Britain's Premier Job Portal
.
Job Summary
The HPC System Administrator will manage day-to-day operations of HPC systems, ensuring stability, security, and performance. This role includes system monitoring, patching, user account management, job queue oversight, and incident resolution to support NSCC?s supercomputing environment.
Roles and Responsibilities
·Administer HPC compute nodes, storage systems, and internal networks.
·Monitor system health using tools like Grafana, Prometheus, and custom scripts.
·Apply patches, updates, and configuration changes to ensure stability.
·Manage user accounts, access controls, and authentication mechanisms.
·Monitor job queues and assist users with job submission and scheduling issues.
·Implement and enforce resource allocation policies.