Current System Load
The plot below shows the status of the CPU nodes on the current Cirrus service for the past day (note: the Cirrus GPU nodes are not included in this plot).
A description of each of the status types is provided below the plot.
CPU
- alloc: Nodes running user jobs
- idle: Nodes available for user jobs
- resv: Nodes in reservation and not available for standard user jobs
- down, drain, maint, drng, comp: Nodes unavailable for user jobs
- mix: Nodes in multiple states
Service Alerts
Status | Start | End | Scope | Impact | Reason |
---|---|---|---|---|---|
Planned | 2025-09-11 12:00 | 2025-09-11 21:00 | foo bar | Login not accessible, SAFE not accessible | Due to work on SAFE database, SAFE and Cirrus login MFA are currently unavailable |
Ongoing | 2025-08-11 09:00 | 2025-08-11 11:00 | SAFE, MFA at login | Login not accessible, SAFE not accessible | Due to work on SAFE database, SAFE and Cirrus login MFA are currently unavailable |
Recently Resolved Service Alerts
This table lists the last five resolved service alerts A full list of historical resolved service alerts is available.
Status | Start | End | Scope | Impact | Reason |
---|---|---|---|---|---|
Resolved | 2025-03-12 08:00 | 2025-03-12 12:00 | Issues with the slurm controller have been observed | Users can connect to the login node but jobs will not start on the compute nodes. Users will not be able to issue slurm commands. | Systems team are investigating the issue. |
Service Maintenance Sessions
Status | Start | End | Scope | Impact | Reason |
---|---|---|---|---|---|
Planned | 2025-08-24 12:00 | 2025-09-11 21:00 | Full Cirrus system | No login access, no access to storgae systems, no jobs running | Major electrical work at the ACF datacentre |
We keep maintenance downtime to a minimum on the service but do occasionally need to perform essential work on the system. Maintenance sessions are used to ensure that:
- software versions are kept up to date;
- firmware levels on HPE and third-party peripheral equipment are kept up to date; essential security patches are applied;
- failed/suspect hardware can be replaced;
- new software can be installed; periodic essential maintenance on HPE electrical and mechanical support equipment (refrigeration systems, air blowers and power distribution units) can be undertaken safely.
Additional maintenance sessions can be scheduled for major hardware or software updates; major upgrades to facility plant and infrastructure; acceptance testing following major service upgrades and statutory electrical testing.
A list of all previous maintenance sessions.