SlurmDB emergency maintenance

Incident Report for CU Boulder RC

Resolved

This incident has been resolved.
Posted Jan 28, 2020 - 12:16 MST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Jan 27, 2020 - 17:33 MST

Update

While the database server is down, the user management portal "RCAMP" is also non-functional.
Posted Jan 27, 2020 - 15:45 MST

Investigating

We have stopped jobs from being able to be submitted to both Blanca and Summit while we work to add space to our database server that serves out our Slurm database. Jobs currently running will continue to run, but no new jobs will be started.
Posted Jan 27, 2020 - 13:19 MST
This incident affected: Blanca.