Unplanned outage in RC Core
Incident Report for CU Boulder RC
Resolved
The root cause of this outage has been identified and cleared in the upstream campus network. The elements of the RC Core infrastructure that were affected were vulnerable to this upstream outage because of an unrelated migration in progress, and would not be vulnerable once the migration has completed.
Posted Sep 29, 2018 - 23:36 MDT
Monitoring
Summit and Blanca have been returned to service, but we will continue monitoring to see if an upstream failure might cause us further issues.
Posted Sep 28, 2018 - 16:45 MDT
Investigating
An unplanned outage in the RC Core infrastructure has led to RC directory services being inaccessible. To prevent potential job failures the queueing systems on Summit and Blanca have been stopped until access to the directory is restored.
Posted Sep 28, 2018 - 16:14 MDT
This incident affected: Research Computing Core, Blanca, and RMACC Summit.