Monitoring - A patch from support has been applied, and Blanca appears to be operational once again. We will continue monitoring the system closely.
Jan 16, 18:02 MST
Identified - Slurm support has identified the issue that has caused a failure in Blanca Slurm. Unfortunately, updating Slurm, including with a provided patch, has not yet resolved the issue.

We are continuing to work this issue with upstream support.
Jan 16, 17:34 MST
Update - We are continuing to investigate the cause of the Slurm events we are experiencing today. Blanca Slurm has become unresponsive, and we are currently unable to start it. We are reaching out to Slurm support for assistance.
Jan 16, 12:34 MST
Investigating - We are investigating a Slurm incident that occurred today around 10:30 AM. This incident may have affect Blanca, Summit, and EnginFrame, up to and including the early termination of jobs.

We apologize for the interruption, and are working to determine the cause.
Jan 16, 11:15 MST
Research Computing Core   ? Operational
Science Network   ? Operational
RMACC Summit   ? Operational
Blanca   ? Operational
PetaLibrary   ? Operational
EnginFrame   ? Operational
JupyterHub   ? Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Past Incidents
Jan 22, 2019

No incidents reported today.

Jan 21, 2019

No incidents reported.

Jan 20, 2019

No incidents reported.

Jan 19, 2019

No incidents reported.

Jan 18, 2019

No incidents reported.

Jan 17, 2019

No incidents reported.

Jan 15, 2019

No incidents reported.

Jan 14, 2019

No incidents reported.

Jan 13, 2019

No incidents reported.

Jan 12, 2019

No incidents reported.

Jan 11, 2019

No incidents reported.

Jan 10, 2019

No incidents reported.

Jan 9, 2019
Resolved - Work on CSU authentication has been completed, and the issue appears to have been resolved.
Jan 9, 12:11 MST
Investigating - We are investigating an issue that is affecting CSU authentication in some circumstances, particularly during account request. As part of the troubleshooting effort, there may be brief interruptions to CSU authentication in general.
Jan 9, 11:15 MST
Completed - This scheduled maintenance has been completed.
Jan 9, 10:14 MST
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Jan 9, 10:00 MST
Scheduled - During the maintenance period we will be enacting a policy change to the way users access Blanca nodes. Jobs will be held during the maintenance. After the maintenance period only users who have a job running on a node will be able to SSH into a node. This is in an effort to keep users who don't have a job on the node from still accessing it, and to bring the Blanca Slurm policies in line with Summit.
Jan 4, 12:18 MST
Completed - IdentiKey authentication appears to be working as expected following upstream network maintenance. If you experience any trouble, please contact rc-help@colorado.edu.
Jan 9, 09:03 MST
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Jan 9, 05:30 MST
Scheduled - Starting at 5:30 a.m. tomorrow (Wednesday), CU OIT will be updating a number of campus firewalls (the second of two updates) which will make the wired network and many of the campus’s online services unavailable for approximately 50 minutes.

What wasn’t clearly communicated to us in the first version of the service alert was that CU Boulder’s Federated Identity Service will also be affected by the maintenance. The Identity Service provides authentication services for the RC login nodes, DTN, EnginFrame, and RCAMP. This means that new CU logins to these services will likely not be possible during the maintenance. Those already logged in should be unaffected, and jobs should continue to run as normal.

We'll review the status of these services first thing in the morning to confirm that there are no lingering outages.

Sorry for the late notice.
Jan 8, 20:32 MST
Jan 8, 2019

No incidents reported.