Alpine scratch incident
Incident Report for CU Boulder RC
Resolved
This incident has been resolved. We will continue monitoring for any potential resurfacing of the issue.
Posted Sep 13, 2023 - 07:38 MDT
Update
The team is continuing to monitor and to roll out fixes to some individual hosts. We believe the cause has been addressed, but we continue to verify and to gather information. We will keep this maintenance open at least until morning tomorrow (Wednesday September 13).
Posted Sep 12, 2023 - 16:00 MDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Sep 12, 2023 - 14:11 MDT
Update
We are continuing to investigate. Possible signs of the issue that a user may experience include job sluggishness and "stale file handle" errors in some Alpine jobs. Our HPC team is continuing to troubleshoot and a case is open with the vendor.
Posted Sep 12, 2023 - 13:35 MDT
Investigating
We are currently investigating this issue.
Posted Sep 12, 2023 - 11:14 MDT
This incident affected: Alpine.