PetaLibrary metadata outage
Incident Report for CU Boulder RC
Resolved
The PetaLibrary BeeGFS metadata service has remained stable without further incident. We are continuing to work with our support vendor (the file system developer) to understand root cause.
Posted Jan 17, 2020 - 09:54 MST
Update
The PetaLibrary has remained accessible after our previous administrative action, including during and after our separate maintenance activities.

We are still waiting to hear from support regarding why this happened in the first place.

In the mean time, we will keep this incident open and monitoring the system closely.
Posted Jan 15, 2020 - 17:23 MST
Monitoring
We have implemented a fix for the beegfs-meta outage. The fix immediately restored access to the fs, but we are monitoring the system to observe whether the problem reoccurs.

We are in contact with both our integrator and file system developer regarding this issue.
Posted Jan 15, 2020 - 15:55 MST
Investigating
We are currently experiencing a BeeGFS metadata outage affecting all services that use the PetaLibrary, including Summit and Blanca. We are actively investigating the incident and will provide any information as we get it.

We currently believe this issue is unrelated to the maintenance work, as its initial symptoms started before any action had been taken, and before the maintenance was scheduled.

We apologize for this inconvenience, and will be working to restore service as soon as possible.
Posted Jan 15, 2020 - 15:06 MST
This incident affected: Blanca, PetaLibrary, and RMACC Summit.