PetaLibrary/active BeeGFS interruption for patch installation
Scheduled Maintenance Report for CU Boulder RC
The update of beegfs-meta from 7.1.2 to 7.1.4 was a success, with virtually no problems during deployment. Our secondary metadata server is now re-syncing from the primary, and we anticipate this to complete successfully.

This update fixes other issues as well, so we are hoping to see more stability in beegfs-meta overall as a result.

We do not think that this update disrupted access to PetaLibrary, aside from a brief pause to I/O; but we apologize if it interrupted or provoked errors in any running jobs.
Posted Jan 23, 2020 - 09:49 MST
In progress
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Posted Jan 23, 2020 - 09:30 MST
We have identified that the BeeGFS metadata component of PetaLibrary/active is experiencing a race condition that is provoking errors in the storage cluster, interrupting access, and preventing the proper synchronization of metadata between the two metadata servers.

A workaround is available in a patch, but deployment of this patch will necessitate a (hopefully brief) interruption while we restart the beegfs-meta service on the primary metadata server.

Given the fact that this is preventing the proper synchronization of our metadata, impeding our ability to recover in the event of a primary metadata server failure and putting the data at risk in general, we are working to deploy this patch as soon as possible.

We will patch the secondary server today, and have scheduled the patching of the primary to occur on or after 09:30 Thursday (tomorrow) morning.

We are hoping that this interruption will be only a brief pause / block to IO, and not impact running jobs beyond that. Further updates will be provided here as we have them.
Posted Jan 22, 2020 - 13:57 MST
This scheduled maintenance affected: PetaLibrary.