PetaLibrary Active/Beegfs Storage Servers Tuning
Scheduled Maintenance Report for CU Boulder RC
Completed
The configuration change in Beegfs Storage servers was applied and the filesystem has been monitored.
No disruptions were identified. Tests with benchmarks have shows that the change improved the balancing of the storage servers among the users. More tests will be performed and the filesystem is continuously monitored, beyond the completion of the maintenance.

If you observe any issues please report to rc-help@colorado.edu as usual.
Posted Nov 25, 2019 - 12:20 MST
In progress
Scheduled maintenance is currently in progress. We will provide updates as necessary.
Posted Nov 25, 2019 - 10:15 MST
Scheduled
Some of you may have noticed performance degradation when using PetaLibrary Active/Beegfs this week.
This is the result of unbalanced storage requests processing across users under the current Beegfs configuration.

Some time ago we improved Beegfs performance on the metadata servers by increasing the number of worker threads to process incoming requests. However, we haven't yet tuned the storage servers, which process reading and writing of files.

Our monitoring shows that the performance issues on Beegfs now lie mostly on the storage servers.
As such we are performing a change recommended by the Beegfs developers on this coming Monday (November 25) in an attempt to balance the processing of storage requests by our users. The change consists of creating a storage message queue for each user instead of using the single queue that we have today; under the present queue configuration incoming read/write requests are processed on a first-come, first-served order.

We contacted one of our users to request that they place the system under high load on Monday, and we will run a benchmark provided by another PL user to verify how the system behaves after the change.

We expect the change itself to be non disruptive. So, users shouldn't encounter any error messages if using the /pl/active filesystem while we apply the change.

Please note that the configuration change should be completed within 15min. The 2 hour timeframe for the maintenance takes into account filesystem monitoring after the change.
Posted Nov 22, 2019 - 14:46 MST
This scheduled maintenance affected: PetaLibrary.