/rc_scratch full event
Incident Report for CU Boulder RC
Resolved
With assistance of our user community, we were able to free up space on rc_scratch. Since then, we have enabled per-user quotas on rc_scratch and, while not all users have had quotas implemented yet, this new functionality will give us greater ability to manage near-full events in the future.

We have also added additional monitoring for rc_scratch capacity and fullness.
Posted Aug 04, 2020 - 14:18 MDT
Identified
We have a plan for establishing per-user quotas in /rc_scratch, but this will require a configuration change that will require the file system to be briefly unmounted. As such, we are scheduling a brief outage for our next PM date, 1 July, and will plan to make the change then.

In the mean time, at least some space has been freed, and rc_scratch is now showing 85% full with 16T available. Per-user quotas have been partially implemented for a few rc_scratch storage outliers, and we will be working with them individually to try to allow them to work until the quota configuration change is made.
Posted Jun 19, 2020 - 12:08 MDT
Investigating
The /rc_scratch file system, mostly used during Blanca computation, is full. We are working on a plan and hope to implement it today. Until then, individuals are being contacted to ask them to free up space.

Be advised: data is, as always, automatically removed from /rc_scratch after 90 days; but bursts of write activity within that time window can fill the file system.
Posted Jun 19, 2020 - 09:28 MDT
This incident affected: Blanca.