Following reduction of storage usage overnight, the scratch system's performance is back to normal. As such, we are closing the incident. Additional steps we will take in the coming days include further discussion with the vendor, adjustments to our monitoring and alerting thresholds, and completion of a previously-planned expansion of the scratch system's maximum storage capacity.
Posted Apr 25, 2025 - 15:45 MDT
Update
We have made substantial progress this afternoon freeing space on scratch, the proximate cause of this degradation. A process is now running that will clear additional space in the coming hours, based on adjustment of scratch policy settings. CURC will continue evaluating and remain in communication with the vendor. We expect our next update here to be tomorrow.
Posted Apr 24, 2025 - 17:05 MDT
Investigating
The scratch filesystem (/scratch/alpine) is currently experiencing performance degradation. CURC has performed initial troubleshooting and opened written communication with the vendor.