Most Blanca nodes have been returned to service, and the rest will be returned to service as they are able to be rebooted.
Posted 3 months ago. May 28, 2019 - 10:27 MDT
A specific workload running preemptably on Blanca has been identified to be leaving large numbers of Blanca compute nodes in "Kill task failed" state. These nodes are automatically drained and rebooted, but new jobs cannot be scheduled on such nodes while they drain for reboot.
We have held all future work in this workload until we are able to correct the behavior.