(QOSGrpNodeLimit) Reason on Summit
Incident Report for CU Boulder RC
Resolved
This incident has been resolved.
Posted 6 months ago. Jun 07, 2018 - 17:25 MDT
Monitoring
We have implemented a fix and will be monitoring the queues to ensure the fix had the intended outcome.
Posted 6 months ago. Jun 07, 2018 - 17:05 MDT
Investigating
Some users jobs are being held in the queue due to the reason of (QOSGrpNodeLimit) when there are nodes available within the partition to run. We are currently investigating this issue and are working with the vendor to resolve the issue.
Posted 6 months ago. Jun 07, 2018 - 13:07 MDT
This incident affected: RMACC Summit.