(QOSGrpNodeLimit) Reason on Summit
Incident Report for CU Boulder RC
Resolved
This incident has been resolved.
Posted Jun 07, 2018 - 17:25 MDT
Monitoring
We have implemented a fix and will be monitoring the queues to ensure the fix had the intended outcome.
Posted Jun 07, 2018 - 17:05 MDT
Investigating
Some users jobs are being held in the queue due to the reason of (QOSGrpNodeLimit) when there are nodes available within the partition to run. We are currently investigating this issue and are working with the vendor to resolve the issue.
Posted Jun 07, 2018 - 13:07 MDT
This incident affected: RMACC Summit.