Due to a cooling failure in our Holyoke datacenter, many compute nodes are offline (half of row 7C). The affected queues include:
- bigmem
- gpu
- shared
- various PI owned partitions
We are investigating the cause of this cooling failure and will bring the nodes back online once cooling has been restored. Any impacted jobs will be requeued.
In the meantime, jobs may pend for longer than usual due to fewer available compute nodes.