FAS Research Computing - Monthly Maintenance July 1st, 2024 from 7am-11am – Maintenance details

Status page for the Harvard FAS Research Computing cluster and other resources.
WINTER BREAK: Harvard and FASRC will be closed for winter break as of Sat. Dec 21st, 2024. We will return on Jan. 2nd, 2025. We will monitor for critical issues. All other work will be deferred until we return.

Cluster Utilization (VPN and FASRC login required): Cannon | FASSE


Please scroll down to see details on any Incidents or maintenance notices.
Monthly maintenance occurs on the first Monday of the month (except holidays).

GETTING HELP
https://docs.rc.fas.harvard.edu | https://portal.rc.fas.harvard.edu | Email: rchelp@rc.fas.harvard.edu


The colors shown in the bars below were chosen to increase visibility for color-blind visitors.
For higher contrast, switch to light mode at the bottom of this page if the background is dark and colors are muted.

Monthly Maintenance July 1st, 2024 from 7am-11am

Completed
Scheduled for July 01, 2024 at 11:00 AM – 3:00 PM

Affects

Cannon Cluster
SLURM Scheduler - Cannon
Cannon Compute Cluster (Holyoke)
Boston Compute Nodes
GPU nodes (Holyoke)
seas_compute
Updates
  • Completed
    July 01, 2024 at 3:00 PM
    Completed
    July 01, 2024 at 3:00 PM
    Maintenance has completed successfully
  • In progress
    July 01, 2024 at 11:00 AM
    In progress
    July 01, 2024 at 11:00 AM
    Maintenance is now in progress
  • Planned
    July 01, 2024 at 11:00 AM
    Planned
    July 01, 2024 at 11:00 AM

    NOTICES

    MAINTENANCE TASKS

    Cannon cluster will be paused during this maintenance?: YES

    FASSE cluster will be paused during this maintenance?: YES

    Slurm scheduler upgrade
    -- Audience: All  cluster users
    -- Impact: The schedulers will be upgraded to Slurm version 23.11.8. All jobs will be paused to accommodate this upgrade and will resume once complete.

    PMIx upgrade - DEFERRED
    -- This upgrade has been deferred due to issues during testing

    -- Audience: All MPI/OpenMP jobs
    -- Impact: PMIx API will be upgraded to version 5.0.2 while jobs are paused for the Slurm upgrade

    Login node and Open OnDemand (OOD/VDI) reboots
    -- Audience: Anyone logged into a login node or VDI/OOD node
    -- Impact: Login and VDI/OOD nodes will rebooted during this maintenance window  

    Scratch cleanup ( https://docs.rc.fas.harvard.edu/kb/policy-scratch/ )
    -- Audience: Cluster users
    -- Impact: Files older than 90 days will be removed. Please note that retention cleanup can run at any time, not just during the maintenance window.  

    Thanks,  

    FAS Research Computing
    Dept. Website: https://www.rc.fas.harvard.edu/
    Documentation: https://docs.rc.fas.harvard.edu/
    Status Page: https://status.rc.fas.harvard.edu/