FAS Research Computing - Notice history

Status page for the Harvard FAS Research Computing cluster and other resources.

Cluster Utilization (VPN and FASRC login required): Cannon | FASSE


Please scroll down to see details on any Incidents or maintenance notices.
Monthly maintenance occurs on the first Monday of the month (except holidays).

GETTING HELP
Documentation: https://docs.rc.fas.harvard.edu | Account Portal https://portal.rc.fas.harvard.edu
Email: rchelp@rc.fas.harvard.edu | Support Hours


The colors shown in the bars below were chosen to increase visibility for color-blind visitors.
For higher contrast, switch to light mode at the bottom of this page if the background is dark and colors are muted.

Operational

SLURM Scheduler - Cannon - Operational

Cannon Compute Cluster (Holyoke) - Operational

Boston Compute Nodes - Operational

GPU nodes (Holyoke) - Operational

seas_compute - Operational

Operational

SLURM Scheduler - FASSE - Operational

FASSE Compute Cluster (Holyoke) - Operational

Operational

Kempner Cluster CPU - Operational

Kempner Cluster GPU - Operational

Operational

FASSE login nodes - Operational

Operational

Cannon Open OnDemand/VDI - Operational

FASSE Open OnDemand/VDI - Operational

Operational

Netscratch (Global Scratch) - Operational

Home Directory Storage - Boston - Operational

Tape - (Tier 3) - Operational

Holylabs - Operational

Isilon Storage Holyoke (Tier 1) - Operational

Holystore01 (Tier 0) - Operational

HolyLFS04 (Tier 0) - Operational

HolyLFS05 (Tier 0) - Operational

HolyLFS06 (Tier 0) - Operational

Holyoke Tier 2 NFS (new) - Operational

Holyoke Specialty Storage - Operational

holECS - Operational

Isilon Storage Boston (Tier 1) - Operational

BosLFS02 (Tier 0) - Operational

Boston Tier 2 NFS (new) - Operational

CEPH Storage Boston (Tier 2) - Operational

Boston Specialty Storage - Operational

bosECS - Operational

Samba Cluster - Operational

Globus Data Transfer - Operational

Notice history

Apr 2026

FASRC monthly maintenance April 6th 2026 9am-1pm
Scheduled for April 06, 2026 at 1:00 PM – 1:00 PM less than a minute
  • Planned
    April 06, 2026 at 1:00 PM
    Planned
    April 06, 2026 at 1:00 PM

    FASRC monthly maintenance will take place on April 6th 2026. Our maintenance tasks should be completed between 9am-1pm.

    NOTICES:

    MAINTENANCE TASKS

    Cannon cluster will be paused during this maintenance?: NO
    FASSE cluster will be paused during this maintenance?: NO

    • two-factor.rc.fas.harvard.edu OpenAuth cut-over to new server

      • Audience: New accounts or anyone requesting an OpenAuth token

      • Impact: two-factor will be unavailable while moving to a new server

    • RStudio Server (Open OnDemand)

      • Audience: RStudio Server users on Cannon and FASSE

      • Impact: We will be decommissioning some versions of RStudio Server so we can properly maintain all production versions. Versions to be decommissioned:

        • R 4.1.3 (Bioconductor 3.14, RStudio 2022.02.0)

        • R 4.1.0 (Bioconductor 3.13, RStudio 1.4.1717)

        • R 4.0.3 (Bioconductor 3.12, Rstudio 1.3.1093)

        • R 4.0.0 (Bioconductor 3.11, Rstudio 1.3.1093)

      • If you use one of these versions, we recommend replacing it with the most recent version, R 4.4.2 (Bioconductor 3.20, RStudio 2024.12.0). You must reinstall previously installed libraries.

    • Domain controller replacement

      • Audience: Internal

      • Impact: End users should not see any impact

    • OOD/Open OnDemand reboots

      • Audience: All OOD users, reboot of the head nodes

      • Impact: Running sessions will not be affected

    • Login node reboots

      • Audience; All login node users

      • Impact: Login nodes will reboot during the maintenance window

    • Netscratch 90-day retention cleanup

      • Audience; All netscratch users

      • Impact: Files older than 90 days will be removed per our scratch policy. Please note that this cleanup can happen at any time, not just during maintenance.

    Thank you,
    FAS Research Computing
    https://docs.rc.fas.harvard.edu/
    https://www.rc.fas.harvard.edu/

Mar 2026

Scheduler is degraded
  • Resolved
    Resolved

    This incident has been resolved. The scheduler is running normally.

  • Investigating
    Investigating

    The scheduler is in a degraded state due to thrashing
    We are actively working to resolve this problem.

Network issues - Cluster degraded
  • Resolved
    Resolved

    This incident has been resolved by draining and rebooting any nodes with stuck mounts.

  • Monitoring
    Monitoring

    Mounts to Holyoke Isilon (specifically /n/sw) are broken on numerous nodes across the cluster. We have a check rolling out to find these nodes so we can remediate them individually. Until remediated the cluster will be in a degraded state. Running jobs may randomly die or fail as they hit nodes that have stale mounts.

    It will be risky to run jobs for the next hour and then, after that point, the cluster will have a large number of nodes closed waiting for them to drain so we can reboot them and fix the mounts.

    At this time we are unaware of any holy-isilon problems other than the effect this had on cluster nodes/running jobs. We will update should we identify any data storage concerns.

  • Identified
    Identified

    Mounts to Holyoke Isilon (specifically /n/sw) are broken on numerous nodes across the cluster. We have a check rolling out to find these nodes so we can remediate them individually. Until remediated the cluster will be in a degraded state. Running jobs may randomly die or fail as they hit nodes that have stale mounts.

    It will be risky to run jobs for the next hour and then, after that point, the cluster will have a large number of nodes closed waiting for them to drain so we can reboot them and fix the mounts.

  • Investigating
    Investigating

    A network issue affecting storage critical to the cluster is It's causing instability. The cluster is currently in a degraded state as a result. We are looking into the problem. Updates to follow..

Feb 2026

Tape outage
  • Resolved
    Resolved

    This incident has been resolved. Normal tape operations are restored.

  • Monitoring
    Monitoring

    The tape library outage is further extended to Wednesday March 4th at 9am awaiting a hardware replacement part due today. Data can still be uploaded to lab collections via Globus, but be mindful of the 10 TB buffer file limit. The outage affects storage and recall from tape.

  • Identified
    Identified

    NESE Tape Service is still working with IBM technical support at restoring the inventory. The expected downtime is extended until Tuesday March 3rd, 9am.
    Apologies for the inconvenvenience.

  • Investigating
    Investigating

    NESE Tape service will be down or operating with degraded service (no store and recall) Friday from 12 Noon EST until as late as Monday, 2 March at 9 AM.

    SUMMARY OF ISSUE:

    NESE Tape service is currently not able to store or recall files to and from tape due to vendor firmware issues in the IBM TS4500 tape library. The issue is related to the library robotics and cartridge database and we do NOT expect any data loss from this issue.

    The issue is apparently due to an issue with the inventory database related to a recent firmware update. This database can be scrubbed and reconstructed by the library, which will scan the bar code labels on all the cartridges to rebuild the inventory. Association of files in Globus to tapes is handled separately from the tape library and is not affected by the firmware update.

NESE tape maintenance Feb 19th 2026
  • Completed
    February 19, 2026 at 10:00 PM
    Completed
    February 19, 2026 at 10:00 PM
    Maintenance has completed successfully
  • In progress
    February 19, 2026 at 1:00 PM
    In progress
    February 19, 2026 at 1:00 PM
    Maintenance is now in progress
  • Planned
    February 19, 2026 at 1:00 PM
    Planned
    February 19, 2026 at 1:00 PM

    From our partners at NESE. Details follow:

    We are installing four new tape frames, which will bring the tape system raw storage capacity to 253 petabytes.

    Service Affected: NESE Tape Service

    Maintenance Window: 8:00 AM - 5:00 PM (EST)

    • The tape service will be unavailable.

    • All upgrade activities are expected to be completed on the same day.

    NOTES:

    • Monitor the MGHPCC Slack #nese channel for status updates and announcements

    • Monitor https://nese.instatus.com/ for real-time updates on progress

    Subscribe to https://nese.instatus.com/subscribe/email for updates and announcements

Feb 2026 to Apr 2026

Next