FAS Research Computing - Historique des incidents

Expérimenter des performances partiellement dégradées

Status page for the Harvard FAS Research Computing cluster and other resources.

Cluster Utilization (VPN and FASRC login required): Cannon | FASSE


Please scroll down to see details on any Incidents or maintenance notices.
Monthly maintenance occurs on the first Monday of the month (except holidays).

GETTING HELP
Documentation: https://docs.rc.fas.harvard.edu | Account Portal https://portal.rc.fas.harvard.edu
Email: rchelp@rc.fas.harvard.edu | Support Hours


The colors shown in the bars below were chosen to increase visibility for color-blind visitors.
For higher contrast, switch to light mode at the bottom of this page if the background is dark and colors are muted.

Performances dégradées

SLURM Scheduler - Cannon - Performances dégradées

Cannon Compute Cluster (Holyoke) - Performances dégradées

Boston Compute Nodes - Performances dégradées

GPU nodes (Holyoke) - Performances dégradées

seas_compute - Performances dégradées

Opérationnel

SLURM Scheduler - FASSE - Opérationnel

FASSE Compute Cluster (Holyoke) - Opérationnel

Opérationnel

Kempner Cluster CPU - Opérationnel

Kempner Cluster GPU - Opérationnel

Opérationnel

FASSE login nodes - Opérationnel

Opérationnel

Cannon Open OnDemand - Opérationnel

FASSE Open OnDemand - Opérationnel

Opérationnel

Netscratch (Global Scratch) - Opérationnel

Home Directory Storage - Boston - Opérationnel

Tape - (Tier 3) - Opérationnel

Holylabs - Opérationnel

Isilon Storage Holyoke (Tier 1) - Opérationnel

Holystore01 (Tier 0) - Opérationnel

HolyLFS04 (Tier 0) - Opérationnel

HolyLFS05 (Tier 0) - Opérationnel

HolyLFS06 (Tier 0) - Opérationnel

Holyoke Tier 2 NFS (new) - Opérationnel

Holyoke Specialty Storage - Opérationnel

holECS - Opérationnel

Isilon Storage Boston (Tier 1) - Opérationnel

BosLFS02 (Tier 0) - Opérationnel

Boston Tier 2 NFS (new) - Opérationnel

CEPH Storage Boston (Tier 2) - Opérationnel

Boston Specialty Storage - Opérationnel

bosECS - Opérationnel

Samba Cluster - Opérationnel

Globus Data Transfer - Opérationnel

Historique des incidents

sept. 2024

Virtual Machine hypervisor down - Affects FASSE login/OOD
  • Résolu
    Résolu
    Resolving. The hypervisor and all but one VM, which has separate issue, are operational.
  • Mettre à jour
    Mettre à jour

    FASSE Open OnDemand and FASSE login services should be operational now.

  • Surveillé
    Surveillé

    FASSE OOD is back up

    FASSE login nodes are still down

  • Identifié
    Identifié

    One of the hypervisors managing virtual machines is down. We are working to bring it back up. This does affect FASSE login and FASSE OOD nodes as well as may degrade OpenAuth (two-factor).

    Affected hosts are:
    HOST -- STATUS

    dataverse-backup UNKNOWN

    demo2-l3-fs UNKNOWN

    enos-vote-l3-fs UNKNOWN

    fasselogin01 UNKNOWN

    fasselogin02 UNKNOWN

    frontier-squid02 UNKNOWN

    frontier-squid03 UNKNOWN

    frontier-squid04 UNKNOWN

    goel-adm24-l3-fs UNKNOWN

    goel-blind-l3-fs UNKNOWN

    goel-l3-fs UNKNOWN

    h-dev-fasseooda-01 UNKNOWN

    h-dev-fasseooda-lb01 UNKNOWN

    h-dev-fasseoodb-lb11 UNKNOWN

    h-fasseooda-01 UNKNOWN

    h-fasseooda-lb02 UNKNOWN

    h-fasseoodb-lb11 UNKNOWN

    h-fasseoodb-lb12 UNKNOWN

    h-fasseoodc-lb21 UNKNOWN

    h-fasseoodc-lb22 UNKNOWN

    h-qa-fasseooda-01 UNKNOWN

    h-qa-fasseooda-lb02 UNKNOWN

    holy-es-master01 UNKNOWN

    holy-es-master02 UNKNOWN

    holy-es-master03 UNKNOWN

    holynagios UNKNOWN

    kreindlerl3-fs UNKNOWN

    martin-su-l3-fs UNKNOWN

    mcconnell-l3-fs UNKNOWN

    openauth02 jtriley UNKNOWN

    shleifer-dsl3-fs UNKNOWN

    stock-solar-l3-fs UNKNOWN

    stopsack-l3-fs UNKNOWN

    xcat UNKNOWN

août 2024

Starfish upgrade
  • Terminé
    août 27, 2024 à 12:00
    Terminé
    août 27, 2024 à 12:00

    Starfish is back up

  • Mettre à jour
    août 26, 2024 à 14:35
    Mettre à jour
    août 26, 2024 à 14:35

    Starfish maintenance is still ongoing, no ETA at this time.

  • En cours
    août 24, 2024 à 00:00
    En cours
    août 24, 2024 à 00:00
    Maintenance is now in progress
  • Pas encore commencé
    août 24, 2024 à 00:00
    Pas encore commencé
    août 24, 2024 à 00:00

    The Starfish Zones Dashboard will be undergoing a few upgrades and maintenance this weekend from Friday, August 23rd at 8AM until Monday, August 26th at 8AM. The dashboard will not be accessible during this time. Further details will be provided, if needed. Please email rchelp@rc.fas.harvard.edu if you have any questions or concerns.

juil. 2024

Authentication issues - Related to global Crowdstrike incident
  • Résolu
    Résolu

    All Crowdstrike-related resources are back up and operational.

  • Mettre à jour
    Mettre à jour
    For FASRC resources affected by the Crowdstrike issue, most are back in full services. A few remaining issues involving the following may not be resolved until Monda: - waywiser2 - proteomics2 - tmsdb3 - lic3
  • Mettre à jour
    Mettre à jour

    Please see HUIT Status (harvard.edu) for additional information on the global issue caused by Crowdstrike security which Harvard relies on. This is an ongoing issue university-wide.

    The systems that continue to be affected at FASRC are minimal, but some Windows-based systems managed by or connected to FASRC may still be affected.

  • Surveillé
    Surveillé

    Authentication is back up and running. Windows machines are still in a bad state and will need remedial work to get them back in service.

  • Identifié
    Identifié

    Authentication is back up and running. Windows machines are still in a bad state and will need remedial work to get them back in service.

  • Détecté
    Détecté

    Authentication is back up and running. Windows machines are still in a bad state and will need remedial work to get them back in service.

juil. 2024 à sept. 2024

Suivant