Railway - A subset of our hosts are currently offline – Incident details

A subset of our hosts are currently offline

Resolved
Major outage
Started 5 months agoLasted about 5 hours

Affected

Deployments

Edge Network

Major outage from 8:24 AM to 10:40 AM, Partial outage from 10:40 AM to 11:57 AM, Degraded performance from 11:57 AM to 1:00 PM

TCP Proxy

Major outage from 8:24 AM to 10:40 AM, Partial outage from 10:40 AM to 11:57 AM, Degraded performance from 11:57 AM to 1:00 PM

Legacy Edge Network

Major outage from 8:24 AM to 9:27 AM, Degraded performance from 9:08 AM to 9:27 AM, Major outage from 9:27 AM to 9:45 AM, Degraded performance from 9:27 AM to 9:45 AM, Major outage from 9:45 AM to 10:05 AM, Degraded performance from 9:45 AM to 10:05 AM, Major outage from 10:05 AM to 10:36 AM, Degraded performance from 10:05 AM to 10:36 AM, Major outage from 10:36 AM to 10:40 AM, Degraded performance from 10:36 AM to 10:40 AM, Partial outage from 10:40 AM to 11:00 AM, Degraded performance from 10:40 AM to 11:00 AM, Partial outage from 11:00 AM to 11:29 AM, Degraded performance from 11:00 AM to 11:29 AM, Partial outage from 11:29 AM to 11:57 AM, Degraded performance from 11:29 AM to 1:00 PM, Operational from 12:29 PM to 1:00 PM

US West (us-west1 / Oregon, USA)

Major outage from 8:24 AM to 10:40 AM, Partial outage from 10:40 AM to 11:57 AM, Degraded performance from 11:57 AM to 1:00 PM

US East (us-east4 / Virginia, USA)

Major outage from 8:24 AM to 9:08 AM, Degraded performance from 9:08 AM to 12:29 PM, Operational from 12:29 PM to 1:00 PM

Updates
  • Resolved
    Resolved

    This incident has been resolved. We will be publishing a post-mortem of this incident after we conclude our investigations into the root cause.

  • Update
    Update

    Services for all regions with the exception of US West (us-west1 / Oregon, USA) have been restored. We are working on moving workloads off a single unhealthy host, and continuing to work on ensuring all impacted hosts that were restarted as a result of this incident are healthy.

  • Monitoring
    Monitoring

    All impacted hosts have been restarted, and we are currently working to ensure they are healthy. Services should be back online within 15 minutes of this update. Some services may encounter networking issues as the restart process completes.

  • Update
    Update

    Restarts have completed for most hosts. We are currently restarting the remaining hosts, and are working to ensure the restarted hosts are healthy.

  • Update
    Update

    Restarts for all impacted hosts are still in progress. We are seeing additional levels of service restoration and are working towards full service restoration.

    The root cause of this incident is still under investigation.

  • Identified
    Identified

    We are seeing additional service restoration across Railway-hosted workloads. Hobby provisions have been re-enabled.

  • Update
    Update

    Restarts for all impacted hosts are currently in progress. We are seeing small amounts of service restoration at the moment, and will continue to work on full service restoration.

    The root cause of this incident is still under investigation.

  • Update
    Update

    We are currently issuing a rolling restart of all impacted hosts. The root cause of the incident is still under investigation.

  • Update
    Update

    We are still investigating this incident. We have issued a restart on some hosts to attempt service recovery.

  • Update
    Update

    We are still investigating this incident.

  • Update
    Update

    We have isolated the majority of impact to the US West (us-west1 / Oregon, USA) region. Other regions are currently unstable for a subset of users. We're still investigating this incident.

  • Update
    Update

    We are still investigating this incident. Deployments for Hobby Plan users are temporarily disabled.

  • Investigating
    Investigating

    We're currently investigating some issues impacting a subset of our hosts. Railway-hosted workloads may be unresponsive during this time.