Outage in instance creation in EU compute shard.
Resolved
Full outage

No impact has been observed for the last 90 minutes, the issue is fully resolved.

Wed, Nov 13, 2024, 04:55 PM
1 month ago
Affected components
Updates

Resolved

No impact has been observed for the last 90 minutes, the issue is fully resolved.

Wed, Nov 13, 2024, 04:55 PM
37m earlier...

Monitoring

We continue to observe the recovery, the queueing time have been back to normal levels for 1h now. The team continues to monitor the situation.

Wed, Nov 13, 2024, 04:18 PM
36m earlier...

Identified

We have confirmed the recovery of the functionality across all regions and have observed that for the last 30 minutes the queuing time got back to expected levels. We keep making changes and monitoring the situation.

Wed, Nov 13, 2024, 03:41 PM
13m earlier...

Identified

We have added more capacity to our EU shard and are observing recovery of the region. Some of the workloads might still observe slightly longer queuing time, but the wait time is improving. The team continues to monitor the situation and are still adding more capacity.

Wed, Nov 13, 2024, 03:28 PM
36m earlier...

Identified

The team is currently working on adding additional compute capacity to the EU shard. The queue time to start new jobs has still degraded performance.

Wed, Nov 13, 2024, 02:51 PM
20m earlier...

Identified

We continue using the US compute capacity to reduce pressure on the EU compute shard, and we see a slow recovery. In the meantime, we are preparing mitigations to recover EU capacity. AMD64 and ARM64 capacity are still experiencing degraded creation time. MacOS capacity is operational.

Wed, Nov 13, 2024, 02:31 PM
42m earlier...

Identified

We are mitigating the impact of our unavailability in EU shard compute by using capacity from or US shard. We still see high queue time for all jobs and full outage for arm64 instances.

Wed, Nov 13, 2024, 01:49 PM
47m earlier...

Identified

The team identified a scheduling component overloaded by requests. The whole team is working on removing the overload from this component.

Wed, Nov 13, 2024, 01:01 PM
10m earlier...

Investigating

We are investigating an outage in creation of new instances in our EU compute shard.

Wed, Nov 13, 2024, 12:51 PM