US1 Hosting Issues

Incident Report for Happydance

Postmortem

Incident overview

On Sunday evening, customer websites hosted on US1 (Central US) experienced service disruption following an unexpected virtual machine restart.

The restart coincided with an urgent Azure preventive maintenance event affecting underlying network infrastructure in the Central US region. During the reboot process, the affected server became stuck in a Windows restart/update loop, delaying service recovery.

Root cause

Azure identified a degradation in a network device connected to one or more virtual machines in the Central US region and initiated urgent preventive repair work to avoid wider unplanned failures.

As part of this event, virtual machines may be automatically rebooted or migrated by the Azure platform. In this case:

The US1 virtual machine underwent an unexpected reboot
During startup, Windows entered a restart/update loop
The loop prevented normal service startup and temporarily blocked standard recovery actions

No data loss occurred. Operating system and data disks were retained throughout the event.

Incident response

The issue was detected shortly after the unexpected restart
Affected services were isolated to US1-hosted workloads
Engineering investigated both platform health and recovery options
The Windows restart loop resolved naturally once the underlying platform state stabilised
Services began returning to normal and the platform was placed under monitoring

Once stability was confirmed, the incident was marked as resolved.

Mitigation steps

Continued platform monitoring following recovery
Verification of service availability and application health
Review of Azure Service Health notifications related to Central US maintenance
Confirmation that no data corruption or configuration loss occurred

Conclusion

This incident was caused by Azure-initiated urgent preventive maintenance combined with a Windows restart loop during VM recovery. While the initial restart was unplanned from an application perspective, it aligns with Azure’s advance notification of urgent infrastructure repair in the region.

All services are now operating normally. We will continue to monitor platform health and take proactive steps to reduce the impact of future infrastructure-level maintenance events.

Posted Jan 26, 2026 - 09:20 UTC

Resolved

The issue affecting websites hosted on US1 has been resolved and services are operating normally.

We’ll continue to monitor the platform to ensure ongoing stability.
Thank you for your patience.

Posted Jan 26, 2026 - 08:40 UTC

Monitoring

The issue affecting websites hosted on US1 has now stabilised and services are starting to return to normal.

We’re continuing to monitor the platform closely to ensure full recovery and stability. A further update will be provided if required.

Thank you for your patience.

Posted Jan 25, 2026 - 23:43 UTC

Investigating

We’re currently aware of an issue impacting some websites hosted on US1.
Our engineering team is investigating the cause and working to restore full service as quickly as possible.

We’ll provide further updates here as more information becomes available.
Thank you for your patience.

Posted Jan 25, 2026 - 22:38 UTC

This incident affected: HappyDance Infrastructure (Microsoft Azure).