SharePoint Online outage on
March 6, 2026

Read more >

Improved SSO setup and
logging

Read more >

StatusGator logo
Schedule a Demo
StatusGator logo
Use cases

IT Teams

Stay informed of outages and reduce tickets

DevOps

One status page for all your providers

Education

Features designed specifically for K12

Enterprise

Advanced features designed for enterprise

Managed service providers

Impress clients with proactive monitoring

Competitive intelligence

Analyze and compare peer performance

Monitor dependencies to prevent revenue loss

Create and manage custom status pages for your product

Features

Status page

A status page with service, website, and custom monitors built-in

Status aggregation

Aggregate the status of all vendors to a single page

Cloud monitoring

Monitor all your cloud services from a single dashboard

Website monitoring

Monitor your website with uptime monitoring built-in

Monitor network connectivity

Control the status of custom monitors manually with incidents

Get notified of disruptions before they become public

Pricing

Business

From startup to enterprise and everything in between

Education

Special plans and discounts for K12 and higher ed

Integrations

Incident Management

Better Uptime
FireHydrant
Opsgenie
PagerDuty

Notifications

Private Status

AT&T status
AWS status
Azure status
Microsoft 365 status
Zendesk status

Status Pages

Atlassian Statuspage
StatusHub

Advanced

Sign In Sign Up

Amazon Cognito outage: How StatusGator notified customers 30 minutes before Amazon did

On December 12, 2024, Amazon Cognito experienced a significant outage in the US-EAST-1 (N. Virginia) region, impacting authentication for numerous applications. This operational issue, caused by a configuration change deployment, led to widespread “TooManyRequestsException” errors for several hours. Many Amazon Cognito users were left scrambling to figure out why their application was down, why users could authenticate, and how to get back up and running.

In the early minutes of the outage, as IT teams were struggling to figure out how to recover, Amazon was silent on the issue, with their status page proclaiming “No recent issues”.

However, for StatusGator customers, the story unfolded quite differently as they were alerted minutes after the widespread outage began and 30 minutes before Amazon acknowledged the issue officially on their status page.

The AWS Cognito Outage Timeline

At 02:24 UTC, StatusGator notified our users of authentication issues with Amazon Cognito — 28 minutes before AWS officially acknowledged the problem on their status page. Our early warning signal was powered by reports and patterns we observed starting at 02:17 UTC, allowing us to alert customers before their applications were deeply affected. This crucial lead time enabled proactive troubleshooting and communication to end-users, minimizing the impact of the outage.

What Happened?

According to AWS’s postmortem, the issue began at 00:35 UTC (4:35 PM PST) due to a change deployment within Amazon Cognito. Here’s a full breakdown of the timeline:

  • 00:35 UTC: Amazon detects increased error rates in Cognito in the US-EAST-1 region, however the issue not widespread and Amazon does not publicly disclose the error rate increase.
  • 01:14 UTC: Amazon engineers begin investigating and working on a resolution, but the status page is not yet updated.
  • 02:00 UTC: Amazon identifies two root causes for the increase in error rates but still has not yet disclosed this investigation on the status page.
  • 02:17 UTC: The issue becomes more widespread and early reports of authentication errors start surfacing across the internet.
  • 02:24 UTC: StatusGator customers receive our Early Warning Signals alert about problems with Amazon Cognito.
  • 02:52 UTC: AWS updates its status page to acknowledge they are investigating an issue.
  • 02:55 UTC: StatusGator detects the change on AWS’s status page and updates the official status.
  • 03:17 UTC: AWS confirms the increase in error rates and isolates the issue to one of two root causes, pledging to continue investigating and hoping to resolve the issue within 60 minutes.
  • 03:37 UTC: AWS updates its status page to state that they have implemented a fix and are seeing signs of recovery.
  • 04:01 UTC: Time of full recovery as retroactively confirmed by AWS.
  • 04:38 UTC: AWS posts final incident summary,

There are two critical moments of this timeline: At 9:17 PM ET / 6:17 PM PT the issue become more widespread and StatusGator notified its customers 7 minutes later. But Amazon did not notify its customers for a further 28 minutes. This timeline highlights the critical gap between when problems first emerge and when providers acknowledge them. StatusGator bridges that gap, giving its users an edge.

How StatusGator Beats Status Pages

Our platform continuously monitors hundreds of status pages and collects early warning signals from a variety of sources. This unique capability allows StatusGator to detect and report issues before they become widely known. In this incident, we were able to capture signals such as:

  • User reports of “TooManyRequestsException” errors submitted to our public website.
  • Reports of issues with Amazon Web Services from StatusGator customers’ internal status pages.
  • A sudden spike in interest and activity surrounding the status of Amazon Cognito.
  • Reports of authentication-related issues on other official status pages that depend on Cognito 

By analyzing these signals in real time, StatusGator provides  faster alerts and actionable insights that can help organizations respond quickly. We answer that critical question “Is it everyone or just us?” and help teams react to outages in real time.

Learnings and Takeaways

This incident underscores the importance of independent monitoring for critical services. While provider status pages are essential, they are often reactive, leaving customers to grapple with service disruptions until official updates are posted. StatusGator’s early detection capabilities empower teams to stay ahead, respond swiftly, and maintain trust with their customers.

Stay Ahead with StatusGator

Outages happen, but you don’t have to be caught off guard. With StatusGator, you gain the power of early detection and actionable insights. Whether you’re managing a critical application or global IT infrastructure, StatusGator keeps you informed and prepared.

Read more about Early Warning Signals to see how we make this possible, and join the growing number of organizations that rely on StatusGator for critical monitoring and communication by booking a demo.

Share this

Photo of author

Andy Libby

Andrew Libby is a veteran Ruby developer and technologist with over 25 years of experience; Andy is co-founder of StatusGator and leads engineering at Nimble Industries.