Google Cloud outage: StatusGator detected it first — here’s what happened

Published:

June 16, 2025

Updated:

July 26, 2025

Table of contents

On June 12, 2025, Google Cloud suffered a major global outage that affected dozens of its core services—including Compute Engine, BigQuery, Gmail, and Vertex AI. The disruption, caused by a faulty quota policy update, triggered API failures across nearly every region.

While Google Cloud’s own monitoring tools were slow to respond, StatusGator detected the issue at 18:09 UTC, giving users a critical head start—nearly an hour before Google publicly acknowledged the incident.

This breakdown explores what happened, what was affected, and why independent monitoring like StatusGator is essential when your cloud provider is in trouble.

The root cause: a quota policy crash

Google later revealed that a new quota policy—deployed without proper error handling or feature flag protections—contained blank metadata fields. These blank values triggered null pointer crashes in a critical backend service called Service Control.

Because this data was replicated globally, nearly all regions experienced API outages simultaneously. Worse, the system lacked exponential backoff, which overwhelmed backend infrastructure in key zones like us-central1, extending recovery time.

StatusGator detected the outage before Google did

Here’s how the key moments unfolded:

⏱️ Outage timeline (all times in UTC)

17:42 – Internal signals of degradation begin (per StatusGator correlation)
17:49 – Official incident start (Google confirms 10:49 PDT)
18:09 – 🟡 StatusGator detects the outage
19:02 – 🔴 Google Cloud posts first public update
20:49 – Majority of services fully recovered
21:49 – Recovery completed in us-central1
01:18 (June 13) – ✅ Final residual services (Vertex AI) fully recovered

Throughout this timeline, StatusGator continuously tracked the degradation and recovery, allowing customers to assess impact well before official communication.

What broke — and who was affected

The incident led to widespread 503 errors and API failures for both Google Cloud and Workspace products. Impacted services included:

Infrastructure & data: Compute Engine, Cloud SQL, Cloud Storage, BigQuery, Cloud Run
AI/ML: Vertex AI, AutoML, Dialogflow, Document AI
DevOps: Cloud Monitoring, Logging, Build, Workflows
Workspace: Gmail, Drive, Docs, Meet, Chat

Existing workloads like running VMs were largely unaffected—but any service relying on API access or new provisioning failed or degraded during the event.

Why StatusGator’s early detection mattered

When outages strike, time is everything. Organizations that rely solely on official cloud status pages are often caught flat-footed. In this case, Google’s own monitoring tools were impaired, delaying incident updates.

But StatusGator filled the gap:

Detected the outage at 18:09 UTC, ahead of Google
Aggregated signals from affected services
Alerted users across multiple channels, independent of the provider’s infrastructure

That early alert gave teams valuable lead time to pause deployments, initiate failovers, and notify stakeholders—before the chaos hit full force.

Google’s commitments going forward

In their postmortem, Google acknowledged systemic failures and committed to:

Modularizing critical services to fail open
Adding feature flag protections to all risky code paths
Implementing proper error handling and test coverage
Strengthening external communications—even during outages
Ensuring backoff strategies to avoid cascading failures

These are positive steps—but they don’t change the reality that even the most trusted platforms can (and do) break.

Final thoughts: don’t rely on just one signal

The June 12 outage proves one thing: you need monitoring that works even when your provider doesn’t. Whether you use Google Cloud, AWS, Azure—or all three—independent monitoring like StatusGator gives you visibility when it matters most.

StatusGator saw it first. And next time, it might be your team that needs that early warning. Try StatusGator for free now!

Early Warning Signals

Use Cases

Features

Pricing

Integrations

Chat

Embeds

Help Desk

Incident Management

Monitoring

Notifications

Private Status

Status Pages

Advanced

.st0{fill:#252F3E;} .st1{fill-rule:evenodd;clip-rule:evenodd;fill:;} AWS status

.cls-1{fill:url(#linear-gradient);}.cls-2{fill:url(#linear-gradient-2);}.cls-3{fill:#2684ff;} Opsgenie

.st0{fill:#252F3E;} .st1{fill-rule:evenodd;clip-rule:evenodd;fill:;} AWS status

.cls-1{fill:url(#linear-gradient);}.cls-2{fill:#2684ff;} Atlassian Statuspage

Google Cloud outage: StatusGator detected it first — here’s what happened

The root cause: a quota policy crash

StatusGator detected the outage before Google did

⏱️ Outage timeline (all times in UTC)

What broke — and who was affected

Why StatusGator’s early detection mattered

Google’s commitments going forward

Final thoughts: don’t rely on just one signal

Recent posts

AWS status

Opsgenie

AWS status

Atlassian Statuspage