Status Page Design Patterns: How the Best SaaS Companies Communicate Downtime

Status pages are one of the few interfaces that users only visit when something is already wrong. The user arrives stressed, looking for a specific answer: is the thing I depend on broken, and when will it be fixed? Every design decision on the page either reduces that anxiety or compounds it.

Most SaaS status pages are bad. They show a single green banner reading “All Systems Operational,” offer no history, and go silent during the exact moments users need them most. The companies that get status pages right treat them as a real design problem with real information architecture, not as a compliance checkbox buried in the footer.

What follows are five patterns that separate status pages that actually work from ones that exist only to technically exist.

Component-level status vs. the single green dot

The most common status page antipattern is a single indicator for the entire product. One green circle. “All Systems Operational.” This tells the user nothing about what they actually care about.

A product like Stripe has dozens of independent systems: the API, the Dashboard, Stripe.js, Connect, webhooks, Checkout. A user whose checkout flow is failing doesn’t care whether the Dashboard is fine. They need to know whether the specific system they depend on is affected. A single indicator forces every user to assume the worst or assume the best, with no middle ground.

GitHub’s status page breaks this well. It separates Git Operations, API Requests, Issues, Pull Requests, GitHub Actions, GitHub Pages, and Codespaces into individual rows, each with its own indicator. When Actions has degraded performance but everything else is green, a developer whose CI pipeline just stalled gets a precise answer in two seconds. A developer pushing code sees green across their workflow and moves on.

The design implication is that the status page needs a component model, not just a state model. Each row needs a label that maps to something the user recognizes from their own workflow, a visual indicator (color + icon + text label), and enough whitespace to scan quickly. The failure mode is listing too many components. If you show forty internal microservice names, the page becomes noise. The right granularity matches how users think about the product, not how the engineering team has decomposed it internally.

The 90-day uptime bar

Open Cloudflare’s status page and you’ll see it immediately: a horizontal bar stretching across the page, divided into thin vertical segments, each one representing a day. Green means clean. Yellow means a degraded period. Red means an outage. The pattern has become standard across the industry because it solves a specific problem that point-in-time status can’t: it shows whether the service is generally reliable, not just whether it’s up right now.

The bar works because it gives historical context at a glance. A user evaluating a vendor can scan 90 days of operational history in a single eye movement. Patterns emerge visually: if every other Wednesday has a yellow segment, that’s a recurring maintenance window or a systemic issue. A single red segment three weeks ago with clean green since then reads as “they had an incident and fixed it.” A cluster of red and yellow across the last two weeks reads as “this service is unstable.”

Design considerations matter here more than they might seem. Granularity is the first decision: per-day segments are common, but per-hour or per-check granularity (with grouped rendering) gives a more honest picture. Color accessibility is critical, since the entire visualization depends on color differentiation. Red-green color blindness affects roughly 8% of men, which means a pure green/red bar is unreadable for a meaningful share of the audience. Stripe handles this by using distinct shapes and patterns alongside color, and by never relying on hue alone to convey state. Hover states add a second layer: mousing over a segment should reveal the specific incident, its duration, and a link to the full timeline.

The bar is also a trust artifact. A status page that shows 90 days of data, including a few incidents with clear resolutions, is more credible than a page with no history at all. An empty incident log doesn’t read as “perfect uptime.” It reads as “nobody is updating this page.”

Incident timeline UX

When an incident is active, the status page becomes a live communication channel. The design pattern that has standardized across good status pages is the reverse-chronological timeline: a vertical sequence of timestamped updates, most recent at the top, each tagged with a state label.

The state labels follow a predictable progression: InvestigatingIdentifiedMonitoringResolved. Linear’s status page uses this exact flow. Discord does the same. The labels do real work for the user because each one answers a different question. “Investigating” means the team knows something is wrong but hasn’t found the cause. “Identified” means they know what broke. “Monitoring” means the fix is deployed and they’re watching to confirm it holds. “Resolved” means it’s over.

Each update in the timeline should include three things: a timestamp (with timezone), the state label, and a prose description written by a human. The prose matters. “We are investigating reports of elevated error rates on the API” is useful. “Incident detected” is not. Users read these updates the way they read flight delay announcements. They’re looking for specificity, honesty, and a signal that someone competent is working on it.

The affected components should be tagged on each update so users can filter or scan for relevance. If the incident affects “API” and “Webhooks” but not “Dashboard,” tagging those components lets a dashboard-only user stop checking for updates.

The design trap is overdesigning the timeline. Animations, collapsible sections, and progressive disclosure add complexity to a page that users are scanning under stress. The best incident timelines are flat, scrollable, and plain. GitHub’s incident pages are a good reference: timestamps on the left, state labels in bold, update text in body copy, no accordion, no tabs.

Severity color systems and accessibility

Every status page maps operational states to colors. The standard palette is green (operational), yellow (degraded performance), orange (partial outage), and red (major outage). This mapping is so consistent across the industry that it functions as a shared design language. Users arrive already knowing what the colors mean.

The problem is that color alone is not accessible. Protanopia and deuteranopia (the two most common forms of color vision deficiency) make green and red nearly indistinguishable. A status page that communicates state purely through background color on a row is broken for a significant portion of its users.

The fix is straightforward but consistently skipped: pair every color with a text label and a distinct icon. Cloudflare uses a filled circle for operational, a minus sign for degraded, and an X for outage, each with a text label beside it. This means the information is encoded three ways (color, shape, text), and any one of those channels is sufficient on its own.

There’s a subtler design choice in the color system too. Some pages use only two states (up/down), which is honest but coarse. Others use four or five (operational, degraded, partial outage, major outage, maintenance), which gives finer resolution but requires users to understand the difference between “degraded” and “partial outage.” The right number of states depends on how granular your component model is. If you show five components with five severity levels, you have a 25-cell matrix of possible states, and the cognitive load of scanning it is real. Most status pages land on three or four states as a workable middle ground.

Trust through transparency

The patterns above are all in service of a larger meta-pattern: status pages build trust by showing more, not less. This runs counter to the instinct most teams have, which is to minimize what’s visible so the page looks clean and reassuring. A status page with zero incident history doesn’t look like perfect reliability. It looks abandoned.

The companies that use status pages most effectively as trust signals share a few characteristics. They show historical uptime percentages, usually in the 99.9%+ range, with the actual number visible rather than hidden. They keep past incidents accessible, often browsable by month, so a prospective customer doing due diligence can see how the team handles problems. They display resolution times, either explicitly or implicitly through the incident timeline. And they show component counts that reflect the real surface area of the product.

DevHelm’s status explorer aggregates public status pages across SaaS providers, which makes it a useful reference for seeing how different companies approach these design decisions side by side. Browsing through a few dozen real status pages reveals just how much variation exists in how the same basic patterns get implemented.

The empty-state problem is worth calling out specifically. When a status page launches with a clean slate, the absence of incidents can read two ways: either the service has been flawless, or nobody has bothered to post updates. Showing uptime data from day one (even if it’s just “100% over the last 7 days” with a short bar) signals that the system is actively monitored and the page is a living artifact, not a static placeholder.

How to evaluate status page design

If you’re designing a status page or evaluating one, run through these questions:

Does it show component-level status? A single indicator is a red flag. Users need to map their own workflows to individual components.

Does it show history? A page with no timeline, no uptime bar, and no past incidents offers no evidence that it’s maintained. The uptime bar is the minimum viable trust signal.

Is the severity system accessible without color? Turn on a color blindness simulator in your browser dev tools. If you can’t distinguish states, neither can 8% of your male users.

Do incident updates read like a human wrote them? Automated “incident detected” messages with no follow-up are worse than no status page, because they prove the system is running but nobody is communicating.

Does the page load fast? Users visit status pages during outages, which is exactly when infrastructure may be degraded. A status page that depends on the same infrastructure it’s reporting on is a design failure. The best implementations run on separate infrastructure or a CDN-hosted static page that stays up even when the product doesn’t.

Status pages are a small surface area with outsized impact on user trust. The patterns are well-established, the reference implementations are public, and the design decisions are mostly settled. The difference between a good status page and a bad one isn’t engineering effort. It’s whether someone treated the page as a design problem worth solving.

Pttrns

We will be happy to hear your thoughts

Leave a reply

Pttrns
Logo