Latest News

Data at Scale

June 18, 2026

How Postgres' Rise To Enterprise Default Is Outpacing The Operational Model Behind It

Christophe Pettus, CEO of PGX, on how Postgres became the nerve center for systems far larger than itself, and why the next wave of production failures will be failures of operating it.

Share:
Credit: The Read Replica

Make The Read Replica one of your go-to sources on Google

Add The Read Replica on Google

Companies way underestimate how much operational complexity they're buying into.

Christophe Pettus

CEO

PGX

Ask a room of senior engineers what database a new project should use in 2026 and you'll get a shrug and a meme: "just use Postgres." The line started as a Hacker News in-joke and hardened into an architectural default, and the data backs the attitude. PostgreSQL was the single fastest-rising database in the first half of 2026 on the DB-Engines ranking, closing to within roughly ten points of Microsoft SQL Server, a gap that stood near ninety a year earlier. Every major cloud now sells a flavor of it: Amazon's Aurora wraps the Postgres front end around a different storage engine, Google's AlloyDB does something similar, and a steady parade of "Postgres-plus" and "Postgres-like" products has turned the wire protocol into a land grab.

So the headline writes itself: Postgres won. But that's not the interesting bit. What's interesting is what winning did to the workload. It stopped being a box that holds your data and became the thing everything else plugs into, and most teams are still operating it as though it were the box.

The man who gets called when the database is on fire

Christophe Pettus has been working with Postgres since 1997, before the hosted era existed. He runs PGX (formerly PostgreSQL Experts), the boutique consultancy he founded in 2009, sits on the PostgreSQL community infrastructure team, and speaks regularly at conferences from PGConf EU to PGDay Chicago. PGX has seen a great many sick databases and can usually tell, fast, what's killing it. Increasingly, though, what's killing it is the operational assumptions around the database, not the database itself.

Pettus is unsentimental about the victory lap. "It's not news," he said of Postgres's rise. He's more interested in where the new failure modes are coming from, and his starting observation reads almost like architectural archaeology. "It's a product of the late 1990s," he said. "Back then, the way you deployed infrastructure was you picked up a phone, called Supermicro, and they shipped you a computer. You went into the data center, you racked it, and you installed the software on it. It had local disks, and Postgres grew up in that world." That world fused compute and storage into a single machine that you owned and babied. The modern one has inverted the relationship. "Now the compute is a disposable product. You fire up a VM, you pick one, and you're done. The storage is the long-term persistent thing. That's a different world than Postgres originally grew up in." The community edition is catching up to it, he said, but slowly. The assumptions a lot of teams operate under are older than the infrastructure they're running on.

Scaling Postgres is the easy part

Though counterintuitive, Pettus argues that scaling the database itself is mostly a solved problem, and the solved-ness is exactly why the danger moves elsewhere.

"The first thing that happens is you start running out of IO capacity," he said. "Postgres doesn't tend to burn a lot of CPU, but it does have to move the data on and off the disk." The reflexive fix is a bigger instance or more provisioned IOPS, which buys time until it doesn't. The next move is read replicas. "Physical replication like that is almost foolproof and really easy for anyone to set up. You don't need to be a specialist." His own shop stopped doing replica setups as paid work, he noted, because clients simply don't need the help anymore.

The genuine ceiling is writes. "The big problem is, at some point you run out of write capacity, because Postgres right now doesn't have a built-in multimaster write solution," he said. Commercial options exist to fill that gap, but they buy operational weight in exchange. Past that, you're into sharding, where off-the-shelf tools like Citus work well and a scheme tied more closely to your own data sometimes works better. None of it, in his telling, is exotic. "We've worked on petabyte-sized databases running on conventional Postgres and conventional hardware, depending on your workload." A team can take Postgres remarkably far before the database is the thing that breaks.

So if the database scales fine, what's the three-in-the-morning call actually about?

The database spans everything now

Pettus's answer is that the database has stopped being a database. "The database, in effect, is spanning a large number of components within the infrastructure," he said. "Traditionally, there was the database, there was the CPU, it had its storage, and that was kind of it. But now we're finding databases integrated into queuing systems, into AI-focused, GPU-intensive operations." Sometimes the database is doing AI-flavored work itself, with pgvector and embeddings living next to the operational data. Sometimes it isn't touched directly at all; it's providing the control plane for an AI system running elsewhere, coordinating something far larger than itself.

That distinction matters more than it sounds, because the moment your database is wired into a constellation of other components connected by a network, what you're actually running is a distributed system. And Pettus has a definition he learned a long time ago: "The definition of a distributed system is one where the wire can break." A control plane built on the assumption that the wire never breaks is a control plane blindsided when it does.

AI compounds the problem. Teams lean on tooling to spot patterns and surface anomalies, but as Pettus puts it, the tools "can only really see what they've seen. A lot of the time it requires this extra leap of human inference to get past the problem." Understanding the distributed system you've built is precisely the thing teams are most tempted to skip when the system spans everything.

He sees the same overreach in the opposite architectural instinct, too. One client arrived with a long paper proposing to spread a single Postgres workload across seven specialized data stores, trying to squeeze out every last optimization. His response was to count the cost nobody had counted. "Some of these data stores don't do what you think they do," he said, the usual hazard of buying from the glossy brochure. And even where they deliver, "you've bought a huge amount of operational complexity, because you have to get your data in and out of seven different data stores," keeping them correlated and handling every way the data can fail to move between them. Centralize on one spanning database or scatter across seven, the bill is the same: you've signed up to operate a distributed system, and most teams underestimate that invoice.

Recovery is where systems die

"We'll actually get called in not for the failure, but for the recovery of the failure," Pettus said. The outage itself rarely does the lasting damage. That comes from what the organization does next. "Every time they bring up the application, there's this thundering herd problem, and the database falls back over." Network blips inside the data center, which happen constantly, compound into full failures because nobody designed for graceful recovery.

The pattern he sees is organizational, not technical. Take disaster recovery that mistakes adjacency for redundancy. His favorite example is local. A client's primary data center sat on Main Street in San Francisco; their backup was a Digital Realty facility across the bay in Oakland, four or five miles away, both built on landfill. "Any earthquake that's going to take out one has probably a 75% chance of taking out the other," he said. "It's a case of correlated failure." Two data centers is not a plan if the same event flattens both. And the substrate underneath them is shakier than the org chart implies: "US-East-1, which basically is the internet," he noted, has gone down repeatedly "because of storms, because of all kinds of things." He's not exaggerating. AWS's oldest and most-depended-on region went dark again in October 2025, a DNS fault that cascaded for hours and took a meaningful slice of the internet with it. Data centers, in the end, are just buildings. In 2021, an OVHcloud facility in Strasbourg burned to the ground, and customers who had treated 'it's hosted somewhere' as a backup strategy lost their data permanently.

The same gap shows up in backups. "A backup you haven't tested is a backup you don't have," is the rule he repeats, and the failure is almost never the backup system itself. It's the cluster of operational steps around the restore that nobody rehearsed. He told the story of a client who turned a bad night into a catastrophe: they restored a backup, brought the application up, and only then realized the backup was a day old. They had lost a day of data, a day of unapplied migrations were now colliding with the running system, it was three in the morning, everyone was exhausted, errors were flying, and no one understood what was happening. But the backups had worked perfectly. The recovery had no plan.

Run it like the critical system it has become

Pettus has built something of a sideline in fixing the organizational version of this problem: incident response with no incident command. "What turns an emergency into a crisis tends to be responding too quickly, responding reflexively rather than procedurally," he said. He once joined a live call with 43 people on it. Nobody had an assigned role. Nobody owned communication with leadership. "Inevitably, a vice president will hop on the call, because they think the only reason the problem hasn't been solved is that someone with a lofty enough title hasn't yelled at them yet."

The frustrating part, to him, is that this is a solved problem in every field that takes failure seriously. Google's own incident-management framework is built directly on the Incident Command System, the same protocol the United States uses for wildfires and earthquakes: a clear commander, defined roles, one person on communications, nobody straying onto someone else's turf. The knowledge is sitting on the shelf. Software culture mostly hasn't picked it up, Pettus argued, because the reflex runs the other way, toward 'move fast and break things.' "Sometimes the thing you break is something you actually wanted."

The throughline is that none of these are database problems. They're the operational discipline a critical system demands and a humble data store never did. The database changed jobs. The operating model didn't follow.

Avoiding the 3 a.m. panic

The temptation isn't going away, because it's a good one. "It's very tempting. You can do so much," Pettus said. "You try to get it to do a little bit of everything and be the core of your business. And that's not wrong."

More teams will keep asking Postgres to be the center of everything, and Postgres will keep saying yes. The ones who come through the next few years intact will be the ones who stopped operating a control plane with a data store's playbook. "Companies way underestimate how much operational complexity they're buying into," Pettus said, and a lot of them are about to find out, at three in the morning, exactly how much.

The signal, once a week

Reporting, contributor perspectives and sharp notes from the people building with Supabase in the real world. No noise, no spam.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.