Post-Postgres: What’s Driving a Great Migration from the Iconic Database
Postgres for 20 years has served as the primary transactional data store for many applications as it’s flexible and easy to use and supported by a strong community. Twenty years, however, is a long time in the tech world where advances in hardware and software relentlessly change what businesses can do – and more importantly, what developers come to expect and need.
Following the release of Postgres 16 in September of 2023 and the meteoric rise of data-heavy innovations such as generative AI, it’s a natural time to take stock of the landscape and consider how infrastructure is evolving to support the applications of today and tomorrow. While Postgres still has a role to play, companies are migrating to other databases to meet these changing requirements.
One-size-fits-all platforms that aim to handle all data needs are increasingly inadequate and cost prohibitive, while task-specific technologies often perform better with current demands. Increasingly, global systems can generate terabytes and even petabytes of data daily, with new types of applications requiring real-time responses for analytical queries at scale.
The result is that technologies such as Postgres, while certainly still relevant, are being redefined in their use as well as augmented by new types of databases and data warehouses that excel in real-time data insight at scale.
For example, Postgres makes sense if a dating app needs to change “Location” in a user’s profile. But if a cloud observability company needs to calculate the average bill price across billions of entries, they need something else.
Better together
Companies of all sizes benefit from the insights they generate into mission critical business areas where speed is of the essence and the data being crunched serves as a gating factor to performance. And businesses operating at market dominant scale – like Uber or eBay – generate an absolutely staggering amount of data, petabytes of logs, metrics, traces and events every day.
Postgres architecture isn’t equipped to handle analytics and complex querying at scale, and efforts to attempt and scale it for this purpose are cumbersome and costly. Diving deeper, the reasons for these limitations are the result of a fundamental design shared by all transactional databases like Postgres: row-oriented organization of data. This architecture is great at processing transactions, but it does not scale well for analytical applications. Explained another way, Postgres is optimized for update heavy workloads, whereas column oriented software is optimized for read-heavy workloads.
All this explains why the migration is taking place, where organizations deploy Postgres and other Online Transaction Processing (OLTP) databases strictly for their transactional data, while leveraging a new type of database for workloads focused on analytics. These real-time analytical databases come in different flavors, but all share the same fact that they organize data in columns – not rows like Postgres.
Real time
If transactional databases formatted in rows are losing favor for compute-intensive workloads, what is replacing them?
Technology providers talk about themselves in different ways, including “data warehouses,” “analytics platforms,” “analytics databases” or “relational databases.” The throughline for developers, CIOs and companies is to look for technologies that perform well on benchmarks to do with real time analysis, like the Star Schema Benchmark.
This industry shift is playing out across thousands of companies. Real time data technology underlies a vast variety of use cases across industries. Observability is a key example where real-time analytics has found widespread adoption because it can power instantly responsive user-facing dashboards over high-volumes of data and ingest rates. But really, any application where efficiently accessing and quickly aggregating or analyzing data is a place where you can expect to see these real time analytics providers earn workloads from incumbents.
Part of the reason why the data space is compelling for me as a professional is because companies openly share their lessons and insights. A little bit of Googling can bring up examples where the engineering teams of GitLab, Sentry and PostHog share their journey looking for technologies that complement Postgres. I look forward to seeing what the next twenty years hold, and I don’t doubt that we will see even more specialization and disruption.
About the Author
Tanya Bragin leads product at ClickHouse, the fastest open source analytical database. She has over 20 years of experience in information technology, ranging from data analytics, to risk management and information security. She started my career in consulting and sales engineering, and spent the last decade and a half growing product organizations from early stages to maturity at two data analytics startups, Elastic and ExtraHop.
Sign up for the free insideBIGDATA newsletter.