We Stopped Using NoSQL for Everything—Relational Was Right All Along

In 2019, our engineering team made what seemed like an obvious decision: we would build our entire new platform on MongoDB. The reasoning was compelling at the time. We were a startup moving fast, our data models were "evolving," and we didn't want to be "constrained" by rigid schemas. The NoSQL revolution had promised us freedom from the tyranny of relational databases, and we were eager believers.

Four years later, after countless production incidents, a complete re-architecture, and an estimated $2.3 million in engineering costs, we migrated our core systems back to PostgreSQL. This is the story of how document databases failed us, why the flexibility promise was a trap, and what we learned about choosing the right database for the right job.

The Seduction of Schema-Less Design

The appeal of MongoDB was immediate and visceral. Our product requirements were changing weekly. Traditional SQL databases required migrations, schema updates, and careful planning. MongoDB promised that we could just throw JSON documents into collections and figure out the structure later.

Our first prototype was built in three weeks. We were storing users, orders, products, and analytics events all as flexible documents. Need to add a new field? Just add it. Need to change a data type? Just change it. Need to store nested objects? No problem. The development velocity was intoxicating.

The problems started slowly, almost imperceptibly. First, it was small inconsistencies. Some user documents had a "phoneNumber" field, others had "phone_number," and still others had "contactPhone." Without a schema to enforce consistency, our data became a reflection of whoever happened to be writing code that day.

We created internal documentation listing our "standard" field names, but documentation is not enforcement. Six months in, a single query to analyze user contact methods required checking for twelve different field name variations. Our "flexible" schema had become a maintenance nightmare.

The Query Complexity Explosion

Relational databases solve a fundamental problem that we didn't appreciate until we lived without it: they provide a consistent, well-understood query language for relating data across tables. MongoDB's aggregation pipeline is powerful, but it's also dramatically more complex than SQL for common operations.

Consider a simple requirement: find all orders placed by users who signed up in the last 30 days, grouped by product category, with the average order value for each. In PostgreSQL, this is a straightforward JOIN with GROUP BY. In MongoDB, we needed a multi-stage aggregation pipeline with $lookup, $unwind, $match, $group, and $project stages.

The aggregation pipeline for this query was 47 lines of JSON. The equivalent SQL was 8 lines. But the real problem wasn't the line count—it was maintainability. When requirements changed, modifying the aggregation pipeline required deep expertise. Our team had SQL proficiency, but aggregation pipeline expertise was concentrated in two senior engineers.

We started documenting our aggregation pipelines with extensive comments, but the cognitive load remained high. Code reviews for pipeline changes took three times longer than equivalent SQL changes. Junior developers avoided touching aggregation code entirely, creating bottlenecks around our MongoDB experts.

The performance implications were equally concerning. MongoDB's query planner is good, but it's not as mature as PostgreSQL's. We found ourselves manually optimizing aggregation stages, creating custom indexes for specific query patterns, and accepting performance characteristics that would be unacceptable in a relational system.

The Consistency Crisis

Our application handled financial transactions—users purchasing digital goods with real money. In a relational database, this would involve a transaction that debits the user's balance, creates an order record, updates inventory, and commits atomically. If any step fails, everything rolls back.

MongoDB's transaction support, when we started in 2019, was limited. We designed around this limitation using the "two-phase commit" pattern with status flags and reconciliation jobs. In theory, this approach works. In practice, it created a constant stream of edge cases and inconsistent states.

The reconciliation job ran every five minutes, checking for orders stuck in intermediate states and attempting to complete or roll back. But reconciliation has its own edge cases. What happens when the reconciliation job crashes mid-execution? What happens when two reconciliation jobs overlap? What happens when network latency causes a write to succeed after the timeout?

We built an increasingly complex state machine to handle order status transitions. The state machine had 14 states and 23 valid transitions. Each transition had its own validation logic, side effects, and failure handling. What started as a simple order model became an engineering project of its own.

The production incidents were predictable but demoralizing. Every few weeks, a customer would report being charged without receiving their purchase. Investigating these cases required tracing through status flags, checking timestamp sequences, and understanding which reconciliation job might have partially processed the order. Each investigation took hours.

MongoDB eventually added multi-document ACID transactions, but by then our architecture was built around eventual consistency. Retrofitting transactions would have required rewriting our entire order processing pipeline. We were locked into a consistency model we never actually wanted.

The Hidden Costs of Flexibility

The document model's flexibility sounds like a feature, but we discovered it was often a liability. Without schema enforcement, the database couldn't optimize storage or indexing as effectively. Our documents varied dramatically in size—some user profiles had extensive preference objects, others were minimal.

MongoDB's memory management assumes relatively uniform document sizes within a collection. Our varying document sizes led to memory fragmentation and unpredictable performance. Queries that performed well with small documents would timeout when they happened to hit users with large preference objects.

The indexing story was similarly problematic. In a relational database, you index columns with well-defined types. In MongoDB, you index fields that may or may not exist, may have different types across documents, and may be nested at varying depths. Our index management became an ongoing project.

We discovered that 40% of our indexes were unused—they had been created for queries that no longer existed or were superseded by different access patterns. But we were afraid to drop them because our documentation of query patterns was incomplete. The unused indexes consumed RAM that could have improved query caching.

The opposite problem also occurred: queries that should have used indexes weren't because the query planner made suboptimal choices. We spent considerable time analyzing explain() output and adding query hints to force better plans. This was operational overhead that relational databases largely handled automatically.

The Operational Burden

Running MongoDB in production required specialized expertise that was harder to find than traditional DBA skills. Our operations team knew PostgreSQL administration—backup and recovery, replication, performance tuning. MongoDB's operational model was different enough that this experience didn't transfer cleanly.

Replica set management was conceptually simple but operationally complex. Elections happened more frequently than we expected, causing brief write unavailability. The oplog sizing required careful tuning—too small and secondaries couldn't catch up during maintenance, too large and we wasted SSD capacity.

Sharding was the ultimate trap. We started with a single replica set, planning to shard "when we needed to scale." When that time came, we discovered that adding sharding to an existing cluster required careful planning of shard keys, data migration, and application changes. The shard key choice was irreversible—we picked based on our 2020 access patterns, but our 2022 access patterns were completely different.

The monitoring and alerting story was equally frustrating. MongoDB's metrics were different from relational databases, so our existing monitoring infrastructure didn't apply. We built custom dashboards tracking document sizes, index usage, lock percentages, and replica lag. The observability investment was substantial.

The Migration Decision

By 2022, we had accumulated enough pain points that the engineering leadership approved a comprehensive evaluation. We analyzed our actual usage patterns and discovered something surprising: 85% of our data access was relational in nature. We were JOINing documents, filtering by multiple fields, and grouping for aggregation—exactly the patterns that SQL databases excel at.

The remaining 15% involved genuinely document-oriented access—retrieving a user with their nested preferences, storing free-form event payloads, managing configuration objects. These use cases were valid, but they didn't justify building our entire platform on a document database.

PostgreSQL 14 offered a compelling alternative: robust relational modeling for our core data with JSONB columns for genuinely flexible content. We could have the best of both worlds—transactional consistency for orders and payments, flexible documents for preferences and events.

The migration planning took three months. We couldn't simply dump and restore—we needed to normalize our denormalized data, clean up field name inconsistencies, and design a proper relational schema. The schema design process was illuminating: decisions that we had deferred for years suddenly required explicit choices.

The Migration Execution

We adopted a strangler fig pattern, running both databases in parallel while gradually migrating workloads. New features were built against PostgreSQL. Existing features were migrated in dependency order—core entities first, then the services that consumed them.

The data migration scripts were substantial. Converting nested documents to relational tables required creating junction tables, foreign keys, and normalization logic. Our "flexible" user preferences object became three related tables with clear constraints.

Some data was simply lost during migration—fields with inconsistent types, documents with conflicting reference IDs, records that violated constraints that should have existed from the beginning. We had been operating with data quality problems that the schema-less model had hidden from us.

The testing phase revealed numerous application bugs that had been masked by MongoDB's flexibility. Our code had been writing invalid data for years, and MongoDB had happily stored it. PostgreSQL's constraints immediately identified these issues, allowing us to fix the root causes rather than perpetually cleaning up downstream.

The Results

Six months after completing the migration, we conducted a comprehensive comparison:

Query Performance: Average query latency dropped by 40%. The improvement was most dramatic for complex queries involving multiple collections—what had been $lookup aggregations became simple JOINs with well-optimized execution plans.

Development Velocity: Feature development for data-intensive functionality accelerated by 60%. Engineers could write SQL they understood, use familiar tooling, and make schema changes through standard migration frameworks.

Operational Overhead: Our database-related incident rate dropped by 75%. PostgreSQL's consistency guarantees eliminated entire categories of bugs. The operations team could apply their existing expertise rather than learning a different operational model.

Cost: Our database infrastructure costs dropped by 30% despite running on similar hardware. PostgreSQL's better memory efficiency and more effective indexing meant we could serve the same workload with fewer resources.

The JSONB columns handled our genuinely flexible data beautifully. User preferences, event payloads, and configuration objects lived happily as JSON within a relational structure. We got the flexibility we actually needed without the chaos of building everything on documents.

What We Should Have Done Originally

Looking back, we made a classic architecture mistake: we chose technology based on ideology rather than requirements. The NoSQL movement told a compelling story about agility and scale, and we bought into that story without analyzing our actual needs.

Our requirements analysis should have asked different questions. What is the actual shape of our data relationships? How important is transactional consistency? What query patterns will dominate? What are our operational capabilities? What skills does our team have?

Honest answers would have pointed us toward PostgreSQL from the beginning. Our data was inherently relational—users had orders, orders had line items, products had categories. The relational model wasn't a constraint; it was a natural fit.

The "evolving schema" argument collapsed under scrutiny. We didn't need schema-less storage; we needed good migration tooling. PostgreSQL's ALTER TABLE and migration frameworks like Flyway or Alembic handle evolving schemas elegantly. The flexibility we thought we needed was really just discipline we were avoiding.

When Document Databases Make Sense

This post isn't anti-DocumentDB in general—it's anti-choosing-without-thinking. There are legitimate use cases for document databases that we should have recognized weren't ours:

Content management systems often benefit from document storage. A blog post with embedded images, tags, and author information is genuinely document-shaped. The access pattern is primarily by document ID, and relationships between documents are loose.

Event sourcing can work well with documents. Events are append-only, self-contained, and rarely queried relationally. Storing events as documents with flexible payloads is a reasonable choice.

Caching layers often use document stores. When you're storing serialized objects for fast retrieval, the document model's flexibility is genuinely helpful.

Prototyping and rapid iteration can benefit from avoiding schema definition. If you're genuinely unsure what your data model will look like and you're willing to throw away the prototype, the schema-less approach reduces friction.

Our mistake was treating a prototyping convenience as a production architecture. Document databases can be excellent choices for specific problems. They're poor defaults for general-purpose application development.

The Broader Lesson

The database choice is not just a technical decision—it's a statement about what problems you expect to encounter. Choosing a document database says you expect flexible, evolving, document-shaped data with limited relational needs. Choosing a relational database says you expect structured, consistent, interrelated data.

Most business applications fall into the second category. Users, orders, products, inventory, payments—these entities have natural relationships that SQL databases model elegantly. The NoSQL movement's criticism of relational databases often applied to problems that most applications don't have.

We were not Google, dealing with planet-scale data volumes that exceeded any single database's capacity. We were not storing free-form content with genuinely variable structure. We were building a straightforward e-commerce platform with entirely predictable data models.

The tooling ecosystem told the same story. ORM frameworks, migration tools, query builders, and database IDEs all assumed relational databases. We fought against this ecosystem constantly, building custom tooling and workarounds for problems that didn't exist in SQL-land.

Conclusion

Our NoSQL adventure cost us years of engineering effort and millions in technical debt. The flexibility we thought we were buying turned into complexity we hadn't anticipated. The scale we thought we'd need never materialized, and the relational queries we avoided became aggregation pipelines we couldn't maintain.

PostgreSQL now runs our core platform reliably and efficiently. Our engineers write SQL instead of aggregation pipelines. Our operational burden is standard rather than specialized. Our data is consistent, queryable, and well-understood.

If you're starting a new project and considering MongoDB (or any document database) "because it's flexible," stop and analyze your actual requirements. Ask what your data really looks like, how you'll query it, and what consistency guarantees you need. There's a good chance that a relational database, perhaps with JSONB columns for genuinely flexible content, is the right choice.

The relational model exists because it solves real problems elegantly. Fifty years of database research and tooling development aren't invalidated by flexible documents. Choose your tools based on your problems, not the latest movement's marketing.

Tags:TechnologyTutorialGuide

Written by XQA Team

Our team of experts delivers insights on technology, business, and design. We are dedicated to helping you build better products and scale your business.

•