
We had Redis everywhere. Session storage. API response caching. Rate limiting. Pub/sub for real-time notifications. Job queues. Feature flags. If there was a problem, someone would say "just throw it in Redis" and nobody would question it.
Then Redis went down on a Friday evening. Our session store disappeared. Every user was logged out simultaneously. Our rate limiter stopped working, so a misbehaving client hammered our API with 50,000 requests per minute. Our job queue vanished, losing 3 hours of background processing work.
One service failure cascaded into five. All because we had put too many eggs in the Redis basket without thinking about what would happen when the basket broke.
We spent the next quarter migrating every Redis use case back to Postgres. It was the best infrastructure decision we ever made.
The Cache Invalidation Nightmare
There are only two hard things in computer science: cache invalidation and naming things. We had solved neither, but cache invalidation was actively costing us money.
Our API caching layer worked like this: on every read, check Redis first. If the data exists and hasn't expired, return it. If not, query Postgres, store the result in Redis with a TTL, and return it.
Simple, right? Until you need to invalidate the cache.
The Stale Data Problem: A user updates their profile. The profile is cached in Redis with a 5-minute TTL. For the next 5 minutes, the user sees their old profile. They refresh. Still old. They contact support. "Your website is broken."
The "fix" was to invalidate the cache on every write. But our data model had relationships. Updating a user's role affected their permissions, which affected every API response that included permission checks. We needed to invalidate dozens of cache keys for a single database write.
We built an invalidation system. It was 2,000 lines of code dedicated entirely to figuring out which cache keys needed to be deleted when data changed. It had bugs. Of course it had bugs. Cache invalidation code always has bugs because the problem is fundamentally about maintaining consistency between two data stores, which is a distributed systems problem disguised as a performance optimization.
The Thundering Herd: When a popular cache key expired, hundreds of concurrent requests would simultaneously miss the cache and hit Postgres. This is the thundering herd problem. We implemented cache stampede protection with mutex locks in Redis. More complexity. More code. More bugs.
After removing Redis, we had zero cache invalidation bugs. Zero. Because there was no cache to invalidate. Postgres served the data directly, always fresh, always consistent.
Postgres Is Faster Than You Think
The assumption behind using Redis is that Postgres is "too slow" for certain workloads. This assumption is usually wrong.
With proper indexing: A well-indexed Postgres query returns results in 1-5ms. For comparison, a Redis GET returns in 0.1-0.5ms. The difference is 1-4ms. For most web applications, this difference is invisible to the user because network latency (50-200ms) dominates the request lifecycle.
We were adding 2,000 lines of caching infrastructure to save 3ms per request. The engineering cost far exceeded the performance benefit.
With connection pooling (PgBouncer): Postgres's main performance bottleneck is connection overhead. Each connection consumes ~10MB of RAM. With PgBouncer in transaction pooling mode, we reduced our connection count from 500 to 50 while handling the same throughput.
With UNLOGGED tables: For truly ephemeral data (sessions, rate limiting counters), Postgres UNLOGGED tables skip write-ahead logging. This makes writes 2-3x faster at the cost of data loss on crash — exactly the same trade-off Redis makes by default.
With materialized views: For expensive aggregation queries that we previously cached in Redis, we used materialized views that refresh on a schedule. The data is pre-computed, stored in Postgres, and queryable like a regular table. No separate cache layer needed.
Replacing Every Redis Use Case
Sessions: Moved to Postgres with an UNLOGGED table. Session reads are a simple primary key lookup — sub-millisecond with an index. We added a periodic cleanup job to delete expired sessions.
Rate Limiting: Implemented using Postgres window functions. A single query counts requests per IP in the last N seconds. With a partial index on recent timestamps, the query runs in under 2ms even with millions of rows.
Job Queues: Replaced with SKIP LOCKED, a Postgres feature designed exactly for this use case. Workers SELECT and lock rows atomically, process them, then delete. No separate queue infrastructure needed. If the worker crashes, the lock is released automatically.
Pub/Sub: Replaced with Postgres LISTEN/NOTIFY for simple real-time notifications. For higher throughput, we used logical replication to stream changes to consumers. Both are built into Postgres.
Feature Flags: A simple table with a flag name and a boolean. Queried on every request with a 1-second materialized view refresh. Total query time: 0.3ms.
API Response Caching: We eliminated this entirely. With proper indexing and query optimization, our API responses were fast enough without caching. The 3ms we "saved" with Redis was not worth the complexity.
The Operational Simplification
Running Redis in production is not free. It requires:
- Monitoring: Memory usage, eviction rates, connection counts, replication lag.
- Persistence configuration: RDB snapshots vs AOF logging vs hybrid. Each has trade-offs that affect data durability and performance.
- Cluster management: Redis Cluster for horizontal scaling, with all its partition and failover complexity.
- Memory management: Redis stores everything in RAM. When you run out of memory, Redis starts evicting keys based on a policy you probably haven't thought about carefully.
- Backup and recovery: A separate backup strategy from your database backups.
By removing Redis, we eliminated an entire class of operational concerns. Our infrastructure team now manages one data store instead of two. Our backup strategy is simpler. Our disaster recovery is simpler. Our monitoring dashboards have fewer panels.
Infrastructure complexity is a tax. Every additional service multiplies the number of failure modes, the number of things to monitor, the number of things to upgrade, and the number of things that can go wrong at 3am.
When Redis Actually Makes Sense
Redis is a great tool. It has legitimate use cases:
- Sub-millisecond latency requirements: If you genuinely need responses in under 1ms (high-frequency trading, real-time bidding), Redis is appropriate.
- Massive throughput: If you need millions of operations per second on a single node, Redis's single-threaded architecture with io_uring can handle workloads Postgres cannot.
- Complex data structures: Redis's sorted sets, HyperLogLog, and Streams are genuinely useful for specific algorithms that would be awkward to implement in SQL.
- Shared state across services: If multiple services need to share ephemeral state without a database, Redis can serve as a lightweight coordination layer.
For most web applications, none of these apply. Your API does not need sub-millisecond responses. Your throughput is measured in hundreds of requests per second, not millions. Your data structures are rows and columns.
Conclusion
We used Redis because it was the default answer to performance questions. "Slow query? Cache it in Redis." This is cargo cult engineering — applying a solution because everyone else does, not because the problem requires it.
Postgres, properly configured, is remarkably fast. Fast enough for 95% of web application workloads. And it gives you something Redis cannot: consistency, durability, SQL, joins, transactions, and a single source of truth.
Before adding Redis to your stack, ask: have I actually optimized Postgres? Have I added proper indexes? Have I configured connection pooling? Have I tried UNLOGGED tables? If the answer to any of these is "no," you don't need Redis. You need better Postgres.
Written by XQA Team
Our team of experts delivers insights on technology, business, and design. We are dedicated to helping you build better products and scale your business.