Back to Blog
Technology
January 22, 2026
2 min read
355 words

Why We Stopped 'Serverless-First'. Cold Starts Killed User Experience.

We went all-in on Lambda. P50 latency: 80ms. P99 latency: 4.2 seconds. Users complained. We moved critical paths to containers and kept Lambda for async work.

Why We Stopped 'Serverless-First'. Cold Starts Killed User Experience.

"Pay only for what you use. Infinite scale. No servers to manage." The serverless pitch was irresistible. We rebuilt everything on AWS Lambda.

For async workloads, it was perfect. For user-facing APIs, it was a disaster.

The Numbers Nobody Warned Us About

MetricValue
P50 latency80ms
P95 latency450ms
P99 latency4,200ms
Cold start frequency~2% of requests
Cold start penalty3-5 seconds

2% of users waited 4+ seconds for every request. That's 2% of users having a terrible experience, randomly, unpredictably. They thought our product was broken.

The Provisioned Concurrency Trap

"Just use provisioned concurrency!" AWS's solution to cold starts.

We tried it. Problems:

  • Cost exploded: Provisioned concurrency charges whether used or not. Our bill tripled.
  • Capacity planning returned: How many to provision? Too few = cold starts return. Too many = paying for idle.
  • Traffic spikes: Provisioned concurrency doesn't auto-scale fast enough for spikes.

We'd traded "pay for what you use" for "pay for what you might use plus cold start lottery."

The Hidden Complexity

Connection management: Each Lambda invocation opens fresh connections. Database connection pools exploded. We added RDS Proxy. More cost, more latency.

Local development: Lambda locally is different from Lambda in AWS. SAM and LocalStack helped but never matched production behavior exactly.

Debugging: Distributed traces across dozens of Lambdas. No persistent logs. X-Ray helped but added latency and cost.

Deployment complexity: Each function deployed separately. Coordinating deployments across 40 functions was its own project.

What We Do Now: Hybrid Architecture

User-facing APIs: ECS Fargate containers. Always warm. Predictable latency. Connection pooling that works.

Async processing: Lambda. Perfect fit. Event-driven, sporadic traffic, cold starts don't matter.

Scheduled jobs: Lambda. Run once, done. Cold start is irrelevant.

Webhooks: Lambda behind API Gateway. Occasional traffic, latency tolerance acceptable.

The New Metrics

MetricBefore (Lambda)After (Hybrid)
P50 latency80ms45ms
P99 latency4,200ms180ms
Monthly cost$8,000$5,500
User complaints about slownessWeeklyNone

Lower latency AND lower cost. The "serverless premium" was real.

When Serverless Wins

  • Truly sporadic workloads (occasional scripts, scheduled reports)
  • Event-driven backends (S3 triggers, queue processors)
  • Prototype/early stage (before latency matters)
  • Massive scale-to-zero requirements (multi-tenant with idle tenants)

Serverless isn't wrong. "Serverless-first" is. Let the workload dictate the architecture, not the hype.

Tags:TechnologyTutorialGuide
X

Written by XQA Team

Our team of experts delivers insights on technology, business, and design. We are dedicated to helping you build better products and scale your business.