We Stopped Writing Unit Tests—Integration Tests Were Better

Our test suite had 4,000 unit tests. It ran in 45 seconds. Our CI pipeline showed a beautiful green checkmark on every PR. Code coverage was 95%. By every conventional metric, we had excellent test quality.

Production broke every week.

Not because of untested code. The failing code was tested. The unit tests for that code passed. They passed because they tested the wrong things: implementation details, mock interactions, internal state transformations. They verified that individual functions did what we told them to do. They never verified that the system actually worked.

We deleted 3,200 unit tests and replaced them with 400 integration tests. Production incidents dropped 70%. Here is why the testing pyramid is wrong for most applications.

The Problem with Unit Tests

They test implementation, not behavior.

A typical unit test looks like this: given this input, this function should return this output. This is useful when the function's logic is complex and independent. But most functions in a web application are not complex and independent. They are thin layers that coordinate between other components: validate input, call a service, transform the response, return the result.

When you unit test a controller, you mock the service layer. When you unit test the service layer, you mock the repository. When you unit test the repository, you mock the database. At the end, you have tests that verify each layer calls the next layer with the correct arguments. You have not verified that the system actually processes a request correctly.

They couple tests to implementation.

When you refactor code, unit tests break — even when behavior is unchanged. Move a function to a different module? Tests break. Change a method signature? Tests break. Extract a helper function? Tests break. Rename an internal variable? Tests might break.

This coupling creates a perverse incentive: engineers avoid refactoring because it means rewriting tests. The test suite, instead of enabling confident change, becomes a barrier to change. The code ossifies because the tests hold it in place.

Our codebase had entire modules that nobody would touch because "the tests are too fragile." The tests were supposed to give us confidence to change code. Instead, they gave us fear.

Mocks hide real failures.

Every mock is a lie. When you mock a database call, you assume the real database will behave like your mock. When you mock an HTTP client, you assume the real API will return what your mock returns. These assumptions are often wrong.

We had a unit test that verified our payment processing function handled successful payments correctly. The test mocked the payment gateway client and returned a success response. The test passed for two years. During that time, the actual payment gateway changed their response format three times. Our code broke in production each time, but the unit test never caught it because the mock was frozen in time.

The Integration Test Alternative

An integration test exercises the real system. It sends an HTTP request to the API, which hits the real controller, which calls the real service, which queries the real database, which returns real data. If any layer is broken, the test fails.

Our integration test setup:

A test database (Postgres in Docker) with known seed data, created fresh for each test suite.
The actual application server running in test mode.
Real HTTP requests via a test client.
External services (payment gateways, email providers) replaced with contract-verified fakes — not mocks, but lightweight implementations that honor the same API contract.

What an integration test looks like:

Instead of testing "the UserService.createUser function calls UserRepository.save with the correct arguments," our integration test says: "POST /api/users with this JSON body returns a 201 response with a user object, and a subsequent GET /api/users/{id} returns the same user."

This test verifies the entire request lifecycle: routing, validation, business logic, database persistence, and serialization. If any component breaks, the test fails. And it tests behavior (creating and retrieving a user), not implementation (which functions call which other functions).

But Integration Tests Are Slow

This is the primary objection. Unit tests run in milliseconds. Integration tests that hit databases and HTTP endpoints take seconds.

Our answer: so what?

Our 4,000 unit tests ran in 45 seconds. Our 400 integration tests run in 3 minutes. The integration tests take 4x longer, but they catch 10x more bugs. The ROI is dramatically better.

Moreover, 3 minutes is fast enough for CI. Engineers push code, CI runs the integration tests, and results are available before the code review is finished. The extra 2 minutes and 15 seconds have never been a bottleneck.

For local development, we use a watch mode that only runs integration tests affected by changed files. This typically runs 5-20 tests in under 10 seconds. Fast enough for the feedback loop.

The Coverage Myth

Our unit test coverage was 95%. Our integration test coverage is 72%. By the coverage metric, we regressed.

But coverage measures lines executed, not behavior verified. A unit test that mocks everything can achieve 100% coverage while verifying nothing about system behavior. Our 72% integration test coverage exercises less code but verifies more behavior. The uncovered 28% is mostly error handling and edge cases that we cover with a small number of targeted unit tests.

We kept unit tests for exactly two categories:

Pure functions with complex logic: Algorithms, calculations, data transformations. These are genuinely independent and benefit from exhaustive input/output testing.
Edge cases and error paths: Specific failure modes that are expensive to reproduce in integration tests (network timeouts, race conditions, malformed data).

Everything else — request handling, business workflows, data persistence, inter-service communication — is tested through integration tests.

Refactoring Became Easy

The most dramatic improvement was refactoring velocity. When tests verify behavior rather than implementation, you can completely restructure your code without touching a single test. The tests answer: "Does the system still do what it's supposed to do?" They don't care how.

We refactored our entire data access layer from raw SQL to an ORM. Zero integration tests changed. We extracted three microservices from a monolith. The integration tests caught two bugs in the extraction that would have gone to production.

Compare this to the unit test world, where the same refactoring would have required rewriting hundreds of tests — tests that were testing the old implementation, not the behavior that needed to be preserved.

Conclusion

The testing pyramid (many unit tests, fewer integration tests, even fewer E2E tests) was designed for a world where integration tests were expensive and slow. Modern tooling has made integration tests fast and cheap. The pyramid should be inverted for most web applications: many integration tests, fewer unit tests, minimal E2E tests.

Don't test that your code does what you told it to do. Test that your system does what your users need it to do. The difference between these two is the difference between a passing test suite and a working product.

Tags:TechnologyTutorialGuide

Written by XQA Team

Our team of experts delivers insights on technology, business, and design. We are dedicated to helping you build better products and scale your business.

•