How to Decide What to Test

A thought process for choosing the right tests for a feature

5 min read

Most arguments about testing are really arguments about vocabulary. Two engineers say "unit test" and mean completely different things, and neither realizes it. The fix is to separate two ideas that the common terms smash together: a test's purpose and its boundary.

Purpose vs boundary#

A test's purpose is the question it answers — does this logic produce the right output, do these two systems agree on a contract, does the whole feature hang together. A test's boundary is the slice of the system it actually executes — one function, one component, one service, the full stack.

The words "unit" and "integration" are slippery because each can describe either axis. "Unit" might mean "tests one unit of behavior" (purpose) or "runs inside a single module with nothing real around it" (boundary). Once you notice this, a lot of confusion dissolves. I find it more useful to talk about four kinds of tests by their role.

Four kinds of tests by role#

End-to-end tests exercise the happy paths through the real system. They are a sanity check that the application works, and they double as proof that all the pieces fit together.
Integration tests verify that two parts collaborate correctly — your code against the database, against another service, against a library — using the real dependency or a realistic stand-in for it.
Unit tests check isolated input-to-output with no live dependencies. You mock external APIs, the database, and any internal code the unit leans on, so the only thing under test is the logic itself.
Collaboration tests verify the interactions between components, or between concurrent tasks — you mock the collaborators and assert on how they are called.

A step-by-step way to decide#

When I pick up a feature, I walk through the same sequence rather than guessing at a coverage number.

Identify the user's expectations — what they expect to be true when this works.
Put those expectations in the order you will build them.
Determine the happy paths. The number of distinct happy paths drives how many end-to-end tests you need.
Take the first expectation and determine what is needed to satisfy it.
Determine where that thing comes from.
Trace the data flow to get it. For example: the frontend requests it from the backend, the backend reads it from the database, the backend returns it, the frontend renders it.
List the systems involved along that path — frontend, backend, database.
Is there real logic in producing the result? If yes, that logic earns unit tests. If no, none.
Are the systems dependent on one another? If yes, that communication earns integration tests.
Is the feature doing more than one thing at once? If yes, it earns collaboration tests.
Tally up which tests you actually need, and write those.

This turns "what should I test?" from a vibe into a short walk down the data path.

Cost versus confidence#

Each level buys you a different deal:

Acceptance / end-to-end gives excellent confidence but is slow and expensive to run and maintain.
Integration gives great confidence at moderate cost and moderate speed.
Unit gives narrower confidence but is cheap to write and very fast to run.

A few rules of thumb I lean on:

If your higher-level tests are fast, reliable, and cheap to change, you may not need lower-level ones underneath them.
If those higher-level tests are slow, brittle, or expensive to change, you do need the cheaper ones.
Some redundancy across levels is healthy, not waste. The same behavior failing two tests tells you where it broke.
Mock network calls in unit and integration tests.
Mock third-party services even in end-to-end tests — you do not want a vendor outage failing your build.
Testing at a higher level makes refactoring easier, because those tests bind to behavior rather than to internal structure.

A concrete example: a React app#

For a typical React frontend I split it like this:

Integration-test components and the router, where pieces collaborate and rendering meets state.
Unit-test the utilities, custom hooks, and state logic — reducers, selectors, thunks — where the real logic lives.
Reserve end-to-end for happy-path sanity checks across the whole app.

That keeps the fast tests where the logic is and the slow tests where integration risk is, which is exactly the trade-off the cost-versus-confidence table is pointing at.

A note on the two schools#

You will also run into two philosophies of how unit tests assert. State-based testing (often called Detroit or Chicago style) sets up inputs, runs the code, and checks the resulting state or return value. Behavior-based or interaction testing (the London or mockist style) uses mocks and asserts on how collaborators were called. Neither is wrong; state-based tests couple less to implementation, while interaction tests shine for the collaboration tests above. Knowing which school a test belongs to is just one more way of being clear about its purpose.

If you keep your features small, this whole exercise gets shorter — see keeping stories small.