Security Breach Response

Detect, verify, contain, and recover — without destroying the evidence on the way.

5 min read

A security breach is not a louder version of a normal outage. It carries obligations an operational incident does not: you have to preserve evidence, you may need to involve legal, and you are running against notification clocks the moment you confirm it. This play describes how to respond to a suspected breach in a way that contains the damage without compromising your ability to understand it or your legal standing afterward.

When to use this play#

Use it the moment you suspect unauthorized access, modification, or exfiltration, not once you have confirmed it. The verification step is part of the play precisely because most triggers turn out to be something benign, and you want a process that handles both outcomes. Reach for this instead of the standard operational playbook whenever the suspected cause is an actor rather than a fault, because the added steps around evidence and notification only exist here.

What makes a security incident different#

A security incident layers three things on top of ordinary incident response: evidence preservation, legal involvement, and notification clocks. That changes how you contain. In an operational incident you might restart, wipe, or rebuild a system to restore service. Do that during a breach and you destroy the forensic record you will need to understand the attack and meet your obligations. Isolate affected systems without destroying evidence: snapshot before you change anything.

How to run it#

1. Detection and monitoring. A trigger fires from your monitoring or a report. Treat it as suspected, not confirmed, and start documenting immediately.

2. Initial response and containment. Isolate the affected systems to stop the spread, but snapshot first. Document every action with a timestamp. The goal is to stop the bleeding while preserving the scene.

3. Breach verification. Confirm it is real before you declare. Cross-reference application, system, and network logs, and explicitly rule out false positives: legitimate admin activity, scheduled maintenance, a known penetration test, or a simple misconfiguration. Declaring a breach that was actually a sanctioned pen test burns trust and triggers obligations you did not owe.

4. Impact assessment. Assess across the CIA triad: confidentiality (what was accessed), integrity (what was modified), and availability (what is down). Map the findings against any compliance obligations so you know which notification clocks apply.

5. Notification. Notify internally first so the right people are coordinating, then handle external and regulatory notification according to the clocks your impact assessment surfaced.

6. Mitigation. Close the hole. This typically spans network changes, authentication hardening, and system hardening, depending on how the actor got in.

7. Recovery. Restore from clean backups, apply patches, and test thoroughly before going live. Restoring from a backup that predates and excludes the compromise is the point; verify it is actually clean.

8. Post-incident review. Every breach review includes a control-gap analysis: what control would have prevented this, and is it now in place? A review that does not close the gap just schedules the next breach.

Detection triggers worth wiring up#

A spike in failed authentication from a single source.
Anomalous bulk data downloads.
Unauthorized data modifications.
Sustained resource anomalies that do not match normal load.
Access from unexpected geographies or IP addresses.

Build security into the pipeline#

The cheapest breach to respond to is the one your pipeline prevented. Wire security gates into CI/CD: static analysis in the build, dependency vulnerability scanning, container image scanning, and infrastructure-as-code security validation. Block deploys that carry high or critical vulnerabilities rather than waving them through with a promise to fix later.

Drill before you need it#

Run security drills on a cadence so the team is not learning the play during a real breach. Hold tabletop exercises that include an actual restore-from-backup test, run a larger simulation annually, and test your detection alerts periodically to confirm they still fire. A backup you have never restored is a hope, not a recovery plan.

Common traps#

Rebuilding before snapshotting. The instinct to restore service fast can erase the only record of what happened. Snapshot first, always.
Declaring before verifying. Calling a sanctioned pen test a breach triggers obligations and panic you did not owe. Rule out the benign explanations first.
Assessing only availability. "Is it down" is the easy question. What was accessed and what was modified are the ones that drive notification.
Skipping the control-gap analysis. A review that does not name the missing control is just a story about how you got unlucky.
Untested backups. Discovering your backups do not restore during an active breach is the worst possible time to find out.

Signals it's working#

Suspected breaches get snapshotted and documented before anyone touches the affected systems.
False positives get caught at verification instead of escalating into unnecessary notifications.
Every review ends with a named, implemented control gap.
Restore-from-backup tests pass during drills, so recovery is a known quantity, not a gamble.