Is It Actually Production-Ready?

A reusable rubric for the question everyone answers too optimistically

3 min read

"Is it production-ready?" gets answered with optimism far more often than with evidence. To take the optimism out of it, I run an application through a reusable rubric and score each check honestly. The scoring legend is simple: Yes, Kinda, No or Unsure, and Not Applicable. "Kinda" and "Unsure" are not passing grades; they're flags telling you where to look before you ship.

Monitoring and alerting#

Can you tell when something is wrong before your users tell you? You want uptime monitoring, error monitoring with alerts that actually reach a human, application performance monitoring to see how the system behaves under load, and thresholds on critical metrics so alerts fire on the things that matter rather than drowning everyone in noise.

Logging#

Logs are how you reconstruct what happened after the fact. Confirm you have structured logging, environment-appropriate log levels, and an audit trail that records authentication successes and failures, plus sensitive operations like deletions and downloads. Exceptions should land at ERROR level, and all of it should be centralized and searchable, because logs you can't query are logs you don't really have.

Observability#

Beyond logs, can the system describe its own health? Look for health-check endpoints, key metrics exposed for inspection, and dependency health made visible so you can tell whether a problem is yours or an upstream service's.

Disaster recovery#

Hope is not a recovery strategy. You need a documented backup and restore procedure, and crucially, the recovery has to have been tested at least once. An untested backup is a guess about what will happen during the worst moment to be guessing.

Data protection#

Protecting the data is non-negotiable: encryption at rest and in transit, and no personally identifiable information in your logs. That second one catches a lot of teams who carefully encrypt the database and then leak the same data into a log file.

Application security#

Tighten the perimeter: TLS on all web traffic and APIs, secrets stored securely and never committed to version control, and dependency vulnerabilities resolved before deploy rather than discovered after.

Testing and QA#

The tests have to mean something. Aim for meaningful coverage of critical logic, edge cases, and error handling — not a coverage percentage achieved by testing getters. The tests run in CI/CD, they all pass before production, and the critical paths and integration points are genuinely exercised, since those are where real failures live.

Infrastructure and deployment#

Finally, the machinery underneath: infrastructure as code so environments are reproducible, an automated deploy pipeline instead of manual steps someone might fumble, a tested rollback so you can retreat safely, schema changes via migrations rather than ad-hoc edits, and backups that have been tested end to end.

How to use the rubric#

Score every check with the four-value legend and resist the temptation to round "Kinda" up to "Yes." The value of the rubric is precisely in the uncomfortable entries. A wall of "Yes" with three honest "No or Unsure" marks is far more useful than a wall of optimistic green, because it tells you exactly where the risk lives before your users find it for you. Production readiness isn't a feeling. It's a set of questions with verifiable answers, and this rubric is just a structured way of refusing to skip them.