Personal Product AI-Powered

Gooey Ops

A unified operations-monitoring platform: uptime checks, SSL certificate tracking, third-party status aggregation, and AI-assisted alert triage and incident summaries.

Highlights

  • AI-assisted alert triage and automatically generated incident summaries to cut on-call noise
  • Pluggable checker architecture (HTTP, TCP, DNS, SSL) behind a single dispatch interface
  • Provider-based notification fan-out (email, webhook, Slack) with escalation, quiet hours, and repeat intervals

Skills

AI/LLM integrationBackend systems designConcurrency & worker patternsMulti-tenant authorizationStrict TypeScript / ESM

Overview#

Gooey Ops is a monitoring platform I built to consolidate the things teams usually stitch together from three or four separate tools: uptime monitoring, SSL certificate expiry tracking, aggregation of third-party service status pages, and the alerting layer that ties them together. On top of that alerting layer, it uses AI to triage incoming alerts and generate human-readable incident summaries, so on-call engineers see signal instead of a wall of raw notifications. It's an npm-workspaces monorepo built around a Fastify + Prisma + TypeScript API.

The Problem#

Operational signal is fragmented. Uptime lives in one tool, certificate expiry in another, upstream-vendor outages in a dozen status pages nobody watches, and alert routing in yet another product. Gooey Ops unifies these behind one org-scoped data model and one alerting engine.

My Role#

Founder and sole engineer — architecture, data model, checker and worker subsystems, and notification layer.

Architecture & Approach#

The system follows a clean routes → services → Prisma request flow, with services owning business logic and verifying org membership internally rather than trusting the route layer alone. Everything is org-scoped through an organization/membership model with a role hierarchy (owner > admin > member > viewer).

Two subsystems carry most of the interesting design:

  • Checkers implement a single Checker interface, one per protocol (HTTP, TCP, DNS, SSL), dispatched through a registry so new probe types are a single addition rather than a refactor.
  • Workers are polling loops with a shared shape — configurable poll interval and concurrency, an active-check counter, and a process-then-persist-then-alert cycle. They drive the checkers, transition entity status, and hand off to the notification layer.

Technical Highlights#

  • AI-assisted alerting. An AI layer sits on top of the raw check results to triage incoming alerts and generate concise, human-readable incident summaries from the underlying failures — turning a noisy stream of notifications into something an on-call engineer can act on quickly.
  • Provider-based notifications. Email, webhook, and Slack providers implement a common interface and are fanned out by a notification service, layered under an alert-policy model with escalation delays, repeat intervals, and quiet hours.
  • Security-first auth. Argon2id password hashing, short-lived JWT access tokens with rotating refresh tokens, account lockout on repeated failures, and dedicated audit/security-event logging.
  • Strict ESM + TypeScript discipline. Native ESM with required .js import extensions and exactOptionalPropertyTypes enforced throughout — the kind of constraints that keep a growing codebase honest.

Skills Demonstrated#

AI/LLM integration for alert triage and incident summarization, backend systems design, concurrency and worker-pool patterns, pluggable/extensible architecture, multi-tenant authorization, and disciplined strict-TypeScript engineering.