System 04 — AI-enriched detection

Watchman

A raw alert is a noun with no verb. The EDR says "detection on a host"; the WAF says "a spike of N events." Neither says how bad, who, or what to do — so an analyst spends the first 10–30 minutes doing the same rote enrichment, every time, often at 2am. Watchman puts a bounded LLM in that gap and keeps a human in the thread.

Role design + build, end to end Mode real-time Model Claude, strict schema Boundary assess & ask, never act

The problem

The mechanical first pass on an alert — pull the full event context, judge severity, identify the actor, decide what to do — is exactly the work that fatigues analysts and exactly the work a constrained model is good at. But "let an LLM handle alerts" is also how you get an automated system confidently closing a real incident. The problem is not "can AI triage"; it is triaging with the model on a short, auditable leash.

What I built

Two real-time pipelines that sit between the raw alert and the analyst.

SOC alert pipeline — a webhook drops the alert onto a queue (returns immediately, absorbs spikes); a long-lived processor re-queries the detection backend for the full raw event set (the webhook is only a pointer), backing off for indexing lag; Claude returns a strict-schema assessment — severity, category, impact, actor, recommendation, confidence, and an explicit "needs clarification" flag; the affected human is resolved across identity systems; a structured card goes to the right channel and a threaded message to the right person.
The human loop — if the model flags it needs input, it asks the actor in-thread. The analyst and actor converse with the bot; it re-assesses on every reply. The case auto-closes only when confidence clears a set threshold and nothing is outstanding — and the card turns green. Otherwise it stays open with a human.
WAF spike triage — an edge worker catches the WAF spike notification, pulls the actual firewall event log, aggregates it (path, IP, country, action, IP history), hands a pre-aggregated summary to Claude, and posts a classified verdict — credential stuffing vs DDoS vs a bad rule vs a benign bot — in about 30 seconds, replacing a 10+ minute manual dashboard dance.

  ┌────────┐     ┌────────────┐     ┌──────────────────────────────────────────┐
  │ EDR /  │ ──▶ │ spike-     │ ──▶ │ re-query full event set                  │
  │ WAF    │     │ safe       │     │ Claude · strict schema (severity, actor, │
  │ fires  │     │ queue      │     │ confidence, recommendation, needs-input?)│
  └────────┘     └────────────┘     │ resolve human across identity systems    │
                                    └──────────────────────────────────────────┘

                                         ▼
                                ┌─────────────────────────────────────────────┐
                                │ Slack: channel card + threaded message      │
                                │ if needs-input: bot asks the actor in-thread│
                                │ re-assess each reply · case auto-closes at  │
                                │ confidence ≥ T with nothing open · card →   │
                                │ green                                       │
                                └─────────────────────────────────────────────┘

  WAF lane: spike → edge worker pulls + aggregates
  event log → Claude classifies → pre-classified
  verdict in ~30s

watchman — assess, ask, hand to a human; never act alone

Design decisions

Re-query the full event set; never trust the alert stub

The webhook is a pointer, not evidence. A real assessment needs the raw events, so the processor goes back and fetches them before the model sees anything.

Trade-off the backend has indexing lag, so this means polling with backoff and a hard timeout — latency traded for fidelity, which is the right trade for triage.

Strict output schema, not free text

The model returns machine-consumable fields — severity, actor, confidence, needs-clarification — so downstream code branches deterministically on the assessment instead of parsing prose.

Trade-off you constrain the model and have to design the schema well; the payoff is that the LLM's output is data the system can act on predictably.

Confidence-gated close, human in the thread

The system never silently closes on a guess. It closes only above a confidence threshold with no open questions; otherwise it pulls the actual person into the thread and re-evaluates on every reply.

Trade-off some cases stay open longer. In security that is the correct behaviour, not a regression.

It assesses and asks — it never acts

No autonomous remediation. Watchman enriches, attributes, recommends, and chases up. It does not touch production. Model inputs and outputs are persisted for review.

Trade-off less "automation magic" than a demo. An LLM that can act unreviewed on EDR alerts is an incident generator in a domain where the adversary writes the inputs.

Operating profile

The analyst's channel stops carrying raw alerts and starts carrying structured, attributed, actionable assessments addressed to the right person. The bot does the rote first pass and the chase-up; humans review and decide. High-confidence, no-open-questions cases close themselves; everything else keeps a human in the loop. This is the bounded-AI position applied to the noisiest part of the day.

What I would change

The SOC pipeline and the WAF lane evolved separately and now duplicate assessment logic; they should share one schema and one assessment core. And the assessment prompt deserves a real evaluation harness — prompt changes are currently spot-checked rather than regression-tested, which is fine until the day it isn't.