Story-Based Monitoring

You're Shipping Code
You Didn't Write.

Your AI agent generated a pull request. Tests pass. CI is green. You merge it. The code runs perfectly - it just skips the fraud check before processing payments. Traditional monitoring sees success. You see a compliance violation six hours later.

The Problem: Silent Failures

All Green. All Wrong.

Same trace. Two completely different interpretations.

Traditional MonitoringALL CLEAR

validate_cart

12ms

process_payment

234ms

confirm_order

8ms

api.response

4ms

4 spans, 0 errors, 258ms total

Story-Based MonitoringVIOLATION

validate_cart

✓

check_fraud

SKIPPED

process_payment

✓

confirm_order

✓

Sequence violation: fraud check required before payment processing

Every span returned 200 OK. Traditional monitoring sees success. Story-based monitoring catches the sequence violation: fraud check skipped before payment processing. This is a silent failure - successful execution of the wrong behavior.

The Root Cause

Monitoring Starts From
the Wrong End.

We built observability backward. We instrument code. Capture telemetry. Then try to figure out what it means. But we never wrote down what was supposed to happen in the first place.

TRADITIONAL APPROACH

Ship Code

→

Capture Telemetry

→

Guess Intent

You're reverse-engineering what the code should have done.

STORY-BASED APPROACH

Declare Intent

→

Ship Code

→

Verify Behavior

Intent is the starting point, not an afterthought.

Story-based monitoring flips the model. You write down what should happen before the agent runs. Then telemetry proves whether it did.

The Solution

The Missing Artifact:
OTEL Behavioral Manifest

A single file that connects intent, implementation, and verification. It's what you write before your agent ships code. It's what telemetry gets compared against after.

checkout.manifest.yamlOTEL Behavioral Manifest

story: "Process checkout with fraud check"

intent:
  actor: "checkout-service"
  goal: "Validate cart, check fraud, process payment, confirm order"

steps:
  - name: "validate cart"
    expects: "validate_cart span with items"
  - name: "check fraud"
    expects: "check_fraud span before process_payment"
    # ^ This is the step traditional monitoring misses
  - name: "process payment"
    expects: "process_payment span with amount"
  - name: "confirm order"
    expects: "confirm_order span with order_id"

verification:
  mode: "strict"
  on_sequence_violation: "alert"

01

Intent

You declare what the agent should do before it runs. This becomes ground truth - the source of "should" in your system.

02

Implementation

Standard OpenTelemetry spans capture what actually happened: which operations ran, in what sequence, with what results.

03

Verification

Runtime telemetry is validated against the manifest. A visual storyboard shows what matched the intent - and what didn't.

Why This Matters Now

AI Agents Need
Behavioral Guardrails

Agents don't just write code - they ship it. They make autonomous decisions. And they can succeed at every step while completely missing the point. Traditional monitoring wasn't built for this.

Silent failures are the new norm

An agent can complete every operation successfully and still violate the requirement. Status codes won't catch it.

You need ground truth

The manifest becomes the source of "should" in your system. Not documentation. Not tribal knowledge. A versioned, durable artifact.

Verification happens at runtime

The storyboard doesn't just show what happened - it shows whether it matched the declared intent. Instantly.

It's still OpenTelemetry

No proprietary formats. No vendor lock-in. Standard OTEL spans, enriched with behavioral verification. Use your existing instrumentation.

See It In Action

Try the Interactive Demo

Explore a real working example using Backlog.md - an open-source task manager. See how manifests validate behavior, catch silent failures, and generate storyboards in real-time.

Launch Interactive Demo →

No signup required. Fully interactive in your browser.

You're Shipping CodeYou Didn't Write.