Story-Based Monitoring

You're Shipping Code
You Didn't Write.

Your AI agent generated a pull request. Tests pass. CI is green. You merge it. The code runs perfectly - it just skips the fraud check before processing payments. Traditional monitoring sees success. You see a compliance violation six hours later.

The Problem: Silent Failures

All Green. All Wrong.

Same trace. Two completely different interpretations.

Traditional MonitoringALL CLEAR
validate_cart
12ms
process_payment
234ms
confirm_order
8ms
api.response
4ms
4 spans, 0 errors, 258ms total
Story-Based MonitoringVIOLATION
validate_cart
check_fraud
SKIPPED
process_payment
confirm_order
Sequence violation: fraud check required before payment processing

Every span returned 200 OK. Traditional monitoring sees success. Story-based monitoring catches the sequence violation: fraud check skipped before payment processing. This is a silent failure - successful execution of the wrong behavior.

The Root Cause

Monitoring Starts From
the Wrong End.

We built observability backward. We instrument code. Capture telemetry. Then try to figure out what it means. But we never wrote down what was supposed to happen in the first place.

TRADITIONAL APPROACH
Ship Code
Capture Telemetry
Guess Intent

You're reverse-engineering what the code should have done.

STORY-BASED APPROACH
Declare Intent
Ship Code
Verify Behavior

Intent is the starting point, not an afterthought.

Story-based monitoring flips the model. You write down what should happen before the agent runs. Then telemetry proves whether it did.

The Solution

The Missing Artifact:
OTEL Behavioral Manifest

A single file that connects intent, implementation, and verification. It's what you write before your agent ships code. It's what telemetry gets compared against after.

checkout.manifest.yamlOTEL Behavioral Manifest
story: "Process checkout with fraud check"

intent:
  actor: "checkout-service"
  goal: "Validate cart, check fraud, process payment, confirm order"

steps:
  - name: "validate cart"
    expects: "validate_cart span with items"
  - name: "check fraud"
    expects: "check_fraud span before process_payment"
    # ^ This is the step traditional monitoring misses
  - name: "process payment"
    expects: "process_payment span with amount"
  - name: "confirm order"
    expects: "confirm_order span with order_id"

verification:
  mode: "strict"
  on_sequence_violation: "alert"
01

Intent

You declare what the agent should do before it runs. This becomes ground truth - the source of "should" in your system.

02

Implementation

Standard OpenTelemetry spans capture what actually happened: which operations ran, in what sequence, with what results.

03

Verification

Runtime telemetry is validated against the manifest. A visual storyboard shows what matched the intent - and what didn't.

Under the hood

From Logs to Events

Traditional logs repeat the same text millions of times. Event templates store the structure once and extract only the variables - cutting bytes and surfacing what matters.

TRADITIONAL LOG175 bytes
Payment declined: insufficient funds. Customer cust_8x9k2m attempted $299.99 charge but account balance is $45.20. Retry recommended with alternate payment method.
Watching...
Why This Matters Now

AI Agents Need
Behavioral Guardrails

Agents don't just write code - they ship it. They make autonomous decisions. And they can succeed at every step while completely missing the point. Traditional monitoring wasn't built for this.

Silent failures are the new norm

An agent can complete every operation successfully and still violate the requirement. Status codes won't catch it.

You need ground truth

The manifest becomes the source of "should" in your system. Not documentation. Not tribal knowledge. A versioned, durable artifact.

Verification happens at runtime

The storyboard doesn't just show what happened - it shows whether it matched the declared intent. Instantly.

It's still OpenTelemetry

No proprietary formats. No vendor lock-in. Standard OTEL spans, enriched with behavioral verification. Use your existing instrumentation.

See It In Action

Try the Interactive Demo

Explore a real working example using Backlog.md - an open-source task manager. See how manifests validate behavior, catch silent failures, and generate storyboards in real-time.

No signup required. Fully interactive in your browser.