Agent Integrity Infrastructure · v0.3.0

Agent handoffs leak confidence. We built the fix.

Babel is a validated wire protocol that catches metacognitive poisoning — when Agent A's uncertain guess becomes Agent B's confident input. Grammar validation catches local contradictions. The chain auditor catches global corruption.

npm install babel-validate View source
+0.60 Quality gain w/ envelopes
-0.76 Loss w/ wrong metadata
60% Derived claims overconfident
12 Experiments · 5,500+ calls

Nobody named this until now

Agent A researches a topic and writes a confident summary. Agent B reads it and makes a decision. But Agent A was guessing — it just didn't say so. By Agent C, the original uncertainty has vanished entirely.

This is metacognitive poisoning — cognitive state corruption that propagates through handoff chains. It's not hallucination. It's not a retrieval failure. It's what happens when agents can read each other's words but can't read each other's minds.

Three independent research groups published work in January 2026 proving this is an architectural invariant — it cannot be eliminated by improving model quality. It's a protocol problem. Babel is the protocol-level solution.

Chain Auditor

Basis laundering — caught

Three agents pass a growth estimate through a chain. Each envelope passes grammar validation individually. Only the chain auditor sees the pattern.

AgentClaimBasisScore
Scout "Growth rate likely around 12%" DERIVED 0.65
Analyst "Growth rate is 12%" DERIVED 0.82
Strategist "12% growth confirmed" VERIFIED_DATA 0.93
⚠ HIGH POISONING RISK
Confidence inflation: +28% across 3 handoffs
Basis laundering: DERIVED → VERIFIED_DATA at seq 2
Uncertainty repackaged as verified data. Original estimate erased.

Without the auditor, the board gets "confirmed 12% growth" that was always an estimate. With it, the poisoning is flagged before the memo is written.

The same chain, with confidence preserved:

✓ CLEAN
DERIVED @ 0.65 → DERIVED @ 0.68 → DERIVED @ 0.70 · No poisoning detected

Twelve experiments. Non-overlapping CIs.

4.86
Mean score
with envelopes
4.26
Mean score
flat text only
3.50
Mean score
wrong metadata
98%
Strategy match
with envelopes

Right metadata improves decisions. Wrong metadata is catastrophic. Right > None > Wrong — confirmed across three independent experiments with non-overlapping 95% confidence intervals.

Agents generate structurally perfect Babel envelopes 100% of the time. But they're over-confident on derived claims 60% of the time — treating inferences as verified data. That's the specific failure mode the chain auditor catches.

The Protocol

Every handoff carries an envelope

Six signal types per envelope: confidence (per-assertion with evidence basis), intent, register, affect state, organizational grounds, and trajectory. Five MUST rules reject contradictions. Six SHOULD rules flag risks.

// Build a measured envelope with the fluent API
import { envelope } from 'babel-validate';

const env = envelope()
  .sender('scout-market-intel')
  .recipient('strategist-01')
  .chain(chainId, 0)
  .inform()
  .verified('Q3 revenue was $2.1M', 0.95)
  .derived('Growth rate around 12%', 0.65)
  .reported('HealthStack entering mid-market', 0.30)
  .boardFacing()
  .buildAndValidate();

// Audit a full chain for poisoning
import { auditChain } from 'babel-validate';

const audits = auditChain([env1, env2, env3]);
// → HIGH risk: basis laundering, confidence inflation

Zero dependencies. TypeScript. Works with CrewAI, LangGraph, AutoGen, or any agent framework that passes messages between agents.

Three teams proved we're right

Romanchuk & Bondar formally proved that standard agent architectures systematically conflate information transport with epistemic justification — an architectural invariant that cannot be fixed by improving model quality. Kelly's "Epistemic Suite" independently coined "confidence laundering" — exactly what our chain auditor detects. Agentic Uncertainty Quantification proposed verbalized confidence as active control signal. Our grammar rules are the harder version: they reject at the wire level.

Their formal proofs. Our implementation. Our empirical results. The full story.

Get Started

The grammar is table stakes.
The auditor is the moat.

Open source validator. Enterprise chain auditing. Consulting into teams running multi-agent workflows.

npm install babel-validate michael@hearth.so