Why beliefs

A bug you've already debugged

You tell your coding agent: "we use the new use() hook in this codebase, not useEffect." It acknowledges. Eight turns later, it suggests useEffect again, because the training-time prior won and the context window doesn't distinguish "user corrected me" from "I assumed this once."

You've seen this. It looks like flakiness, or model drift, or context rot. It's none of those. Your agent is missing a primitive: a belief state over its environment. A structured model of what's been observed, what's currently held true, and what evidence would change it.

1Turn 3   ─▶ "Use useEffect here"           ⟵ training prior. no source.
2Turn 4   ─▶ User: "We use use() instead"   ⟵ correction. acknowledged.
3Turn 12  ─▶ "Add a useEffect for cleanup"  ⟵ first prior wins. drift wins.

The training prior and the user's correction look identical to the model. There's no tracked confidence, no provenance, no detection that use() supersedes useEffect for this codebase.

With beliefs, the same trace becomes:

1// Turn 3
2await beliefs.add('Use useEffect for side effects', { confidence: 0.4 })  // training prior
3
4// Turn 4
5await beliefs.after('User: we use the new use() hook here, not useEffect.')
6// engine extracts user-assertion, supersedes prior with higher evidence weight
7
8// Turn 12
9const context = await beliefs.before(userMessage)
10// context.prompt surfaces use() as the active convention, with user-asserted provenance
11// the agent doesn't drift back to the training default

A bigger context window doesn't fix this. A 200K window carries more conflicting claims with more fluency, and the model interpolates across all of it. Wider window, more drift, not less.

Memory and RAG don't fix it either

mem0, LangGraph state, the Claude memory tool, RAG over a vector store. All retrieve. None hold a model of what's currently true.

Dimension	Memory / RAG	Beliefs
What it stores	Text chunks and vectors	Structured claims with confidence and evidence
Uncertainty	None. Every chunk looks equally valid	Tracked per claim, separated from how much it's been investigated
Conflicts	Returns both, or last-write-wins	Detected, tracked, resolved by source reliability
Multi-agent fusion	Per-agent stores, no shared coherence	One fused world. Cross-agent contradictions become visible
Decay	Falls out of context randomly	Beliefs lose strength as evidence ages, in a calibrated way
Provenance	"This chunk was retrieved"	Who stated it, what evidence, how confidence evolved
What is missing	No concept	Gaps are first-class. They drive the next action

Memory recalls. Beliefs judge.

What this unlocks

Hidden assumptions become auditable. Named, scored, traceable.
Uncertainty becomes directional. The system ranks what would reduce the most uncertainty as the next move.
Contradictions become signal. Multi-agent overlap feeds the fusion engine. It resolves through evidence.
Gaps are first-class. The agent knows what it doesn't know.

Where to next

Install

API key, install, scopes.

Learn more

Concepts

Beliefs, intent, clarity, moves, world.

Learn more

Patterns

Single-turn, multi-turn, streaming, multi-agent.

Learn more

A bug you've already debugged

1Turn 3   ─▶ "Use useEffect here"           ⟵ training prior. no source.
2Turn 4   ─▶ User: "We use use() instead"   ⟵ correction. acknowledged.
3Turn 12  ─▶ "Add a useEffect for cleanup"  ⟵ first prior wins. drift wins.

The training prior and the user's correction look identical to the model. There's no tracked confidence, no provenance, no detection that use() supersedes useEffect for this codebase.

With beliefs, the same trace becomes:

1// Turn 3
2await beliefs.add('Use useEffect for side effects', { confidence: 0.4 })  // training prior
3
4// Turn 4
5await beliefs.after('User: we use the new use() hook here, not useEffect.')
6// engine extracts user-assertion, supersedes prior with higher evidence weight
7
8// Turn 12
9const context = await beliefs.before(userMessage)
10// context.prompt surfaces use() as the active convention, with user-asserted provenance
11// the agent doesn't drift back to the training default

A bigger context window doesn't fix this. A 200K window carries more conflicting claims with more fluency, and the model interpolates across all of it. Wider window, more drift, not less.

Memory and RAG don't fix it either

mem0, LangGraph state, the Claude memory tool, RAG over a vector store. All retrieve. None hold a model of what's currently true.

Dimension

Memory / RAG

Beliefs

What it stores

Text chunks and vectors

Structured claims with confidence and evidence

Uncertainty

None. Every chunk looks equally valid

Tracked per claim, separated from how much it's been investigated

Conflicts

Returns both, or last-write-wins

Detected, tracked, resolved by source reliability

Multi-agent fusion

Per-agent stores, no shared coherence

One fused world. Cross-agent contradictions become visible

Decay

Falls out of context randomly

Beliefs lose strength as evidence ages, in a calibrated way

Provenance

"This chunk was retrieved"

Who stated it, what evidence, how confidence evolved

What is missing

No concept

Gaps are first-class. They drive the next action

Memory recalls. Beliefs judge.

What this unlocks

Hidden assumptions become auditable. Named, scored, traceable.

Uncertainty becomes directional. The system ranks what would reduce the most uncertainty as the next move.

Contradictions become signal. Multi-agent overlap feeds the fusion engine. It resolves through evidence.

Gaps are first-class. The agent knows what it doesn't know.