Open source · Apache-2.0

Stop feeding your AI chunks.

Coalent caches the understanding your LLM builds — not the raw chunks — reuses it across queries and agents, and refreshes only what a source change touched. Matches chunk-RAG's best accuracy at a fraction of the tokens — plus freshness and compounding reuse RAG can't give.

Get started →★ Star on GitHub

pip install coalent · sits on top of the stack you already run

Up to 63%
fewer context tokens
structured-regime benchmark

Traditional RAG

Chunks in. Answers out.

High tokens. Low understanding.

Token usage100%

High

Coalent

Understanding in. Answers out.

Low tokens. High understanding.

Coalent KnowledgeFresh

Summary

Concise understanding of the source.

Claims

Key assertions and takeaways.

Facts

Atomic facts, numbers, dates.

Provenance

Source links and citations.

Reusable Understanding Layer

LLM

Token usage37%

Low

Reused Across Agents

Provenance-Aware

Always Fresh

Lower Token Costs

Same answer, a fraction of the tokens — try it live ↓

Works with the tools you already run

QQdrant

ChChroma

pgpgvector

GrGraphRAG

AiOpenAI

AnAnthropic

LgLangGraph

CwCrewAI

MMCP tools

CfConfluence

JJira

SqSQLite

QQdrant

ChChroma

pgpgvector

GrGraphRAG

AiOpenAI

AnAnthropic

LgLangGraph

CwCrewAI

MMCP tools

CfConfluence

JJira

SqSQLite

See it live

Change a source. Watch it stay correct.

Understanding it can act on — and that never goes stale. Pick a question, then change a source.

Understanding · freshserved · cache hit

Customers can be refunded within 30 days; partial refunds are pro-rated.

Window

30 days

Partial

Pro-rated

Auto-approve

Under $500

Sourceszendesk:refund-guidenotion:billing-policy

Edit a source and watch it stay correct.

↑ live — only the unit that used the changed source rebuilds.

Why Coalent

Understanding, reused, and always fresh.

Real understanding

Coalent synthesizes your sources into a structured, decision-ready briefing — summary, claims, facts — not a pile of fragments.

Reused everywhere

Built once, matched by meaning, reused across every query, session, and agent. Less retrieval, less LLM spend, lower latency.

Always fresh

When a source changes, only the understanding that used it is rebuilt — surgically, the moment it changes. Never stale.

Measured, not asserted

Same answers. Always fresh. A fraction of the tokens.

On the structured-regime benchmark (synthetic templates; real embeddings, a deterministic number check, four answer models), Coalent matches naive dense RAG's accuracy at ~a third of the context tokens — routes to the right source ~every time (route@1 ≈ 1.00), answers multi-hop questions RAG can't, and never goes stale.

Naive dense RAG

Accuracybaseline

Context / read126 tok

Multi-hop0%

Freshnessfresh

re-reads the full context every query

Normal cache (raw)

Accuracy= naive

Context / read126 tok

Multi-hop0%

Freshnessstale

no token savings — and goes stale on change

Coalent

Accuracy= naive

Context / read47 tok

Multi-hop100%

Freshnessfresh

⅓ the tokens · route@1 ≈ 1.00 · always fresh

Real news (MultiHopRAG, pre-registered n=605): matches the best chunk-RAG arm at 0.79x its tokens · 95% null honesty

= naive accuracy (structured) · 0% stale · 0% fewer tokens (structured) · 0% of the tokens matches best chunk-RAG arm (real news) · see the benchmark →

Sits on your stack

Bring any retriever. Keep your model.

Coalent is the layer above retrieval — not a replacement. Plug in your vector DB, GraphRAG, tools, or APIs, and your model of choice. It's one call from any agent framework.

QdrantChromapgvectorGraphRAGOpenAIAnthropicLangGraphCrewAIMCP toolsConfluenceJira

from coalent import SemanticCache, QdrantRetriever, LLMSynthesizer, OpenAIProvider

cache = SemanticCache(
    QdrantRetriever(client=qdrant, collection="docs", embed=embed),
    LLMSynthesizer(OpenAIProvider()),
)

ctx = cache.get("what is our refund policy?")   # decision-ready, fresh

“

Store understanding,
not data.

Caches go stale because they store answers without knowing where they came from. Coalent keeps the lineage — so it always knows exactly what to forget.

The Coalent principle

Community

Build it with us.

Questions, ideas, war stories — come hang out. We're building Coalent in the open, and Discord is the fastest way to reach us.

Join our Discord

Get unstuck, share what you're building, and help shape the roadmap. The fastest way to reach the maintainers.

Join the server →

Star us on GitHub

Open source, Apache-2.0. Star it, open issues, send PRs — every star helps an OSS project find its people.

View the repo →

Give your AI living understanding.

Open source, Apache-2.0. Sits on top of what you already run.

Read the docs →GitHub