Coalent has a small number of moving parts. Let's meet them in plain language, then see how they fit together.
The four pieces
Retriever — how Coalent fetches your information.
You point it at wherever your knowledge lives — a vector database, a set of documents, an API, or a tool. Given a question, it returns the relevant raw pieces (each one is a Chunk). This is the only part most teams write, and it's a single method. → The Retriever
Synthesizer — how raw information becomes understanding. It takes those raw chunks and produces a short, structured understanding — a summary plus the key facts and claims — using your LLM. It also notes which sources it actually used. → The Synthesizer
Cognition unit — one cached understanding. The result of the above: a small briefing for one kind of question. It keeps the understanding, the raw evidence it was built from, and its sources (so Coalent knows what it depends on).
Invalidation — keeping it correct. When one of those sources changes, Coalent marks only the cognition units that used it as stale. They rebuild on the next read — nothing else is touched. → Provenance & freshness
The big picture
cache.get("why is the search service slow?")- 1Embed & matchEmbed the query; reuse a cached unit when the meaning matches.
- 2RetrieverOn a miss, fetch the evidence for the query (your retrieval).your vector DB · GraphRAG · tools · APIs
- 3SynthesizerBuild structured understanding and cite the sources it used.your LLM
- 4Cache the unitStore understanding + retained raw + provenance.memory · SQLite
- 5ReturnHand back the minimum decision-relevant context (raw on tap).
A change marks only the cognition units that used that source stale (via provenance). The next matching read rebuilds just those, lazily — nothing else.
- First time you ask (a miss): Coalent retrieves, synthesizes a cognition unit, and caches it.
- Ask something similar (a hit): it serves the cached understanding instantly — matched by meaning, so a rephrase still hits.
- A source changes: only the units that used it rebuild, lazily, on the next read.
Matched by meaning
You never label your questions or configure routing. Coalent matches a question to cached understanding by its embedding — so "how much annual leave?" and "what's the leave allowance?" land on the same unit. The cache organizes itself around what questions mean.
Decision-ready, not a data dump
A cognition unit isn't a pile of chunks — it's distilled, structured understanding, and on each read Coalent hands back the minimum slice relevant to your question (with the raw always reachable if the model needs a specific detail). Less noise to the LLM means better answers and fewer tokens. → Context intelligence
Next
- The Retriever — connect your data.
- The Synthesizer — build understanding with your LLM.
- Examples — wire it to real systems.