A Retriever is how Coalent reaches your knowledge. Given a question, it returns the relevant raw pieces — each a Chunk. It's the only part most teams write, and the design gives you a ladder from zero-effort to total control:

You have…	Use	Effort
Qdrant / Chroma / pgvector	a shipped adapter	one line
another vector DB	extend `BaseVectorRetriever`	two methods
an existing search function	wrap it with `FunctionRetriever`	one line
several sources at once	fuse them with `CompositeRetriever`	one line
anything else	implement the `Retriever` interface	full control

✦

Bring your own client. You pass the client you configured, at the version you run — Coalent never constructs or pins it, and uses capability detection so a client upgrade doesn't break the adapter. No vendor lock-in, nothing in your lockfile to fight.

The contract

def retrieve(self, query: str, *, namespace: str | None = None) -> list[Chunk]:
    ...

Retriever is a structural protocol — any class with this method is a retriever. There's nothing to inherit (the base and adapters below are conveniences, not requirements).

A Chunk is the content plus a stable id for its source:

from coalent import Chunk

Chunk(artifact_id="confluence:98231", text="Leave policy: 21 days...", version="7")

The artifact_id is the only thing freshness depends on — derive it from the source you fetched (its document id, row key, or URL). The same id is what you fire on change, so invalidation lines up. Never hardcode it.

Shipped adapters — Qdrant, Chroma, pgvector

Pass your client and the field names; that's it:

from coalent import QdrantRetriever

retriever = QdrantRetriever(
    client=my_qdrant_client,   # you create and configure this
    collection="docs",
    embed=my_embed,            # any text -> vector function
)

from coalent import ChromaRetriever, PgVectorRetriever

ChromaRetriever(collection=my_chroma_collection, embed=my_embed)
PgVectorRetriever(connection=my_pg_conn, table="docs", embed=my_embed)

Want hybrid search, custom filters, or reranking? Pass your own search= callable — the adapter does pass-through, so you keep every vendor feature:

QdrantRetriever(client=client, collection="docs", search=my_hybrid_search)

Other vector DBs — extend the base

BaseVectorRetriever removes the boilerplate. Implement search (call your client's native API) and to_chunk (map one hit, deriving artifact_id):

from coalent import BaseVectorRetriever, Chunk

class WeaviateRetriever(BaseVectorRetriever):
    def search(self, query, namespace):
        return self.client.near_text(query, limit=6)        # your client, in full

    def to_chunk(self, hit):
        return Chunk(artifact_id=hit["doc_id"], text=hit["text"], version=hit["rev"])

A function — the escape hatch

from coalent import FunctionRetriever, Chunk

retriever = FunctionRetriever(lambda q, ns: [Chunk(h.doc_id, h.text, h.rev) for h in my_search(q)])

Multiple sources — fuse them

Have a vector store and a live tool and Confluence? Wrap them in a CompositeRetriever and hand that one retriever to the cache. It fans out to all of them, merges the evidence, and the cache builds a single understanding whose provenance spans every source — so a change to any one of them invalidates the unit.

from coalent import CompositeRetriever, SemanticCache

retriever = CompositeRetriever([vector_retriever, tool_retriever, confluence_retriever])
cache = SemanticCache(retriever, synthesizer)

A flaky source is fail-open: if one sub-retriever errors (a tool 5xx), it's logged and skipped, and the fused read still returns the healthy sources — never less than what's available.

Give each source a distinct artifact_id (e.g. confluence:hr vs tool:hr-live). If two sources return the same id with different text, both are kept and freshness for that artifact gets noisy. Prefer separate caches (or namespaces) when you'd rather keep sources isolated than fuse them.

In-memory — for trying it out

from coalent import InMemoryRetriever

retriever = InMemoryRetriever()
retriever.add("confluence:98231", "Leave policy: 21 days of annual leave.")

Use the document id as artifact_id (not the per-chunk point uuid), so all chunks of a document share one id and a single change invalidates the unit cleanly.

Embeddings — how the cache matches queries by meaning

The cache finds a hit by embedding the query and comparing it (cosine) to cached units. The embedder it uses matters a lot for hit quality:

With coalent[openai] installed and OPENAI_API_KEY set, the cache auto-uses OpenAI embeddings (text-embedding-3-small) — accurate semantic matching out of the box.
Otherwise it warns and falls back to HashingEmbedder, which matches on keyword overlap, not meaning — so "leave policy?" and "how many vacation days?" won't match. Fine for demos, not production.

Override anytime:

from coalent import SemanticCache, OpenAIEmbedder, FunctionEmbedder

# OpenAI (text-embedding-3-large for max accuracy):
cache = SemanticCache(retriever, synthesizer, embedder=OpenAIEmbedder("text-embedding-3-large"))

# or a local / self-hosted model via any callable:
# embedder=FunctionEmbedder(lambda t: my_model.encode(t).tolist())

The default HashingEmbedder is lexical. For real semantic cache hits, use OpenAIEmbedder (or your own embedder) — this is the #1 thing to set for accuracy.

Vector-search example — a shipped adapter end-to-end, with invalidation.
The Synthesizer — turns the chunks you return into understanding.

The Retriever