This is a single paste-and-run script (no API key, no services) that exercises the whole loop: build, semantic hit, source change → surgical invalidation, and durable persistence. Swap the in-memory pieces for your real ones and nothing else changes.

The script

from coalent import (
    SemanticCache, InMemoryRetriever, StubSynthesizer, SQLiteCognitionStore,
)

# 1. Sources. In production this is your vector DB / tools — here, in-memory.
#    artifact_id is the document's natural id (derived from the source, not hardcoded).
def doc_id(page) -> str:
    return f"confluence:{page['id']}"

pages = [
    {"id": "98231", "body": "Leave policy: full-time staff get 21 days of annual leave."},
    {"id": "44120", "body": "Remote work: up to 3 days per week with manager approval."},
]

retriever = InMemoryRetriever()
for page in pages:
    retriever.add(doc_id(page), page["body"], version="1")

# 2. The cache — durable, so it survives a restart.
store = SQLiteCognitionStore("coalent.db")
cache = SemanticCache(retriever, StubSynthesizer(), store=store)

# 3. Cold read — builds and caches a unit.
ctx = cache.get("how much annual leave do we get?")
print("hit:", ctx.cache_hit)        # False
print("raw:", ctx.raw_text)         # has "21 days"

# 4. Warm read — a rephrase hits the same unit by meaning.
print("hit:", cache.get("what is the leave allowance?").cache_hit)   # True

# 5. A source changes — derive the SAME id and invalidate.
changed_page = {"id": "98231", "body": "Leave policy: now 25 days of annual leave."}
retriever.add(doc_id(changed_page), changed_page["body"], version="2")   # re-ingest
result = cache.source_changed(doc_id(changed_page), text=changed_page["body"])
print("dirtied:", result.dirtied)   # the unit that used confluence:98231

# 6. Next read rebuilds just that unit — fresh.
fresh = cache.get("how much annual leave do we get?")
print("hit:", fresh.cache_hit, "| raw:", fresh.raw_text)   # rebuilt, now "25 days"

print(cache.stats())                # {'units': 1, 'tracked_artifacts': 2}

What just happened

Sources carry natural ids (confluence:<id>), derived by doc_id() — never literals.
The cache is backed by SQLite, so units persist and the invalidation graph rebuilds on restart.
A cold read builds a unit; a rephrase is a semantic hit (no rebuild).
A source change dirties only the unit that used it; the next read rebuilds just that one, with the new value.

Persistence across a restart

Run the script again (or in a new process pointing at coalent.db) and step 3 is a hit from disk — and source_changed still finds and dirties the right unit, because the indexes rebuilt on load.

Taking it to production

Swap one piece at a time — the rest is unchanged:

Real retrieval → replace InMemoryRetriever with your vector retriever (or tools / GraphRAG).
Real understanding → replace StubSynthesizer with LLMSynthesizer(OpenAIProvider()).
Automatic freshness → wire webhooks or a FreshnessPolicy instead of calling source_changed by hand.

How it works — the architecture behind this script.
Benchmark — the cheap+fresh result, measured.