Self-hosted · MCP-native · MIT-licensed

Memory for AI agents that stays on your servers — and tells you when it's lying.

Mnemos is a self-hosted memory + evidence layer for AI agents. Every recall ships with citations, every contradiction surfaces explicitly in the query response, and every byte stays on your infrastructure. Drops into Claude Desktop, Cursor, LangGraph, or any MCP-compatible runtime in one config block.

Run demo now View on GitHub See architecture

1.0holdout recall pass rate

1.0critical pass rate

5 minto first contradiction check

5 backendsSQLite, Postgres, MySQL, libSQL, memory

Built for teams where wrong memory is a business risk: regulated B2B AI, internal copilots, and on-prem agent platforms.

Own your memory infrastructureSingle Go binary, MIT license, and storage in your stack.

Reduce hallucination risk from memory driftContradictions are first-class and surfaced before response generation.

Pass audits fasterEvery claim links to evidence and every correction carries a reason.

Integrate with your current agent runtimeUse HTTP, CLI, gRPC, or MCP with LangGraph, CrewAI, and custom systems.

Live proof

Same prompt, safer outcome when memory is auditable.

Mnemos helps your system avoid confidently repeating stale or contradictory memory by forcing evidence and contradiction handling into the loop.

Without Mnemos

Agent answer

"Deployment succeeded in production."

No contradiction surfaced.
No evidence shown.
Wrong state repeated.

With Mnemos

Agent answer + memory audit context

"There are contradictory claims about deployment status.
Latest resolution: rolled back at 14:02."

evidence: ev_1
contradiction: [cl_a, cl_b]
confidence: 0.84

OSS today, Team in private beta, Enterprise on request. OSS is the whole product — MIT-licensed, 5 storage backends, MCP / REST / gRPC. For managed registry sync + SSO, join the Team waitlist. For air-gapped deploy + SLA, email felix@felixgeelhaar.de — founder-led, replies same business day.

Fast activation

Start with one endpoint, expand only when needed.

Mnemos exposes one HTTP API. Any language with an HTTP client can add memory in five lines — copy/paste, then own it.

import httpx, uuid
m = "http://localhost:7777"
run = str(uuid.uuid4())

# Remember something
httpx.post(f"{m}/v1/events", json={"events": [{
    "id": str(uuid.uuid4()),
    "run_id": run,
    "source_input_id": "chat-session-1",
    "content": "user prefers vegetarian options",
    "timestamp": "2026-05-03T16:00:00Z",
    "metadata": {"role": "preference"},
}]})

# Recall it later (months later, same call)
events = httpx.get(f"{m}/v1/events", params={"run_id": run}).json()

Why this matters. Every event keyed by run_id — full chain replay months later. Add structured claims (fact, decision, hypothesis) when you need contradiction detection and evidence linking. Open source, self-hosted, no SDK to lock you in.

Proof of value

When facts conflict, Mnemos surfaces it before your AI response hardens the error.

Beyond the five-line memory write, Mnemos extracts structured claims, links evidence, and detects contradictions — the parts that make replay-by-run-id more than a glorified log. No API keys required for any of the below; add an LLM provider only when you want grounded answers.

$ go install github.com/felixgeelhaar/mnemos/cmd/mnemos@latest

$ mnemos process --text "The deployment succeeded in production.
The deployment did not succeed in production. Response times averaged 45ms."
ingested 1 event · extracted 3 claims · 1 contradiction detected

$ mnemos query "What happened with the deployment?"
{
  "claims": [
    { "id": "cl_a", "text": "The deployment succeeded in production",
      "type": "fact", "confidence": 0.91,
      "evidence": [{ "event_id": "ev_1", "span": [0, 47] }] },
    { "id": "cl_b", "text": "The deployment did not succeed in production",
      "type": "fact", "confidence": 0.91,
      "evidence": [{ "event_id": "ev_1", "span": [49, 95] }] }
  ],
  "contradictions": [
    { "between": ["cl_a", "cl_b"], "kind": "polarity_conflict" }
  ]
}

$ mnemos resolve cl_a --over cl_b --reason "rolled back at 14:02"
resolved · cl_a now valid_to=2026-05-02T14:02:00Z

$ mnemos query --at 2026-05-01 "What happened with the deployment?"
# returns the pre-rollback world — same store, point-in-time view

Operational outcome. Rule-based extraction, contradiction detection, evidence linking, and point-in-time replay — all without a single LLM call. The audit chain is the differentiator: every claim points at the event it came from, every resolution carries a reason, every query at a past timestamp returns what was true then.

Where it fits

Most memory tools optimize for convenience. Mnemos optimizes for trust.

Mnemos is one of several ways to add memory to an AI app. The right choice depends on whether you need a hosted SDK, raw similarity search, a notes app, or a substrate you self-host. Mnemos is the last one — and the only one that ships with contradiction detection and replay-by-run-id built in.

Approach Best for What you give up

mem0 / Zep / Letta (hosted) Convenience, SDK polish, fast onboarding. Vendor cloud, per-call billing, contradictions silently merged, no per-claim evidence.

Vector databases Pure semantic search across raw chunks. No claim/contradiction structure, no evidence trace, no replay.

Notes apps (Notion, Obsidian, Roam) Humans organising their own thinking. Not built to be queried by an AI agent at scale; no programmatic API for memory writes.

Mnemos AI memory in stacks that can't leave your servers — regulated, on-prem, air-gapped. You operate a binary + database in exchange for ownership, auditability, and no per-call memory tax.

Architecture

Ingest · Extract · Relate · Query.

Events are immutable. Claims are derived. Relationships connect claims. Embeddings rerank. Trust falls out of confidence × corroboration × freshness. No magic.

inputs

text · git · slack

prs · markdown · pdfs

events

immutable log

append-only · run-tagged

claims

derived assertions

fact · hypothesis · decision

relationships

claim-to-claim edges

supports · contradicts · causes

query

answers + evidence

contradictions inline

Trust score · confidence × corroboration × freshness

Per-claim half-life decays freshness over time; mnemos verify ticks it back up. Trust feeds query ranking and contradiction surfacing.

StorageSQLite (default · pure-Go, no CGO) · Postgres · MySQL · libSQL/Turso · in-memory SurfaceCLI · HTTP · gRPC · MCP — pinned wire contract; all four speak the same domain types

Run it

From install to trust signal in under five minutes.

brew install felixgeelhaar/tap/mnemos
# or: go install github.com/felixgeelhaar/mnemos/cmd/mnemos@latest

mnemos process --text "The deployment succeeded in production.
The deployment did not succeed in production."

mnemos query "What happened with the deployment?"

SQLite ships with the binary. No Docker, no hosted service, no API key. Add a provider when you want grounded answers:

export MNEMOS_LLM_PROVIDER=anthropic
export MNEMOS_LLM_API_KEY=sk-ant-…
mnemos query --llm "What happened with the deployment?"

Without an LLM key. Rule-based extraction, contradiction detection, evidence linking, and point-in-time queries all work offline. Semantic search falls back to BM25 (fine on short text), and grounded answer generation falls back to a template. Wire a provider when you need the LLM paths; the audit chain stays the same either way.

Inspect the audit chain

mnemos history --kind=claim cl_a   # every state change with reason
mnemos verify cl_a                  # re-check against source events
mnemos export --kind=lesson         # YAML-frontmatter markdown for review

Wrap LangGraph, CrewAI, or any MCP-compatible agent

Mnemos doubles as the audit substrate beneath any AI agent. Each node-or-step posts one event keyed to a single run_id; the full reasoning chain is one HTTP call away weeks later.

cd examples/refund_triage_langgraph
pip install -r requirements.txt
python agent.py --customer-id CUST-42 --amount 245.00

curl -s "http://localhost:7777/v1/events?run_id=<run-id>" | jq

The example wires raw HTTP, no SDK — four lines per node get you a defensible audit trail. Source: examples/refund_triage_langgraph/.

FAQ

Questions teams ask before they replace hosted memory

Do I need an LLM API key to start?

No. Ingest, extraction, contradiction detection, evidence linking, and replay work offline. Add an LLM later for grounded natural-language answers.

Can we keep data fully in our environment?

Yes. Mnemos is self-hosted. You choose backend and network boundaries. Nothing requires vendor cloud by default.

How hard is migration from a hosted memory API?

Most teams start by dual-writing events for one workflow, validate replay/audit quality, then switch reads after confidence checks pass.

What is the operational burden?

Single Go binary, SQLite by default. Move to Postgres/MySQL/libSQL when you need shared infra and stronger operational controls.