Anatoly logo - multi-LLM AI agent that audits your codebase

Use case

Audit your code on local LLMs. No network egress.

Run the full multi-axis audit on a model you host yourself. With local RAG embeddings, no source byte ever crosses your firewall — works with Ollama, LM Studio, vLLM, or any OpenAI-compatible server.

Why local LLMs for code review

Most AI code review tools assume your code can leave your network. It rarely can. Banking, healthcare, defense, public sector, tech due diligence under NDA — these are environments where shipping source to a hosted SaaS endpoint is simply a non-starter.

Anatoly was designed from day one to run as a CLI on your machine with no required outbound calls beyond the model provider you choose. Choose a local provider and the audit pipeline - multi-LLM orchestration, semantic indexing, evidence gathering, and the three-tier deliberation pass - runs entirely inside your perimeter.

your network · no outbound

your perimeter

CLI
anatoly
your code
LLM
Ollama
localhost:11434
RAG
GGUF / Jina
on-device

┌─ no third-party API ─ no SaaS ─ no telemetry ─┐

Wire it up in two minutes

Drop a .anatoly.yml at your project root, point Anatoly at your local server, and run.

.anatoly.ymllocal-only mode
version: 3

providers:
  ollama:
    transport: openai_compatible
    base_url: http://localhost:11434/v1
    models:
      - qwen2.5-coder:14b
  local-embeddings:
    transport: onnxruntime_node    # in-process, no API key
    models:
      - jinaai/jina-embeddings-v2-base-code

routing:
  generation:
    quality: ollama/qwen2.5-coder:14b
  embeddings:
    code: local-embeddings/jinaai/jina-embeddings-v2-base-code
~/your-project
$ ollama serve # in another shell
$ npx anatoly run --plain
→ scanning · 142 files · TS/TSX
→ embedding · 1842 chunks · jina-v3-cpu (4m12s)
→ axes (parallel) · qwen2.5-coder:14b · 7/7 OK
→ deliberation · 18 findings → 9 confirmed
 .anatoly/runs/2026-05-05_103245/public_report.md

Supported local backends

Local RAG embeddings

The semantic index is the part that often forces teams back to a hosted embeddings API. Anatoly ships two local options:

Wire either as a provider in .anatoly.yml, then point routing.embeddings.code at it.

What "no network egress" actually means

With a local provider configured for both generation and embeddings, Anatoly makes zero outbound calls. Even the dependency-README pass that grounds the Correction axis reads READMEs from your local node_modules/ - never from npm, GitHub, or any registry.

Air-gapped boxes, defense procurement, NDA-bound client perimeters: no firewall exception to request, no egress proxy to whitelist.

Who this fits

What you get

The same Markdown report as the cloud-backed runs: a headline verdict, a 7-axis scorecard, and findings that name file + line + evidence - every one proven by grep, file reads, and RAG queries against the local index. The reports are reproducible and live in your repo under .anatoly/.

Other use cases

Run your first auditSee a sample report →← All use cases