Anatoly logo - multi-LLM AI agent that audits your codebase
For codebases that already exist.

“Can I clean here?”

Audit what's already there.
Stay current as it changes.

Like a senior code reviewer that has to show its work.

Run it once on your repo as it stands today: a verdict, a 7-axis scorecard, findings that name file + line + evidence - every one proven by grep, file reads, and a local semantic index. From then on, live in your IDE - incremental, $0 on unchanged files.

Three ways to authenticate with your AI providers- no lock-in
subscriptionsClaude Code·Gemini CLI
api keyAnthropic · OpenAI · Google · Mistral · xAI · Groq · …
localOllama·LM Studio·vLLM
~/your-project - anatoly
$ npx anatoly run

    ___                __        __
   /   |  ____  ____ _/ /_____  / /_  __
  / /| | / __ \/ __ `/ __/ __ \/ / / / /
 / ___ |/ / / / /_/ / /_/ /_/ / / /_/ /
/_/  |_/_/ /_/\__,_/\__/\____/_/\__, /  [v0.9.4]
                               /____/   "Heavy!"
============================================

Pipeline
────────────────────────────
 Embedding code                        done
 LLM summaries & embedding             done
 Chunking project docs                 done
 Chunking internal docs                done
 Reviewing files                42 findings
 Deliberation                  12 confirmed
 Updating internal docs                done
 Generating report                     done

Done - 22 findings · 4 clean · 10m 5s
────────────────────────────
run          2026-04-28_133253
report       .anatoly/runs/.../report.md
reviews      .anatoly/runs/.../reviews/
transcripts  .anatoly/runs/.../logs/
log          .anatoly/runs/.../anatoly.ndjson

Cost: 0.00$ with claude code
01 - Real run

A report you can read in 10 seconds, and defend in 10 minutes.

Every run produces a single Markdown report - a headline verdict, a per-axis scorecard, and findings sourced with file + line + evidence. Here's a real one, on the public slot-engine benchmark.

15 files reviewed in 10 min - $6.13 in AI analysis so you don't have to.

Verdict: NEEDS_REFACTOR · 26 findings in 11 files

AxisHealth
Correction🟥🟥🟥🟥🟥🟥🟥🟥🟥⬜ 93% OK
Utility🟥🟥🟥🟥🟥🟥🟥🟥⬜⬜ 83% used
Duplication🟥🟥🟥🟥🟥🟥🟥🟥🟥⬜ 90% unique
Overengineering🟩🟩🟩🟩🟩🟩🟩🟩🟩⬜ 90% lean
Best Practices🟥🟥🟥🟥🟥🟥🟥⬜⬜⬜ avg 7.4/10

A few of the 26 findings

  • 🐛 src/engine.ts computePayout

    The house-edge multiplier sign is inverted: it boosts payout instead of reducing it. RTP is broken.

  • 🐛 src/rng.ts weightedPick

    Math.random() used as the RNG source - in a function whose own docstring says "suitable for gaming RNG applications."

  • ♻️ src/types.ts LegacySpinResult

    Exported, imported by 0 files.

  • 📋 src/engine.ts checkLine src/paytable.ts lineWins

    Semantically identical (RAG cosine 0.852): same WILD-skip, same consecutive-match loop, different names. A textual diff would never see it.

  • 🏗️ src/engine.ts EngineContainer

    A bespoke IoC container backed by a stringly-typed Map<string, unknown>, holding three values.

See the full report on the slot-engine benchmark →

Every audit we publish is open

The reports above aren't curated highlights. The bench is reproducible, and the audits we run on real third-party code are published as-is - verdicts, findings, costs, runtimes. Read whichever feels closer to your codebase.

02 - Two ways to live with it

Run and walk away.
Or audit as you ship.

A 10-min audit is too long to sit through. So Anatoly doesn't ask you to. Pick the rail that matches how you work - they share the same engine.

01fire and walk
Telegram · team

Get pinged when it's done.

For periodic and team work.

~
$ anatoly notifications create-bot
→ wizard creates bot · saves token · ~2 min
  • On every anatoly run, you get the verdict, scorecard, top findings, and a link to the full report.
  • One bot, whole team. Other devs add their username to .anatoly.yml - Anatoly resolves it on first run and caches it.
  • Fire-and-forget. Missing token, unresolved username, Telegram API down - the run logs a warning and produces the report anyway.
02audit as you ship
IDE · daily

Every save, audited.

For daily IC work.

~
$ anatoly hook init
→ writes Claude Code hook config
$ anatoly watch
→ daemon · re-audits on change
$ anatoly run
→ always incremental · --no-cache to re-run full
  • Claude Code hook. Write → audit → fix loop, with anti-loop protection. Findings surface in your editor session.
  • Watch mode. Daemon that monitors changes and re-runs only on what's been touched.
  • Always incremental. anatoly run only re-reviews files that changed (SHA-256 cache, $0 on unchanged). Use --no-cache when you need to re-run on the full codebase.
03 - Audit your way

Your conventions, calibrated.

Drop an ANATOLY.md at your project root - Anatoly reads it on every run and shapes its judgment to your team's standards. No config language, no rules engine. Just Markdown.

ANATOLY.mdproject root · auto-detected
# ANATOLY.md

## best_practices
- We don't use barrel exports. Flag any new
  index.ts that re-exports.
- Date handling: always Temporal.PlainDate,
  never Date.

## correction
- Components are colocated with their .test.tsx -
  a component without a sibling test = finding.

## documentation
- Public API exports must have JSDoc with at
  least one @example.

per-axis calibration

Each section calibrates its matching axis. Empty section = axis runs on defaults.

dilution warning

Sections > 2500 tokens trigger a warning - focused beats exhaustive. Anatoly tells you when you over-explain.

auditors: kickoff deliverable

Write it with your client at kickoff. The audit runs against their conventions, not yours.

04 - Reads, writes, remembers

Builds the docs you wish you had.

Before reviewing, Anatoly reverse-engineers your codebase and writes a structured documentation tree under .anatoly/docs/. On every subsequent run, that documentation becomes business context for the audit - findings get smarter because the agent already understands what the code is supposed to do.

.anatoly/docs/auto-scaffolded · 19 files
.anatoly/docs/
├── index.md
├── 01-Getting-Started/
│   ├── 01-Overview.md
│   ├── 02-Installation.md
│   ├── 03-Configuration.md
│   └── 04-Quick-Start.md
├── 02-Architecture/
│   ├── 01-System-Overview.md
│   ├── 02-Core-Concepts.md
│   ├── 03-Data-Flow.md
│   └── 04-Design-Decisions.md
├── 03-Guides/ … 3 files
├── 04-API-Reference/ … 3 files
└── 05-Development/ … 4 files
01-Overview.mdreverse-engineered
# Overview

> A pure-logic TypeScript library for
  simulating a 5-reel, 3-row slot machine
  with paylines, wilds, and a progressive
  jackpot.

## What It Does

A single call to spin() performs:

1. Reel generation - five reels
   of three rows, weighted-random
   symbols.
2. Payline evaluation - ten lines
   checked left-to-right, with WILD
   substitution.
3. Wild multiplier - exponential
   bonus when WILDs are in a winning line.
   … 4 more steps

audit context loop

Subsequent runs use the docs as business context. Findings get smarter run after run.

auditor deliverable

Hand your client an audit and a complete doc tree they didn't have before.

instant onboarding

Run anatoly docs scaffold project to copy the tree to docs/ and publish.

See a real generated doc on the slot-engine benchmark →
05 - Stay in your perimeter

Audit code that can't leave your network.

Anatoly runs end-to-end on local models. Point it at Ollama, LM Studio, or any OpenAI-compatible server - combine with local RAG embeddings, and no byte of your code reaches a third party.

.anatoly.ymllocal-only mode
providers:
  ollama:
    base_url: http://localhost:11434/v1
    transport: openai-compatible

rag:
  code_model: auto    # GGUF (GPU) or Jina (CPU)
  nlp_model: auto

regulated industries

Banking, healthcare, defense, public sector - wherever code can't touch a SaaS endpoint.

client NDAs

Tech due diligence, expert witness work, M&A - the audit runs inside the client's perimeter.

data sovereignty

EU, FR, defense procurement - your prompts and your code stay under your jurisdiction.

Built for teams whose code simply doesn't get to leave the building.

06 - The seven axes

A multi-LLM agent with read access to your
entire codebase and a local semantic index.

For every file, Anatoly runs seven audits in parallel - each routed to the model that fits it best. The agent must grep the codebase, read related files, and query the RAG index before any finding is reported. False positives are filtered through a 3-tier refinement pipeline.

Utility
recommendedclaude haiku
verdictsUSED · DEAD · LOW_VALUE
Dead exports, unused code
Duplication
recommendedclaude haiku
verdictsUNIQUE · DUPLICATE
Semantic duplicates across the project
Correction
recommendedclaude sonnet
verdictsOK · NEEDS_FIX · ERROR
Bugs, logic errors, async issues
Overengineering
recommendedclaude sonnet
verdictsLEAN · OVER · ACCEPTABLE
Excess complexity vs purpose
Tests
recommendedclaude sonnet
verdictsGOOD · WEAK · NONE
Coverage quality per symbol
Best Practices
recommendedclaude sonnet
verdictsSCORE 0–10 · 17 RULES
Language-specific, context-aware
Documentation
recommendedclaude sonnet
verdictsDOCUMENTED · PARTIAL · UNDOCUMENTED
JSDoc gaps, /docs/ desync

Each axis is crash-isolated · all 7 run in parallel · routed to the model that fits the question

→ Deliberation

Two heavier models step in
when a finding is ambiguous.

claude opus3-tier deliberation

Every finding runs through a 3-tier deliberation pipeline before it ships. Tier 1: auto-resolution against the usage graph, AST, and RAG index ($0). Tier 2: inter-axis contradiction detection. Tier 3: claude opus with full tool access - Read · Grep · Bash · WebFetch - reasons through empirical evidence.

claude sonnettwo-pass correction

The Correction axis re-evaluates each finding against your project's dependency manifest (package.json · Cargo.toml · go.mod · pyproject.toml · …) and the README of each declared dependency, to filter API-misunderstanding false positives.

Deliberation is what separates a flagged finding from a proven one.

7
parallel audit axes
$6
first run · 15-file TS repo
10+
supported languages
$0
on unchanged files
07 - Integrations

Three integration paths.
Available today.

01 subscriptions

Claude Code

Drives Anatoly through your existing Claude.ai subscription. Tool use, real-time hooks (PostToolUse, Stop), and the Opus deliberation pass - no API key required.

$ npx anatoly run
02 subscriptions

Gemini CLI

Route axes to Gemini 2.5 Flash via Google OAuth - your Code Assist subscription, $0/token. Reduces Claude calls by ~69% with circuit-breaker fallback.

$ anatoly providers --gemini
03 byok · api key

Any modern provider

Direct API access to Anthropic, OpenAI, Google, Mistral, xAI, Groq, and more - your keys, your spend. Mix per axis: route Correction to Sonnet, Utility to Haiku.

$ anatoly run --provider openai
08 - You own it all

You own the intelligence,
the rules, the data.
Always.

Anatoly doesn't sit in the middle. It runs on your machine - your keys, your conventions, your code, your reports stay yours.

intelligence

Your keys. Your providers. Your bill.

BYOK across Claude, Gemini OAuth, OpenAI, Mistral, xAI, Groq, and any local server. Mix per axis - route correction to Sonnet, duplication to Gemini Flash. No proxy, no middleman, no lock-in. Better model tomorrow? Swap it in.

rules

Your conventions. Your dialect.

Drop an ANATOLY.md at the project root and the LLM's judgment calibrates to your standards. Per-axis sections, no rules engine, no DSL - just Markdown. Auditors: write it with your client at kickoff so the audit runs against their conventions.

data

Your code stays yours.

Run end-to-end on local models (Ollama, LM Studio, GGUF embeddings) when your code can't leave your network. Reports, findings, and generated docs live in your repo under .anatoly/. No SaaS dashboard, no third-party storage of audit history.

09 - Why evidence-based

“A finding must be proven,
or it doesn't ship.”

Every claim on the report card was reasoned out by the agent - grep first, then file reads, then RAG queries on the semantic index. A 3-tier refinement pass strips contradictions and ambiguous cases (auto-resolution → inter-axis contradiction detection → an Opus investigator for the rest). What lands in the report has been argued for.

10 - Who it's for

Built for three profiles maintaining code they didn't fully write.

engineers

Tech leads and senior ICs maintaining a codebase - solo, in a team, or in a PR queue.

solo builders

Founders and indie hackers who shipped fast with Cursor or Claude Code and need to know what they actually have.

auditors

Consultants and tech-DD analysts producing reports that have to hold up to scrutiny - including under client NDA.

From 100 to 100,000+ lines. From the laptop to the boardroom.

11 - Community

Built in the open.
Talk to the dev.

If you have a question, an idea, a bug report, or you just want to see how a multi-LLM agent is built from the inside - come hang out.

Anatoly audits Anatoly. The Checked-by-Anatoly badge on our README runs the full audit pipeline on the product itself, on every release.

12 - FAQ
Will Anatoly modify my code?+

No. Read-only audit. Auto-clean is opt-in and commits each fix individually so you can review and revert per-commit.

Will Anatoly stay free?+

Yes. Always. AGPL-3.0. A commercial license is available for companies that cannot comply with AGPL.

Do I bring my own model keys?+

Yes. BYOK is the default - your keys, your providers, your spend. No proxy, no middleman. Use Claude Code or Gemini CLI subscriptions, direct API keys, or fully local models (Ollama, LM Studio, vLLM).

Is it CI-friendly?+

Yes - exit codes (0 / 1 / 2), --plain mode for non-interactive pipelines, dry-run estimator, and SHA-cached re-runs at $0 on unchanged files.

Which languages?+

TypeScript, Python, Rust, Go, Java, C#, Bash, SQL, YAML, JSON - auto-detected, with framework-aware evaluation for React and Next.js.

13 - Public roadmap

What's next.

Anatoly is free, open source, and will stay that way. Here's what we're working on. Subscribe to get a single email when each one ships - no newsletter, no drip, just roadmap updates.

  • in progressCross-run reports - track verdict and findings drift between audits, on the same repo, over time.
  • in progressExpanded language coverage - Kotlin, Swift, Ruby, PHP join TypeScript, Python, Rust, Go, Java, C#.
  • nextSpecialised audit modes - security-focused passes, accessibility passes, framework-aware presets.
  • nextDeeper CI integrations - first-class GitHub Actions, GitLab CI, and CircleCI templates with PR comment summaries.
  • exploringA hosted runtime for teams that don't want to keep a laptop open - same engine, GitHub-native, on schedule.

Get notified when something ships

We'll email you once per major release. No newsletter, no third-party lists. Resend (EU) - unsubscribe in one click.

We store your email with Resend for the sole purpose of sending you Anatoly roadmap updates when major releases ship. See our privacy notice for retention, your rights, and contact.

Anatoly - one command, your whole codebase audited

One command.
Your whole codebase, audited.

$ npx anatoly run

Questions? The dev is on Discord.