For codebases that already exist.

“Can I clean here?”

Audit what's already there.
Stay current as it changes.

Q: Is it CI-friendly?

Yes - exit codes (0 / 1 / 2), --plain mode for non-interactive pipelines, dry-run estimator, and SHA-cached re-runs at $0 on unchanged files.

Like a senior code reviewer that has to show its work.

Run it once on your repo as it stands today: a verdict, a 7-axis scorecard, findings that name file + line + evidence - every one proven by grep, file reads, and a local semantic index. From then on, live in your IDE - incremental, $0 on unchanged files.

$ Run your first audit See a real report →

Three ways to authenticate with your AI providers- no lock-in

subscriptionsClaude Code

api keyAnthropic · OpenAI · Google · Mistral · xAI · Groq · …

localOllama·LM Studio·vLLM

~/your-project - anatoly

$ npx anatoly run

    ___                __        __
   /   |  ____  ____ _/ /_____  / /_  __
  / /| | / __ \/ __ `/ __/ __ \/ / / / /
 / ___ |/ / / / /_/ / /_/ /_/ / / /_/ /
/_/  |_/_/ /_/\__,_/\__/\____/_/\__, /  [v0.9.4]
                               /____/   "Heavy!"
============================================

Pipeline
────────────────────────────
✓ Embedding code                        done
✓ LLM summaries & embedding             done
✓ Chunking project docs                 done
✓ Chunking internal docs                done
✓ Reviewing files                42 findings
✓ Deliberation                  12 confirmed
✓ Updating internal docs                done
✓ Generating report                     done

Done - 22 findings · 4 clean · 10m 5s
────────────────────────────
run          2026-04-28_133253
report       .anatoly/runs/.../report.md
reviews      .anatoly/runs/.../reviews/
transcripts  .anatoly/runs/.../logs/
log          .anatoly/runs/.../anatoly.ndjson

Cost: 0.00$ with claude code

01 - Real run

A report you can read in 10 seconds,
and defend in 10 minutes.

Every run produces a single Markdown report - a headline verdict, a per-axis scorecard, and findings sourced with file + line + evidence. Here's a real one, on the public slot-engine benchmark.

15 files reviewed in 10 min - $6.13 in AI analysis so you don't have to.

Verdict: NEEDS_REFACTOR · 26 findings in 11 files

Axis	Health
Correction	🟥🟥🟥🟥🟥🟥🟥🟥🟥⬜ 93% OK
Utility	🟥🟥🟥🟥🟥🟥🟥🟥⬜⬜ 83% used
Duplication	🟥🟥🟥🟥🟥🟥🟥🟥🟥⬜ 90% unique
Overengineering	🟩🟩🟩🟩🟩🟩🟩🟩🟩⬜ 90% lean
Best Practices	🟥🟥🟥🟥🟥🟥🟥⬜⬜⬜ avg 7.4/10

A few of the 26 findings

🐛 src/engine.ts computePayout
The house-edge multiplier sign is inverted: it boosts payout instead of reducing it. RTP is broken.
🐛 src/rng.ts weightedPick
Math.random() used as the RNG source - in a function whose own docstring says "suitable for gaming RNG applications."
♻️ src/types.ts LegacySpinResult
Exported, imported by 0 files.
📋 src/engine.ts checkLine ⇄ src/paytable.ts lineWins
Semantically identical (RAG cosine 0.852): same WILD-skip, same consecutive-match loop, different names. A textual diff would never see it.
🏗️ src/engine.ts EngineContainer
A bespoke IoC container backed by a stringly-typed Map<string, unknown>, holding three values.

See the full report on the slot-engine benchmark →

Every audit we publish is open

The reports above aren't curated highlights. The bench is reproducible, and the audits we run on real third-party code are published as-is - verdicts, findings, costs, runtimes. Read whichever feels closer to your codebase.

bench

anatoly-bench

Reproducible benchmark - synthetic projects with known bugs and dead code, audited end-to-end.

real audits

anatoly-reports

Audits run on real third-party open-source code - published verbatim, no edits, no cherry-picking.

Want the reasoning behind the engine? Read the research - deep-dives on the RAG architecture, prompt strategy, and benchmarks against other AI code-audit tools.

02 - Two ways to live with it

Run and walk away.
Or audit as you ship.

A 10-min audit is too long to sit through. So Anatoly doesn't ask you to. Pick the rail that matches how you work - they share the same engine.

01fire and walk

Telegram · team

Get pinged when it's done.

For periodic and team work.

$ anatoly notifications create-bot
→ wizard creates bot · saves token · ~2 min

✓On every anatoly run, you get the verdict, scorecard, top findings, and a link to the full report.
✓One bot, whole team. Other devs add their username to .anatoly.yml - Anatoly resolves it on first run and caches it.
✓Fire-and-forget. Missing token, unresolved username, Telegram API down - the run logs a warning and produces the report anyway.

02audit as you ship

IDE · daily

Every save, audited.

For daily IC work.

$ anatoly hook init
→ writes Claude Code hook config
$ anatoly watch
→ daemon · re-audits on change
$ anatoly run
→ always incremental · --no-cache to re-run full

✓Claude Code hook. Write → audit → fix loop, with anti-loop protection. Findings surface in your editor session.
✓Watch mode. Daemon that monitors changes and re-runs only on what's been touched.
✓Always incremental. anatoly run only re-reviews files that changed (SHA-256 cache, $0 on unchanged). Use --no-cache when you need to re-run on the full codebase.

03 - Audit your way

Your conventions, calibrated.

Drop an ANATOLY.md at your project root - Anatoly reads it on every run and shapes its judgment to your team's standards. No config language, no rules engine. Just Markdown.

ANATOLY.mdproject root · auto-detected

# ANATOLY.md

## best_practices
- We don't use barrel exports. Flag any new
  index.ts that re-exports.
- Date handling: always Temporal.PlainDate,
  never Date.

## correction
- Components are colocated with their .test.tsx -
  a component without a sibling test = finding.

## documentation
- Public API exports must have JSDoc with at
  least one @example.

per-axis calibration

Each section calibrates its matching axis. Empty section = axis runs on defaults.

dilution warning

Sections > 2500 tokens trigger a warning - focused beats exhaustive. Anatoly tells you when you over-explain.

auditors: kickoff deliverable

Write it with your client at kickoff. The audit runs against their conventions, not yours.

Every section, axis, and override is documented in the configuration reference.

04 - Reads, writes, remembers

Builds the docs you wish you had.

Before reviewing, Anatoly reverse-engineers your codebase and writes a structured documentation tree under .anatoly/docs/. On every subsequent run, that documentation becomes business context for the audit - findings get smarter because the agent already understands what the code is supposed to do.

.anatoly/docs/auto-scaffolded · 19 files

.anatoly/docs/
├── index.md
├── 01-Getting-Started/
│   ├── 01-Overview.md
│   ├── 02-Installation.md
│   ├── 03-Configuration.md
│   └── 04-Quick-Start.md
├── 02-Architecture/
│   ├── 01-System-Overview.md
│   ├── 02-Core-Concepts.md
│   ├── 03-Data-Flow.md
│   └── 04-Design-Decisions.md
├── 03-Guides/ … 3 files
├── 04-API-Reference/ … 3 files
└── 05-Development/ … 4 files

01-Overview.mdreverse-engineered

# Overview

> A pure-logic TypeScript library for
  simulating a 5-reel, 3-row slot machine
  with paylines, wilds, and a progressive
  jackpot.

## What It Does

A single call to spin() performs:

1. Reel generation - five reels
   of three rows, weighted-random
   symbols.
2. Payline evaluation - ten lines
   checked left-to-right, with WILD
   substitution.
3. Wild multiplier - exponential
   bonus when WILDs are in a winning line.
   … 4 more steps

audit context loop

Subsequent runs use the docs as business context. Findings get smarter run after run.

auditor deliverable

Hand your client an audit and a complete doc tree they didn't have before.

instant onboarding

Run anatoly docs scaffold project to copy the tree to docs/ and publish.

See a real generated doc on the slot-engine benchmark →

05 - Stay in your perimeter

Audit code that can't leave your network.

Anatoly runs end-to-end on local models. Point it at Ollama, LM Studio, or any OpenAI-compatible server - combine with local RAG embeddings, and no byte of your code reaches a third party.

.anatoly.ymllocal-only mode

providers:
  ollama:
    base_url: http://localhost:11434/v1
    transport: openai-compatible

rag:
  code_model: auto    # GGUF (GPU) or Jina (CPU)
  nlp_model: auto

regulated industries

Banking, healthcare, defense, public sector - wherever code can't touch a SaaS endpoint.

client NDAs

Tech due diligence, expert witness work, M&A - the audit runs inside the client's perimeter.

data sovereignty

EU, FR, defense procurement - your prompts and your code stay under your jurisdiction.

Built for teams whose code simply doesn't get to leave the building.

06 - The seven axes

A multi-LLM agent with read access to your
entire codebase and a local semantic index.

For every file, Anatoly runs seven audits in parallel - each routed to the model that fits it best. The agent must grep the codebase, read related files, and query the RAG index before any finding is reported. False positives are filtered through a 3-tier refinement pipeline.

Axis

Recommended model

Verdicts

What it detects

Utility

recommendedclaude haiku

verdictsUSED · DEAD · LOW_VALUE

Dead exports, unused code

Duplication

recommendedclaude haiku

verdictsUNIQUE · DUPLICATE

Semantic duplicates across the project

Correction

recommendedclaude sonnet

verdictsOK · NEEDS_FIX · ERROR

Bugs, logic errors, async issues

Overengineering

recommendedclaude sonnet

verdictsLEAN · OVER · ACCEPTABLE

Excess complexity vs purpose

Tests

recommendedclaude sonnet

verdictsGOOD · WEAK · NONE

Coverage quality per symbol

Best Practices

recommendedclaude sonnet

verdictsSCORE 0–10 · 17 RULES

Language-specific, context-aware

Documentation

recommendedclaude sonnet

verdictsDOCUMENTED · PARTIAL · UNDOCUMENTED

JSDoc gaps, /docs/ desync

Each axis is crash-isolated · all 7 run in parallel · routed to the model that fits the question

→ Deliberation

Two heavier models step in
when a finding is ambiguous.

claude opus3-tier deliberation

Every finding runs through a 3-tier deliberation pipeline before it ships. Tier 1: auto-resolution against the usage graph, AST, and RAG index ($0). Tier 2: inter-axis contradiction detection. Tier 3: claude opus with full tool access - Read · Grep · Bash · WebFetch - reasons through empirical evidence.

claude sonnettwo-pass correction

The Correction axis re-evaluates each finding against your project's dependency manifest (package.json · Cargo.toml · go.mod · pyproject.toml · …) and the README of each declared dependency, to filter API-misunderstanding false positives.

Deliberation is what separates a flagged finding from a proven one.

parallel audit axes

first run · 15-file TS repo

10+

supported languages

on unchanged files

07 - Integrations

Two integration paths.
Available today.

01 subscriptions

Claude Code

Drives Anatoly through your existing Claude.ai subscription. Tool use, real-time hooks (PostToolUse, Stop), and the Opus deliberation pass - no API key required.

$ npx anatoly run

02 byok · api key

Any modern provider

Direct API access to Anthropic, OpenAI, Google, Mistral, xAI, Groq, and more - your keys, your spend. Mix per axis: route Correction to Sonnet, Utility to Haiku.

$ anatoly run --provider openai

08 - You own it all

You own the intelligence,
the rules, the data.
Always.

Anatoly doesn't sit in the middle. It runs on your machine - your keys, your conventions, your code, your reports stay yours.

intelligence

Your keys. Your providers. Your bill.

BYOK across Claude, OpenAI, Mistral, xAI, Groq, and any local server. Mix per axis - route correction to Sonnet, utility to Haiku. No proxy, no middleman, no lock-in. Better model tomorrow? Swap it in.

rules

Your conventions. Your dialect.

Drop an ANATOLY.md at the project root and the LLM's judgment calibrates to your standards. Per-axis sections, no rules engine, no DSL - just Markdown. Auditors: write it with your client at kickoff so the audit runs against their conventions.

data

Your code stays yours.

Run end-to-end on local models (Ollama, LM Studio, GGUF embeddings) when your code can't leave your network. Reports, findings, and generated docs live in your repo under .anatoly/. No SaaS dashboard, no third-party storage of audit history.

09 - Why evidence-based

“A finding must be proven,
or it doesn't ship.”

Every claim on the report card was reasoned out by the agent - grep first, then file reads, then RAG queries on the semantic index. A 3-tier refinement pass strips contradictions and ambiguous cases (auto-resolution → inter-axis contradiction detection → an Opus investigator for the rest). What lands in the report has been argued for.

10 - Who it's for

Built for three profiles
maintaining code they didn't fully write.

engineers

Tech leads and senior ICs maintaining a codebase - solo, in a team, or in a PR queue.

solo builders

Founders and indie hackers who shipped fast with Cursor or Claude Code and need to know what they actually have.

auditors

Consultants and tech-DD analysts producing reports that have to hold up to scrutiny - including under client NDA.

From 100 to 100,000+ lines. From the laptop to the boardroom.

11 - Community

Built in the open.
Talk to the dev.

If you have a question, an idea, a bug report, or you just want to see how a multi-LLM agent is built from the inside - come hang out.

★ Anatoly audits Anatoly. The Checked-by-Anatoly badge on our README runs the full audit pipeline on the product itself, on every release.

live chat

Discord

Direct line to the developer. Feature discussions, audit-results show-and-tell, real-time help.

discord.gg/zetMErTjNH →

source · issues · PRs

GitHub

Source code, releases, issue tracker, pull requests, and the complete documentation tree.

github.com/r-via/anatoly →

12 - FAQ

Will Anatoly modify my code?+

No. Read-only audit. Auto-clean is opt-in and commits each fix individually so you can review and revert per-commit.

Will Anatoly stay free?+

Yes. Always. AGPL-3.0. A commercial license is available for companies that cannot comply with AGPL.

Do I bring my own model keys?+

Yes. BYOK is the default - your keys, your providers, your spend. No proxy, no middleman. Use a Claude Code subscription, direct API keys, or fully local models (Ollama, LM Studio, vLLM).

Is it CI-friendly?+

Yes - exit codes (0 / 1 / 2), --plain mode for non-interactive pipelines, dry-run estimator, and SHA-cached re-runs at $0 on unchanged files.

Which languages?+

TypeScript, Python, Rust, Go, Java, C#, Bash, SQL, YAML, JSON - auto-detected, with framework-aware evaluation for React and Next.js.

13 - Public roadmap

What's next.

Anatoly is free, open source, and will stay that way. Here's what we're working on. Subscribe to get a single email when each one ships - no newsletter, no drip, just roadmap updates.

in progressCross-run reports - track verdict and findings drift between audits, on the same repo, over time.
in progressExpanded language coverage - Kotlin, Swift, Ruby, PHP join TypeScript, Python, Rust, Go, Java, C#.
nextSpecialised audit modes - security-focused passes, accessibility passes, framework-aware presets.
nextDeeper CI integrations - first-class GitHub Actions, GitLab CI, and CircleCI templates with PR comment summaries.
exploringA hosted runtime for teams that don't want to keep a laptop open - same engine, GitHub-native, on schedule.

Get notified when something ships

We'll email you once per major release. No newsletter, no third-party lists. Resend (EU) - unsubscribe in one click.

One command.
Your whole codebase, audited.

$ npx anatoly run

⭐ Star on GitHub 💬 Join the Discord See the roadmap →

Questions? The dev is on Discord.

Audit what's already there.Stay current as it changes.

A report you can read in 10 seconds, and defend in 10 minutes.

Run and walk away.Or audit as you ship.

Get pinged when it's done.

Every save, audited.

Your conventions, calibrated.

Builds the docs you wish you had.

Audit code that can't leave your network.

A multi-LLM agent with read access to yourentire codebase and a local semantic index.

Two integration paths.Available today.

Claude Code

Any modern provider

You own the intelligence,the rules, the data.Always.

Built for three profiles maintaining code they didn't fully write.

Built in the open.Talk to the dev.

Discord

GitHub

What's next.

One command.Your whole codebase, audited.

Audit what's already there.
Stay current as it changes.

A report you can read in 10 seconds,
and defend in 10 minutes.

Run and walk away.
Or audit as you ship.

A multi-LLM agent with read access to your
entire codebase and a local semantic index.

Two integration paths.
Available today.

You own the intelligence,
the rules, the data.
Always.

Built for three profiles
maintaining code they didn't fully write.

Built in the open.
Talk to the dev.

One command.
Your whole codebase, audited.