Anatoly logo - multi-LLM AI agent that audits your codebase

Use case

A tech-DD report where every finding has a file, a line, and a proof.

Tech due diligence, M&A code audits, expert witness work — the claims have to be reproducible, sourced, defensible. Anatoly delivers each one with a file path, a line number, and an artifact you can pull. And it runs inside the client's perimeter when the NDA demands it.

Five-step chain-of-evidence illustration. From left to right: 'Code' shows a TypeScript editor with a highlighted line in src/engine.ts; 'Search results' shows a terminal with two grep matches; 'Cross-read' shows two overlapping file documents; 'Semantic match' shows a small RAG cosine-similarity diagram with score 0.852; 'Finding' shows a report.md card with a pink DUPLICATE badge and a gold checkmark. The five cards are connected by a dashed gold-to-pink line, with a warm radial glow behind the final card.
Every finding traces back to source — the chain is what survives a counter-expertise.

What due diligence actually demands

A tech-DD report isn't a vibe check. It's a document that has to survive a counter-expertise: a buying party's CTO, a litigation opponent, or a regulator pulling on every assertion. That means each finding needs three things - a precise location, an artifact you can show, and a justification for why it counts.

The run

Drop into the target repo, calibrate with ANATOLY.md, and run.

~/target-repo
$npx anatoly run --plain > reports/audit_2026-04-28.md
→ scanning · 15 files · TypeScript
→ axes · 7/7 in parallel · RAG indexed
→ deliberation · 3-tier refinement pipeline
 verdict: NEEDS_REFACTOR · 26 findings in 11 files · 10.1 min · $6.13

The first run does the work — full embedding, all seven axes, deliberation. After that, every re-audit on the same target (post-remediation, second-look, response-to-management-comment) is SHA-256 cached: unchanged files cost $0. You bill the deep audit once, run as many follow-ups as the engagement needs. Cost scales roughly linearly with file count; the figures above are from the public slot-engine benchmark — a small TypeScript repo we publish in full to make this page verifiable.

The report

Markdown, single file, same shape every time — opening verdict, per-axis health, then findings with file + line + evidence. Here is the actual scorecard from that benchmark run:

From the public slot-engine report — 15 files, 26 findings, $6.13

Verdict: NEEDS_REFACTOR

Correction🟥🟥🟥🟥🟥🟥🟥🟥🟥⬜ 93% OK
Utility🟥🟥🟥🟥🟥🟥🟥🟥⬜⬜ 83% used
Duplication🟥🟥🟥🟥🟥🟥🟥🟥🟥⬜ 90% unique
Overengineering🟩🟩🟩🟩🟩🟩🟩🟩🟩⬜ 90% lean
Tests🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩 No data
Documentation🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩 100% documented
Best Practices🟥🟥🟥🟥🟥🟥🟥⬜⬜⬜ avg 7.4/10

Read the full report on GitHub → Every claim above resolves to a file path, a symbol name, and an artifact you can pull and check.

Calibrate to the company's conventions

At kickoff, write an ANATOLY.md with the client. Anatoly reads it on every run and shapes its judgment to their standards - not yours.

ANATOLY.mdkickoff deliverable
# ANATOLY.md - calibration for slot-engine

## best_practices
- No barrel exports. Flag any new index.ts
  that re-exports.
- Date handling: always Temporal.PlainDate,
  never Date.

## correction
- Components must be colocated with .test.tsx.
  A component without a sibling test = finding.

## documentation
- Public API exports require JSDoc with at least one @example.

NDA & data residency

For client engagements where source can't leave the perimeter, Anatoly runs end-to-end on local models - Ollama, LM Studio, vLLM, with local RAG embeddings. See the local LLM use case for the full setup. EU / FR / defense procurement use cases are first-class.

How false positives get filtered

The Correction axis runs a second pass: every finding is re-evaluated against the project's dependency manifest (package.json, Cargo.toml, go.mod, pyproject.toml, …) and the README of each declared dependency. Findings that boil down to a misread API contract get dropped before they reach the report. What's left in the deliverable is what survived adversarial review — which is what a counter-expertise will probe first.

Bonus deliverable: docs the target didn't have

Run anatoly docs scaffold and Anatoly reverse-engineers the codebase into a structured documentation tree under .anatoly/docs/ — Getting-Started, Architecture, Guides, API Reference, Development. Five sections, ~19 files for a small repo, all derived from reading the actual source.

Two things this buys you: (a) the docs themselves are a deliverable — hand the buyer an audit and a product spec they didn't have before; (b) on every subsequent audit run, those docs become business context (RAG-indexed at $0), so findings get judged against what the product is meant to do, not just against generic best practices.

See it on real codeWe publish audits run on real third-party open-source codebases — verbatim, no edits, no cherry-picking. Useful before sending a proposal: pick the report closest to your target's stack and show the buyer what they'll get.Browse the public reports →

Who this fits

Other use cases

Run your first auditSee a sample report →← All use cases