Public log

Changelog

Every release of Anatoly, in reverse chronological order.

Pulled live from GitHub Releases. Subscribe via RSS.

v0.9.7 — npm homepage → anatoly.cloud

Jun 21, 2026 · v0.9.7

## Maintenance release

A metadata-only release on top of v0.9.6.

- npm `homepage` now points to the official site, [anatoly.cloud](https://anatoly.cloud), instead of the GitHub README.
- Version bumped to 0.9.7 (package + lockfile).

No functional or behavioral changes to the audit engine.

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.9.6...v0.9.7

v0.9.6 — Estimate/Scan unification, RAG hardening & OSS hygiene

May 7, 2026 · v0.9.6

## Highlights

A consolidation release on top of v0.9.5: the `scan` command is now folded into `estimate`, the `estimate` forecast was recalibrated against real runs, RAG / local-embeddings handling was hardened, LLM cost bookkeeping was centralised, and the project gained a full open-source hygiene layer (SECURITY, CONTRIBUTING, CoC, CI).

---

## 🔭 Estimate / Scan unification

The standalone `anatoly scan` command is gone — its capabilities (new / modified / cached file accounting) now live inside `estimate`, which is the single entry point for forecasting.

- `anatoly scan` removed; new/modified/cached are surfaced directly in the `estimate` view.
- New `--no-cache` flag for a from-scratch forecast that ignores prior cache state.
- Cached files now excluded from the token + cost forecast (no more double-counting).
- Stale tasks pointing to deleted/missing source files are pruned from the estimator.
- Deliberation shard count derived from distinct directories rather than a flat `files / 20` heuristic.
- Bootstrap and coherence per-page token constants recalibrated from real runs.
- Estimate user-guide added under `docs/`.

## 🧭 Scan / config schema cleanup

- Glob include/exclude detection runs at wizard time, with a new `respect_gitignore` knob.
- `auto_detect` removed — `include` / `exclude` are now strictly authoritative.
- TypeScript-specific defaults dropped from the scan config schema (the project is multi-language).
- Language detection delegated to [`linguist-js`](https://www.npmjs.com/package/linguist-js) for accurate per-language stats.

## 🧠 RAG & local embeddings

- LanceDB tables auto-rebuild on embedding-dimension drift, so swapping providers no longer corrupts the index.
- System-local providers are now identified by name (`local-advanced`), not by URL heuristics — fixes routing when users override `base_url`.
- The local sidecar is unified under the canonical `local-advanced` name, with per-axis URLs routed through the slot map.
- `ANATOLY_LOCAL_DUMMY_KEY` footgun removed: local providers declare `auth: none` and skip API-key plumbing entirely.
- `local-embeddings` gained `init`, `cleanup`, and `downgrade` subcommands; YAML formatting is preserved across config edits.

## 📊 Metrics & telemetry

- Centralised LLM cost bookkeeping behind a single `recordLlmCost` helper — every code path (tier3, phase, refinement) now goes through one accounting hook.
- Refinement tokens propagate correctly through tier3 → phase → `recordLlmCost` (previously dropped on the floor).
- `llm_call` events are now emitted for the coherence-review and content-review passes too, so the cost ndjson covers the full pipeline.
- `runDocContentReview` is wrapped in `runWithContext` so its events carry the right ndjson phase tag.

## ✂️ Output concision discipline

- New "output concision discipline" prompt block added to every axis and to non-RAG services.
- Empirical study (`docs/concision-discipline-study.md`) documents the calibration methodology behind the new prompts.

## 📚 Docs pipeline

- The `content-review` pass is now merged into the updater loop, removing one full pipeline phase and the round-trip it cost.

## 🛠️ CLI, build & Makefile

- `anatoly --version` now prints the git commit SHA alongside the package version.
- `make update` supports a `BRANCH=` variable to refresh from a non-`main` branch.
- `make update` supports a `COMMIT=` variable to pin a specific revision.
- `sync-motd` build step is idempotent (no spurious diffs on repeated builds).

## 🔒 Open-source hygiene

- New `SECURITY.md` with endpoint inventory and threat model.
- New `CONTRIBUTING.md`, `CODE_OF_CONDUCT.md`, plus a CI workflow.
- `README.md` got a TL;DR section.

## 🛡️ Security & dependencies

- `protobufjs` and `ip-address` pinned via `overrides` to clear critical CVEs.
- Stale `hasInstallScript` flag dropped from the lockfile.

## 🔧 Refactoring

- `RunConfig` type and builder extracted into a dedicated core module (`run.ts` is no longer the source of truth for config assembly).

---

## Full changelog

`git log v0.9.5..v0.9.6` — 34 commits.

**Compare:** https://github.com/r-via/anatoly/compare/v0.9.5...v0.9.6

v0.9.5 — External Embeddings, Config v3 & Estimate Forecast

May 5, 2026 · v0.9.5

## Highlights

This release lands two major epics — **Epic 50 (External Embedding Providers)** and **Epic 48/49 (First-Run Onboarding)** — alongside a complete overhaul of the cost forecasting / `estimate` view, a new **declarative v3 config schema**, and a runtime pricing registry sourced from upstream catalogs.

---

## 🌐 Epic 50 — External Embedding Providers

Anatoly is no longer tied to local GGUF / ONNX runtimes. You can now bring your own embedding provider — OpenAI, Voyage, Cohere, Mistral, Qwen (via OpenRouter), Azure OpenAI, HuggingFace Inference Endpoints, or any custom OpenAI-compatible base URL.

- **New \`external\` tier** in the first-run wizard with provider sub-prompts and best-of-breed defaults (`voyage-code-3` for code, Qwen3 for NLP).
- **Per-axis providers**: pick distinct providers/models for `code` vs `nlp`, or reuse the same with one click.
- **Vercel AI SDK embedding factory** with dim-probe and signature cache — replaces the legacy native GGUF runtime end-to-end.
- **Provider registry** (`KNOWN_EMBEDDING_PROVIDERS`) covering OpenAI, Voyage, Cohere, Mistral, Qwen, plus a `Custom (manual)` path with `base_url` + `env_key` validation.
- **Pre-flight connectivity check** for the configured embedding backend before audits start.
- **Config write**: fully self-explanatory `.anatoly.yml` emitted on first run, no implicit defaults.
- **Documentation**: new `docs/embedding-providers.md` covering Azure OpenAI internal, self-hosted GGUF clusters, and HF Inference Endpoints.
- **Cost projection**: `estimate` now projects embedding token counts and prices them per axis for SDK-backed providers.

## 🪄 Epic 48/49 — First-Run Onboarding

A fully reworked onboarding flow that takes a brand-new user from `npx anatoly` to a working audit without manual config.

- **First-run wizard** with tier (`lite` / `advanced` / `external`) and mode prompts.
- **Inline ONNX prefetch** for lite tier and **inline GGUF download** with streaming SHA-256 verification, partial-file cleanup, and post-download verify.
- **Subprocess `setup-embeddings`** (now renamed `local-embeddings <upgrade|status>`) runs automatically after GGUF download.
- **Always-write `.anatoly.yml`** with sane defaults so config is discoverable.
- **Cross-project preferences** stored in `~/.anatoly/preferences.yml`.
- **Plain-mode parity**: tier comparison table and transparency notice render correctly without TTY/colors.
- **Quick-win runtime filter** + summary suggestion to surface the cheapest first audit.
- **Recovery prompts** for download failures (retry / fallback to lite / abort).
- **End-of-setup 3-choice prompt** and post-audit progressive education hint.
- **`--defaults-settings` flag** for fully non-interactive CI runs.
- **Visual setup-to-audit transition** so users know when the wizard hands off to the audit.

## 📊 Estimate / Forecast — Major Refresh

The `estimate` command went from a flat dump to an actionable, table-driven forecast.

- **Unified \`Cost breakdown\` table** powered by `cli-table3`, with full model IDs, embedding labels, total footer, and a ` based on latest public provider price` caption.
- **Per-step billing mode** (billed vs consumption) and per-step cost breakdown, including **Anthropic prompt-cache modeling** and a doc-generation heuristic (Pass A).
- **Forecast block reordered** for CLI-friendly reading: Forecast last, merged Configuration / Pipeline Plan first.
- **Scenario flags**: `--files`, `--axes`, `--no-deliberation`, `--no-internal-docs`.
- **Step IDs split into category + name**, plus `--json` output for programmatic use.
- **Deliberation cost** now modeled as a fixed cost per shard (not per axis × file), with per-shard tokens calibrated to opus-4-6 pricing.
- **Per-axis output multipliers** calibrated from R3 actuals and rebalanced to average 1.0.
- **Real bootstrap page count**, RAG-unfiltered scope, and the previously-missing coherence step are now reflected.
- **Pipeline Summary refond** and dropped opaque rows like ‘usage graph N edges'.
- **NLP summarizer cost** included in LLM forecast totals (Pass 1) and `summaryModel` wired through.

## ⚙️ Config v3 Schema

- **Declarative providers + routing** in a single v3 schema — replaces the v0/v1/v2 migration chain (legacy migrations dropped).
- **Annotated YAML template** emitted on first run.
- **External tier wiring**: runtime correctly resolves the external embedding tier through the RAG pipeline.

## 💰 Pricing Registry

- **Runtime pricing registry** sourced from **litellm** + **OpenRouter** catalogs — replaces the hardcoded `MODEL_PRICING` map.
- **Fail-loud strict mode**: runs are blocked when any model has no pricing, instead of silently estimating zero.
- **Pricing gate moved** to fire after the first-run wizard, never before.

## 🔌 Provider & Transport Improvements

- **OpenRouter** integrated as an aggregator for Qwen3-Embedding-8B, with app-attribution headers.
- **Cache-token capture** fixed for Vercel AI SDK v6 + Gemini and Claude Agent SDK (snake_case `cache_creation_input_tokens` / `cache_read_input_tokens`).
- **Per-call LLM telemetry** persisted on disk; Anthropic token capture hardened.
- **Auth column** in providers list now derived from the v3 declaration and covers every provider.
- **Unified provider-auth notice** for Anthropic + Google with inline A./B. labels.
- **Anthropic pre-flight probe** before starting the run.

## 🩹 Wizard / UX Fixes

- Stop saying ‘Embeddings (lite) ready' when advanced is the active tier.
- Unified embeddings tier notice + comparison into one block; renamed `default` → `lite`.
- OpenRouter (Qwen3-8B) promoted as the recommended NLP provider.
- External setup exits cleanly instead of failing the run.
- `--file` glob with zero matches now fails fast with a clear error.
- Run-summary shows input/output tokens in the cost line; hardcoded subscription hint removed.
- Pre-summary hint detector for missing init / lite RAG upgrade.
- `local-embeddings` patches config and skips the wizard prompt after upgrade.
- Doc-bootstrap: `scaffold-status` tag now certifies docs validity (no more file-presence guessing).

## 🛠 Install / Build / DevEx

- **Lazy-load model download** on first run — drops the postinstall download.
- **`make update`** recipe to refresh from `origin/main` and reinstall in one step; merged into a single shell so ‘up to date' skips install.
- **WSL guard** for Windows-installed Node, with install doc note.
- **`prepare` script** runs `tsup` so `npm install <git-url>` works; self-heals devDeps when npm git-install skips them.
- Migrated `@xenova/transformers` → `@huggingface/transformers`.
- Bumped `@google/gemini-cli-core` 0.35.2 → 0.40.1.

## 🧹 Misc

- Centralised default model identifiers in `core/default-models.ts`.
- Code-review hardening pass.
- `removeRunIfEmpty` + `readLatestPointer` exports restored on `run.ts`.
- README links to `anatoly.cloud` + free/star CTA.
- README + CLI / module / cost-optimization docs updated to match the new `estimate` view.

---

**Full changelog**: https://github.com/r-via/anatoly/compare/v0.9.4...v0.9.5

v0.9.4 — Background Worktree Review & Internal Docs Injection

Apr 28, 2026 · v0.9.4

## Highlights

### Background worktree review
Run audits in isolated git worktree snapshots without locking your main checkout. The new background mode forks the audit into its own process, persists per-run status, and notifies you on completion — letting you keep working while a long review runs.

- Isolated worktree per run (no conflicts with WIP changes on `HEAD`)
- Parallel runs supported via per-run lock policy (global lock skipped in background mode)
- `anatoly status` enriched with tracked background runs (PID, phase, elapsed)
- Desktop notifications on completion (`notify-send` / equivalent)
- New `anatoly cleanup` command — prune stale worktrees, lock files, and orphaned run dirs

### Internal docs as ground truth in business-logic axes
The internal docs scaffolder (`.anatoly/docs/`, agent-curated business overview / architecture / invariants) is now injected as authoritative project context in the `correction`, `best_practices`, and `overengineering` axes — not just in `documentation`. Findings can cite the source page path, making the chain of reasoning auditable. Zero additional LLM cost: the docs are already generated by the existing scaffolder phase.

### Industry-domain prompting
When the model can confidently infer your project's domain (gambling/casino, finance/payments, healthcare/PII, cryptography, gaming RNG, real-time systems) from filenames, imports, package metadata, README, or internal docs — it now applies well-known industry rules from its pretrained knowledge. Examples: `Math.random()` flagged as non-certifiable for regulated gaming, floating-point arithmetic flagged on monetary code, deprecated cryptographic primitives (MD5/SHA-1/ECB) flagged as critical. Each such finding cites both the inferred domain and the rule, keeping the speculative chain auditable.

### Per-defect correction findings
Symbols carrying multiple independent defects now split into one row per defect in the report instead of collapsing into a single prose paragraph. Each finding has its own `line_start` / `line_end` / `detail` for clearer signals and reproducible verdicts.

### Transport-Level Resilience
Per-provider semaphores and circuit breakers are now centralized in `TransportRouter` with a unified `acquireSlot` / `release` API. Replaces the previous mix of manual semaphores and the legacy `GeminiCircuitBreaker`. All agentic call sites migrated; agentic and single-turn calls share concurrency policy by provider.

## Bench progression (anatoly-bench / slot-engine fixture)

Each run is a full audit of the [`slot-engine`](https://github.com/r-via/anatoly-bench/tree/main/catalog/slot-engine) fixture, scored against a curated ground-truth catalog. Global F1 is the unweighted mean of per-axis F1s.

| Run | Date | Global F1 | correction | utility | duplication | overengineering | best-practices |
|-----|------|----------:|-----------:|--------:|------------:|----------------:|---------------:|
| v6  | 2026-04-24 | 56.8%     | 54.5%      | 60.0%   | **66.7%**   | 66.7%           | 36.4%          |
| v7  | 2026-04-26 | 65.5%     | 61.5%      | 60.0%   | 66.7%       | 66.7%           | **72.7%**      |
| v8  | 2026-04-27 | 62.7%     | 36.4%      | **85.7%** | 66.7%     | 75.0%           | 50.0%          |
| v9  | 2026-04-27 | 61.0%     | 46.2%      | 85.7%   | 66.7%       | 66.7%           | 40.0%          |
| v10 | 2026-04-28 | 65.0%     | 53.3%      | 85.7%   | 66.7%       | 75.0%           | 44.4%          |
| v11 | 2026-04-28 | 57.8%     | 44.4%      | 85.7%   | 66.7%       | 33.3%           | 58.8%          |
| **v12** | 2026-04-28 | **67.8%** | 53.3% | 85.7% | 66.7% | **66.7%** | **66.7%** |

Six fixes landed during this release window, each measured against the previous baseline:

- **v6 — duplication tier-1 invariant** ([44f0617](https://github.com/r-via/anatoly/commit/44f0617)). Tier-1 refinement was downgrading `DUPLICATE` verdicts when RAG similarity stayed below 0.68, even with a concrete `duplicate_target`. duplication: 0% → 66.7%.
- **v8 — per-axis triage policy** ([b784caf](https://github.com/r-via/anatoly/commit/b784caf)). Triage skip-tier was binary: type-only / trivial / barrel files bypassed every axis with safe defaults — utility lost real DEAD signals. Now skip decisions are per-axis, with usage graph consulted for utility on skipped files. utility: 66.7% → 85.7%.
- **v9 — multi-defect findings per symbol** ([75cdf08](https://github.com/r-via/anatoly/commit/75cdf08)). Correction now returns an optional `findings[]` array per symbol; the shard renderer emits one row per defect.
- **v10 — internal-docs injection into business-logic axes** ([a584b80](https://github.com/r-via/anatoly/commit/a584b80)). Anatoly's existing `.anatoly/docs/` already produced high-quality business context, used only by the documentation axis. Now also fed into `correction` / `best_practices` / `overengineering` with a ground-truth framing in the system prompt. correction: 46.2% → 53.3%; INV-ROUND detected.
- **v11 — industry-knowledge prompting** ([d0068a2](https://github.com/r-via/anatoly/commit/d0068a2)). Prompt rule inviting the model to apply well-known industry-specific rules (gaming RNG / monetary arithmetic / deprecated cryptographic primitives) when domain inference is confident, with mandatory citation of both inferred domain and rule. best-practices recall hit 100% (5/5) for the first time; BP-RNG (`Math.random()` in gaming) detected.
- **v12 — anti-collapse rules + temperature pin** ([d8fd931](https://github.com/r-via/anatoly/commit/d8fd931), [ebb8505](https://github.com/r-via/anatoly/commit/ebb8505)). Two changes: (1) "flag the source of a defect, not its consumer" rule on correction / OE / best-practices prompts — fixes run-to-run-flapping verdicts where the LLM oscillated between flagging one consumer-side finding vs N source-side findings. (2) `temperature: 0` pinned in the Vercel SDK transport for evaluator reproducibility (Anthropic Claude Agent SDK and Gemini CLI use SDK defaults — they do not expose temperature). OE: 33.3% → 66.7% with 100% precision; global F1: 57.8% → 67.8%.

**Net result for v0.9.4**: global F1 climbed from **56.8%** (start of cycle) to **67.8%** (+11.0 percentage points), with structural improvements on every axis. Full per-run baselines: [`anatoly-bench/baselines/`](https://github.com/r-via/anatoly-bench/tree/main/baselines).

## Fixes

- **Review progress counter** — display could show impossible values like `13/12` when triage skip-tier files had partial-axes policies (trivial files keeping correction/duplication/utility). `evaluateTotal` now mirrors the handler's actual evaluator-runs decision.
- **Refinement / DUPLICATE preservation** — never downgrade a `DUPLICATE` verdict in tier 2 when `duplicate_target` is populated; preserve `DUPLICATE` on dead code instead of collapsing to `DEAD`.
- **Triage** — per-axis skip policy keeps real signal on trivial / barrel-export / type-only / constants-only files (correction + duplication + utility still run on trivial files); usage-graph utility evaluation correctly resolves DEAD on skipped type/constant exports.
- **Anti-collapse rule** — correction and overengineering prompts now instruct: flag the source of a defect, not its consumer (defects have one canonical home — where they're defined).
- **Telegram notifications** — disabled axes excluded from the scorecard so cosmetic placeholder rows don't pollute the message.
- **Vercel SDK transport** — pinned `temperature=0` for evaluator reproducibility (deterministic verdicts on identical input).

## Internals

- BMAD/Ralph workflow integration improvements: parser tolerates story / epic heading variants, sprint-status sync points clarified.
- `domain-digest` feature explored, implemented, then reverted in favor of internal docs injection (same goal, no parallel extraction pipeline, zero additional LLM cost). Original spec preserved as deprecated history in `anatoly-bench/docs/02-domain-digest-spec.md`.

## Migration

Drop-in upgrade from 0.9.3. No config changes required; new flags are opt-in:

\`\`\`bash
anatoly run                    # default — same as before
anatoly status                 # now shows background runs
anatoly cleanup                # new — prune stale worktrees / locks
\`\`\`

The internal-docs injection only activates when `.anatoly/docs/` already exists for a project — generated automatically on first run by the existing scaffolder phase.

**Full changelog**: https://github.com/r-via/anatoly/compare/v0.9.3...v0.9.4

v0.9.3

Apr 24, 2026 · v0.9.3

## What's Changed since v0.9.2

### Features

- **Overengineering axis — usage-graph signal + duplication invariant** ([b129cd6](https://github.com/r-via/anatoly/commit/b129cd6)): the overengineering evaluator now factors in the usage graph (symbols with few runtime importers are weighted differently from hot code) and enforces a duplication invariant preventing contradictory verdicts.
- **Global refinement cache with freshFiles invalidation** ([3071dfc](https://github.com/r-via/anatoly/commit/3071dfc)): per-finding cache in `.anatoly/cache/` now survives across runs with freshly-reviewed files auto-evicted, preventing redundant tier-3 investigations.
- **Zod validation retry in agenticQuery** ([f617d1a](https://github.com/r-via/anatoly/commit/f617d1a)): transport layer retries once on schema validation failures before bubbling the error, reducing flaky runs caused by transient LLM output drift.

### Fixes

- **Review coherence and cache invariants** ([3c27a0a](https://github.com/r-via/anatoly/commit/3c27a0a)): multiple inter-axis coherence bugs and stale-cache edge cases resolved.
- **extract-json fence matching** ([fdd9c77](https://github.com/r-via/anatoly/commit/fdd9c77)): only matches ` ```json ` fences, no longer swallows ` ```rust ` or other-language blocks that happen to look JSON-like.
- **Tier1/tier2 progress output** ([e6a14a8](https://github.com/r-via/anatoly/commit/e6a14a8)): finding totals now surface in refinement progress so you can see how many findings each tier is processing.
- **Transport hardening — adversarial review #1-#10** ([b10e5c0](https://github.com/r-via/anatoly/commit/b10e5c0)): ten findings from the adversarial transport review addressed.

### Docs

- **Advanced Configuration rewrite for v2 schema** ([d6f0143](https://github.com/r-via/anatoly/commit/d6f0143)): [docs/03-Guides/02-Advanced-Configuration.md](https://github.com/r-via/anatoly/blob/main/docs/03-Guides/02-Advanced-Configuration.md) rewritten to match the current v2 config schema.

### Chore

- **Lint cleanup** ([06441cc](https://github.com/r-via/anatoly/commit/06441cc)): removed unused imports/vars, routed Telegram warnings through the central logger.

**Full changelog**: https://github.com/r-via/anatoly/compare/v0.9.2...v0.9.3

v0.9.2

Mar 31, 2026 · v0.9.2

## What's Changed since v0.9.1

### 3-Tier Refinement Pipeline (replaces per-file Deliberation)

The per-file Opus deliberation pass is replaced by a post-review refinement pipeline that processes all findings in batch:

- **Tier 1 — Deterministic auto-resolve** (0 tokens): usage graph confirms DEAD exports are truly unreferenced, AST validates line ranges, RAG confirms duplication candidates. Resolves ~40% of findings instantly
- **Tier 2 — Inter-axis coherence** (0 tokens): detects contradictions (DEAD + NEEDS_FIX is moot, type-only importers can't be OVER, LOW_VALUE coherence checks). Deterministic rules, no LLM
- **Tier 3 — Agentic investigation** (Opus): launches an agent with full tool access (Read, Grep, Bash, WebFetch) to investigate ambiguous findings with empirical evidence. Conversation transcripts dumped per finding
- **Global refinement cache** — per-finding persistence in `.anatoly/cache/` survives across runs; freshly-reviewed files are auto-evicted; `--no-cache` clears it
- **[CACHED] shard display** — cached shards show `[CACHED]` tag in deliberation output, matching review phase style
- **Finding totals in progress** — tier 1/tier 2 now show resolved/total and confirmed counts for full visibility
- **Results**: -22% faster, -20% cheaper, +150% CLEAN files vs legacy deliberation
- Per-shard progress display with finding-level granularity

### Multi-Provider Transport Architecture (Epic 43)

Complete rewrite of the LLM transport layer:

- **Mode-aware TransportRouter** — routes models to native transports (subscription) or Vercel AI SDK (API billing) based on provider config
- **Vercel AI SDK transport** — unified API billing for any provider (Anthropic, Google, OpenAI) with cost calculation
- **Config v2 format** — `providers:`, `models:`, `agents:`, `runtime:` sections replace flat `llm.*` paths. Automatic v1→v2 migration
- **`anatoly init` wizard** — interactive multi-provider setup with model selection
- **Per-provider semaphores and circuit breakers** (Epic 46) — `acquireSlot()`/`release()` pattern with automatic success/failure tracking
- **`extractProvider()`/`stripPrefix()`** — model prefix inference (`google/gemini-2.5-flash` or bare `gemini-2.5-flash`)
- **Agentic query** — `agenticQuery()` on TransportRouter for tier 3 dispatch with Bash tool + web search
- **Zod validation retry** in agentic queries

### Telegram Notifications (Epic 45)

- **`anatoly notifications create-bot`** — interactive setup wizard for Telegram bot
- **`anatoly notifications test`** — send a test notification
- **`anatoly report --notify`** — send notification after report generation
- **Auto-notify** after each `anatoly run` — single photo+caption with compressed banner, health bars, severity breakdown, token stats
- Budget-aware findings truncation to fit Telegram's 1024 caption limit
- Fire-and-forget: delivery failures never break the pipeline

### User Instructions — ANATOLY.md

- **`ANATOLY.md`** project-level instructions file — custom rules injected into axis system prompts
- Loader with frontmatter parsing and section extraction
- Per-axis skip patterns in config (`axes.*.skip`)
- Show custom rules in configuration table during setup

### Review Engine

- **Duplication auto-UNIQUE** — skip LLM when no RAG similarity candidates exist
- **Utility retry** — retry when LLM omits symbols instead of crashing
- **Correction refinement** — 4 new deterministic rules reduce false positives
- Remove projectTree injection from overengineering and tests axes (token savings)
- Pass `userInstructions` to all evaluators

### RAG

- Detect and log NLP name mismatches that cause infinite re-indexing
- Garbage-collect orphaned cache entries after parser line shifts
- Truncate NLP summaries instead of rejecting on >400 chars
- Route NLP summarization through TransportRouter
- Remove duplicate `rag:` prefix from log messages

### CLI & UX

- Dynamic provider display in pipeline header
- Per-provider concurrency slots display (Claude + Google)
- Move run-only options from global scope to run command
- Unify deliberation step as single "Deliberation" task in UI
- Include file paths and messages in error summary
- Doc scaffold conversation transcripts
- `anatoly runs` command + latest pointer helpers
- Fix `--no-color` flag

### Report

- Add `--debug` flag for report generation
- Fix `--notify` to use real health percentages

### Clean Loop

- Kill spawned claude process on SIGINT/SIGTERM
- Restore original branch on interrupt (stash + checkout + pop)

### Config

- Gold-set: 6 fixtures covering all 7 axes
- Config v1.0 schema with validation tests
- Per-axis skip patterns

### Bug Fixes

- Fix `extract-json` to only match ` ```json ` fences, not ` ```rust ` or other langs
- Fix `isGeminiModel` crash, use `stripPrefix` everywhere
- Suppress `console.error` from gemini-cli-core rate limit retries
- Fix Zod v4 refine compatibility in utility axis
- Fix cached file metrics in triage
- Fix triage to respect enabled axes in skip reviews
- Adversarial reviews: Epic 41 (10 fixes), Epic 42 (5 fixes), Epic 43 (7 fixes), Epic 46 (10 fixes)

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.9.1...v0.9.2

v0.9.1

Mar 27, 2026 · v0.9.1

## What's Changed since v0.9.0

### Gemini Transport

- **GenAI SDK transport** — added `@google/genai` SDK as alternative Gemini transport alongside `gemini-cli-core`, with concurrency stress test and token optimizations

### RAG

- **Scoped `--rebuild-rag`** — when used with `--file`, only purges vector store entries and caches for matching files instead of dropping the entire table
- **Gemini semaphore** — pass `geminiSemaphore` through the full RAG pipeline (orchestrator → nlp-summarizer → runSingleTurnQuery)

### Report

- **Health bar severity scaling** — degrade health bar color based on high-severity finding count, scale thresholds by codebase size
- Remove `buildReportsBaseUrl` from report command

### Scripts

- Portable `awk` in `free_port`, bounded timeout in `wait_for_gguf`

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.9.0...v0.9.1

v0.9.0

Mar 27, 2026 · v0.9.0

## What's Changed since v0.8.2

### Multi-Provider LLM — Gemini 2.5 Flash (Experimental)

**LlmTransport abstraction** — pluggable provider layer with `AnthropicTransport` and `GeminiTransport`:
- `LlmTransport` interface and `TransportRouter` for model-to-provider routing (Story 37.1)
- `AnthropicTransport` wraps existing Claude SDK calls (Story 37.2)
- `GeminiTransport` wraps `@google/gemini-cli-core` with Google OAuth (Story 37.3)
- Auth check with graceful fallback to Claude when Gemini is unavailable (Story 37.5)
- Circuit breaker: auto-falls back to Claude on repeated Gemini failures

**Axis routing** — utility, duplication, overengineering now routed to Gemini 2.5 Flash (Story 38.1):
- 100% accuracy on gold-set benchmarks, 2-5s latency, implicit caching (96% hit rate on 2nd call)
- Correction, tests, best practices, documentation remain on Claude (quality-critical)
- Deliberation stays on Claude Opus (non-negotiable safety net)

**NLP summarization** routed to Gemini Flash (Story 39.1) — 100% schema validity, $0/token

**Impact:** ~69% reduction in Claude API calls, ~74% cost reduction, ~35-40% faster runs

**New commands:**
- `anatoly providers` — verify LLM connectivity (Claude + Gemini status)

**Infrastructure:**
- Dual semaphores for Claude and Gemini concurrency management
- Provider field in logs and run metrics (Story 39.2)

### Review Engine

- **AST-based import extraction** — replaced regex with tree-sitter AST traversal for `require()`, Python `from-import`, and Bash recursion
- **False positive reduction** — filter private symbols, deduplicate actions, calibrate severity, conservative test coherence
- **Test discovery** — expanded test file discovery, inject tests into deliberation context, deduplication
- **Deliberation memory overhaul** — group by symbol instead of per-axis entries, merge stale entries, escape regex, rebuild on corrupted JSON, truncate `original_detail` to reclassified axes only
- **`--flush-memory` flag** — reset deliberation memory before a run
- **Rate limit handling** — sleep until rate limit reset instead of degrading reviews
- Raise default review concurrency from 4 to 8

### Report

- **`public_report.md`** — new polished public-facing report layout
- **Report upstream extracted** — migrated to standalone script in `anatoly-reports` repo
- **Redesigned report sections** — merged Findings Summary into Axes table, emoji health bars, verdict breakdown for all-clear axes, doc coverage section, total findings count in hero block
- Absolute links to anatoly-reports + breadcrumb navigation
- Executive summary with all 7 axes, fix token metrics

### Clean Loop

- **Subcommand rename** — `clean-run` → `clean run`, `clean-sync` → `clean sync`, etc.
- `clean generate`, `clean run`, `clean sync` as proper subcommands with tests
- Bump default iterations from 10 to 50
- Rename Ralph → clean loop in source code

### Documentation Pipeline

- **Smart chunking** — programmatic H2+H3+paragraph splitting replaces Haiku LLM chunking ($0)
- **Coherence optimization** — single-pass Sonnet, auto-fix, content injection (was multi-pass Opus)
- **Incremental updates** — scope doc updates to touched modules only
- **Doc deduplication** — auto-sync project docs from internal when trees are identical, `docs identity` and `docs reset-project` commands
- **Batch doc embeddings** — one mega-batch instead of per-file
- Auto-fix missing h1 heading, strip preamble from multi-turn agent output
- Raise scaffold maxTurns from 5 to 15, coherence injection budget to 50%
- Track RAG costs, expose cached status for doc section indexing
- Modular type-context injection + remove dead strategy2

### CLI & UX

- Show "Done !" instead of "0/N checks left" on file completion
- Clear screen on terminal resize to prevent rendering glitches
- Show "done" on completed pipeline tasks instead of stale counters
- Two-column models table, doc cache refresh, estimate doc stats
- Cache breakdown in pipeline output, error dumps, `[review]` prefix
- Reduce path truncation in progress display

### Rust Support

- Chained `super::` resolution, workspace glob expansion, cross-crate usage tracking
- Cargo workspace detection as Monorepo
- Language-aware output formatting

### Security

- Shell injection fixes — `execFileSync`, validate NWO format

### JSDoc Coverage (clean loop auto-fix)

- Added JSDoc for ~200 symbols across 70 files (auto-generated via clean loop)

### Bug Fixes

- RAG: block fallback to lite when index was built with advanced-gguf
- Symlink guard on sync/reset, conditional sync task
- Rate limiter max standby cycles, doc-gen rate limit detection
- Workspace module extraction, gap detection, Cargo detection
- Adversarial review — 12 fixes across CLI, RAG, core, commands
- Breadcrumb links, fork detection, doc coverage on cached runs

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.2...v0.9.0

v0.8.3

Mar 26, 2026 · v0.8.3

## What's Changed since v0.8.2

### Review Engine

- **AST-based import extraction** — replaced regex with tree-sitter AST traversal for `require()`, Python `from-import`, and Bash recursion
- **False positive reduction** — filter private symbols, deduplicate actions, calibrate severity, conservative test coherence
- **Test discovery** — expanded test file discovery, inject tests into deliberation context, deduplication
- **Deliberation memory overhaul** — group by symbol instead of per-axis entries, merge stale entries, escape regex, rebuild on corrupted JSON, truncate `original_detail` to reclassified axes only
- **`--flush-memory` flag** — reset deliberation memory before a run
- **Rate limit handling** — sleep until rate limit reset instead of degrading reviews
- Raise default review concurrency from 4 to 8

### Report

- **`public_report.md`** — new polished public-facing report layout
- **Report upstream extracted** — `report upstream` migrated to standalone script in `anatoly-reports` repo
- **Redesigned report sections** — merged Findings Summary into Axes table, emoji health bars, verdict breakdown for all-clear axes, doc coverage section, total findings count in hero block
- Absolute links to anatoly-reports + breadcrumb navigation
- Executive summary with all 7 axes, fix token metrics

### Clean (Ralph)

- **Subcommand rename** — `clean-run` → `clean run`, `clean-sync` → `clean sync`, etc.
- `clean generate`, `clean run`, `clean sync` as proper subcommands with tests
- Bump default Ralph iterations from 10 to 50

### Documentation Pipeline

- **Coherence optimization** — single-pass Sonnet, auto-fix, content injection (was multi-pass Opus)
- **Incremental updates** — scope doc updates to touched modules only
- **Doc deduplication** — auto-sync project docs from internal when trees are identical, `docs identity` and `docs reset-project` commands
- **Batch doc chunking** — N Haiku calls → 1 per file, maximize concurrency
- **Batch doc embeddings** — one mega-batch instead of per-file
- Auto-fix missing h1 heading by deriving title from filename
- Strip preamble from multi-turn agent output before writing
- Raise scaffold maxTurns from 5 to 15, coherence review injection budget to 50%
- Track RAG costs, expose cached status for doc section indexing
- Modular type-context injection + remove dead strategy2

### CLI & UX

- Show "Done !" instead of "0/N checks left" on file completion
- Clear screen on terminal resize to prevent rendering glitches
- Show "done" on completed pipeline tasks instead of stale counters
- Reduce path truncation in progress display

### Rust Support

- Chained `super::` resolution, workspace glob expansion, cross-crate usage tracking
- Cargo workspace detection as Monorepo
- Language-aware output formatting

### Security

- Shell injection fixes — `execFileSync`, validate NWO format

### JSDoc Coverage (Ralph auto-clean)

- Added JSDoc for ~200 symbols across 70 files (auto-generated via Ralph loop)

### Bug Fixes

- RAG: block fallback to lite when index was built with advanced-gguf
- Symlink guard on sync/reset, conditional sync task
- Rate limiter max standby cycles, doc-gen rate limit detection
- Workspace module extraction, gap detection, Cargo detection
- Adversarial review — 12 fixes across CLI, RAG, core, commands

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.2...v0.8.3

v0.8.2

Mar 25, 2026 · v0.8.2

## What's Changed since v0.8.1

### Internal Documentation Pipeline (Major)

**Full `docs scaffold` pipeline** — 6-step pipeline replacing `docs rebuild`:
- Scaffold → Coherence review → RAG index → Gap detection → Update → Lint
- Opus coherence review agent for cross-page structural + content consistency
- Deterministic heading linter replaces LLM-based structure review
- Neighbor page injection for cross-referencing during doc generation
- Site map context injected into doc-writer prompts
- `docs scaffold project` copies internal docs to `docs/` for publishing
- README.md injected as context for scaffold page generation

**Gap detection v2** — three-strategy architecture:
- Pre-computed `doc_vector` for gap detection (no runtime embedding, $0)
- Domain splitting for large codebases (sub-domains by file)
- Internal and project scope support

**New commands:**
- `anatoly docs index` — standalone RAG indexing (code + NLP + doc chunks)
- `anatoly docs update` — incremental doc updates with shared logic
- `anatoly docs lint` — deterministic structure lint
- `anatoly docs coherence` — lint + Opus coherence review
- `anatoly docs gap-detection internal|project` — scope-aware gap analysis

### RAG Engine

- Always dual embedding (code + NLP) — removed `dualEmbedding` flag
- Doc chunk cache to avoid re-chunking on rebuild
- `--drop-cache` flag for full rebuild (purges NLP summary + doc caches)
- `docSummary` added to function cards for doc gap detection
- Cumulative progress tracking for doc chunking (project + internal)
- Auto-purge doc caches when store has no doc sections

### CLI & UX

- Bump default SDK concurrency to 24
- Show "deliberating" state in UI
- Show "no project/internal docs" instead of "done" when none exist
- All pipeline steps show file activity in "In progress"
- Track current page during Sonnet update
- Semaphore passed to Opus agents for accurate UI counter
- Rate limit retry with exponential backoff for doc executor
- Plain mode improvements: log task transitions, error details, pipeline crashes
- Fix deadlock from double semaphore acquire

### Setup & Configuration

- Shared models directory across projects (skip redundant pulls)
- Remove deprecated `--ab-test` flag from `setup-embeddings`

### Bug Fixes

- Fix doc_vector migration warning and scaffold task gap
- Fix stale review-internal refs and coherence prompt contradictions
- Anti-hallucination + overwrite guard in content review prompt
- Linter detects unnumbered files in numbered directories
- Generic reference matching + deduplicate `pagesToUpdate`
- `--plain` no longer implies `--yes` (require explicit `-y` for destructive ops)
- Increase Opus agent maxTurns from 50 to 200
- Fix `onFileDone` for files with 0 sections (stuck "In progress")
- Block `docs index` if scaffold-only pages detected

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.1...v0.8.2

v0.8.1

Mar 22, 2026 · v0.8.1

## What's Changed since v0.8.0

### Prompt System Overhaul (Epic 33 + 34)

**Epic 33 — Universal Prompt Registry**
- Migrated all 28 system prompts from inline strings to `.system.md` files (`src/prompts/axes/`)
- Extracted 8 additional inline prompts to dedicated files
- Extended `prompt-resolver.ts` to universal registry (37 entries)
- Added bidirectional coherence tests for the registry
- Adversarial review & auto-fixes

**Epic 34 — Prompt Reinforcement (14 edge cases fixed)**
- **Story 34.1**: Removed contradictory JSON fences from 6 axis prompts, dynamic axis count
- **Story 34.2**: Added `guard-rails.system.md` anti-hallucination layer (confidence floor, symbol validation, line-range enforcement)
- **Story 34.3**: Score calibration anchoring for all 12 best-practices prompts (TypeScript + 11 languages)
- **Story 34.4**: Edge case handling for generated code, reinforced doc-writer and nlp-summarizer
- **Story 34.5**: Dynamic Zod schema → JSON example injection into axis system prompts
- **Story 34.6**: Gold-set integration test suite (8 fixtures, real LLM validation)
- **Story 34.7**: Adversarial validation pass across all stories

### Doc Generation
- Enriched LLM context with real project prerequisites
- LLM detects system dependencies from code (Docker, Redis, etc.)
- Prevented hallucinated API key instructions
- Separated bootstrap from update phase
- Added `anatoly docs` command with rebuild and status subcommands
- Switched default model from Haiku to Sonnet for better quality

### CLI & Infrastructure
- Added `--keep-docs` flag to reset command
- Extracted setup table renderer, enriched estimate command
- Added plain mode output for doc generation phases

### Scanner
- Download tree-sitter grammars on-the-fly with local caching

### RAG
- Parallelized doc chunking
- Skip scaffolded-only doc pages from RAG indexing
- Fixed stale "Saving index" display

### Report
- Replaced page ratio with symbol-based coverage metric

### Review Engine
- Wired prompt cascade into evaluators
- Dynamic code fence tags (no more hardcoded TypeScript)
- Derived axis count dynamically from evaluator registry

### Bug Fixes & Quality
- Adversarial reviews & auto-fixes for Epics 28, 29, 31, 33, 34
- Resolved pre-existing TypeScript compilation errors
- Code generation marker rule shared via guard-rails (all axes)
- Fixed gold-set test regex, symbol line numbers, and test assertions

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.0...v0.8.1

v0.8.0 — Multi-Language Support

Mar 21, 2026 · v0.8.0

## Highlights

### Multi-Language Support (Epic 31)
- **10 languages**: TypeScript, TSX, Python, Rust, Go, Java, C#, Bash/Shell, SQL, YAML, JSON
- **Auto-detection**: language distribution by file extension + framework detection (React, Next.js) by project markers
- **AST parsing**: tree-sitter grammar manager with dynamic WASM loading per language
- **Language-specific best-practices**: dedicated prompts for each language (PyGuard, RustGuard, GoGuard, ShellGuard, JavaGuard, CSharpGuard, SqlGuard, YamlGuard, JsonGuard)
- **Framework-aware evaluation**: React and Next.js specific prompts for best-practices and documentation axes
- **Auto-detect scanning**: `scan.auto_detect` discovers project files across all supported languages automatically
- **Usage graph**: multi-language import tracking (Python, Rust, Go, Bash source/dot imports)

### Documentation Pipeline (Stories 29.16-29.21)
- **LLM doc generation**: semaphore-bounded concurrent doc page generation via Haiku
- **Dynamic module pages**: scaffolder injects module-specific pages based on codebase analysis
- **Dual doc context**: documentation axis pulls from both `docs/` (user) and `.anatoly/docs/` (internal reference)
- **Configurable docs_path**: `documentation.docs_path` in `.anatoly.yml`
- **Distinct coverage tracking**: separate project vs internal export documentation coverage
- **Pipeline decoupling**: internal docs, post-review update, and bootstrap run as independent stages

### Fixes
- Review cache is now per-axis (prevents stale results when switching axes)
- Pipeline CLI display: proper left/right alignment in task list
- Ctrl+C interrupt no longer dumps noisy error traces
- Pre-existing TypeScript type errors resolved across tests and adapters

v0.7.0

Mar 20, 2026 · v0.7.0

## Highlights

- **Documentation scaffold pipeline** — full doc-scaffold system: project type detection, structure scaffolding, source code analysis, LLM page generation, scoring, recommendations, and Ralph sync mode (Epic 29)
- **SDK concurrency control** — global semaphore bounds parallel API calls to prevent rate-limit storms (Epic 30)
- **Docker GGUF embedding backend** — replace Python sidecar with Docker-based GGUF containers for GPU-accelerated embeddings (Epic 28)
- **RAG documentation indexing** — doc section extraction, NLP embeddings, and type-based filtering for review context
- **Structured metrics & events** — timeline, conversationStats, per-file/per-axis structured events in run-metrics.json
- **Report restructuring** — reports organized by axis with independent indexes

## Features

- **Doc scaffold pipeline** — project type detection, module granularity resolution, contextual scaffolding hints, code-to-doc mapping, LLM page content generation, incremental SHA-256 cache, 5-dimension scoring, dual-output recommendations, Ralph sync mode, and full CLI integration via `anatoly run` (Stories 29.1–29.15)
- **Concurrency** — global SDK semaphore wired through evaluation pipeline with configurable concurrency limits (Story 30.1)
- **Flat render** — code review fixes for Story 31.1
- **Docker GGUF containers** — VRAM detection, tier selection, A/B test for bf16 vs GGUF, container lifecycle management, setup wizard with Docker + NVIDIA Container Toolkit install (Epic 28)
- **RAG enhancements** — doc section indexing with NLP embeddings, sidecar model swap, per-function NLP summary cache, documentation axis in deliberation system, `--docs` flag for RAG status
- **Structured events** — per-file and per-axis structured events, unified run context for all commands, conversation dump infrastructure per LLM call
- **Timeline & metrics** — timeline phases, conversationStats with byModel breakdown in run-metrics.json
- **RAG UX** — animated spinners, phase checkmarks, concurrent file display, 3-phase progress (code/NLP/doc)
- **Haiku semantic chunking** — doc sections refined via Haiku before embedding
- **Batch embeddings** — batched NLP and code embedding requests for performance

## Fixes

- Triage phase added to timeline, byModel fix in conversationStats
- Normalize averaged NLP vector, restore lite-mode doc fallback
- GGUF container lifecycle: kill zombies on startup, stop on force exit, verify alive before reuse
- RAG progress display fixes (output stacking, concurrency display, Listr corruption)
- Setup-embeddings routes correctly per backend tier
- SHA-based cache for doc section indexing
- Guard test ensuring Anatoly never writes to `docs/`

v0.6.0

Mar 18, 2026 · v0.6.0

## Highlights

- **7th evaluation axis: Documentation** — evaluates JSDoc coverage on exports and `/docs/` synchronization
- **Deliberation memory** — generalized across all axes with learning loop feedback
- **AGPL-3.0 + dual licensing** — migrate from Apache-2.0, add commercial license option
- **Dry-run mode** — simulate the full pipeline without API calls (`--dry-run`)
- **Calibrated ETA** — per-axis timing based on historical runs
- **Sentence-transformers sidecar** — GPU-accelerated embeddings via nomic-embed-code 7B (3584d)

## Features

- **Documentation axis** — `DOCUMENTED` / `PARTIAL` / `UNDOCUMENTED` verdicts, docs_coverage in review JSON (Epic 26)
- **Deliberation memory** — persistent false-positive registry covers all axes, feeds back into learning loop
- **Holistic deliberation** — covers all axes per symbol with transitive usage-graph refs, NIH detection
- **Calibrated ETA** — estimate and pipeline summary display calibrated per-axis timing
- **Branch isolation** — `clean-run` enforces branch isolation before Ralph loop
- **Run lock** — block concurrent commands while a run is in progress
- **`--dry-run`** — phase-based estimate, calibrated timing, no runDir creation
- **`--axes` CLI option** — run specific axes (e.g. `--axes correction,tests`)
- **Tests axis enrichment** — test file content, callers, and project tree in context
- **Run Statistics & Axis Summary** — new report sections
- **`init` & `setup-embeddings` commands** — one-command project setup
- **RAG observability** — `rag-status` shows lite+advanced indexes
- **Dual code+NLP embedding** — hybrid similarity search with configurable weights
- **Hardware detection** — auto-select embedding models based on available hardware
- **Colored MOTD banner**, sidecar lifecycle overhaul, Ralph circuit breaker

## Fixes

- Documentation axis calibration, merge pipeline, prefix matching
- Adversarial review: wire docsTree, fix 6→7-axis refs
- Accumulative cache for complete reports across runs
- Best practices & tests findings trigger NEEDS_REFACTOR
- Calibration: max(axis) for parallel model, remove 3s sleep
- Deliberation: only reviews symbols with findings, always active by default
- Report: hide non-executed axes, full deliberation reasoning
- RAG: Arrow FloatVector crash, ONNX fallback
- Severity labels: French → English (CRITIQUE→CRITICAL, HAUTE→HIGH, MOYENNE→MEDIUM)
- Sidecar: correct dimensions, venv isolation, loading progress

## Breaking Changes

- **License**: Apache-2.0 → AGPL-3.0 (commercial license available, see COMMERCIAL.md)
- `fix`/`fix-sync` commands renamed to `clean`/`clean-sync`
- `--dual-embedding` replaced by `--rag-lite` / `--rag-advanced`
- Hook subcommands renamed to `on-edit`/`on-stop`

v0.5.1 — MOTD Banner, RAG Modes & Sidecar Overhaul

Mar 17, 2026 · v0.5.1

## Features

- Colored MOTD ASCII banner at startup with auto-generated sync script
- Sidecar lifecycle overhaul — cleanup, scoped spawn, idle timeout
- Show RAG file count in setup summary table
- List available axes in --axes help text
- Add --lite / --advanced RAG mode selection with separate indexes
- Show '-' for unevaluated axes when using --axes filter
- Auto-disable dual embedding when nomic-7B sidecar is active
- Show sidecar loading progress in CLI spinner with elapsed time
- Replace Ollama with sentence-transformers sidecar for GPU embeddings
- Add Ollama runtime for GPU-accelerated embeddings and fix RAG schema mismatch
- Add --axes CLI option for runtime axis selection (Epic 26)
- Add Ollama backend for GPU-accelerated code embeddings via nomic-embed-code
- Harden Ralph clean loop with circuit breaker, anti-placeholder guards, and adaptive PRD
- Add RAG pipeline evaluation framework with ground-truth benchmarks
- Add hardware detection and configurable dual embedding models
- Add dual code+NLP embedding for improved semantic duplication detection
- Add final clean-sync after clean-run completes
- Rename fix → clean commands to match Anatoly branding
- Native TypeScript Ralph loop via Claude Agent SDK

## Fixes

- Resolve pre-existing TypeScript compilation errors
- Review progress counter matches triage evaluate count
- Exclude NLP-failed cards from cache so they get retried
- Purge RAG cache when vector store is empty
- Remove Arrow table migration — drop legacy table instead
- Align ConfigSchema default dual_embedding to true
- ONNX fallback always uses Jina, not the sidecar model
- Start sidecar BEFORE model resolution to avoid chicken-and-egg
- Correct nomic-embed-code dimension to 3584d (7B hidden state)
- Add model load timer to --check and increase sidecar timeout to 180s
- Correct model download size to ~14 GB (7B model in FP16)
- Use correct model ID nomic-ai/nomic-embed-code (public, no auth)
- Move embedding venv to .anatoly/.venv to avoid project collision
- Fallback to ONNX when Ollama embed fails instead of truncating
- Truncate input to Ollama embed to avoid GGML_ASSERT overflow

## Refactoring

- Replace --dual-embedding with --rag-lite / --rag-advanced
- Simplify estimate and triage CLI output
- Use Claude Code CLI instead of SDK for Ralph loop

---

> *"Can I clean here?"* — Anatoly looked at the codebase, mop in hand, and sighed. Fifty-one files. Eighteen thousand lines of dead imports. A `TODO` from 2019. He wrung the mop. *"Da. I clean everywhere."*

v0.5.0 — Fix Command & Documentation Restructure

Mar 15, 2026 · v0.5.0

## Highlights

**`anatoly fix`** — The cleaning man now fixes what he finds. Parse an audit report shard, generate Ralph artifacts, and launch an autonomous correction loop. Every finding gets a deterministic checkbox ID for traceable remediation.

### Features

- **`anatoly fix <report-file>`** — Generate prd.json + CLAUDE.md + ralph.sh from a report shard
- **`anatoly fix-sync <report-file>`** — Sync completed fixes back to the report (shard + index checklist)
- **Checkbox rendering** — Every action gets `- [ ] <!-- ACT-{hash}-{id} -->` for deterministic matching
- **Aggregated Checklist** — Report index now includes a severity-sorted checklist of all actions

### Fixes

- `--file` filter now correctly scopes estimate and triage phases (was estimating full project)
- Removed dead `summary`/`keyConcepts`/`behavioralProfile` fields from `rag-status` display

### Documentation

- **Complete restructure**: 27 files across 7 sections, all cross-referenced against source code
- Getting Started, Architecture, CLI Reference, Core Modules, Integration, Development, Design Decisions
- 32 factual corrections applied after adversarial review
- README updated with new doc links and [Anatoly Shmondenko](https://www.youtube.com/@vladimirfitness) origin story

See [CHANGELOG.md](CHANGELOG.md) for full details.

v0.4.2

Feb 25, 2026 · v0.4.2

## What's Changed

### Enriched Review Reports (.rev.md)

- **Best Practices section** — Score, rules table (WARN/FAIL only), and suggestions with before/after code blocks now rendered in individual file reviews
- **Structured symbol details** — Pipe-delimited axis output parsed into per-axis bullets (Utility, Duplication, Correction, Overengineering, Tests)
- **Categorized actions** — Quick Wins / Refactors / Hygiene sections with effort estimates
- **Exported column** added to symbols table
- **Defaulted axes flagged** — When an evaluator didn't produce a result, the review now shows *(default — evaluator did not produce a result)*

### Transcript Persistence

- Axis evaluation transcripts are now persisted to runDir/logs/<file>.log for both run and review commands, enabling diagnosis of failed evaluators

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.4.1...v0.4.2

v0.4.1

Feb 25, 2026 · v0.4.1

## What's Changed

### Bug Fixes & Improvements

- **Dependency-aware evaluations** — Correction and best-practices axes now receive project dependency versions, reducing false positives (e.g., no longer flags missing try/catch when Commander v14+ handles async rejections natively)
- **Implicit no-cache for explicit commands** — `anatoly review` and `anatoly run --file` always re-evaluate files instead of skipping cached results
- **Renamed "dead code" → "utility"** in CLI display for clarity

### New in the Axis Pipeline (Epic 19)

- 6 axis evaluators: utility, duplication, overengineering, tests, correction, best_practices
- Axis merger, file evaluator, and simplified review pipeline
- Worker pool with compact axis progress display
- Updated estimator and reporter for the axis pipeline

### Internal

- New `dependency-meta` module with 14 unit tests
- Schemas v2 for multi-axis review output

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.4.0...v0.4.1

v0.4.0 — Triage, Fast Review & Sharded Reports

Feb 25, 2026 · v0.4.0

## What's New

### Triage Pipeline (Epic 16)
Files are now automatically classified into three tiers before review:
- **Skip** — barrels, type-only, constants → synthetic CLEAN review, zero API calls
- **Fast** — simple files (< 50 lines, < 3 symbols) → single-turn review (~5s)
- **Deep** — complex files → full agentic investigation (~45s)

Use `--no-triage` to disable and review all files with the full agent.

### Pre-computed Usage Graph (Epic 16)
A full import graph is built across all project files in a single local pass (< 1s). The agent receives pre-computed usage data in its prompt, eliminating ~90 redundant Grep tool calls per review for dead code verification.

### Fast Reviewer (Epic 17)
Simple files (fast tier) are reviewed in a single `query()` call with no tools — all context (file content, symbols, usage graph, RAG results) is included inline. If Zod validation fails after 2 attempts, the file is automatically promoted to deep review.

Optional `fast_model` config field lets you use a cheaper model (e.g., `claude-haiku-4-5-20251001`) for fast-tier reviews.

### Sharded Reports (Epic 18)
Reports are now split into:
- `report.md` — compact index (~100 lines) with executive summary, severity table, and checkbox links to shards
- `report.N.md` — per-shard detail files (max 10 files each), sorted by severity

When triage is active, the index includes a **Performance & Triage** section showing skip/fast/deep distribution and estimated time saved.

### Code Review Fixes
- Removed 180-line legacy `renderReport()` (dead code, used raw LLM verdict instead of `computeFileVerdict`)
- Deduplicated `loadTasks()` calls in run pipeline (3x → 1x disk I/O)
- Fast transcripts use `.fast.transcript.md` suffix to avoid overwrite on deep promotion
- Added `export * from` handling in usage graph (prevents false DEAD on star re-exports)

## Commits

- feat(triage): add file triage module for skip/fast/deep classification
- feat(usage-graph): add pre-computed import usage graph
- feat(prompt): inject pre-computed usage graph into agent prompt
- feat(pipeline): integrate triage and usage graph into run pipeline
- feat(fast-reviewer): add simplified single-turn reviewer for fast-tier files
- feat(pipeline): dispatch fast-tier files to fast-reviewer with deep promotion
- feat(reporter): shard report into index + per-shard files (max 10 files each)
- feat(reporter): add Performance & Triage section to index when triage active
- fix: code review fixes for v0.4.0 (epics 16-18)
- chore: bump version to 0.4.0 and update README

**Full Changelog**: https://github.com/r-via/anatoly/compare/v0.3.0...v0.4.0

v0.3.0 — Conformity Audit & Fixes

Feb 24, 2026 · v0.3.0

## Changelog (v0.2.0 → v0.3.0)

### Features

- **Parallel reviews** — Worker pool with `--concurrency N` (default 4), thread-safe ProgressManager with serialized writes (`c18b8d8`)
- **Rate limiting** — Exponential backoff for Sonnet reviews (base 5s, max 120s, jitter ±20%, 5 retries) (`00dbd72`)
- **Multi-file renderer** — Worker slots showing `[1] reviewing...` per concurrent review, completion order in flow zone (`fe03e10`)
- **Claude Code hooks** — `anatoly hook post-edit` (async background review), `hook stop` (quality gate + feedback injection), `hook init` (template generator) (`ed8424f`, `f501d44`)
- **`min_confidence` config** — Filter hook findings below threshold (default 70) (`f501d44`)
- **`max_stop_iterations` config** — Anti-loop protection for hook stop cycle (default 3) (`c1276a6`)
- **RAG pre-resolved** — Similarity results injected statically in prompt, MCP tool removed (`96fe7d2`)
- **Parallel RAG indexation** — Haiku calls distributed via worker pool with rate limiting (base 2s, max 30s) (`87f7794`, `d7c2ca4`)
- **`index_model` config** — Configure RAG indexing model separately (default `claude-haiku-4-5-20251001`) (`c5df16d`)
- **RAG on by default** — `--no-rag` to disable, launch banner shows model info (`29c9169`)
- **RAG garbage collection** — Stale index entries for deleted/renamed files are automatically purged on re-index (`0538f73`)

### Fixes

- **`$NO_COLOR` env var** — Respects [no-color.org](https://no-color.org) standard (`c1276a6`)
- **`review` command renderer** — Now uses `renderer.ts` with progress bar and counters instead of raw `console.log` (`c1276a6`)
- **Hook stop anti-loop** — `stop_count` tracked in HookState, exits silently at max iterations (`c1276a6`)
- **Hook spec alignment** — `decision: "block"` output format per Claude Code Stop hook protocol (`2f557a7`)
- **RAG code review fixes** — 4 issues from Epic 12 (orchestrator, cache, error handling) (`c7d514a`)
- **Progress bar stuck** — Fixed renderer scrolling up during review (`9ffa954`)
- **Ctrl+C reliability** — Interrupt now works in all commands (`e447ef7`)
- **Error messages** — Improved hints, verdictColor DRY, README update (`25edcce`)

### Self-Audit Fixes (`daeb3d9`)

Anatoly audited its own codebase — 14 files reviewed, 29 findings. Applied in a single pass:

- **scanner** — `abstract_class_declaration` now recognized as `class` symbol kind
- **run** — AbortController renewal race fixed, dynamic `index_model` label, retry count no longer hardcoded
- **reviewer** — dead `retries` counter removed, `tools`/`allowedTools` deduplication
- **cache** — `readProgress` validates with `ProgressSchema.safeParse`, `atomicWriteJson` cleanup on failure
- **vector-store** — `distanceToCosineSimilarity` computed once per row, `safeParseJsonArray` type-filters elements
- **format** — `formatResultLine` uses `verdictColor()` instead of inlined switch
- **hook-state** — null-review guard fixed (`typeof null === 'object'`)
- **process** — `isProcessRunning` returns `true` on `EPERM` (cross-user process)
- **lock** — explicit `unlinkSync` for stale lock cleanup
- **config** — 6 dead sub-schema type exports removed
- **task** — `TaskSchema`, `SymbolKindSchema`, `SymbolInfoSchema`, `CoverageDataSchema` exported
- **report** — shared `progressPath`/`errorFiles` extraction hoisted out of branches

### Refactoring

- **RAG orchestrator decoupled** — `processFileForIndex()` as pure function, `needsReindex()` extracted, batch upsert post-pool (`eb88562`)

### Docs

- **README rewrite** — Stronger positioning, hook documentation, Mermaid architecture diagram (`42b068e`)

### Stats
- **280 tests passing**
- Build: 134.85 KB (ESM)
- 19 commits