anatoly

Public log

Changelog

Every release of Anatoly, in reverse chronological order.

Pulled live from GitHub Releases. Subscribe via RSS.

  1. v0.9.6 β€” Estimate/Scan unification, RAG hardening & OSS hygiene

    Β· v0.9.6

    ## Highlights
    
    A consolidation release on top of v0.9.5: the `scan` command is now folded into `estimate`, the `estimate` forecast was recalibrated against real runs, RAG / local-embeddings handling was hardened, LLM cost bookkeeping was centralised, and the project gained a full open-source hygiene layer (SECURITY, CONTRIBUTING, CoC, CI).
    
    ---
    
    ## πŸ”­ Estimate / Scan unification
    
    The standalone `anatoly scan` command is gone β€” its capabilities (new / modified / cached file accounting) now live inside `estimate`, which is the single entry point for forecasting.
    
    - `anatoly scan` removed; new/modified/cached are surfaced directly in the `estimate` view.
    - New `--no-cache` flag for a from-scratch forecast that ignores prior cache state.
    - Cached files now excluded from the token + cost forecast (no more double-counting).
    - Stale tasks pointing to deleted/missing source files are pruned from the estimator.
    - Deliberation shard count derived from distinct directories rather than a flat `files / 20` heuristic.
    - Bootstrap and coherence per-page token constants recalibrated from real runs.
    - Estimate user-guide added under `docs/`.
    
    ## 🧭 Scan / config schema cleanup
    
    - Glob include/exclude detection runs at wizard time, with a new `respect_gitignore` knob.
    - `auto_detect` removed β€” `include` / `exclude` are now strictly authoritative.
    - TypeScript-specific defaults dropped from the scan config schema (the project is multi-language).
    - Language detection delegated to [`linguist-js`](https://www.npmjs.com/package/linguist-js) for accurate per-language stats.
    
    ## 🧠 RAG & local embeddings
    
    - LanceDB tables auto-rebuild on embedding-dimension drift, so swapping providers no longer corrupts the index.
    - System-local providers are now identified by name (`local-advanced`), not by URL heuristics β€” fixes routing when users override `base_url`.
    - The local sidecar is unified under the canonical `local-advanced` name, with per-axis URLs routed through the slot map.
    - `ANATOLY_LOCAL_DUMMY_KEY` footgun removed: local providers declare `auth: none` and skip API-key plumbing entirely.
    - `local-embeddings` gained `init`, `cleanup`, and `downgrade` subcommands; YAML formatting is preserved across config edits.
    
    ## πŸ“Š Metrics & telemetry
    
    - Centralised LLM cost bookkeeping behind a single `recordLlmCost` helper β€” every code path (tier3, phase, refinement) now goes through one accounting hook.
    - Refinement tokens propagate correctly through tier3 β†’ phase β†’ `recordLlmCost` (previously dropped on the floor).
    - `llm_call` events are now emitted for the coherence-review and content-review passes too, so the cost ndjson covers the full pipeline.
    - `runDocContentReview` is wrapped in `runWithContext` so its events carry the right ndjson phase tag.
    
    ## βœ‚οΈ Output concision discipline
    
    - New "output concision discipline" prompt block added to every axis and to non-RAG services.
    - Empirical study (`docs/concision-discipline-study.md`) documents the calibration methodology behind the new prompts.
    
    ## πŸ“š Docs pipeline
    
    - The `content-review` pass is now merged into the updater loop, removing one full pipeline phase and the round-trip it cost.
    
    ## πŸ› οΈ CLI, build & Makefile
    
    - `anatoly --version` now prints the git commit SHA alongside the package version.
    - `make update` supports a `BRANCH=` variable to refresh from a non-`main` branch.
    - `make update` supports a `COMMIT=` variable to pin a specific revision.
    - `sync-motd` build step is idempotent (no spurious diffs on repeated builds).
    
    ## πŸ”’ Open-source hygiene
    
    - New `SECURITY.md` with endpoint inventory and threat model.
    - New `CONTRIBUTING.md`, `CODE_OF_CONDUCT.md`, plus a CI workflow.
    - `README.md` got a TL;DR section.
    
    ## πŸ›‘οΈ Security & dependencies
    
    - `protobufjs` and `ip-address` pinned via `overrides` to clear critical CVEs.
    - Stale `hasInstallScript` flag dropped from the lockfile.
    
    ## πŸ”§ Refactoring
    
    - `RunConfig` type and builder extracted into a dedicated core module (`run.ts` is no longer the source of truth for config assembly).
    
    ---
    
    ## Full changelog
    
    `git log v0.9.5..v0.9.6` β€” 34 commits.
    
    **Compare:** https://github.com/r-via/anatoly/compare/v0.9.5...v0.9.6
  2. v0.9.5 β€” External Embeddings, Config v3 & Estimate Forecast

    Β· v0.9.5

    ## Highlights
    
    This release lands two major epics β€” **Epic 50 (External Embedding Providers)** and **Epic 48/49 (First-Run Onboarding)** β€” alongside a complete overhaul of the cost forecasting / `estimate` view, a new **declarative v3 config schema**, and a runtime pricing registry sourced from upstream catalogs.
    
    ---
    
    ## 🌐 Epic 50 β€” External Embedding Providers
    
    Anatoly is no longer tied to local GGUF / ONNX runtimes. You can now bring your own embedding provider β€” OpenAI, Voyage, Cohere, Mistral, Qwen (via OpenRouter), Azure OpenAI, HuggingFace Inference Endpoints, or any custom OpenAI-compatible base URL.
    
    - **New \`external\` tier** in the first-run wizard with provider sub-prompts and best-of-breed defaults (`voyage-code-3` for code, Qwen3 for NLP).
    - **Per-axis providers**: pick distinct providers/models for `code` vs `nlp`, or reuse the same with one click.
    - **Vercel AI SDK embedding factory** with dim-probe and signature cache β€” replaces the legacy native GGUF runtime end-to-end.
    - **Provider registry** (`KNOWN_EMBEDDING_PROVIDERS`) covering OpenAI, Voyage, Cohere, Mistral, Qwen, plus a `Custom (manual)` path with `base_url` + `env_key` validation.
    - **Pre-flight connectivity check** for the configured embedding backend before audits start.
    - **Config write**: fully self-explanatory `.anatoly.yml` emitted on first run, no implicit defaults.
    - **Documentation**: new `docs/embedding-providers.md` covering Azure OpenAI internal, self-hosted GGUF clusters, and HF Inference Endpoints.
    - **Cost projection**: `estimate` now projects embedding token counts and prices them per axis for SDK-backed providers.
    
    ## πŸͺ„ Epic 48/49 β€” First-Run Onboarding
    
    A fully reworked onboarding flow that takes a brand-new user from `npx anatoly` to a working audit without manual config.
    
    - **First-run wizard** with tier (`lite` / `advanced` / `external`) and mode prompts.
    - **Inline ONNX prefetch** for lite tier and **inline GGUF download** with streaming SHA-256 verification, partial-file cleanup, and post-download verify.
    - **Subprocess `setup-embeddings`** (now renamed `local-embeddings <upgrade|status>`) runs automatically after GGUF download.
    - **Always-write `.anatoly.yml`** with sane defaults so config is discoverable.
    - **Cross-project preferences** stored in `~/.anatoly/preferences.yml`.
    - **Plain-mode parity**: tier comparison table and transparency notice render correctly without TTY/colors.
    - **Quick-win runtime filter** + summary suggestion to surface the cheapest first audit.
    - **Recovery prompts** for download failures (retry / fallback to lite / abort).
    - **End-of-setup 3-choice prompt** and post-audit progressive education hint.
    - **`--defaults-settings` flag** for fully non-interactive CI runs.
    - **Visual setup-to-audit transition** so users know when the wizard hands off to the audit.
    
    ## πŸ“Š Estimate / Forecast β€” Major Refresh
    
    The `estimate` command went from a flat dump to an actionable, table-driven forecast.
    
    - **Unified \`Cost breakdown\` table** powered by `cli-table3`, with full model IDs, embedding labels, total footer, and a ` based on latest public provider price` caption.
    - **Per-step billing mode** (billed vs consumption) and per-step cost breakdown, including **Anthropic prompt-cache modeling** and a doc-generation heuristic (Pass A).
    - **Forecast block reordered** for CLI-friendly reading: Forecast last, merged Configuration / Pipeline Plan first.
    - **Scenario flags**: `--files`, `--axes`, `--no-deliberation`, `--no-internal-docs`.
    - **Step IDs split into category + name**, plus `--json` output for programmatic use.
    - **Deliberation cost** now modeled as a fixed cost per shard (not per axis Γ— file), with per-shard tokens calibrated to opus-4-6 pricing.
    - **Per-axis output multipliers** calibrated from R3 actuals and rebalanced to average 1.0.
    - **Real bootstrap page count**, RAG-unfiltered scope, and the previously-missing coherence step are now reflected.
    - **Pipeline Summary refond** and dropped opaque rows like β€˜usage graph N edges'.
    - **NLP summarizer cost** included in LLM forecast totals (Pass 1) and `summaryModel` wired through.
    
    ## βš™οΈ Config v3 Schema
    
    - **Declarative providers + routing** in a single v3 schema β€” replaces the v0/v1/v2 migration chain (legacy migrations dropped).
    - **Annotated YAML template** emitted on first run.
    - **External tier wiring**: runtime correctly resolves the external embedding tier through the RAG pipeline.
    
    ## πŸ’° Pricing Registry
    
    - **Runtime pricing registry** sourced from **litellm** + **OpenRouter** catalogs β€” replaces the hardcoded `MODEL_PRICING` map.
    - **Fail-loud strict mode**: runs are blocked when any model has no pricing, instead of silently estimating zero.
    - **Pricing gate moved** to fire after the first-run wizard, never before.
    
    ## πŸ”Œ Provider & Transport Improvements
    
    - **OpenRouter** integrated as an aggregator for Qwen3-Embedding-8B, with app-attribution headers.
    - **Cache-token capture** fixed for Vercel AI SDK v6 + Gemini and Claude Agent SDK (snake_case `cache_creation_input_tokens` / `cache_read_input_tokens`).
    - **Per-call LLM telemetry** persisted on disk; Anthropic token capture hardened.
    - **Auth column** in providers list now derived from the v3 declaration and covers every provider.
    - **Unified provider-auth notice** for Anthropic + Google with inline A./B. labels.
    - **Anthropic pre-flight probe** before starting the run.
    
    ## 🩹 Wizard / UX Fixes
    
    - Stop saying β€˜Embeddings (lite) ready' when advanced is the active tier.
    - Unified embeddings tier notice + comparison into one block; renamed `default` β†’ `lite`.
    - OpenRouter (Qwen3-8B) promoted as the recommended NLP provider.
    - External setup exits cleanly instead of failing the run.
    - `--file` glob with zero matches now fails fast with a clear error.
    - Run-summary shows input/output tokens in the cost line; hardcoded subscription hint removed.
    - Pre-summary hint detector for missing init / lite RAG upgrade.
    - `local-embeddings` patches config and skips the wizard prompt after upgrade.
    - Doc-bootstrap: `scaffold-status` tag now certifies docs validity (no more file-presence guessing).
    
    ## πŸ›  Install / Build / DevEx
    
    - **Lazy-load model download** on first run β€” drops the postinstall download.
    - **`make update`** recipe to refresh from `origin/main` and reinstall in one step; merged into a single shell so β€˜up to date' skips install.
    - **WSL guard** for Windows-installed Node, with install doc note.
    - **`prepare` script** runs `tsup` so `npm install <git-url>` works; self-heals devDeps when npm git-install skips them.
    - Migrated `@xenova/transformers` β†’ `@huggingface/transformers`.
    - Bumped `@google/gemini-cli-core` 0.35.2 β†’ 0.40.1.
    
    ## 🧹 Misc
    
    - Centralised default model identifiers in `core/default-models.ts`.
    - Code-review hardening pass.
    - `removeRunIfEmpty` + `readLatestPointer` exports restored on `run.ts`.
    - README links to `anatoly.cloud` + free/star CTA.
    - README + CLI / module / cost-optimization docs updated to match the new `estimate` view.
    
    ---
    
    **Full changelog**: https://github.com/r-via/anatoly/compare/v0.9.4...v0.9.5
  3. v0.9.4 β€” Background Worktree Review & Internal Docs Injection

    Β· v0.9.4

    ## Highlights
    
    ### Background worktree review
    Run audits in isolated git worktree snapshots without locking your main checkout. The new background mode forks the audit into its own process, persists per-run status, and notifies you on completion β€” letting you keep working while a long review runs.
    
    - Isolated worktree per run (no conflicts with WIP changes on `HEAD`)
    - Parallel runs supported via per-run lock policy (global lock skipped in background mode)
    - `anatoly status` enriched with tracked background runs (PID, phase, elapsed)
    - Desktop notifications on completion (`notify-send` / equivalent)
    - New `anatoly cleanup` command β€” prune stale worktrees, lock files, and orphaned run dirs
    
    ### Internal docs as ground truth in business-logic axes
    The internal docs scaffolder (`.anatoly/docs/`, agent-curated business overview / architecture / invariants) is now injected as authoritative project context in the `correction`, `best_practices`, and `overengineering` axes β€” not just in `documentation`. Findings can cite the source page path, making the chain of reasoning auditable. Zero additional LLM cost: the docs are already generated by the existing scaffolder phase.
    
    ### Industry-domain prompting
    When the model can confidently infer your project's domain (gambling/casino, finance/payments, healthcare/PII, cryptography, gaming RNG, real-time systems) from filenames, imports, package metadata, README, or internal docs β€” it now applies well-known industry rules from its pretrained knowledge. Examples: `Math.random()` flagged as non-certifiable for regulated gaming, floating-point arithmetic flagged on monetary code, deprecated cryptographic primitives (MD5/SHA-1/ECB) flagged as critical. Each such finding cites both the inferred domain and the rule, keeping the speculative chain auditable.
    
    ### Per-defect correction findings
    Symbols carrying multiple independent defects now split into one row per defect in the report instead of collapsing into a single prose paragraph. Each finding has its own `line_start` / `line_end` / `detail` for clearer signals and reproducible verdicts.
    
    ### Transport-Level Resilience
    Per-provider semaphores and circuit breakers are now centralized in `TransportRouter` with a unified `acquireSlot` / `release` API. Replaces the previous mix of manual semaphores and the legacy `GeminiCircuitBreaker`. All agentic call sites migrated; agentic and single-turn calls share concurrency policy by provider.
    
    ## Bench progression (anatoly-bench / slot-engine fixture)
    
    Each run is a full audit of the [`slot-engine`](https://github.com/r-via/anatoly-bench/tree/main/catalog/slot-engine) fixture, scored against a curated ground-truth catalog. Global F1 is the unweighted mean of per-axis F1s.
    
    | Run | Date | Global F1 | correction | utility | duplication | overengineering | best-practices |
    |-----|------|----------:|-----------:|--------:|------------:|----------------:|---------------:|
    | v6  | 2026-04-24 | 56.8%     | 54.5%      | 60.0%   | **66.7%**   | 66.7%           | 36.4%          |
    | v7  | 2026-04-26 | 65.5%     | 61.5%      | 60.0%   | 66.7%       | 66.7%           | **72.7%**      |
    | v8  | 2026-04-27 | 62.7%     | 36.4%      | **85.7%** | 66.7%     | 75.0%           | 50.0%          |
    | v9  | 2026-04-27 | 61.0%     | 46.2%      | 85.7%   | 66.7%       | 66.7%           | 40.0%          |
    | v10 | 2026-04-28 | 65.0%     | 53.3%      | 85.7%   | 66.7%       | 75.0%           | 44.4%          |
    | v11 | 2026-04-28 | 57.8%     | 44.4%      | 85.7%   | 66.7%       | 33.3%           | 58.8%          |
    | **v12** | 2026-04-28 | **67.8%** | 53.3% | 85.7% | 66.7% | **66.7%** | **66.7%** |
    
    Six fixes landed during this release window, each measured against the previous baseline:
    
    - **v6 β€” duplication tier-1 invariant** ([44f0617](https://github.com/r-via/anatoly/commit/44f0617)). Tier-1 refinement was downgrading `DUPLICATE` verdicts when RAG similarity stayed below 0.68, even with a concrete `duplicate_target`. duplication: 0% β†’ 66.7%.
    - **v8 β€” per-axis triage policy** ([b784caf](https://github.com/r-via/anatoly/commit/b784caf)). Triage skip-tier was binary: type-only / trivial / barrel files bypassed every axis with safe defaults β€” utility lost real DEAD signals. Now skip decisions are per-axis, with usage graph consulted for utility on skipped files. utility: 66.7% β†’ 85.7%.
    - **v9 β€” multi-defect findings per symbol** ([75cdf08](https://github.com/r-via/anatoly/commit/75cdf08)). Correction now returns an optional `findings[]` array per symbol; the shard renderer emits one row per defect.
    - **v10 β€” internal-docs injection into business-logic axes** ([a584b80](https://github.com/r-via/anatoly/commit/a584b80)). Anatoly's existing `.anatoly/docs/` already produced high-quality business context, used only by the documentation axis. Now also fed into `correction` / `best_practices` / `overengineering` with a ground-truth framing in the system prompt. correction: 46.2% β†’ 53.3%; INV-ROUND detected.
    - **v11 β€” industry-knowledge prompting** ([d0068a2](https://github.com/r-via/anatoly/commit/d0068a2)). Prompt rule inviting the model to apply well-known industry-specific rules (gaming RNG / monetary arithmetic / deprecated cryptographic primitives) when domain inference is confident, with mandatory citation of both inferred domain and rule. best-practices recall hit 100% (5/5) for the first time; BP-RNG (`Math.random()` in gaming) detected.
    - **v12 β€” anti-collapse rules + temperature pin** ([d8fd931](https://github.com/r-via/anatoly/commit/d8fd931), [ebb8505](https://github.com/r-via/anatoly/commit/ebb8505)). Two changes: (1) "flag the source of a defect, not its consumer" rule on correction / OE / best-practices prompts β€” fixes run-to-run-flapping verdicts where the LLM oscillated between flagging one consumer-side finding vs N source-side findings. (2) `temperature: 0` pinned in the Vercel SDK transport for evaluator reproducibility (Anthropic Claude Agent SDK and Gemini CLI use SDK defaults β€” they do not expose temperature). OE: 33.3% β†’ 66.7% with 100% precision; global F1: 57.8% β†’ 67.8%.
    
    **Net result for v0.9.4**: global F1 climbed from **56.8%** (start of cycle) to **67.8%** (+11.0 percentage points), with structural improvements on every axis. Full per-run baselines: [`anatoly-bench/baselines/`](https://github.com/r-via/anatoly-bench/tree/main/baselines).
    
    ## Fixes
    
    - **Review progress counter** β€” display could show impossible values like `13/12` when triage skip-tier files had partial-axes policies (trivial files keeping correction/duplication/utility). `evaluateTotal` now mirrors the handler's actual evaluator-runs decision.
    - **Refinement / DUPLICATE preservation** β€” never downgrade a `DUPLICATE` verdict in tier 2 when `duplicate_target` is populated; preserve `DUPLICATE` on dead code instead of collapsing to `DEAD`.
    - **Triage** β€” per-axis skip policy keeps real signal on trivial / barrel-export / type-only / constants-only files (correction + duplication + utility still run on trivial files); usage-graph utility evaluation correctly resolves DEAD on skipped type/constant exports.
    - **Anti-collapse rule** β€” correction and overengineering prompts now instruct: flag the source of a defect, not its consumer (defects have one canonical home β€” where they're defined).
    - **Telegram notifications** β€” disabled axes excluded from the scorecard so cosmetic placeholder rows don't pollute the message.
    - **Vercel SDK transport** β€” pinned `temperature=0` for evaluator reproducibility (deterministic verdicts on identical input).
    
    ## Internals
    
    - BMAD/Ralph workflow integration improvements: parser tolerates story / epic heading variants, sprint-status sync points clarified.
    - `domain-digest` feature explored, implemented, then reverted in favor of internal docs injection (same goal, no parallel extraction pipeline, zero additional LLM cost). Original spec preserved as deprecated history in `anatoly-bench/docs/02-domain-digest-spec.md`.
    
    ## Migration
    
    Drop-in upgrade from 0.9.3. No config changes required; new flags are opt-in:
    
    \`\`\`bash
    anatoly run                    # default β€” same as before
    anatoly status                 # now shows background runs
    anatoly cleanup                # new β€” prune stale worktrees / locks
    \`\`\`
    
    The internal-docs injection only activates when `.anatoly/docs/` already exists for a project β€” generated automatically on first run by the existing scaffolder phase.
    
    **Full changelog**: https://github.com/r-via/anatoly/compare/v0.9.3...v0.9.4
  4. v0.9.3

    Β· v0.9.3

    ## What's Changed since v0.9.2
    
    ### Features
    
    - **Overengineering axis β€” usage-graph signal + duplication invariant** ([b129cd6](https://github.com/r-via/anatoly/commit/b129cd6)): the overengineering evaluator now factors in the usage graph (symbols with few runtime importers are weighted differently from hot code) and enforces a duplication invariant preventing contradictory verdicts.
    - **Global refinement cache with freshFiles invalidation** ([3071dfc](https://github.com/r-via/anatoly/commit/3071dfc)): per-finding cache in `.anatoly/cache/` now survives across runs with freshly-reviewed files auto-evicted, preventing redundant tier-3 investigations.
    - **Zod validation retry in agenticQuery** ([f617d1a](https://github.com/r-via/anatoly/commit/f617d1a)): transport layer retries once on schema validation failures before bubbling the error, reducing flaky runs caused by transient LLM output drift.
    
    ### Fixes
    
    - **Review coherence and cache invariants** ([3c27a0a](https://github.com/r-via/anatoly/commit/3c27a0a)): multiple inter-axis coherence bugs and stale-cache edge cases resolved.
    - **extract-json fence matching** ([fdd9c77](https://github.com/r-via/anatoly/commit/fdd9c77)): only matches ` ```json ` fences, no longer swallows ` ```rust ` or other-language blocks that happen to look JSON-like.
    - **Tier1/tier2 progress output** ([e6a14a8](https://github.com/r-via/anatoly/commit/e6a14a8)): finding totals now surface in refinement progress so you can see how many findings each tier is processing.
    - **Transport hardening β€” adversarial review #1-#10** ([b10e5c0](https://github.com/r-via/anatoly/commit/b10e5c0)): ten findings from the adversarial transport review addressed.
    
    ### Docs
    
    - **Advanced Configuration rewrite for v2 schema** ([d6f0143](https://github.com/r-via/anatoly/commit/d6f0143)): [docs/03-Guides/02-Advanced-Configuration.md](https://github.com/r-via/anatoly/blob/main/docs/03-Guides/02-Advanced-Configuration.md) rewritten to match the current v2 config schema.
    
    ### Chore
    
    - **Lint cleanup** ([06441cc](https://github.com/r-via/anatoly/commit/06441cc)): removed unused imports/vars, routed Telegram warnings through the central logger.
    
    **Full changelog**: https://github.com/r-via/anatoly/compare/v0.9.2...v0.9.3
  5. v0.9.2

    Β· v0.9.2

    ## What's Changed since v0.9.1
    
    ### 3-Tier Refinement Pipeline (replaces per-file Deliberation)
    
    The per-file Opus deliberation pass is replaced by a post-review refinement pipeline that processes all findings in batch:
    
    - **Tier 1 β€” Deterministic auto-resolve** (0 tokens): usage graph confirms DEAD exports are truly unreferenced, AST validates line ranges, RAG confirms duplication candidates. Resolves ~40% of findings instantly
    - **Tier 2 β€” Inter-axis coherence** (0 tokens): detects contradictions (DEAD + NEEDS_FIX is moot, type-only importers can't be OVER, LOW_VALUE coherence checks). Deterministic rules, no LLM
    - **Tier 3 β€” Agentic investigation** (Opus): launches an agent with full tool access (Read, Grep, Bash, WebFetch) to investigate ambiguous findings with empirical evidence. Conversation transcripts dumped per finding
    - **Global refinement cache** β€” per-finding persistence in `.anatoly/cache/` survives across runs; freshly-reviewed files are auto-evicted; `--no-cache` clears it
    - **[CACHED] shard display** β€” cached shards show `[CACHED]` tag in deliberation output, matching review phase style
    - **Finding totals in progress** β€” tier 1/tier 2 now show resolved/total and confirmed counts for full visibility
    - **Results**: -22% faster, -20% cheaper, +150% CLEAN files vs legacy deliberation
    - Per-shard progress display with finding-level granularity
    
    ### Multi-Provider Transport Architecture (Epic 43)
    
    Complete rewrite of the LLM transport layer:
    
    - **Mode-aware TransportRouter** β€” routes models to native transports (subscription) or Vercel AI SDK (API billing) based on provider config
    - **Vercel AI SDK transport** β€” unified API billing for any provider (Anthropic, Google, OpenAI) with cost calculation
    - **Config v2 format** β€” `providers:`, `models:`, `agents:`, `runtime:` sections replace flat `llm.*` paths. Automatic v1β†’v2 migration
    - **`anatoly init` wizard** β€” interactive multi-provider setup with model selection
    - **Per-provider semaphores and circuit breakers** (Epic 46) β€” `acquireSlot()`/`release()` pattern with automatic success/failure tracking
    - **`extractProvider()`/`stripPrefix()`** β€” model prefix inference (`google/gemini-2.5-flash` or bare `gemini-2.5-flash`)
    - **Agentic query** β€” `agenticQuery()` on TransportRouter for tier 3 dispatch with Bash tool + web search
    - **Zod validation retry** in agentic queries
    
    ### Telegram Notifications (Epic 45)
    
    - **`anatoly notifications create-bot`** β€” interactive setup wizard for Telegram bot
    - **`anatoly notifications test`** β€” send a test notification
    - **`anatoly report --notify`** β€” send notification after report generation
    - **Auto-notify** after each `anatoly run` β€” single photo+caption with compressed banner, health bars, severity breakdown, token stats
    - Budget-aware findings truncation to fit Telegram's 1024 caption limit
    - Fire-and-forget: delivery failures never break the pipeline
    
    ### User Instructions β€” ANATOLY.md
    
    - **`ANATOLY.md`** project-level instructions file β€” custom rules injected into axis system prompts
    - Loader with frontmatter parsing and section extraction
    - Per-axis skip patterns in config (`axes.*.skip`)
    - Show custom rules in configuration table during setup
    
    ### Review Engine
    
    - **Duplication auto-UNIQUE** β€” skip LLM when no RAG similarity candidates exist
    - **Utility retry** β€” retry when LLM omits symbols instead of crashing
    - **Correction refinement** β€” 4 new deterministic rules reduce false positives
    - Remove projectTree injection from overengineering and tests axes (token savings)
    - Pass `userInstructions` to all evaluators
    
    ### RAG
    
    - Detect and log NLP name mismatches that cause infinite re-indexing
    - Garbage-collect orphaned cache entries after parser line shifts
    - Truncate NLP summaries instead of rejecting on >400 chars
    - Route NLP summarization through TransportRouter
    - Remove duplicate `rag:` prefix from log messages
    
    ### CLI & UX
    
    - Dynamic provider display in pipeline header
    - Per-provider concurrency slots display (Claude + Google)
    - Move run-only options from global scope to run command
    - Unify deliberation step as single "Deliberation" task in UI
    - Include file paths and messages in error summary
    - Doc scaffold conversation transcripts
    - `anatoly runs` command + latest pointer helpers
    - Fix `--no-color` flag
    
    ### Report
    
    - Add `--debug` flag for report generation
    - Fix `--notify` to use real health percentages
    
    ### Clean Loop
    
    - Kill spawned claude process on SIGINT/SIGTERM
    - Restore original branch on interrupt (stash + checkout + pop)
    
    ### Config
    
    - Gold-set: 6 fixtures covering all 7 axes
    - Config v1.0 schema with validation tests
    - Per-axis skip patterns
    
    ### Bug Fixes
    
    - Fix `extract-json` to only match ` ```json ` fences, not ` ```rust ` or other langs
    - Fix `isGeminiModel` crash, use `stripPrefix` everywhere
    - Suppress `console.error` from gemini-cli-core rate limit retries
    - Fix Zod v4 refine compatibility in utility axis
    - Fix cached file metrics in triage
    - Fix triage to respect enabled axes in skip reviews
    - Adversarial reviews: Epic 41 (10 fixes), Epic 42 (5 fixes), Epic 43 (7 fixes), Epic 46 (10 fixes)
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.9.1...v0.9.2
  6. v0.9.1

    Β· v0.9.1

    ## What's Changed since v0.9.0
    
    ### Gemini Transport
    
    - **GenAI SDK transport** β€” added `@google/genai` SDK as alternative Gemini transport alongside `gemini-cli-core`, with concurrency stress test and token optimizations
    
    ### RAG
    
    - **Scoped `--rebuild-rag`** β€” when used with `--file`, only purges vector store entries and caches for matching files instead of dropping the entire table
    - **Gemini semaphore** β€” pass `geminiSemaphore` through the full RAG pipeline (orchestrator β†’ nlp-summarizer β†’ runSingleTurnQuery)
    
    ### Report
    
    - **Health bar severity scaling** β€” degrade health bar color based on high-severity finding count, scale thresholds by codebase size
    - Remove `buildReportsBaseUrl` from report command
    
    ### Scripts
    
    - Portable `awk` in `free_port`, bounded timeout in `wait_for_gguf`
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.9.0...v0.9.1
  7. v0.9.0

    Β· v0.9.0

    ## What's Changed since v0.8.2
    
    ### Multi-Provider LLM β€” Gemini 2.5 Flash (Experimental)
    
    **LlmTransport abstraction** β€” pluggable provider layer with `AnthropicTransport` and `GeminiTransport`:
    - `LlmTransport` interface and `TransportRouter` for model-to-provider routing (Story 37.1)
    - `AnthropicTransport` wraps existing Claude SDK calls (Story 37.2)
    - `GeminiTransport` wraps `@google/gemini-cli-core` with Google OAuth (Story 37.3)
    - Auth check with graceful fallback to Claude when Gemini is unavailable (Story 37.5)
    - Circuit breaker: auto-falls back to Claude on repeated Gemini failures
    
    **Axis routing** β€” utility, duplication, overengineering now routed to Gemini 2.5 Flash (Story 38.1):
    - 100% accuracy on gold-set benchmarks, 2-5s latency, implicit caching (96% hit rate on 2nd call)
    - Correction, tests, best practices, documentation remain on Claude (quality-critical)
    - Deliberation stays on Claude Opus (non-negotiable safety net)
    
    **NLP summarization** routed to Gemini Flash (Story 39.1) β€” 100% schema validity, $0/token
    
    **Impact:** ~69% reduction in Claude API calls, ~74% cost reduction, ~35-40% faster runs
    
    **New commands:**
    - `anatoly providers` β€” verify LLM connectivity (Claude + Gemini status)
    
    **Infrastructure:**
    - Dual semaphores for Claude and Gemini concurrency management
    - Provider field in logs and run metrics (Story 39.2)
    
    ### Review Engine
    
    - **AST-based import extraction** β€” replaced regex with tree-sitter AST traversal for `require()`, Python `from-import`, and Bash recursion
    - **False positive reduction** β€” filter private symbols, deduplicate actions, calibrate severity, conservative test coherence
    - **Test discovery** β€” expanded test file discovery, inject tests into deliberation context, deduplication
    - **Deliberation memory overhaul** β€” group by symbol instead of per-axis entries, merge stale entries, escape regex, rebuild on corrupted JSON, truncate `original_detail` to reclassified axes only
    - **`--flush-memory` flag** β€” reset deliberation memory before a run
    - **Rate limit handling** β€” sleep until rate limit reset instead of degrading reviews
    - Raise default review concurrency from 4 to 8
    
    ### Report
    
    - **`public_report.md`** β€” new polished public-facing report layout
    - **Report upstream extracted** β€” migrated to standalone script in `anatoly-reports` repo
    - **Redesigned report sections** β€” merged Findings Summary into Axes table, emoji health bars, verdict breakdown for all-clear axes, doc coverage section, total findings count in hero block
    - Absolute links to anatoly-reports + breadcrumb navigation
    - Executive summary with all 7 axes, fix token metrics
    
    ### Clean Loop
    
    - **Subcommand rename** β€” `clean-run` β†’ `clean run`, `clean-sync` β†’ `clean sync`, etc.
    - `clean generate`, `clean run`, `clean sync` as proper subcommands with tests
    - Bump default iterations from 10 to 50
    - Rename Ralph β†’ clean loop in source code
    
    ### Documentation Pipeline
    
    - **Smart chunking** β€” programmatic H2+H3+paragraph splitting replaces Haiku LLM chunking ($0)
    - **Coherence optimization** β€” single-pass Sonnet, auto-fix, content injection (was multi-pass Opus)
    - **Incremental updates** β€” scope doc updates to touched modules only
    - **Doc deduplication** β€” auto-sync project docs from internal when trees are identical, `docs identity` and `docs reset-project` commands
    - **Batch doc embeddings** β€” one mega-batch instead of per-file
    - Auto-fix missing h1 heading, strip preamble from multi-turn agent output
    - Raise scaffold maxTurns from 5 to 15, coherence injection budget to 50%
    - Track RAG costs, expose cached status for doc section indexing
    - Modular type-context injection + remove dead strategy2
    
    ### CLI & UX
    
    - Show "Done !" instead of "0/N checks left" on file completion
    - Clear screen on terminal resize to prevent rendering glitches
    - Show "done" on completed pipeline tasks instead of stale counters
    - Two-column models table, doc cache refresh, estimate doc stats
    - Cache breakdown in pipeline output, error dumps, `[review]` prefix
    - Reduce path truncation in progress display
    
    ### Rust Support
    
    - Chained `super::` resolution, workspace glob expansion, cross-crate usage tracking
    - Cargo workspace detection as Monorepo
    - Language-aware output formatting
    
    ### Security
    
    - Shell injection fixes β€” `execFileSync`, validate NWO format
    
    ### JSDoc Coverage (clean loop auto-fix)
    
    - Added JSDoc for ~200 symbols across 70 files (auto-generated via clean loop)
    
    ### Bug Fixes
    
    - RAG: block fallback to lite when index was built with advanced-gguf
    - Symlink guard on sync/reset, conditional sync task
    - Rate limiter max standby cycles, doc-gen rate limit detection
    - Workspace module extraction, gap detection, Cargo detection
    - Adversarial review β€” 12 fixes across CLI, RAG, core, commands
    - Breadcrumb links, fork detection, doc coverage on cached runs
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.2...v0.9.0
  8. v0.8.3

    Β· v0.8.3

    ## What's Changed since v0.8.2
    
    ### Review Engine
    
    - **AST-based import extraction** β€” replaced regex with tree-sitter AST traversal for `require()`, Python `from-import`, and Bash recursion
    - **False positive reduction** β€” filter private symbols, deduplicate actions, calibrate severity, conservative test coherence
    - **Test discovery** β€” expanded test file discovery, inject tests into deliberation context, deduplication
    - **Deliberation memory overhaul** β€” group by symbol instead of per-axis entries, merge stale entries, escape regex, rebuild on corrupted JSON, truncate `original_detail` to reclassified axes only
    - **`--flush-memory` flag** β€” reset deliberation memory before a run
    - **Rate limit handling** β€” sleep until rate limit reset instead of degrading reviews
    - Raise default review concurrency from 4 to 8
    
    ### Report
    
    - **`public_report.md`** β€” new polished public-facing report layout
    - **Report upstream extracted** β€” `report upstream` migrated to standalone script in `anatoly-reports` repo
    - **Redesigned report sections** β€” merged Findings Summary into Axes table, emoji health bars, verdict breakdown for all-clear axes, doc coverage section, total findings count in hero block
    - Absolute links to anatoly-reports + breadcrumb navigation
    - Executive summary with all 7 axes, fix token metrics
    
    ### Clean (Ralph)
    
    - **Subcommand rename** β€” `clean-run` β†’ `clean run`, `clean-sync` β†’ `clean sync`, etc.
    - `clean generate`, `clean run`, `clean sync` as proper subcommands with tests
    - Bump default Ralph iterations from 10 to 50
    
    ### Documentation Pipeline
    
    - **Coherence optimization** β€” single-pass Sonnet, auto-fix, content injection (was multi-pass Opus)
    - **Incremental updates** β€” scope doc updates to touched modules only
    - **Doc deduplication** β€” auto-sync project docs from internal when trees are identical, `docs identity` and `docs reset-project` commands
    - **Batch doc chunking** β€” N Haiku calls β†’ 1 per file, maximize concurrency
    - **Batch doc embeddings** β€” one mega-batch instead of per-file
    - Auto-fix missing h1 heading by deriving title from filename
    - Strip preamble from multi-turn agent output before writing
    - Raise scaffold maxTurns from 5 to 15, coherence review injection budget to 50%
    - Track RAG costs, expose cached status for doc section indexing
    - Modular type-context injection + remove dead strategy2
    
    ### CLI & UX
    
    - Show "Done !" instead of "0/N checks left" on file completion
    - Clear screen on terminal resize to prevent rendering glitches
    - Show "done" on completed pipeline tasks instead of stale counters
    - Reduce path truncation in progress display
    
    ### Rust Support
    
    - Chained `super::` resolution, workspace glob expansion, cross-crate usage tracking
    - Cargo workspace detection as Monorepo
    - Language-aware output formatting
    
    ### Security
    
    - Shell injection fixes β€” `execFileSync`, validate NWO format
    
    ### JSDoc Coverage (Ralph auto-clean)
    
    - Added JSDoc for ~200 symbols across 70 files (auto-generated via Ralph loop)
    
    ### Bug Fixes
    
    - RAG: block fallback to lite when index was built with advanced-gguf
    - Symlink guard on sync/reset, conditional sync task
    - Rate limiter max standby cycles, doc-gen rate limit detection
    - Workspace module extraction, gap detection, Cargo detection
    - Adversarial review β€” 12 fixes across CLI, RAG, core, commands
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.2...v0.8.3
  9. v0.8.2

    Β· v0.8.2

    ## What's Changed since v0.8.1
    
    ### Internal Documentation Pipeline (Major)
    
    **Full `docs scaffold` pipeline** β€” 6-step pipeline replacing `docs rebuild`:
    - Scaffold β†’ Coherence review β†’ RAG index β†’ Gap detection β†’ Update β†’ Lint
    - Opus coherence review agent for cross-page structural + content consistency
    - Deterministic heading linter replaces LLM-based structure review
    - Neighbor page injection for cross-referencing during doc generation
    - Site map context injected into doc-writer prompts
    - `docs scaffold project` copies internal docs to `docs/` for publishing
    - README.md injected as context for scaffold page generation
    
    **Gap detection v2** β€” three-strategy architecture:
    - Pre-computed `doc_vector` for gap detection (no runtime embedding, $0)
    - Domain splitting for large codebases (sub-domains by file)
    - Internal and project scope support
    
    **New commands:**
    - `anatoly docs index` β€” standalone RAG indexing (code + NLP + doc chunks)
    - `anatoly docs update` β€” incremental doc updates with shared logic
    - `anatoly docs lint` β€” deterministic structure lint
    - `anatoly docs coherence` β€” lint + Opus coherence review
    - `anatoly docs gap-detection internal|project` β€” scope-aware gap analysis
    
    ### RAG Engine
    
    - Always dual embedding (code + NLP) β€” removed `dualEmbedding` flag
    - Doc chunk cache to avoid re-chunking on rebuild
    - `--drop-cache` flag for full rebuild (purges NLP summary + doc caches)
    - `docSummary` added to function cards for doc gap detection
    - Cumulative progress tracking for doc chunking (project + internal)
    - Auto-purge doc caches when store has no doc sections
    
    ### CLI & UX
    
    - Bump default SDK concurrency to 24
    - Show "deliberating" state in UI
    - Show "no project/internal docs" instead of "done" when none exist
    - All pipeline steps show file activity in "In progress"
    - Track current page during Sonnet update
    - Semaphore passed to Opus agents for accurate UI counter
    - Rate limit retry with exponential backoff for doc executor
    - Plain mode improvements: log task transitions, error details, pipeline crashes
    - Fix deadlock from double semaphore acquire
    
    ### Setup & Configuration
    
    - Shared models directory across projects (skip redundant pulls)
    - Remove deprecated `--ab-test` flag from `setup-embeddings`
    
    ### Bug Fixes
    
    - Fix doc_vector migration warning and scaffold task gap
    - Fix stale review-internal refs and coherence prompt contradictions
    - Anti-hallucination + overwrite guard in content review prompt
    - Linter detects unnumbered files in numbered directories
    - Generic reference matching + deduplicate `pagesToUpdate`
    - `--plain` no longer implies `--yes` (require explicit `-y` for destructive ops)
    - Increase Opus agent maxTurns from 50 to 200
    - Fix `onFileDone` for files with 0 sections (stuck "In progress")
    - Block `docs index` if scaffold-only pages detected
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.1...v0.8.2
  10. v0.8.1

    Β· v0.8.1

    ## What's Changed since v0.8.0
    
    ### Prompt System Overhaul (Epic 33 + 34)
    
    **Epic 33 β€” Universal Prompt Registry**
    - Migrated all 28 system prompts from inline strings to `.system.md` files (`src/prompts/axes/`)
    - Extracted 8 additional inline prompts to dedicated files
    - Extended `prompt-resolver.ts` to universal registry (37 entries)
    - Added bidirectional coherence tests for the registry
    - Adversarial review & auto-fixes
    
    **Epic 34 β€” Prompt Reinforcement (14 edge cases fixed)**
    - **Story 34.1**: Removed contradictory JSON fences from 6 axis prompts, dynamic axis count
    - **Story 34.2**: Added `guard-rails.system.md` anti-hallucination layer (confidence floor, symbol validation, line-range enforcement)
    - **Story 34.3**: Score calibration anchoring for all 12 best-practices prompts (TypeScript + 11 languages)
    - **Story 34.4**: Edge case handling for generated code, reinforced doc-writer and nlp-summarizer
    - **Story 34.5**: Dynamic Zod schema β†’ JSON example injection into axis system prompts
    - **Story 34.6**: Gold-set integration test suite (8 fixtures, real LLM validation)
    - **Story 34.7**: Adversarial validation pass across all stories
    
    ### Doc Generation
    - Enriched LLM context with real project prerequisites
    - LLM detects system dependencies from code (Docker, Redis, etc.)
    - Prevented hallucinated API key instructions
    - Separated bootstrap from update phase
    - Added `anatoly docs` command with rebuild and status subcommands
    - Switched default model from Haiku to Sonnet for better quality
    
    ### CLI & Infrastructure
    - Added `--keep-docs` flag to reset command
    - Extracted setup table renderer, enriched estimate command
    - Added plain mode output for doc generation phases
    
    ### Scanner
    - Download tree-sitter grammars on-the-fly with local caching
    
    ### RAG
    - Parallelized doc chunking
    - Skip scaffolded-only doc pages from RAG indexing
    - Fixed stale "Saving index" display
    
    ### Report
    - Replaced page ratio with symbol-based coverage metric
    
    ### Review Engine
    - Wired prompt cascade into evaluators
    - Dynamic code fence tags (no more hardcoded TypeScript)
    - Derived axis count dynamically from evaluator registry
    
    ### Bug Fixes & Quality
    - Adversarial reviews & auto-fixes for Epics 28, 29, 31, 33, 34
    - Resolved pre-existing TypeScript compilation errors
    - Code generation marker rule shared via guard-rails (all axes)
    - Fixed gold-set test regex, symbol line numbers, and test assertions
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.8.0...v0.8.1
  11. v0.8.0 β€” Multi-Language Support

    Β· v0.8.0

    ## Highlights
    
    ### Multi-Language Support (Epic 31)
    - **10 languages**: TypeScript, TSX, Python, Rust, Go, Java, C#, Bash/Shell, SQL, YAML, JSON
    - **Auto-detection**: language distribution by file extension + framework detection (React, Next.js) by project markers
    - **AST parsing**: tree-sitter grammar manager with dynamic WASM loading per language
    - **Language-specific best-practices**: dedicated prompts for each language (PyGuard, RustGuard, GoGuard, ShellGuard, JavaGuard, CSharpGuard, SqlGuard, YamlGuard, JsonGuard)
    - **Framework-aware evaluation**: React and Next.js specific prompts for best-practices and documentation axes
    - **Auto-detect scanning**: `scan.auto_detect` discovers project files across all supported languages automatically
    - **Usage graph**: multi-language import tracking (Python, Rust, Go, Bash source/dot imports)
    
    ### Documentation Pipeline (Stories 29.16-29.21)
    - **LLM doc generation**: semaphore-bounded concurrent doc page generation via Haiku
    - **Dynamic module pages**: scaffolder injects module-specific pages based on codebase analysis
    - **Dual doc context**: documentation axis pulls from both `docs/` (user) and `.anatoly/docs/` (internal reference)
    - **Configurable docs_path**: `documentation.docs_path` in `.anatoly.yml`
    - **Distinct coverage tracking**: separate project vs internal export documentation coverage
    - **Pipeline decoupling**: internal docs, post-review update, and bootstrap run as independent stages
    
    ### Fixes
    - Review cache is now per-axis (prevents stale results when switching axes)
    - Pipeline CLI display: proper left/right alignment in task list
    - Ctrl+C interrupt no longer dumps noisy error traces
    - Pre-existing TypeScript type errors resolved across tests and adapters
  12. v0.7.0

    Β· v0.7.0

    ## Highlights
    
    - **Documentation scaffold pipeline** β€” full doc-scaffold system: project type detection, structure scaffolding, source code analysis, LLM page generation, scoring, recommendations, and Ralph sync mode (Epic 29)
    - **SDK concurrency control** β€” global semaphore bounds parallel API calls to prevent rate-limit storms (Epic 30)
    - **Docker GGUF embedding backend** β€” replace Python sidecar with Docker-based GGUF containers for GPU-accelerated embeddings (Epic 28)
    - **RAG documentation indexing** β€” doc section extraction, NLP embeddings, and type-based filtering for review context
    - **Structured metrics & events** β€” timeline, conversationStats, per-file/per-axis structured events in run-metrics.json
    - **Report restructuring** β€” reports organized by axis with independent indexes
    
    ## Features
    
    - **Doc scaffold pipeline** β€” project type detection, module granularity resolution, contextual scaffolding hints, code-to-doc mapping, LLM page content generation, incremental SHA-256 cache, 5-dimension scoring, dual-output recommendations, Ralph sync mode, and full CLI integration via `anatoly run` (Stories 29.1–29.15)
    - **Concurrency** β€” global SDK semaphore wired through evaluation pipeline with configurable concurrency limits (Story 30.1)
    - **Flat render** β€” code review fixes for Story 31.1
    - **Docker GGUF containers** β€” VRAM detection, tier selection, A/B test for bf16 vs GGUF, container lifecycle management, setup wizard with Docker + NVIDIA Container Toolkit install (Epic 28)
    - **RAG enhancements** β€” doc section indexing with NLP embeddings, sidecar model swap, per-function NLP summary cache, documentation axis in deliberation system, `--docs` flag for RAG status
    - **Structured events** β€” per-file and per-axis structured events, unified run context for all commands, conversation dump infrastructure per LLM call
    - **Timeline & metrics** β€” timeline phases, conversationStats with byModel breakdown in run-metrics.json
    - **RAG UX** β€” animated spinners, phase checkmarks, concurrent file display, 3-phase progress (code/NLP/doc)
    - **Haiku semantic chunking** β€” doc sections refined via Haiku before embedding
    - **Batch embeddings** β€” batched NLP and code embedding requests for performance
    
    ## Fixes
    
    - Triage phase added to timeline, byModel fix in conversationStats
    - Normalize averaged NLP vector, restore lite-mode doc fallback
    - GGUF container lifecycle: kill zombies on startup, stop on force exit, verify alive before reuse
    - RAG progress display fixes (output stacking, concurrency display, Listr corruption)
    - Setup-embeddings routes correctly per backend tier
    - SHA-based cache for doc section indexing
    - Guard test ensuring Anatoly never writes to `docs/`
  13. v0.6.0

    Β· v0.6.0

    ## Highlights
    
    - **7th evaluation axis: Documentation** β€” evaluates JSDoc coverage on exports and `/docs/` synchronization
    - **Deliberation memory** β€” generalized across all axes with learning loop feedback
    - **AGPL-3.0 + dual licensing** β€” migrate from Apache-2.0, add commercial license option
    - **Dry-run mode** β€” simulate the full pipeline without API calls (`--dry-run`)
    - **Calibrated ETA** β€” per-axis timing based on historical runs
    - **Sentence-transformers sidecar** β€” GPU-accelerated embeddings via nomic-embed-code 7B (3584d)
    
    ## Features
    
    - **Documentation axis** β€” `DOCUMENTED` / `PARTIAL` / `UNDOCUMENTED` verdicts, docs_coverage in review JSON (Epic 26)
    - **Deliberation memory** β€” persistent false-positive registry covers all axes, feeds back into learning loop
    - **Holistic deliberation** β€” covers all axes per symbol with transitive usage-graph refs, NIH detection
    - **Calibrated ETA** β€” estimate and pipeline summary display calibrated per-axis timing
    - **Branch isolation** β€” `clean-run` enforces branch isolation before Ralph loop
    - **Run lock** β€” block concurrent commands while a run is in progress
    - **`--dry-run`** β€” phase-based estimate, calibrated timing, no runDir creation
    - **`--axes` CLI option** β€” run specific axes (e.g. `--axes correction,tests`)
    - **Tests axis enrichment** β€” test file content, callers, and project tree in context
    - **Run Statistics & Axis Summary** β€” new report sections
    - **`init` & `setup-embeddings` commands** β€” one-command project setup
    - **RAG observability** β€” `rag-status` shows lite+advanced indexes
    - **Dual code+NLP embedding** β€” hybrid similarity search with configurable weights
    - **Hardware detection** β€” auto-select embedding models based on available hardware
    - **Colored MOTD banner**, sidecar lifecycle overhaul, Ralph circuit breaker
    
    ## Fixes
    
    - Documentation axis calibration, merge pipeline, prefix matching
    - Adversarial review: wire docsTree, fix 6β†’7-axis refs
    - Accumulative cache for complete reports across runs
    - Best practices & tests findings trigger NEEDS_REFACTOR
    - Calibration: max(axis) for parallel model, remove 3s sleep
    - Deliberation: only reviews symbols with findings, always active by default
    - Report: hide non-executed axes, full deliberation reasoning
    - RAG: Arrow FloatVector crash, ONNX fallback
    - Severity labels: French → English (CRITIQUE→CRITICAL, HAUTE→HIGH, MOYENNE→MEDIUM)
    - Sidecar: correct dimensions, venv isolation, loading progress
    
    ## Breaking Changes
    
    - **License**: Apache-2.0 β†’ AGPL-3.0 (commercial license available, see COMMERCIAL.md)
    - `fix`/`fix-sync` commands renamed to `clean`/`clean-sync`
    - `--dual-embedding` replaced by `--rag-lite` / `--rag-advanced`
    - Hook subcommands renamed to `on-edit`/`on-stop`
  14. v0.5.1 β€” MOTD Banner, RAG Modes & Sidecar Overhaul

    Β· v0.5.1

    ## Features
    
    - Colored MOTD ASCII banner at startup with auto-generated sync script
    - Sidecar lifecycle overhaul β€” cleanup, scoped spawn, idle timeout
    - Show RAG file count in setup summary table
    - List available axes in --axes help text
    - Add --lite / --advanced RAG mode selection with separate indexes
    - Show '-' for unevaluated axes when using --axes filter
    - Auto-disable dual embedding when nomic-7B sidecar is active
    - Show sidecar loading progress in CLI spinner with elapsed time
    - Replace Ollama with sentence-transformers sidecar for GPU embeddings
    - Add Ollama runtime for GPU-accelerated embeddings and fix RAG schema mismatch
    - Add --axes CLI option for runtime axis selection (Epic 26)
    - Add Ollama backend for GPU-accelerated code embeddings via nomic-embed-code
    - Harden Ralph clean loop with circuit breaker, anti-placeholder guards, and adaptive PRD
    - Add RAG pipeline evaluation framework with ground-truth benchmarks
    - Add hardware detection and configurable dual embedding models
    - Add dual code+NLP embedding for improved semantic duplication detection
    - Add final clean-sync after clean-run completes
    - Rename fix β†’ clean commands to match Anatoly branding
    - Native TypeScript Ralph loop via Claude Agent SDK
    
    ## Fixes
    
    - Resolve pre-existing TypeScript compilation errors
    - Review progress counter matches triage evaluate count
    - Exclude NLP-failed cards from cache so they get retried
    - Purge RAG cache when vector store is empty
    - Remove Arrow table migration β€” drop legacy table instead
    - Align ConfigSchema default dual_embedding to true
    - ONNX fallback always uses Jina, not the sidecar model
    - Start sidecar BEFORE model resolution to avoid chicken-and-egg
    - Correct nomic-embed-code dimension to 3584d (7B hidden state)
    - Add model load timer to --check and increase sidecar timeout to 180s
    - Correct model download size to ~14 GB (7B model in FP16)
    - Use correct model ID nomic-ai/nomic-embed-code (public, no auth)
    - Move embedding venv to .anatoly/.venv to avoid project collision
    - Fallback to ONNX when Ollama embed fails instead of truncating
    - Truncate input to Ollama embed to avoid GGML_ASSERT overflow
    
    ## Refactoring
    
    - Replace --dual-embedding with --rag-lite / --rag-advanced
    - Simplify estimate and triage CLI output
    - Use Claude Code CLI instead of SDK for Ralph loop
    
    ---
    
    > *"Can I clean here?"* β€” Anatoly looked at the codebase, mop in hand, and sighed. Fifty-one files. Eighteen thousand lines of dead imports. A `TODO` from 2019. He wrung the mop. *"Da. I clean everywhere."*
  15. v0.5.0 β€” Fix Command & Documentation Restructure

    Β· v0.5.0

    ## Highlights
    
    **`anatoly fix`** β€” The cleaning man now fixes what he finds. Parse an audit report shard, generate Ralph artifacts, and launch an autonomous correction loop. Every finding gets a deterministic checkbox ID for traceable remediation.
    
    ### Features
    
    - **`anatoly fix <report-file>`** β€” Generate prd.json + CLAUDE.md + ralph.sh from a report shard
    - **`anatoly fix-sync <report-file>`** β€” Sync completed fixes back to the report (shard + index checklist)
    - **Checkbox rendering** β€” Every action gets `- [ ] <!-- ACT-{hash}-{id} -->` for deterministic matching
    - **Aggregated Checklist** β€” Report index now includes a severity-sorted checklist of all actions
    
    ### Fixes
    
    - `--file` filter now correctly scopes estimate and triage phases (was estimating full project)
    - Removed dead `summary`/`keyConcepts`/`behavioralProfile` fields from `rag-status` display
    
    ### Documentation
    
    - **Complete restructure**: 27 files across 7 sections, all cross-referenced against source code
    - Getting Started, Architecture, CLI Reference, Core Modules, Integration, Development, Design Decisions
    - 32 factual corrections applied after adversarial review
    - README updated with new doc links and [Anatoly Shmondenko](https://www.youtube.com/@vladimirfitness) origin story
    
    See [CHANGELOG.md](CHANGELOG.md) for full details.
  16. v0.4.2

    Β· v0.4.2

    ## What's Changed
    
    ### Enriched Review Reports (.rev.md)
    
    - **Best Practices section** β€” Score, rules table (WARN/FAIL only), and suggestions with before/after code blocks now rendered in individual file reviews
    - **Structured symbol details** β€” Pipe-delimited axis output parsed into per-axis bullets (Utility, Duplication, Correction, Overengineering, Tests)
    - **Categorized actions** β€” Quick Wins / Refactors / Hygiene sections with effort estimates
    - **Exported column** added to symbols table
    - **Defaulted axes flagged** β€” When an evaluator didn't produce a result, the review now shows *(default β€” evaluator did not produce a result)*
    
    ### Transcript Persistence
    
    - Axis evaluation transcripts are now persisted to runDir/logs/<file>.log for both run and review commands, enabling diagnosis of failed evaluators
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.4.1...v0.4.2
  17. v0.4.1

    Β· v0.4.1

    ## What's Changed
    
    ### Bug Fixes & Improvements
    
    - **Dependency-aware evaluations** β€” Correction and best-practices axes now receive project dependency versions, reducing false positives (e.g., no longer flags missing try/catch when Commander v14+ handles async rejections natively)
    - **Implicit no-cache for explicit commands** β€” `anatoly review` and `anatoly run --file` always re-evaluate files instead of skipping cached results
    - **Renamed "dead code" β†’ "utility"** in CLI display for clarity
    
    ### New in the Axis Pipeline (Epic 19)
    
    - 6 axis evaluators: utility, duplication, overengineering, tests, correction, best_practices
    - Axis merger, file evaluator, and simplified review pipeline
    - Worker pool with compact axis progress display
    - Updated estimator and reporter for the axis pipeline
    
    ### Internal
    
    - New `dependency-meta` module with 14 unit tests
    - Schemas v2 for multi-axis review output
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.4.0...v0.4.1
  18. v0.4.0 β€” Triage, Fast Review & Sharded Reports

    Β· v0.4.0

    ## What's New
    
    ### Triage Pipeline (Epic 16)
    Files are now automatically classified into three tiers before review:
    - **Skip** β€” barrels, type-only, constants β†’ synthetic CLEAN review, zero API calls
    - **Fast** β€” simple files (< 50 lines, < 3 symbols) β†’ single-turn review (~5s)
    - **Deep** β€” complex files β†’ full agentic investigation (~45s)
    
    Use `--no-triage` to disable and review all files with the full agent.
    
    ### Pre-computed Usage Graph (Epic 16)
    A full import graph is built across all project files in a single local pass (< 1s). The agent receives pre-computed usage data in its prompt, eliminating ~90 redundant Grep tool calls per review for dead code verification.
    
    ### Fast Reviewer (Epic 17)
    Simple files (fast tier) are reviewed in a single `query()` call with no tools β€” all context (file content, symbols, usage graph, RAG results) is included inline. If Zod validation fails after 2 attempts, the file is automatically promoted to deep review.
    
    Optional `fast_model` config field lets you use a cheaper model (e.g., `claude-haiku-4-5-20251001`) for fast-tier reviews.
    
    ### Sharded Reports (Epic 18)
    Reports are now split into:
    - `report.md` β€” compact index (~100 lines) with executive summary, severity table, and checkbox links to shards
    - `report.N.md` β€” per-shard detail files (max 10 files each), sorted by severity
    
    When triage is active, the index includes a **Performance & Triage** section showing skip/fast/deep distribution and estimated time saved.
    
    ### Code Review Fixes
    - Removed 180-line legacy `renderReport()` (dead code, used raw LLM verdict instead of `computeFileVerdict`)
    - Deduplicated `loadTasks()` calls in run pipeline (3x β†’ 1x disk I/O)
    - Fast transcripts use `.fast.transcript.md` suffix to avoid overwrite on deep promotion
    - Added `export * from` handling in usage graph (prevents false DEAD on star re-exports)
    
    ## Commits
    
    - feat(triage): add file triage module for skip/fast/deep classification
    - feat(usage-graph): add pre-computed import usage graph
    - feat(prompt): inject pre-computed usage graph into agent prompt
    - feat(pipeline): integrate triage and usage graph into run pipeline
    - feat(fast-reviewer): add simplified single-turn reviewer for fast-tier files
    - feat(pipeline): dispatch fast-tier files to fast-reviewer with deep promotion
    - feat(reporter): shard report into index + per-shard files (max 10 files each)
    - feat(reporter): add Performance & Triage section to index when triage active
    - fix: code review fixes for v0.4.0 (epics 16-18)
    - chore: bump version to 0.4.0 and update README
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.3.0...v0.4.0
  19. v0.3.0 β€” Conformity Audit & Fixes

    Β· v0.3.0

    ## Changelog (v0.2.0 β†’ v0.3.0)
    
    ### Features
    
    - **Parallel reviews** β€” Worker pool with `--concurrency N` (default 4), thread-safe ProgressManager with serialized writes (`c18b8d8`)
    - **Rate limiting** β€” Exponential backoff for Sonnet reviews (base 5s, max 120s, jitter Β±20%, 5 retries) (`00dbd72`)
    - **Multi-file renderer** β€” Worker slots showing `[1] reviewing...` per concurrent review, completion order in flow zone (`fe03e10`)
    - **Claude Code hooks** β€” `anatoly hook post-edit` (async background review), `hook stop` (quality gate + feedback injection), `hook init` (template generator) (`ed8424f`, `f501d44`)
    - **`min_confidence` config** β€” Filter hook findings below threshold (default 70) (`f501d44`)
    - **`max_stop_iterations` config** β€” Anti-loop protection for hook stop cycle (default 3) (`c1276a6`)
    - **RAG pre-resolved** β€” Similarity results injected statically in prompt, MCP tool removed (`96fe7d2`)
    - **Parallel RAG indexation** β€” Haiku calls distributed via worker pool with rate limiting (base 2s, max 30s) (`87f7794`, `d7c2ca4`)
    - **`index_model` config** β€” Configure RAG indexing model separately (default `claude-haiku-4-5-20251001`) (`c5df16d`)
    - **RAG on by default** β€” `--no-rag` to disable, launch banner shows model info (`29c9169`)
    - **RAG garbage collection** β€” Stale index entries for deleted/renamed files are automatically purged on re-index (`0538f73`)
    
    ### Fixes
    
    - **`$NO_COLOR` env var** β€” Respects [no-color.org](https://no-color.org) standard (`c1276a6`)
    - **`review` command renderer** β€” Now uses `renderer.ts` with progress bar and counters instead of raw `console.log` (`c1276a6`)
    - **Hook stop anti-loop** β€” `stop_count` tracked in HookState, exits silently at max iterations (`c1276a6`)
    - **Hook spec alignment** β€” `decision: "block"` output format per Claude Code Stop hook protocol (`2f557a7`)
    - **RAG code review fixes** β€” 4 issues from Epic 12 (orchestrator, cache, error handling) (`c7d514a`)
    - **Progress bar stuck** β€” Fixed renderer scrolling up during review (`9ffa954`)
    - **Ctrl+C reliability** β€” Interrupt now works in all commands (`e447ef7`)
    - **Error messages** β€” Improved hints, verdictColor DRY, README update (`25edcce`)
    
    ### Self-Audit Fixes (`daeb3d9`)
    
    Anatoly audited its own codebase β€” 14 files reviewed, 29 findings. Applied in a single pass:
    
    - **scanner** β€” `abstract_class_declaration` now recognized as `class` symbol kind
    - **run** β€” AbortController renewal race fixed, dynamic `index_model` label, retry count no longer hardcoded
    - **reviewer** β€” dead `retries` counter removed, `tools`/`allowedTools` deduplication
    - **cache** β€” `readProgress` validates with `ProgressSchema.safeParse`, `atomicWriteJson` cleanup on failure
    - **vector-store** β€” `distanceToCosineSimilarity` computed once per row, `safeParseJsonArray` type-filters elements
    - **format** β€” `formatResultLine` uses `verdictColor()` instead of inlined switch
    - **hook-state** β€” null-review guard fixed (`typeof null === 'object'`)
    - **process** β€” `isProcessRunning` returns `true` on `EPERM` (cross-user process)
    - **lock** β€” explicit `unlinkSync` for stale lock cleanup
    - **config** β€” 6 dead sub-schema type exports removed
    - **task** β€” `TaskSchema`, `SymbolKindSchema`, `SymbolInfoSchema`, `CoverageDataSchema` exported
    - **report** β€” shared `progressPath`/`errorFiles` extraction hoisted out of branches
    
    ### Refactoring
    
    - **RAG orchestrator decoupled** β€” `processFileForIndex()` as pure function, `needsReindex()` extracted, batch upsert post-pool (`eb88562`)
    
    ### Docs
    
    - **README rewrite** β€” Stronger positioning, hook documentation, Mermaid architecture diagram (`42b068e`)
    
    ### Stats
    - **280 tests passing**
    - Build: 134.85 KB (ESM)
    - 19 commits
  20. v0.2.0

    Β· v0.2.0

    ## What's New in v0.2.0
    
    ### Features
    
    - **Semantic RAG cross-file similarity detection** β€” detect similar code patterns across files using embeddings (`feat(rag)`)
    - **Pre-index all functions with Haiku** before reviews for faster RAG lookups (`feat(rag)`)
    - **`rag-status` command** and improved SIGINT handling (`feat(cli)`)
    - **Run-scoped output directories** β€” organize outputs into time-scoped directories (`feat(output)`)
    - **UX/DX improvements (Epic 9)** β€” CLI usability and developer experience enhancements (`feat(ux)`)
    
    ### Bug Fixes
    
    - Fix `review-writer` skipping `duplicate_target` block when target fields are empty
    - Fix `PKG_VERSION` injection at build time for bundled `package.json` resolution
    - Resolve security, correctness and wiring issues in RAG module (code review fixes)
    - Code review fixes for Epic 9 β€” 10 issues resolved
    
    ### Refactoring
    
    - Clean up RAG module and sync architecture
    
    ### Documentation
    
    - Update README for v0.2.0 β€” RAG, run-scoped outputs, new commands
    
    **Full Changelog**: https://github.com/r-via/anatoly/compare/v0.1.1...v0.2.0