// Blog / orientation cost

How to reduce repeated repo crawls by coding agents

The point is not to make agents read less source. The point is to stop spending context on the same broad orientation crawl every session.

technical blog Pain, optimization, and cost reduction Published May 24, 2026 Updated May 25, 2026

View .knowledge hub Inspect GitHub

// direct answer

Short answer

Repeated repo crawls are reduced by replacing the broad first pass with a maintained routing bundle, then spending the saved attention on current source, tests, evidence, and any stale or suspect areas.

Reduce cold-start orientation work while keeping source review strict where it matters.

// real failure mode

The failure mode

Every new session repeats the same repo walk: README, package files, source tree, tests, docs, and old summaries.

That crawl is sometimes useful, but it is also noisy. It does not tell the agent which notes are stale or which module summaries are only heuristic.

The waste is not only tokens. It is attention spent rediscovering project shape instead of checking the code path that matters.

// repo-local model

The repo-local fix

Replace the broad first pass with a routing bundle as the first operational read.

Use the bundle to find target modules, trust status, critical files, and maintenance reports.

Then spend the source-reading budget where it matters: current code, tests, evidence, and any stale or suspect area.

// concrete example

A cautious measurement method

Measure the baseline cold-start path first: the files an agent tends to read before it knows where the task belongs.

Then compare it with a route-first path: routing_bundle.json, the relevant module card, current source/tests, and evidence for the task.

Keep the claim narrow. A local smoke estimate can show orientation reduction, but it should not be sold as universal token savings for every repository.

// repo proof

Repo proof to inspect

Route-first entrypoint

`.knowledge/maintenance/routing_bundle.json`

The bundle replaces an aimless first crawl with a smaller first-read path.

Use it for

Choosing module and source paths
Finding stale and suspect knowledge

Do not use it for

Avoiding source review
Claiming exact token savings

Local metrics

`.knowledge/metrics/`

Metrics make orientation claims inspectable instead of anecdotal.

Use it for

Comparing local runs
Tracking health and context estimates

Do not use it for

Publishing universal benchmarks
Comparing unrelated repos without caveats

Benchmark notes

`.knowledge/docs/metrics-benchmarks.md`

Benchmark docs explain the limits of the estimator and the scope of the result.

Use it for

Documenting methodology
Keeping claims cautious

Do not use it for

Guaranteeing production savings

PR summary

`.knowledge/maintenance/pr_summary.md`

The PR summary turns trust and repair state into reviewable output.

Use it for

Review handoff
Checking what changed since the last run

Do not use it for

Replacing reviewer judgment

// command transcript

Commands and expected checks

node .knowledge/tools/collect-metrics.js

expected: Local health, token estimates, file counts, and graph metrics are collected.
inspect next: .knowledge/metrics/ and .knowledge/docs/metrics-benchmarks.md
caution: Treat estimates as local smoke data unless validated with a tokenizer-specific benchmark.

node .knowledge/tools/flow.js release --no-color

expected: The route-first artifacts are rebuilt and checked together.
inspect next: .knowledge/maintenance/quality_report.json

Before / after repo-local proof

First-orientation path

01 Before: README -> manifests -> source tree -> tests -> docs -> old summaries

02 After: routing_bundle.json -> target module -> source/tests -> evidence

03 Synthetic SaaS-shape fixture: 14 orientation files -> 1 routing bundle

04 Published smoke result: about 22% estimated context reduction with one local estimator

// field guide

Measurement guardrails

Use metrics to improve workflow, not to overclaim precision.
Metric	Use it for	Do not claim
Orientation file count	Compare broad crawl vs route-first path.	Exact effort saved for every agent.
Estimated context	Spot rough direction and regression risk.	Tokenizer-verified universal savings.
Doctor score	Check knowledge health before handoff.	Proof the product code is correct.

// guardrails

What the agent should not trust blindly

Use the published numbers as order-of-magnitude smoke data, not production benchmarks.
Do not claim universal token savings. Tiny repositories can show overhead because the routing bundle has fixed structure cost.
Do not claim correctness from routing. Routing reduces aimless orientation; source and tests still decide behavior.

// common mistakes

Common mistakes

Measuring only tokens and ignoring whether the agent read the right source files.
Comparing one repository's result to another without matching size, language, and task type.
Optimizing away the source read that prevents wrong-file edits.

// quick FAQ

FAQ

Does .knowledge guarantee lower token use?

No. It is designed to reduce aimless first-orientation work. Actual token impact depends on repository size, task type, model behavior, and how much source must be re-read.

Should an agent read fewer tests after routing is added?

No. Routing should help the agent find the right tests faster. It should not reduce test review for behavior-changing work.

// page-specific next step

Use this page when repeated orientation becomes visible cost

Measure the broad crawl, switch to route-first onboarding, and keep the claim narrow: less aimless orientation, more targeted source review.

Read the hub Open GitHub See the Inspector