Operator Workflows
This page covers the day-to-day command workflows you will use after the
workspace is indexed. Commands are shown using the short form that assumes
gather-step is on your PATH. Append --workspace /path/to/workspace to
every command if you have not set the workspace through an environment variable
or config.
For flag-level reference on any command, run gather-step <command> --help.
Inspect a Workspace
Section titled “Inspect a Workspace”Always start here before running deeper analysis. These commands tell you whether the index exists, what it contains, and whether it is healthy enough to trust.
gather-step statusgather-step doctorgather-step search <QUERY> --limit 20status
Section titled “status”Prints a table of all configured repos with these columns:
| Column | What it shows |
|---|---|
| repo | Logical repo name from gather-step.config.yaml |
| files | Number of source files indexed |
| symbols | Number of named symbols extracted |
| nodes | Graph node count for this repo |
| edges | Graph edge count touching this repo |
| unresolved | Call sites that could not be resolved to a target |
| semantic health | Summary of framework extraction quality |
Add --json to get machine-readable output for scripting.
doctor
Section titled “doctor”Runs a structured health check against the indexed state. It reports issues in five categories:
- workspace — missing repo paths, config validation errors
- dangling edges — edges whose target node no longer exists in the graph
- unresolved inputs — call sites with no confident resolution, surfaced as actionable items
- search projection — nodes that should be in the search index but are absent
- semantic-link — framework-level extraction gaps (for example, a route node with no handler edge)
Run doctor before trusting benchmark results, pack output, or trace results
on an unfamiliar workspace.
search
Section titled “search”Searches the indexed symbol space by name or pattern:
gather-step search createOrder --limit 10gather-step search createOrder --kind Function --limit 5The --kind flag filters by node kind (for example Function, Class,
Route, Topic). Use search to locate a symbol’s ID before passing it to
trace crud --symbol-id or pack.
Debug a Route Flow (CRUD Trace)
Section titled “Debug a Route Flow (CRUD Trace)”Route tracing answers the question: which frontend caller reaches this backend route, which handler serves it, and what does the request touch downstream?
By route
Section titled “By route”gather-step trace crud --method POST --path /ordersBy backend symbol
Section titled “By backend symbol”gather-step trace crud --symbol-id <SYMBOL_ID>The output contains:
- frontend callers — symbols in frontend repos that call this route, with evidence labels
- backend handlers — the NestJS (or equivalent) handler node that serves the route
- continuation nodes — services, functions, and methods the handler calls
- entities — schema-like nodes reachable from the continuation path
- persistence hints — database-adjacent nodes with confidence and traversal depth annotations
Evidence labels distinguish how a caller was resolved:
| Label | Meaning |
|---|---|
literal | The path string appears as a literal in the source |
imported_constant | The path was traced through an imported constant |
hint | Heuristic match — treat with lower confidence |
Dynamic endpoints that cannot be safely reduced to a canonical path remain unresolved rather than being silently mislinked.
Map Async Topology (Events)
Section titled “Map Async Topology (Events)”Event commands give you visibility into the Kafka event topology baked into the code graph. Producers and consumers are modeled as first-class nodes, so cross-repo event flows become graph traversals rather than text searches.
gather-step events trace order.createdgather-step events blast-radius order.created --depth 2gather-step events orphansevents trace <SUBJECT>
Section titled “events trace <SUBJECT>”Follows the event from every producer to every consumer. Output identifies which repos emit the event, which repos handle it, and the inferred payload contract on each side. Use this when debugging a missing event or checking that the consumer set is what you expect.
events blast-radius <SUBJECT> --depth <N>
Section titled “events blast-radius <SUBJECT> --depth <N>”Expands the graph outward from the event node up to --depth hops. Each hop
follows downstream PropagatesEvent and Consumes edges and records the
accumulated confidence at each level. Use this to understand how many repos
are transitively affected when a topic changes.
events orphans
Section titled “events orphans”Lists events that have producers but no consumers, or consumers but no producers, in the indexed workspace. These are candidates for dead-code review or missing-handler investigation.
For architectural background on how events are modeled, see Concepts: event topology.
Estimate Change Impact
Section titled “Estimate Change Impact”Two commands address change impact at different levels of depth.
Lightweight cross-repo view
Section titled “Lightweight cross-repo view”gather-step impact createOrderimpact performs a bounded graph traversal from the named symbol and returns
a list of nodes in other repos that are reachable through dependency edges. It
is fast and good for a quick sanity check before a refactor.
Full context pack with ranked files
Section titled “Full context pack with ranked files”gather-step pack createOrder --mode change_impactpack --mode change_impact runs a heavier analysis that returns ranked
relevant files, semantic bridges connecting the target to its consumers, a
list of identified gaps (for example, unresolved edges), and suggested next
steps. Use this when you need to communicate blast radius to a reviewer or
feed it to an AI coding assistant.
Build Context Packs for AI Assistants
Section titled “Build Context Packs for AI Assistants”Context packs are the primary surface for preparing task-shaped context. A pack bundles the graph neighborhood relevant to a target into a bounded, ranked response rather than a raw graph dump.
Basic syntax
Section titled “Basic syntax”gather-step pack <TARGET> --mode <MODE>Supported modes
Section titled “Supported modes”| Mode | Best for |
|---|---|
planning | Estimating scope and identifying dependencies before starting work |
debug | Investigating a broken behavior with relevant call paths highlighted |
fix | Focused context for applying a targeted fix |
review | Summarizing what changed and what it touches for review preparation |
change_impact | Blast-radius analysis before a refactor or API change |
Additional flags
Section titled “Additional flags”gather-step pack createOrder --mode planning --limit 50 --depth 3 --budget-bytes 65536| Flag | Effect |
|---|---|
--limit <N> | Maximum number of ranked items to include |
--depth <N> | Maximum traversal depth from the target node |
--budget-bytes <N> | Hard size cap on the response, useful when feeding output to a context window |
--repo <NAME> | Restrict the pack to a single configured repo |
The response includes ranked relevant items, semantic bridge nodes (cross-repo connectors), next-step suggestions generated from graph structure, and a list of unresolved gaps. For a deeper explanation of how packs are assembled, see Concepts: context packs.
Export QA Planning Evidence
Section titled “Export QA Planning Evidence”gather-step qa-evidence createOrder --base main --head feature/create-order --jsonUse qa-evidence when a downstream planning workflow needs grounded code
evidence for a QA reference. The command emits normalized rows from
planning, review, and change_impact packs, plus local feature-flag and
existing-test signals, using the same canonical evidence metadata shape as the
underlying producers. It also surfaces dynamic feature-flag and scan-limit gaps
instead of hiding incomplete coverage. It intentionally stops at evidence:
requirement interpretation, test-case prose, and reviewer prompts belong in the
planning tool that consumes the manifest.
Generate Derived Artifacts
Section titled “Generate Derived Artifacts”gather-step generate claude-mdgather-step generate codeownersgenerate claude-md
Section titled “generate claude-md”Generates .agent-context/gather-step/*.md files from the live graph state.
The output files summarize system architecture, routes, and events in a format
that can be committed to the repository. They are loaded on demand through
an installed skill (.claude/skills/gather-step-context/SKILL.md for Claude
Code, .agents/skills/gather-step-context/SKILL.md for Codex) instead of
being pulled into every session, plus a small .claude/rules/gather-step-index.md
pointer that tells the agent when to invoke the skill. Because the data files
are derived from the indexed graph rather than maintained by hand, they stay in
sync with the codebase as the graph is refreshed.
The generator applies a byte budget so the output stays within practical context-window limits.
generate codeowners
Section titled “generate codeowners”Generates a CODEOWNERS-format file derived from ownership signals in the indexed graph. Use this as a baseline for repository ownership configuration.
Keep the Index Fresh
Section titled “Keep the Index Fresh”Manual incremental re-index
Section titled “Manual incremental re-index”gather-step --workspace /path/to/workspace indexRe-running index on an already-indexed workspace is incremental. It compares
current file hashes against stored state, re-parses only changed files and
their dependents, and reconciles the graph. You do not need to clean first.
Live watch mode
Section titled “Live watch mode”gather-step --workspace /path/to/workspace watchwatch starts a file-system watcher that applies incremental indexing
automatically as files change. Operational details:
- debounce — events are batched over a short window before triggering re-indexing to avoid thrashing on rapid saves.
- overflow rescan — if the event queue overflows (burst of many changes at once), the watcher schedules a repo-wide incremental pass rather than silently missing updates.
- repo-level backoff — if a repo produces consecutive indexing errors, it is temporarily suppressed rather than retried in a tight loop.
- clean shutdown —
Ctrl+Ccleanly stops the watcher, stops the local daemon, and emits the final status summary; pending queued changes are not guaranteed to be indexed before exit.
Use watch during active development sessions when you want CLI and MCP
answers to reflect current code without manual re-indexing.
Full reindex
Section titled “Full reindex”gather-step --workspace /path/to/workspace reindexDeletes and rebuilds the full index in one command. Use this after large-scale refactors, config changes, or when incremental state has drifted.
Compact generated state
Section titled “Compact generated state”gather-step --workspace /path/to/workspace compactCompacts the generated graph and metadata stores without deleting indexed
state. Use it after large reindexes or long watch-mode sessions when you want
to compress .gather-step/ storage but keep CLI and MCP queries available.
Clean Local State
Section titled “Clean Local State”gather-step --workspace /path/to/workspace clean --yesRemoves everything under .gather-step/. The --yes flag is required to skip
the interactive confirmation prompt. When using --json output, --yes is
also required so that automated pipelines cannot hang on a prompt.
Source repositories are not affected. Only generated index state is removed.
Release Validation And Benchmarks
Section titled “Release Validation And Benchmarks”The benchmark harness lives in the gather-step-bench binary. It is primarily
for release work, not day-to-day operator queries.
gather-step-bench pr-oracle build-sample --helpgather-step-bench pr-oracle score --helpgather-step-bench release-gate --helpThe release gate runs a real-workspace index with gather-step index --release-gate --artifact-path ..., then checks high-contract probes,
planning-pack quality, event tracing, change-impact parity, and PR-oracle
scores. Operators pass explicit planning, event, and impact targets so the
gate cannot accidentally reuse one target shape for all checks.
The v3.5.0 release benchmark uses a 31-repo workspace as the scale baseline. Local benchmark artifacts are not checked into the docs.
| Metric | Value |
|---|---|
| v3.5.0 release result | PASS |
| Link-quality tasks | 3 / 3 passing |
| Planning oracle | 25 / 25 passing (coverage 1.000, p50 3 ms / p95 8 ms / p99 15 ms) |
| Python planning | 1 / 1 passing |
| Workspace scale | 31 repos, 14,296 files, 216,663 symbols, 484,379 edges, 96,787 cross-repo |
| Storage thresholds | graph / metadata / search / total all under release-gate caps |
JSON-First Output
Section titled “JSON-First Output”Every command supports --json for machine-readable output. This is useful
for piping results into other tools, scripting workflows, and feeding output
to an AI assistant as structured data.
Three flags apply broadly across all commands:
| Flag | Effect |
|---|---|
--json | Emit JSON instead of human-formatted tables and text |
--no-banner | Suppress the startup banner (useful in scripted contexts) |
-v / --verbose | Increase log verbosity for debugging |
The --repo <NAME> flag is also accepted by most commands to scope output to
a single configured repo.
Review a PR
Section titled “Review a PR”pr-review builds a disposable review index for any two refs and returns a structured delta report covering the changed surfaces across every affected repo.
Basic usage
Section titled “Basic usage”gather-step pr-review --base main --head feat/my-changeThe command:
- Resolves
--baseand--headto SHAs. - Expands the affected repo set from changed files (direct path match, shared-package indicators, and reverse-dependent repos).
- Indexes the head branch into a disposable storage location.
- Computes the delta report and writes it to stdout (human-formatted by default,
--jsonfor machine-readable).
Key flags
Section titled “Key flags”| Flag | Effect |
|---|---|
--base <REF> | Base ref — the PR target, typically main |
--head <REF> | Head ref — the PR branch |
--keep-cache | Preserve the review index for follow-up impact/trace/pack queries |
--severity warn|strict|pedantic | Threshold for non-zero exit. warn always returns the report. |
--json | Emit the DeltaReport as JSON |
Reading the report
Section titled “Reading the report”The report sections are:
| Section | What it shows |
|---|---|
changed_files | Repo-relative paths changed in merge_base..head |
routes | Added / removed / changed HTTP routes with handler info and downstream impact |
symbols | Added / removed / changed exported symbols; flags signature_changed and visibility_changed |
payload_contracts | Field-level diffs: added, removed, type-changed, optional-required flips |
events | Producer/consumer set diffs across topic, queue, subject, stream, and event virtual nodes |
decorators | Permission, audit, and authorization decorator changes |
contract_alignments | Cross-repo clusters of related payload contracts with confidence scores |
removed_surface_risks | Removed routes / symbols / events with surviving consumers, classified high / medium / low |
evidence | Canonical, query-time evidence rows derived from the typed delta surfaces (closed kind/source enums, structured citations, deterministic IDs) |
deployment | Deployment-topology changes: Dockerfiles, Compose services, K8s manifests, env vars, secrets, config maps, GitHub Actions deploy jobs |
suggested_followups | Ready-to-run gather-step pack and trace crud commands grounded in the actual delta when available; otherwise emit explicit placeholders (SYMBOL_PLACEHOLDER, /ROUTE_PATH_PLACEHOLDER) so it’s clear they need substitution |
Follow-up queries against the kept index
Section titled “Follow-up queries against the kept index”When --keep-cache is set, the suggested_followups field includes commands pre-filled with --registry / --storage overrides pointing at the kept review index. Run them as-is to inspect PR-branch state rather than workspace baseline:
gather-step pr-review --base main --head feat/my-change --keep-cache --json
# then, from suggested_followups:gather-step pack <TARGET> --mode review --registry <REVIEW_REGISTRY> --storage <REVIEW_STORAGE>Cleaning up artifacts
Section titled “Cleaning up artifacts”Without --keep-cache, the review index is deleted after the report is returned. To manage kept artifacts:
gather-step pr-review clean --dry-run # list every kept artifact for this workspacegather-step pr-review clean --older-than 7d # prune stale artifactsgather-step pr-review clean --all # wipe all review artifactsFor a step-by-step walkthrough, see the PR review guide.
Next Steps
Section titled “Next Steps”- PR review guide — step-by-step walkthrough of the review workflow.
- MCP clients — expose the same graph to an AI coding assistant through the stdio MCP server.
- CLI reference — complete command and flag documentation.
- Concepts: polyrepo graph — how cross-repo stitching works under the hood.