gather-step MCP tools reference
The gather-step serve command exposes a local stdio MCP server over the indexed workspace graph.
In normal use, engineers do not call these tools manually. An MCP-aware assistant selects them automatically based on the task. This page exists as a reference so the tool surface is explicit, inspectable, and easier to debug.
Tool Groups
Section titled “Tool Groups”- Orientation: understand what is indexed before deeper queries
- Search and traversal: find symbols and walk local call relationships
- Topology and impact: trace routes, events, deployments, and cross-repo blast radius
- PR review: build a disposable review index for a PR branch and return a structured delta report
- Contracts: inspect payload shape and producer-consumer drift
- Context retrieval: return short summaries, combined context, and task-shaped packs
- Repo intelligence: inspect ownership, dead code, conventions, and repo summaries
Orientation
Section titled “Orientation”get_graph_schema
Section titled “get_graph_schema”“What kinds of graph nodes and edges are available in this workspace?”
Used automatically at the start of a new session when the assistant needs a compact view of the indexed graph shape before issuing more specific calls.
get_graph_schema_summary
Section titled “get_graph_schema_summary”“Give me the quick schema summary for this workspace.”
Alias of get_graph_schema. It exists for client compatibility and returns the same compact graph-shape summary.
list_repos
Section titled “list_repos”“Which repositories are indexed right now, and are they fresh?”
Used automatically when the assistant needs to confirm repo coverage, framework detection, file counts, symbol counts, and freshness before trusting later answers.
Search and Traversal
Section titled “Search and Traversal”search
Section titled “search”“Find
createOrderin the indexed workspace.”
Used automatically to locate symbols, routes, topics, types, or files before deeper analysis. This is often the first step before get_symbol, trace_route, or a context pack.
get_symbol
Section titled “get_symbol”“Show me the stored metadata for this symbol ID.”
Used automatically after search when the assistant needs the exact symbol record, including repo, file path, source span, and other stored metadata.
get_callers
Section titled “get_callers”“What calls into this function or method?”
Used automatically when the assistant is debugging upstream entry points, understanding who depends on a function, or preparing review context.
get_callees
Section titled “get_callees”“What does this function call downstream?”
Used automatically when the assistant needs the direct delegated work of a function, method, or handler before tracing broader impact.
Topology and Impact
Section titled “Topology and Impact”trace_impact
Section titled “trace_impact”“What features, repos, or pages could be affected if I change this symbol?”
Used automatically to estimate cross-repo blast radius through routes, events, queues, topics, shared symbols, and other virtual graph surfaces.
trace_event
Section titled “trace_event”“Who produces and who consumes
order.created?”
Used automatically when the assistant needs the producer-consumer map for an event-like target across one or more repos.
trace_route
Section titled “trace_route”“Which clients call
POST /orders, and which handler serves it?”
Used automatically when the assistant needs the route surface for a known HTTP method and path, including callers and handlers attached to the same route node.
crud_trace
Section titled “crud_trace”“Show me the end-to-end flow for
POST /orders, including callers, handlers, and persistence touchpoints.”
Used automatically when the assistant needs a fuller request-path trace than trace_route, especially for debugging or implementation planning.
event_blast_radius
Section titled “event_blast_radius”“If this event changes, what downstream code is likely affected?”
Used automatically when the assistant needs a transitive downstream walk from an event-like node rather than only the direct producer and consumer list.
list_orphan_topics
Section titled “list_orphan_topics”“Which topics or events have only producers or only consumers?”
Used automatically for event-topology audits, dead-path investigation, and integration checks where the assistant needs to surface incomplete or stale async wiring.
cross_repo_deps
Section titled “cross_repo_deps”“What other repositories does this repo depend on through shared graph surfaces?”
Used automatically when the assistant needs repo-level dependency structure before a refactor, migration, or deployment-isolation discussion.
where_deployed
Section titled “where_deployed”“Where is service
apideployed?”
Used automatically when the assistant needs concrete deployment evidence for a service-like workload. The request accepts service, optional repo, and optional limit (1-100).
service_env
Section titled “service_env”“Which env vars does service
workerread?”
Used automatically before env var changes or deployment-sensitive implementation work. Values from env files are not returned; Gather Step indexes names only.
env_var_consumers
Section titled “env_var_consumers”“Which services consume
DATABASE_URL?”
Used automatically to find deployment-level env var consumers across indexed Docker, Compose, Kubernetes, Kustomize, explicit Helm chart, GitHub Actions, configured env-file artifacts, and Compose env_file references.
undeployed_services
Section titled “undeployed_services”“Which indexed services have no deployment edge?”
Used automatically when planning needs to distinguish code/service nodes from deployable runtime owners.
deployed_but_no_code
Section titled “deployed_but_no_code”“Which deployments have no connected service/source evidence?”
Used automatically for deployment-topology audits, especially after service renames, repo splits, or GitOps drift.
shared_infra
Section titled “shared_infra”“What shared brokers or databases appear in deployment config?”
Used automatically when the assistant needs runtime-adjacent infrastructure names before planning a change.
get_shared_type_usage
Section titled “get_shared_type_usage”“Where is this shared type used across the workspace?”
Used automatically when the assistant needs repo and file usage for a shared symbol or DTO before changing that contract.
PR Review
Section titled “PR Review”pr_review
Section titled “pr_review”“Review this PR using gather-step.”
Used automatically when the user asks to review a pull request, do a structural PR review, check what a branch changed, or analyze cross-repo impact of a PR. Trigger phrases include “review this PR”, “review the PR using gather-step”, “do a code review”, “what does this PR change”, “analyze the impact of branch X”.
Inputs.
| Field | Type | Required | Description |
|---|---|---|---|
base | string | yes | Base ref (branch name, tag, or full SHA). The PR’s target branch — typically main. |
head | string | yes | Head ref (branch name, tag, or full SHA). The PR’s source branch. |
config | string | no | Path to a gather-step.config.yaml, absolute or relative to the workspace root. Use this when the reviewed repo does not commit its own config, or when reviewing a child repo with a parent workspace config. |
cache_root | string | no | Override the OS cache root used for review artifacts, absolute or relative to the workspace root. |
keep_cache | bool | no | Preserve the review artifact for follow-up impact/trace/pack queries. Default: false — the artifact is deleted after the report is returned. |
severity | string | no | One of warn (default), strict, pedantic. strict and pedantic cause non-zero exit on threshold violations; warn always returns the report regardless. |
no_baseline_check | bool | no | Suppress the warning emitted when the workspace HEAD does not match base. |
timeout_secs | integer | no | Child-process timeout in seconds, capped by the server. |
Returns. A JSON DeltaReport (schema_version: 1) with these top-level sections:
metadata— base/head SHAs, checkout mode, indexed repos, elapsed time, warnings (e.g., baseline-vs-base mismatch).safety— review storage path, run id, cleanup policy, cache key, config hash.changed_files— list of repo-relative paths changed inmerge_base..head.evidence— canonical evidence rows computed from the typed delta surfaces at query time.routes— added / removed / changed HTTP routes by(method, canonical_path). Carry handler info viaServesedges and downstream impact summaries.symbols— added / removed / changed exported symbols by(repo, qualified_name). Detectssignature_changedandvisibility_changedflags. Removed and changed surfaces carry impact summaries.payload_contracts— field-level diffs (added / removed / type-changed /optional-required flips). Removed and changed contracts can carry impact summaries.events— producer/consumer set diffs acrossTopic,Queue,Subject,Stream, andEventvirtual nodes.decorators— added / removed / changed permission, audit, and authorization decorators.contract_alignments— cross-repo clusters of related payload contracts with confidence scores.removed_surface_risks— removed routes / symbols / events with surviving consumers, classified by severity (high/medium/low).deployment— added / removed / changed deployment-topology surfaces (Dockerfiles, Compose services, K8s manifests, env vars, secrets, config maps, brokers, databases, GitHub Actions deploy jobs).suggested_followups— synthesizedgather-step packandgather-step trace crudcommands for the highest-impact deltas.
Hard invariants.
- The workspace’s normal
.gather-step/storageand.gather-step/registry.jsonare never modified. - Review artifacts live under the OS cache directory by default (
<cache>/gather-step/pr-review/<workspace_hash>/<run_id>/). - Baseline index is checked against the resolved
--baseSHA; mismatches surface as ametadata.warningsentry rather than a hard error.
Latency. First runs take ~30-90 seconds because a fresh review index is built. Cache-hit runs complete in 1-2 seconds when a retained matching artifact exists for the same (base_sha, head_sha) pair.
Cleanup. Without keep_cache, the artifact is removed when the report is returned. With keep_cache, the artifact survives until pr-review clean is run (or the OS cache root is cleared). The suggested_followups field includes commands pre-filled with --registry / --storage overrides pointing at the kept review index.
Implementation note. The MCP tool shells out to the gather-step binary’s pr-review subcommand. The binary must be on PATH or in the same directory as the MCP server.
pr_review_set
Section titled “pr_review_set”“Review this related PR set using gather-step.”
Used automatically when the user asks to review multiple related pull requests,
a PR stack, or a cross-repo PR set. It shells out to gather-step pr-review
with either --pr-set <manifest> or --from-gh <query>, and always requests
JSON output.
Inputs.
| Field | Type | Required | Description |
|---|---|---|---|
pr_set | string | required unless from_gh is set | Path to a PR-set manifest, absolute or relative to the workspace root. Mutually exclusive with from_gh. |
from_gh | string | required unless pr_set is set | GitHub search query to resolve into a temporary PR-set manifest using gh pr list. Mutually exclusive with pr_set. |
set_id | string | no | Override the manifest id in the emitted report. |
parallelism | integer | no | Number of independent entries to review in parallel. Dependency levels still run in order. |
allow_unknown_repos | bool | no | With from_gh, include PRs whose GitHub repo is not listed in the workspace config. Default: false. |
config | string | no | Path to a gather-step.config.yaml, absolute or relative to the workspace root. Use this for child repos that rely on a parent workspace config. |
cache_root | string | no | Override the OS cache root used for review artifacts, absolute or relative to the workspace root. |
keep_cache | bool | no | Preserve each child review artifact for follow-up queries. |
severity | string | no | One of warn (default), strict, pedantic. |
no_baseline_check | bool | no | Suppress baseline-vs-workspace HEAD mismatch warnings for each child review. |
timeout_secs | integer | no | Child-process timeout in seconds, capped by the server. |
Returns. A JSON MultiPrDeltaReport (schema_version: 0) with:
metadata— set id, manifest version/path, completed/failed/skipped counts, and set fingerprint.prs— each completed childDeltaReportwith entry id, repo, PR number, and base/head.errors— failed and dependency-skipped entries.cross_pr.contract_drifts— producer payload-contract changes in the set that lack a matching consumer PR.threshold_exceeded— true when any completed child review crossed the requested severity mode.
Contracts
Section titled “Contracts”payload_schema
Section titled “payload_schema”“What payload shape does this event or route appear to use?”
Used automatically when the assistant needs the inferred producer and consumer schema for a virtual target so it can reason about fields instead of only symbol names.
contract_drift
Section titled “contract_drift”“Are producers and consumers disagreeing on this payload contract?”
Used automatically when the assistant needs mismatches between the producer-side and consumer-side inferred shapes for the same target.
projection_impact
Section titled “projection_impact”“If this field changes, which source fields, projected fields, filters, indexes, and backfills need review?”
Used automatically when the assistant needs static field-level evidence for denormalized or persisted projections. The request accepts target, optional repo, optional limit (1-100), and optional evidence_verbosity (summary or full). The tool returns source and projected fields, derivation edges, read/write/filter/index/backfill evidence, evidence-source labels such as direct_field_access and local_alias_field_access, missing evidence, and risk hints such as source_field_unreviewed, backfill_unproven, index_or_search_mapping_unproven, frontend_only_focus, optional_payload_filter_mismatch, deployed_owner_unchecked, and deployed_owner_topology_observed.
breaking_change_candidates
Section titled “breaking_change_candidates”“If I change this DTO or producer payload, which consumers are at risk?”
Used automatically when the assistant needs a targeted breaking-change view tied to a producer symbol, DTO, or related contract surface.
Context Retrieval
Section titled “Context Retrieval”“Give me a one-screen summary of what this symbol is and why it matters.”
Used automatically for short summaries when the assistant needs lightweight orientation without spending budget on a deeper trace or pack.
context
Section titled “context”“Give me the combined context around this target.”
Used automatically when the assistant needs a broader stitched view that combines symbol metadata, traversal, repo context, and impact hints in one call.
context_pack
Section titled “context_pack”“Build a focused context pack for this target.”
Used automatically when the assistant wants a bounded task-shaped retrieval and already knows which mode it wants, such as planning, debug, fix, review, or change_impact.
get_context_pack
Section titled “get_context_pack”“Return the context pack for this target.”
Alias of context_pack. It exists for client compatibility and returns the same bounded pack response.
planning_pack
Section titled “planning_pack”“I’m about to work on this area. What do I need to understand first?”
Used automatically when the assistant needs planning-oriented context focused on entry points, dependencies, related surfaces, and likely next investigation steps.
plan_change
Section titled “plan_change”“Help me plan this change.”
Alias of planning_pack. It exists for clients that prefer a more task-shaped name.
debug_pack
Section titled “debug_pack”“This behavior is broken. Give me the most relevant debug context.”
Used automatically when the assistant needs inbound paths, nearby event surfaces, persistence touchpoints, and other debugging-oriented evidence.
fix_pack
Section titled “fix_pack”“I know the issue. What is the smallest safe edit surface?”
Used automatically when the assistant needs a narrower context set for applying a targeted fix without the wider planning surface.
fix_surface
Section titled “fix_surface”“Show me the fix surface for this target.”
Alias of fix_pack. It exists for clients that frame the same retrieval in terms of edit surface rather than pack mode.
review_pack
Section titled “review_pack”“I’m reviewing this area. What should I think about before approving a change?”
Used automatically when the assistant needs review-oriented context such as impacted relationships, conventions, ownership hints, and nearby hotspots.
change_impact_pack
Section titled “change_impact_pack”“What is the full blast radius if this changes?”
Used automatically when the assistant needs the richest impact-oriented context pack, including cross-repo dependents, bridges, and identified gaps.
get_change_impact_pack
Section titled “get_change_impact_pack”“Return the change-impact pack for this target.”
Alias of change_impact_pack. It exists for client compatibility and returns the same impact-focused pack.
batch_query
Section titled “batch_query”“Run the search, context, and impact lookups together so I can answer this in one pass.”
Used automatically when the assistant wants multiple bounded Gather Step queries in one round trip instead of making several separate tool calls.
Repo Intelligence
Section titled “Repo Intelligence”who_owns
Section titled “who_owns”“Who has the strongest history of working in this file or symbol?”
Used automatically when the assistant needs history-based ownership percentages, likely owners, or bus-factor hints for review routing and change risk.
get_dead_code
Section titled “get_dead_code”“What dead-code candidates exist in this repo?”
Used automatically when the assistant needs graph-reachability-based dead-code candidates while reviewing cleanup work or repo health.
get_conventions
Section titled “get_conventions”“What structural conventions does this repo seem to follow?”
Used automatically when the assistant wants repeated patterns discovered from the indexed graph so it can align suggestions with existing code shape.
get_overview
Section titled “get_overview”“Give me the high-level overview of this repo.”
Used automatically when the assistant needs a repo summary that combines graph-level shape with git-derived signals and other indexed analytics.
Notes for Operators
Section titled “Notes for Operators”- All tools are read-only against the indexed state.
- Results depend on index freshness. Run
indexagain, or keepwatchrunning, if the code has changed. - The assistant selects these tools automatically. The command names here are reference material, not a required manual workflow.