feat: extract operational tools into user-rag cli (PR-V2-4) by HumanBean17 · Pull Request #53 · HumanBean17/java-codebase-rag

HumanBean17 · 2026-05-07T14:01:44Z

Scope

Implement PR-V2-4 from plans/PLAN-MCP-API-V2.md § PR-V2-4: move 5 operational tools out of MCP into the user-rag CLI, leaving MCP with exactly 4 navigation tools.

Summary

Add user_rag package with user-rag console entrypoint and 5 subcommands: refresh, meta, tables, diagnose-ignore, analyze-pr.
Remove operational tool registrations from MCP server and keep navigation-only MCP surface (search/find/describe/neighbors).
Update packaging/docs/tests for installed entrypoint behavior, CLI migration docs, and final MCP surface assertions.

Test delta

Baseline + 8 new subprocess tests in tests/test_user_rag_cli.py.

Validation

pytest tests -q -> 322 passed, 7 skipped
pytest tests/test_user_rag_cli.py -q -> 8 passed
pytest tests/test_mcp_tools.py -q -> 3 passed

Manual evidence

# CLI installed
pip install -e .
user-rag --help   # lists 5 subcommands

# JSON when piped
user-rag meta | python -c "import json,sys; d=json.loads(sys.stdin.read()); print(sorted(d['edge_counts'].keys()))"
# Expect: 9 edge type keys

# Final MCP surface
grep -cE "@mcp.tool" server.py   # expect: 4

# Diagnose ignore on a real path
user-rag diagnose-ignore tests/bank-chat-system/.git/HEAD
# Expect: ignored=True

# Refresh into a temp kuzu path
user-rag refresh --source-root tests/bank-chat-system --kuzu-path /tmp/v24_smoke --quiet
ls /tmp/v24_smoke   # expect: kuzu files

Manual TODO (user skill)

Dmitry: update the cursor-pr-review user skill callsite from MCP analyze_pr to CLI user-rag analyze-pr --diff-file /tmp/pr.diff (not included in this repo diff by design).

Made with Cursor

Move graph_meta/analyze_pr/diagnose_ignore/list_code_index_tables/refresh_code_index out of MCP and into the new user-rag console CLI while keeping MCP as a 4-tool navigator surface. Update packaging, docs, and subprocess tests so installed entrypoints and CLI migration paths remain validated. Co-authored-by: Cursor <[email protected]>

Close the loop after all four V2 PRs by moving the propose/plan/prompt artifacts to completed locations and marking the propose status as locked. Co-authored-by: Cursor <[email protected]>

HumanBean17 · 2026-05-07T14:18:31Z

Review: PR-V2-4 — extract operational tools into `user-rag` CLI

Verdict: Approved with notes ⚠️

Final piece of MCP API v2 lands cleanly: 5 ops tools moved to a user-rag console script, MCP surface drops to exactly 4 navigation tools (search/find/describe/neighbors). Plan adherence is high — file scope is confined to deliverables, surface assertion is locked, pretty/JSON output respects the no-flag isatty() contract, and the pr_analysis engine is reused (not reimplemented). One real platform-dependent test failure on this sandbox (the PTY pretty-print test) is the single blocker-adjacent issue and the plan itself anticipated it.

Scope discipline (out-of-scope checks)

Sentinel	Status
`ONTOLOGY_VERSION` / `SCHEMA_VERSION`	✅ 0 hits — no ontology bump
`CREATE NODE TABLE` / `CREATE REL TABLE` / `DROP TABLE`	✅ 0 hits — no schema delta
`mcp_v2.` references in diff	✅ 2 hits, both unchanged kept-behavior context lines (server.py:299/300) — no `mcp_v2.py` edits
`NodeFilter` references in diff	✅ 1 hit, README doc cell — no shape change
`kuzu_queries.` references in diff	✅ 0 hits
`search_lancedb` references	✅ 2 hits — `pyproject.toml` `py-modules` list (intentional inventory) and a kept `from search_lancedb import TABLES` import (server.py); no behaviour change
`@mcp.tool` count in `server.py` HEAD	✅ exactly 4 (search/find/describe/neighbors)

Plan compliance

#	Step from plan	Verified
1	New `user_rag/` package with `__init__.py` + `cli.py` argparse entrypoint	✅ `user_rag/cli.py` has `build_parser()` + `main()` + 5 subparsers
2	5 subcommands: `refresh` / `meta` / `tables` / `diagnose-ignore` / `analyze-pr`	✅ all 5 wired to thin handlers `_cmd_` reusing `server.` and `pr_analysis.analyze_pr_pipeline`
3	Output mode: `isatty()` → pretty (pprint), piped → single-line `json.dumps`	✅ `_emit()` cli.py:32-37 — no `--pretty`/`--json` flag exists
4	Exit codes 0 / 1 / 2 (success / user error / internal)	✅ argparse returns 2 on bad subcommand; user-error paths in `_cmd_analyze_pr` / `_cmd_diagnose_ignore` return 1 on bad input; refresh failure returns 1 (subprocess failed) or 2 (pre-launch internal)
5	`pyproject.toml`: `packages = ["user_rag"]`, `[project.scripts] user-rag = "user_rag.cli:main"`, root scripts stay outside the package as `py-modules`	✅ pyproject.toml:25-30 — exactly that
6	`server.py`: delete 5 `@mcp.tool` ops registrations; helpers (`_graph_meta_output`, `list_code_index_tables_payload`, `run_refresh_pipeline`, `_cocoindex_subprocess_env`) kept as named functions for CLI to import	✅ verified by `grep -nE "^def\|^async def" server.py` — handlers retained, decorators gone
7	README: new `## CLI reference` (5 subcommand subsections + `### Migration from v1`); `### Tool reference` lists only the 4 v2 tools	✅ README.md:152 (4-row Tool reference) and :161 (CLI reference top-level section)
8	AGENTS.md mentions 4-tool MCP surface + `user-rag --help`	✅ AGENTS.md:11-12
9	8 new subprocess tests in `tests/test_user_rag_cli.py`	✅ test count matches; assertions targeted (JSON keys, `ignored=True`, `risk_score`, `blast_radius_total`, kuzu rebuild, exit 2 on bogus)
10	Surface-assertion test updated to exactly 4	✅ `tests/test_mcp_tools.py::test_registered_tool_surface_is_v2_navigation_only` — `names == {"search","find","describe","neighbors"}` and `len(names) == 4`
11	DoD = baseline + 8 new tests, suite green	⚠️ 321 passed, 7 skipped, 1 failed locally — see Tests section

Tests

1 failed, 321 passed, 7 skipped in 209.25s
FAILED tests/test_user_rag_cli.py::test_cli_meta_pretty_when_tty
  OSError: [Errno 5] Input/output error

Master baseline (re-run on this branch's checkout): 320 passed, 7 skipped. Per-file delta: tests/test_mcp_tools.py 9 → 3 (-6, removed v1 ops surface assertions / refresh smoke), tests/test_user_rag_cli.py 0 → 8 (+8). Net +2 collected → expected 322 passed if all green. The PR description's claim of 322 passed, 7 skipped matches that math, and the missing pass is exactly the PTY test.

The failing test (test_cli_meta_pretty_when_tty) is the case the plan flagged as a known PTY flakiness risk. Quoting plan § Tests: "If CI on the target platform makes PTY-based testing flaky, mark this single test @pytest.mark.skipif(...) with a clear reason rather than introducing a side-door flag." Implementation only guards not hasattr(os, "openpty"), which is a weaker check — openpty exists here but os.read() on the master fd raises OSError(EIO) after the child closes, which is platform-dependent kernel behaviour. Recommended fix in a follow-up: either swallow OSError(EIO) after proc.poll() returns (the child has written its full output and exited; EIO is just the master end seeing slave-closed) and treat it as end-of-stream, or expand the skipif to detect EIO at runtime. The non-failure case is worth keeping — it's the only test that proves pretty-mode actually fires.

All other 321 tests pass. The 7 skipped are the standard LANCEDB_MCP_RUN_HEAVY=1 gate (4 in test_lancedb_e2e.py) and the 3 fixture-conditional client-row skips in test_mcp_v2.py (already on master).

Manual evidence reproduced

$ pip install -e .                                  # Successfully installed java-enterprise-codebase-rag-0.0.0
$ user-rag --help                                   # 5 subcommands listed ✅

$ rm -rf /tmp/v24_smoke && \
  user-rag refresh --source-root tests/bank-chat-system \
    --kuzu-path /tmp/v24_smoke --quiet > /tmp/refresh_out.json
$ wc -l /tmp/refresh_out.json                       # 1 (single-line JSON when piped) ✅
$ python -c "import json; d=json.load(open('/tmp/refresh_out.json')); print(d['success'])"
True

$ KUZU_DB_PATH=/tmp/v24_smoke user-rag meta | \
  python -c "import json,sys; d=json.loads(sys.stdin.read()); print(sorted(d['edge_counts'].keys()))"
['ASYNC_CALLS', 'CALLS', 'DECLARES', 'DECLARES_CLIENT', 'EXPOSES', 'EXTENDS', 'HTTP_CALLS', 'IMPLEMENTS', 'INJECTS']
                                                    # 9 edge keys ✅

$ grep -cE "@mcp.tool" server.py                    # 4 ✅

$ user-rag diagnose-ignore tests/bank-chat-system/.git/HEAD | \
  python -c "import json,sys; print(json.loads(sys.stdin.read())['ignored'])"
True                                                # ✅

$ user-rag bogus; echo $?
user-rag: error: argument subcommand: invalid choice: 'bogus' ...
2                                                   # ✅

All five PR-claimed manual evidence lines reproduce on a fresh fixture rebuild.

Notes that earned my trust

_emit() is the single output policy point (cli.py:32-37). Every subcommand routes through it, so the isatty() contract holds uniformly — no subcommand can accidentally bypass it. Pretty mode uses pprint.pformat(..., sort_dicts=True) (matches the sort_keys=True in the JSON branch — keys sort identically across modes), which is a thoughtful design choice.
_apply_graph_env resets the KuzuGraph._instance singleton when --kuzu-path is overridden (cli.py:50-54). Without this, a CLI process that imported kuzu_queries early would silently keep the env-time path. This is the correct fix and not over-engineered.
subparsers.add_mutually_exclusive_group(required=True) for --diff-file | --diff-stdin (cli.py:152-154). argparse enforces "exactly one" for free, no manual validation drift.
Test 5 (test_cli_diagnose_ignore_unconditional_prune) asserts ignored=True on .git/foo — verifies the unconditional .git prune fires through the CLI, not just unit-tests the LayeredIgnore class. That's the right level for a CLI integration test.
tests/test_user_rag_cli.py::_install_user_rag_entrypoint is session-scoped — pip install -e . runs once for the whole module, then shutil.which("user-rag") is used per-test. Tests still work even if a developer hasn't manually pip install'd the repo.
Surface assertion uses names == expected and len(names) == 4 (test_mcp_tools.py:11-15) — same defensive double-check pattern from PR-V2-3. If anyone adds a 5th tool by accident, both arms catch it.

Observations (non-blocking)

PTY test EIO handling (tests/test_user_rag_cli.py:60-87). As discussed in the Tests section, the read loop should treat OSError(EIO) from os.read(master, ...) as EOF when proc.poll() is not None — that's normal POSIX kernel behaviour after the slave closes. Currently it crashes the test on platforms (this sandbox) where the kernel raises EIO instead of returning 0 bytes. Either expand the skipif at the top of the test (with a runtime probe) or wrap the os.read call in try: ... except OSError as e: if e.errno == errno.EIO: break; raise. Either is a 2-line fix.
_cmd_refresh exit-code split is sensible but undocumented (cli.py:59-69). Returning 1 when exit_code is not None (subprocess produced an exit code = it ran but failed) and 2 otherwise (pre-launch internal error like missing cocoindex binary) maps roughly to plan's "1 = user error, 2 = internal", but the precise rule isn't obvious from the code. A 2-line docstring on _cmd_refresh would lock it for future maintainers.
diagnose-ignore doesn't take --source-root semantics into account for relative paths (cli.py:88-95). The path resolution is (root / raw).resolve() where root = server._project_root(), but _project_root() reads LANCEDB_MCP_PROJECT_ROOT from env, which _apply_graph_env set from --source-root. So the chain works — but only because _apply_graph_env runs before _project_root(). If anyone reorders that or adds caching, relative paths silently start resolving against cwd. A short comment "# must run after _apply_graph_env" on the root = server._project_root() line would prevent that footgun.
_emit exception fallback at main() returns 2 but emits a payload missing exit_code (cli.py:165-167). Internal exceptions print {"success": False, "message": "internal error: ..."} but no exit_code field, which means downstream JSON consumers parsing payload.get("exit_code") get None instead of 2. The exit code is correct on the process boundary; this is a minor schema-consistency point for piped consumers. Non-blocking.
--quiet only suppresses --verbose on the graph builder subprocess, not cocoindex's own progress bar (verified empirically — user-rag refresh ... --quiet still streams cocoindex emoji progress to the captured stderr field of the JSON payload). The captured stderr ends up in the JSON output's stderr field, so the JSON itself is still single-line and clean. This is fine for piped consumers but worth noting for anyone who expects --quiet to actually mute terminal output. Non-blocking.
No tests/test_user_rag_cli.py::test_cli_analyze_pr_with_diff_stdin. The plan's test Document Kuzu MAP-as-STRING pattern in plan + prompts #6 only covers --diff-file. Given the mutually-exclusive group is the only place where --diff-stdin validation lives, it's worth an extra one-liner test passing stdin= to subprocess.run and asserting the same JSON shape. Defer to PR-E1 follow-ups.
Stale cursor-pr-review user skill reference (mentioned in PR description's "Manual TODO"). The user skill still calls analyze_pr via MCP. This is correctly out of repo scope — flagged here only so it doesn't fall off the radar.

Plan deltas needed

None for the locked PR scope. Two suggested footnotes for the plan:

Note in plan § Tests entry Tier 1 completion plan + proposals (B2a + B4 + B5) #2 that the implemented skipif(not hasattr(os, "openpty")) is weaker than the plan-suggested "a clear reason rather than introducing a side-door flag" and consider documenting EIO handling as the canonical pattern.
Update plan § DoD PR-A1: Route schema + literal extractor (B2a) #5 phrasing from 8 new tests to 8 new tests collected, 7 expected to pass on most platforms (PTY test platform-conditional) to match observed reality.

Bonus catches

_apply_graph_env correctly invalidates the KuzuGraph._instance singleton — fixes a latent bug if any prior CLI invocation in-process has touched the graph (e.g. multi-subcommand wrapper scripts).
pyproject.toml py-modules list is exhaustive — every root-level Python file is now a declared installed module. pip install -e . actually picks up server.py, mcp_v2.py, etc. as importables, which makes from user_rag.cli import main → import server work end-to-end.

Ready to merge. Final state after merge: MCP exposes exactly 4 navigation tools (search/find/describe/neighbors); 5 ops tools live in user-rag console script; PR-V2-4 series is complete. Suggested follow-up PR-E1: PTY EIO handling + 1-2 deferred items from PR-V2-3 (port 2-3 client-filter cases into test_mcp_v2.py, fix stale docstring at tests/test_assign_endpoint_client_extraction.py:133).

Don't forget the manual TODO in your PR description: update the cursor-pr-review user skill to call user-rag analyze-pr --diff-file /tmp/pr.diff instead of the (now-removed) MCP analyze_pr tool.

Handle PTY EIO edge cases in CLI pretty-output test, add analyze-pr stdin coverage, and tighten CLI error-path documentation/payload consistency for refresh and top-level internal failures. Co-authored-by: Cursor <[email protected]>

HumanBean17 and others added 2 commits May 7, 2026 17:01

move mcp api v2 artifacts to completed

0aff54d

Close the loop after all four V2 PRs by moving the propose/plan/prompt artifacts to completed locations and marking the propose status as locked. Co-authored-by: Cursor <[email protected]>

apply non-blocking review followups for v2 cli

86f58fd

Handle PTY EIO edge cases in CLI pretty-output test, add analyze-pr stdin coverage, and tighten CLI error-path documentation/payload consistency for refresh and top-level internal failures. Co-authored-by: Cursor <[email protected]>

HumanBean17 merged commit e90cbec into master May 7, 2026

HumanBean17 deleted the feat/user-rag-cli branch May 10, 2026 21:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: extract operational tools into user-rag cli (PR-V2-4)#53

feat: extract operational tools into user-rag cli (PR-V2-4)#53
HumanBean17 merged 3 commits into
masterfrom
feat/user-rag-cli

HumanBean17 commented May 7, 2026

Uh oh!

HumanBean17 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

HumanBean17 commented May 7, 2026

Scope

Summary

Test delta

Validation

Manual evidence

Manual TODO (user skill)

Uh oh!

HumanBean17 commented May 7, 2026

Review: PR-V2-4 — extract operational tools into user-rag CLI

Scope discipline (out-of-scope checks)

Plan compliance

Tests

Manual evidence reproduced

Notes that earned my trust

Observations (non-blocking)

Plan deltas needed

Bonus catches

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Review: PR-V2-4 — extract operational tools into `user-rag` CLI