feat: extract operational tools into user-rag cli (PR-V2-4)#53
Conversation
Move graph_meta/analyze_pr/diagnose_ignore/list_code_index_tables/refresh_code_index out of MCP and into the new user-rag console CLI while keeping MCP as a 4-tool navigator surface. Update packaging, docs, and subprocess tests so installed entrypoints and CLI migration paths remain validated. Co-authored-by: Cursor <[email protected]>
Close the loop after all four V2 PRs by moving the propose/plan/prompt artifacts to completed locations and marking the propose status as locked. Co-authored-by: Cursor <[email protected]>
Review: PR-V2-4 — extract operational tools into
|
| Sentinel | Status |
|---|---|
ONTOLOGY_VERSION / SCHEMA_VERSION |
✅ 0 hits — no ontology bump |
CREATE NODE TABLE / CREATE REL TABLE / DROP TABLE |
✅ 0 hits — no schema delta |
mcp_v2. references in diff |
✅ 2 hits, both unchanged kept-behavior context lines (server.py:299/300) — no mcp_v2.py edits |
NodeFilter references in diff |
✅ 1 hit, README doc cell — no shape change |
kuzu_queries. references in diff |
✅ 0 hits |
search_lancedb references |
✅ 2 hits — pyproject.toml py-modules list (intentional inventory) and a kept from search_lancedb import TABLES import (server.py); no behaviour change |
@mcp.tool count in server.py HEAD |
✅ exactly 4 (search/find/describe/neighbors) |
Plan compliance
| # | Step from plan | Verified |
|---|---|---|
| 1 | New user_rag/ package with __init__.py + cli.py argparse entrypoint |
✅ user_rag/cli.py has build_parser() + main() + 5 subparsers |
| 2 | 5 subcommands: refresh / meta / tables / diagnose-ignore / analyze-pr |
✅ all 5 wired to thin handlers _cmd_* reusing server.* and pr_analysis.analyze_pr_pipeline |
| 3 | Output mode: isatty() → pretty (pprint), piped → single-line json.dumps |
✅ _emit() cli.py:32-37 — no --pretty/--json flag exists |
| 4 | Exit codes 0 / 1 / 2 (success / user error / internal) | ✅ argparse returns 2 on bad subcommand; user-error paths in _cmd_analyze_pr / _cmd_diagnose_ignore return 1 on bad input; refresh failure returns 1 (subprocess failed) or 2 (pre-launch internal) |
| 5 | pyproject.toml: packages = ["user_rag"], [project.scripts] user-rag = "user_rag.cli:main", root scripts stay outside the package as py-modules |
✅ pyproject.toml:25-30 — exactly that |
| 6 | server.py: delete 5 @mcp.tool ops registrations; helpers (_graph_meta_output, list_code_index_tables_payload, run_refresh_pipeline, _cocoindex_subprocess_env) kept as named functions for CLI to import |
✅ verified by grep -nE "^def|^async def" server.py — handlers retained, decorators gone |
| 7 | README: new ## CLI reference (5 subcommand subsections + ### Migration from v1); ### Tool reference lists only the 4 v2 tools |
✅ README.md:152 (4-row Tool reference) and :161 (CLI reference top-level section) |
| 8 | AGENTS.md mentions 4-tool MCP surface + user-rag --help |
✅ AGENTS.md:11-12 |
| 9 | 8 new subprocess tests in tests/test_user_rag_cli.py |
✅ test count matches; assertions targeted (JSON keys, ignored=True, risk_score, blast_radius_total, kuzu rebuild, exit 2 on bogus) |
| 10 | Surface-assertion test updated to exactly 4 | ✅ tests/test_mcp_tools.py::test_registered_tool_surface_is_v2_navigation_only — names == {"search","find","describe","neighbors"} and len(names) == 4 |
| 11 | DoD = baseline + 8 new tests, suite green |
Tests
1 failed, 321 passed, 7 skipped in 209.25s
FAILED tests/test_user_rag_cli.py::test_cli_meta_pretty_when_tty
OSError: [Errno 5] Input/output error
Master baseline (re-run on this branch's checkout): 320 passed, 7 skipped. Per-file delta: tests/test_mcp_tools.py 9 → 3 (-6, removed v1 ops surface assertions / refresh smoke), tests/test_user_rag_cli.py 0 → 8 (+8). Net +2 collected → expected 322 passed if all green. The PR description's claim of 322 passed, 7 skipped matches that math, and the missing pass is exactly the PTY test.
The failing test (test_cli_meta_pretty_when_tty) is the case the plan flagged as a known PTY flakiness risk. Quoting plan § Tests: "If CI on the target platform makes PTY-based testing flaky, mark this single test @pytest.mark.skipif(...) with a clear reason rather than introducing a side-door flag." Implementation only guards not hasattr(os, "openpty"), which is a weaker check — openpty exists here but os.read() on the master fd raises OSError(EIO) after the child closes, which is platform-dependent kernel behaviour. Recommended fix in a follow-up: either swallow OSError(EIO) after proc.poll() returns (the child has written its full output and exited; EIO is just the master end seeing slave-closed) and treat it as end-of-stream, or expand the skipif to detect EIO at runtime. The non-failure case is worth keeping — it's the only test that proves pretty-mode actually fires.
All other 321 tests pass. The 7 skipped are the standard LANCEDB_MCP_RUN_HEAVY=1 gate (4 in test_lancedb_e2e.py) and the 3 fixture-conditional client-row skips in test_mcp_v2.py (already on master).
Manual evidence reproduced
$ pip install -e . # Successfully installed java-enterprise-codebase-rag-0.0.0
$ user-rag --help # 5 subcommands listed ✅
$ rm -rf /tmp/v24_smoke && \
user-rag refresh --source-root tests/bank-chat-system \
--kuzu-path /tmp/v24_smoke --quiet > /tmp/refresh_out.json
$ wc -l /tmp/refresh_out.json # 1 (single-line JSON when piped) ✅
$ python -c "import json; d=json.load(open('/tmp/refresh_out.json')); print(d['success'])"
True
$ KUZU_DB_PATH=/tmp/v24_smoke user-rag meta | \
python -c "import json,sys; d=json.loads(sys.stdin.read()); print(sorted(d['edge_counts'].keys()))"
['ASYNC_CALLS', 'CALLS', 'DECLARES', 'DECLARES_CLIENT', 'EXPOSES', 'EXTENDS', 'HTTP_CALLS', 'IMPLEMENTS', 'INJECTS']
# 9 edge keys ✅
$ grep -cE "@mcp.tool" server.py # 4 ✅
$ user-rag diagnose-ignore tests/bank-chat-system/.git/HEAD | \
python -c "import json,sys; print(json.loads(sys.stdin.read())['ignored'])"
True # ✅
$ user-rag bogus; echo $?
user-rag: error: argument subcommand: invalid choice: 'bogus' ...
2 # ✅All five PR-claimed manual evidence lines reproduce on a fresh fixture rebuild.
Notes that earned my trust
_emit()is the single output policy point (cli.py:32-37). Every subcommand routes through it, so theisatty()contract holds uniformly — no subcommand can accidentally bypass it. Pretty mode usespprint.pformat(..., sort_dicts=True)(matches thesort_keys=Truein the JSON branch — keys sort identically across modes), which is a thoughtful design choice._apply_graph_envresets theKuzuGraph._instancesingleton when--kuzu-pathis overridden (cli.py:50-54). Without this, a CLI process that importedkuzu_queriesearly would silently keep the env-time path. This is the correct fix and not over-engineered.subparsers.add_mutually_exclusive_group(required=True)for--diff-file | --diff-stdin(cli.py:152-154). argparse enforces "exactly one" for free, no manual validation drift.- Test 5 (
test_cli_diagnose_ignore_unconditional_prune) assertsignored=Trueon.git/foo— verifies the unconditional.gitprune fires through the CLI, not just unit-tests the LayeredIgnore class. That's the right level for a CLI integration test. tests/test_user_rag_cli.py::_install_user_rag_entrypointis session-scoped —pip install -e .runs once for the whole module, thenshutil.which("user-rag")is used per-test. Tests still work even if a developer hasn't manuallypip install'd the repo.- Surface assertion uses
names == expectedandlen(names) == 4(test_mcp_tools.py:11-15) — same defensive double-check pattern from PR-V2-3. If anyone adds a 5th tool by accident, both arms catch it.
Observations (non-blocking)
-
PTY test EIO handling (
tests/test_user_rag_cli.py:60-87). As discussed in the Tests section, the read loop should treatOSError(EIO)fromos.read(master, ...)as EOF whenproc.poll()is not None — that's normal POSIX kernel behaviour after the slave closes. Currently it crashes the test on platforms (this sandbox) where the kernel raises EIO instead of returning 0 bytes. Either expand the skipif at the top of the test (with a runtime probe) or wrap theos.readcall intry: ... except OSError as e: if e.errno == errno.EIO: break; raise. Either is a 2-line fix. -
_cmd_refreshexit-code split is sensible but undocumented (cli.py:59-69). Returning 1 whenexit_code is not None(subprocess produced an exit code = it ran but failed) and 2 otherwise (pre-launch internal error like missingcocoindexbinary) maps roughly to plan's "1 = user error, 2 = internal", but the precise rule isn't obvious from the code. A 2-line docstring on_cmd_refreshwould lock it for future maintainers. -
diagnose-ignoredoesn't take--source-rootsemantics into account for relative paths (cli.py:88-95). The path resolution is(root / raw).resolve()whereroot = server._project_root(), but_project_root()readsLANCEDB_MCP_PROJECT_ROOTfrom env, which_apply_graph_envset from--source-root. So the chain works — but only because_apply_graph_envruns before_project_root(). If anyone reorders that or adds caching, relative paths silently start resolving againstcwd. A short comment "# must run after _apply_graph_env" on theroot = server._project_root()line would prevent that footgun. -
_emitexception fallback atmain()returns 2 but emits a payload missingexit_code(cli.py:165-167). Internal exceptions print{"success": False, "message": "internal error: ..."}but noexit_codefield, which means downstream JSON consumers parsingpayload.get("exit_code")getNoneinstead of 2. The exit code is correct on the process boundary; this is a minor schema-consistency point for piped consumers. Non-blocking. -
--quietonly suppresses--verboseon the graph builder subprocess, not cocoindex's own progress bar (verified empirically —user-rag refresh ... --quietstill streams cocoindex emoji progress to the captured stderr field of the JSON payload). The captured stderr ends up in the JSON output'sstderrfield, so the JSON itself is still single-line and clean. This is fine for piped consumers but worth noting for anyone who expects--quietto actually mute terminal output. Non-blocking. -
No
tests/test_user_rag_cli.py::test_cli_analyze_pr_with_diff_stdin. The plan's test Document Kuzu MAP-as-STRING pattern in plan + prompts #6 only covers--diff-file. Given the mutually-exclusive group is the only place where--diff-stdinvalidation lives, it's worth an extra one-liner test passingstdin=tosubprocess.runand asserting the same JSON shape. Defer to PR-E1 follow-ups. -
Stale
cursor-pr-reviewuser skill reference (mentioned in PR description's "Manual TODO"). The user skill still callsanalyze_prvia MCP. This is correctly out of repo scope — flagged here only so it doesn't fall off the radar.
Plan deltas needed
None for the locked PR scope. Two suggested footnotes for the plan:
- Note in plan § Tests entry Tier 1 completion plan + proposals (B2a + B4 + B5) #2 that the implemented
skipif(not hasattr(os, "openpty"))is weaker than the plan-suggested "a clear reason rather than introducing a side-door flag" and consider documenting EIO handling as the canonical pattern. - Update plan § DoD PR-A1: Route schema + literal extractor (B2a) #5 phrasing from
8 new teststo8 new tests collected, 7 expected to pass on most platforms (PTY test platform-conditional)to match observed reality.
Bonus catches
_apply_graph_envcorrectly invalidates theKuzuGraph._instancesingleton — fixes a latent bug if any prior CLI invocation in-process has touched the graph (e.g. multi-subcommand wrapper scripts).pyproject.tomlpy-moduleslist is exhaustive — every root-level Python file is now a declared installed module.pip install -e .actually picks upserver.py,mcp_v2.py, etc. as importables, which makesfrom user_rag.cli import main→import serverwork end-to-end.
Ready to merge. Final state after merge: MCP exposes exactly 4 navigation tools (search/find/describe/neighbors); 5 ops tools live in user-rag console script; PR-V2-4 series is complete. Suggested follow-up PR-E1: PTY EIO handling + 1-2 deferred items from PR-V2-3 (port 2-3 client-filter cases into test_mcp_v2.py, fix stale docstring at tests/test_assign_endpoint_client_extraction.py:133).
Don't forget the manual TODO in your PR description: update the cursor-pr-review user skill to call user-rag analyze-pr --diff-file /tmp/pr.diff instead of the (now-removed) MCP analyze_pr tool.
Handle PTY EIO edge cases in CLI pretty-output test, add analyze-pr stdin coverage, and tighten CLI error-path documentation/payload consistency for refresh and top-level internal failures. Co-authored-by: Cursor <[email protected]>
Scope
Implement PR-V2-4 from plans/PLAN-MCP-API-V2.md § PR-V2-4: move 5 operational tools out of MCP into the
user-ragCLI, leaving MCP with exactly 4 navigation tools.Summary
user_ragpackage withuser-ragconsole entrypoint and 5 subcommands:refresh,meta,tables,diagnose-ignore,analyze-pr.search/find/describe/neighbors).Test delta
Baseline + 8 new subprocess tests in
tests/test_user_rag_cli.py.Validation
pytest tests -q->322 passed, 7 skippedpytest tests/test_user_rag_cli.py -q->8 passedpytest tests/test_mcp_tools.py -q->3 passedManual evidence
Manual TODO (user skill)
cursor-pr-reviewuser skill callsite from MCPanalyze_prto CLIuser-rag analyze-pr --diff-file /tmp/pr.diff(not included in this repo diff by design).Made with Cursor