Skip to content

NodeFilter: add symbol_kind / symbol_kinds to disambiguate class/method/interface/enum within find("symbol", ...) #56

@HumanBean17

Description

@HumanBean17

Summary

find.kind="symbol" is functionally complete (it correctly maps to all Symbol nodes in Kuzu) but semantically too coarse. Inside the graph, Symbol.kind already takes 7 distinct values: class, interface, enum, record, annotation, method, constructor. The MCP find.kind parameter collapses all 7 into one bucket, and NodeFilter has no symbol_kind field — so today there is no way to ask "find me only methods" or "find me only interfaces" via find.

Why kind="symbol" collapses today

Three reasons it was designed as a single bucket in v2:

  1. _symbol_where_from_filter predicates (role, annotation, capability, fqn_prefix) work uniformly across class-level and method-level symbols.
  2. The result row shape is identical: id, fqn, microservice, module, role.
  3. Consumers historically wanted "the named thing" without caring whether it's a class or a method on a class.

Real queries you can't answer cleanly today

Query Current workaround Cost
"List all interfaces in chat-core" find("symbol", {"microservice":"chat-core"}) then filter client-side by s.kind — but kind isn't in the result projection Impossible without a round-trip through describe
"List all @Scheduled methods" find("symbol", {"capability":"SCHEDULED_TASK"}) — works but conflates classes that have a scheduled method with the methods themselves Imprecise
"Methods on ChatController" neighbors(class_id, "out", ["DECLARES"]) Works but requires knowing the class id first
"All enum types" find("symbol", {"role":"???"}) — enums don't have a role No way

Three possible splits, ranked

Option A — Add symbol_kind to NodeFilter (minimal, recommended)

Don't split find.kind. Add one field to the shared filter:

class NodeFilter(BaseModel):
    ...
    symbol_kind: Literal["class","interface","enum","record","annotation","method","constructor"] | None = None
    symbol_kinds: list[str] | None = None  # set-membership; covers "all type-level declarations" idiom

Plumb into _symbol_where_from_filter:

if f.symbol_kind:
    preds.append("s.kind = $symbol_kind")
    params["symbol_kind"] = f.symbol_kind
if f.symbol_kinds:
    preds.append("s.kind IN $symbol_kinds")
    params["symbol_kinds"] = list(f.symbol_kinds)

Also: return s.kind in the projection so callers can see it without a describe round-trip.

  • Pros: zero breaking change, MCP surface stays at 4, find signature stays at 3 kinds.
  • Cons: NodeFilter grows from 15 keys to ~17 (already accepted as a "fat NodeFilter" trade-off in v2).

Option B — Split find.kind to 5 values

kind: Literal["class","method","route","client","interface"]

Where class covers {class, record, enum, annotation} and method covers {method, constructor}. Routes and clients stay separate (different Kuzu tables, not Symbol nodes).

  • Pros: "give me all controllers" vs "give me all controller methods" becomes a discriminated query. The schema-grounded decoder sees a 5-element Literal and routes the model correctly.
  • Cons: breaking change to the v2 contract just shipped. Every doc/example/test/cursor rule that mentions kind="symbol" flips. class is overloaded (the keyword vs all class-like declarations) — would prefer type or declaration.

Option C — Two-axis find: kind × level

kind: Literal["symbol","route","client"]                       # which table
level: Literal["type","method","field"] | None = None          # granularity within Symbol
  • Pros: orthogonal, clean. kind="symbol" stays as the umbrella; level is the new axis.
  • Cons: weak models will conflate level with the find.kind Literal — adding two Literal parameters increases the surface for "model picks the wrong combination". Trades ambiguity-by-overload for ambiguity-by-combinatorial-misuse.

Recommendation: Option A

The actual pain is "I can't filter by symbol granularity," not "the find.kind enum is wrong." Adding symbol_kind to NodeFilter (plus returning kind in the result projection) solves the gap with minimum disruption.

Estimated cost:

  • ~30 lines in mcp_v2.py (NodeFilter field + _symbol_where_from_filter predicates)
  • 1 line in the find("symbol", ...) result projection (s.kind AS kind)
  • README NodeFilter table updated (15 → 17 keys)
  • 3 tests: filter by symbol_kind="method", by symbol_kind="interface", by symbol_kinds=["class","interface","enum","record","annotation"]

Refinements to bundle in the same PR

  1. symbol_kinds: list[str] | None for IN queries — Java codebases commonly want "all type-level declarations" = ["class","interface","enum","record","annotation"]. Single field for both single-kind and set-membership cases is cleaner than symbol_kind + exclude_symbol_kinds.
  2. Default the find("symbol", ...) projection to include kind. Currently returns id, fqn, microservice, module, role — adding kind is one column. The model can then filter client-side as a stopgap if a future query needs a kind-set the schema doesn't anticipate.

Suggested rollout

Single Cursor task PR, additive only, no ontology bump, no schema delta. Branch name: feat/nodefilter-symbol-kind. Plan file: plans/PLAN-NODEFILTER-SYMBOL-KIND.md (small).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions