Skip to content

[refactor] Semantic function clustering: verified duplicates, outliers, scattered helpers & util reimplementations #36160

Description

@github-actions

🔧 Semantic Function Clustering Analysis

Automated analysis of github/gh-aw — 871 Go source files (excl. _test.go); mostly pkg/workflow (393) & pkg/cli (309). Parallel finder agents clustered functions by purpose; each candidate was then adversarially re-checked against the real code. Build-tag (*_wasm.go) variants, engine-specific log parsers, factory-backed codemods, and dual-format renderers were inspected and rejected.

Summary

Category Findings Top item
Duplicate functions 9 Safe-output parse*Config scaffold (~10 handlers)
Outliers (wrong file) 4 Concurrency/engine-API logic in notify_comment.go
Scattered / util-reimpl 7 Dedup reimplemented vs sliceutil
Generics 1 typeutil.Lookup[T]
Near-clone files 1 inline_skill_extractor.gosub_agent_extractor.go

Top 3 fixes (high impact, low risk): (1) collapse the two near-identical parser extractor files; (2) adopt existing sliceutil/stringutil/repoutil utils for ~10 local reimplementations; (3) extract the safe-output parse*Config scaffold (absorbs the 12× preprocess wrapper).


1. Duplicate / Near-Duplicate Functions

pkg/workflow — 5

1a. Safe-output parse*Config scaffold copy-pasted ~10× (HIGH) — same 5-step boilerplate (key check → preprocessIntFieldAsString("max")parseConfigScaffold w/ near-identical onError → default-max). Sites: assign_to_user.go:20, unassign_from_user.go:19, assign_to_agent.go:25, add_reviewer.go:21, add_comment.go:28, close_entity_helpers.go:104, comment_memory.go:26 (+ create_issue/discussion/pull_request). Fix: extend parseConfigScaffold (or parseSafeOutputConfigWithMax[T]) to take templatable fields + default-max; the *_entity_helpers.go generics show the target.

1b. Inline []any→[]string loops reimplement parseStringSliceAny (HIGH)parse_helpers.go:70 already does it, re-coded ~17× inline: tools_parser.go (11×, e.g. :230,:301,:321), comment.go:84, repo_memory.go (2×), safe_outputs_messages_config.go, mcp_config_types.go, claude_tools.go. Fix: call parseStringSliceAny (see model role_checks.go:186).

1c. toolCallMap upsert duplicated ~6× across engine log parsers (HIGH)claude_logs.go:386, codex_logs.go:130&167, copilot_logs.go:102,394,435. Fix: recordToolCall(toolCallMap, name, inputSize).

1d. preprocess<Field>+"Invalid X"+return nil wrapper 12× (HIGH)add_comment.go (4×), assign_to_user.go (2×), comment_memory.go (2×), assign_to_agent.go, add_reviewer.go, reply_to_pr_review_comment.go, noop.go. Fix: fold into 1a via preprocessFields(...).

1e. "cap max at 50" idiom 3× (MED)call_workflow.go:57, dispatch_workflow.go:65, dispatch_repository.go:107. Fix: capMax(maxPtr, limit, log).

pkg/cli — 4

2a. Human-comment counting loop reimplemented (HIGH) — fetch issues/%d/comments, skip isBotUser, count: outcome_eval_issue.go:44, outcome_eval_pr.go:107, outcome_eval_comment.go:64 (adds time filter). Fix: countHumanComments(repo, num, after) in outcome_eval.go.

2b. Secret-prompt clones (HIGH)engine_secrets.go:291/342/389 (promptForCopilotPATUnified/...SystemToken.../...GenericAPIKey...) share the same huh password-form → setenv → upload skeleton; differ only in intro text + Copilot validate. Fix: promptAndStoreSecret(req, config, intro, extraValidate, successMsg).

2c. Two parallel update-check subsystems (MED)update_check.go vs compile_update_check.go: updateLastCheckTime:140/updateCompileUpdateCheckTime:298 near-identical (differ by logger+perm); shouldCheckForUpdate/shouldRunCompileUpdateCheck share throttle core. Fix: shared lastCheckThrottle{file,interval,perm,log}.

2d. gh secret set upload reimplemented (LOW)engine_secrets.go:480 & add_interactive_secrets.go:51. Fix: share setRepoSecret(name,value,repo).


2. Outlier Functions (wrong file)

pkg/workflow is well-decomposed; mismatches cluster in notify_comment.go.

Function Now in Belongs in Why
isGroupConcurrencyQueueEnabled/parseGroupConcurrencyQueueFeatureValue :632/644 notify_comment.go concurrency.go concurrency-flag logic; already called from concurrency.go:84 (HIGH)
getEngineAPIHosts :764 notify_comment.go engine_api_targets.go engine-API hostname table (HIGH)
toEnvVarCase :740 notify_comment.go strings.go generic string transformer (MED)
splitShellTokens :231 gh_cli_permissions.go shell.go generic shell tokenizer; single-caller (MED)

3. Scattered Helpers & Util Reimplementations

The repo ships shared util packages; these are local reimplementations of them (✓ = verified in this run).

7 findings

3a. ✓ Order-preserving dedup vs sliceutil.Deduplicate/MergeUnique (HIGH)claude_tools.go:413 dedupeAllowedTools (verified identical body), docker.go:225 mergeDockerImages (=MergeUnique), central_slash_command_workflow.go:487 uniqueSorted, parser/tools_merger.go:184 mergeAllowedArrays, + inline loops in parser/frontmatter_hash.go:~238, parser/import_field_extractor.go:466.

3b. ✓ Pass-through wrappers over stringutil (HIGH, nuanced)parser/schema_suggestions.go:234/244 FindClosestMatches/LevenshteinDistance are one-line delegations (verified). Fix: import stringutil directly — unless retained as a deliberate re-export (maintainer call).

3c. Truncate-with-ellipsis inline 4× → stringutil.Truncate (MED)audit_report.go:693,749,784, audit_report_analysis.go:57 (note: inline len = N+3 vs Truncate cap-at-N).

3d. parseRepoSlugLiteral vs repoutil.SplitRepoSlug (MED)dispatch_workflow_validation.go:176; contrast the good cli/engine_secrets.go:694 splitRepoSlug which delegates.

3e. containsAny reimplemented in 3 packages (MED)cli/audit_agentic_analysis.go:492, errorutil/errors.go:55 containsErrorSubstring, linters/errormessage/errormessage.go:213 containsAnyWholeWord. Fix: add stringutil.ContainsAny(s, subs...).


4. Generics Opportunity

typeutil.LookupString/LookupMap (convert.go:150/169) are identical apart from the asserted type → collapse to func Lookup[T any](m map[string]any, key string) (T, bool), keep the two as one-line wrappers (MED; ~3 call sites).


5. Near-Clone Files — highest single win

pkg/parser/inline_skill_extractor.gosub_agent_extractor.go are structurally line-for-line identical, differing only by marker (skill: vs agent:), result struct (InlineSkill/InlineSubAgent, identical fields), valid-field set, and wording.

Paired functions
inline_skill_extractor.go sub_agent_extractor.go
ValidateInlineSkillsFrontmatter:18 ValidateInlineSubAgentsFrontmatter:103
ValidateInlineSkillsInBody:28 ValidateInlineSubAgentsInBody:123
validateInlineSkillFrontmatterFields:44 validateSubAgentFrontmatterFields:143
GetEngineSkillDir:70 GetEngineSubAgentDir:178
ExtractInlineSkills:91 ExtractInlineSubAgents:248
validateUniqueInlineSkillNames:116 validateUniqueSubAgentNames:273
extractInlineSkill:137 extractInlineSubAgent:294
collectInlineSkillH2Positions:129+regex:89 collectH2Positions:286+regex:236

nextInlineSkillH2After:148/nextH2After:305 are byte-identical; both H2 regexes are (?m)^##[ \t].

Fix: one parameterized extractInlineSections(markdown, spec) + validateInlineSections(body, spec) driven by a small inlineSectionSpec{kind, sepRegex, validFields, validFieldList}; collapse InlineSkill/InlineSubAgentInlineSection (or thin aliases), and GetEngine*DirengineConfigDir(engineID, subdir); keep one h2HeadingRegex+collectH2Positions.


Priority-Ordered Recommendations

P1 (high impact, mechanical): collapse the parser extractor files (§5); swap local reimplementations for sliceutil/stringutil/repoutil (§3a,3c,3d); extract the parse*Config scaffold + preprocess wrapper (§1a,1d).
P2: relocate the 4 outliers (§2, start with the cross-file-consumed pair); extract countHumanComments/promptAndStoreSecret/recordToolCall (§2a,2b,1c); add stringutil.ContainsAny + []any→[]string adoption (§3e,1b).
P3: unify update-check subsystems (§2c); add typeutil.Lookup[T] (§4); decide on the stringutil re-export wrappers (§3b).

Checklist

  • Collapse parser inline-skill/sub-agent extractors
  • Adopt sliceutil/stringutil/repoutil utilities at reimpl sites
  • Extract safe-output config-parse scaffold + preprocess wrapper
  • Relocate outlier functions; update imports
  • Extract countHumanComments/promptAndStoreSecret/recordToolCall
  • go build ./... + full tests to confirm no behavior change

Metadata

871 Go files analyzed (excl. _test.go) — pkg/workflow 393, pkg/cli 309, pkg/parser 42, pkg/console 26. 22 confirmed findings (9 dup · 4 outlier · 7 scattered/util · 1 generics · 1 near-clone). Method: parallel semantic finders → adversarial per-finding verification → Serena/gopls + naming clustering. Date 2026-06-01.

References: §26728321552

Generated by 🔧 Semantic Function Refactoring · opus48 9.4M ·

  • expires on Jun 3, 2026, 12:17 AM UTC

Metadata

Metadata

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions