🔧 Semantic Function Clustering Analysis
Automated analysis of github/gh-aw — 871 Go source files (excl. _test.go); mostly pkg/workflow (393) & pkg/cli (309). Parallel finder agents clustered functions by purpose; each candidate was then adversarially re-checked against the real code. Build-tag (*_wasm.go) variants, engine-specific log parsers, factory-backed codemods, and dual-format renderers were inspected and rejected.
Summary
| Category |
Findings |
Top item |
| Duplicate functions |
9 |
Safe-output parse*Config scaffold (~10 handlers) |
| Outliers (wrong file) |
4 |
Concurrency/engine-API logic in notify_comment.go |
| Scattered / util-reimpl |
7 |
Dedup reimplemented vs sliceutil |
| Generics |
1 |
typeutil.Lookup[T] |
| Near-clone files |
1 |
inline_skill_extractor.go ↔ sub_agent_extractor.go |
Top 3 fixes (high impact, low risk): (1) collapse the two near-identical parser extractor files; (2) adopt existing sliceutil/stringutil/repoutil utils for ~10 local reimplementations; (3) extract the safe-output parse*Config scaffold (absorbs the 12× preprocess wrapper).
1. Duplicate / Near-Duplicate Functions
pkg/workflow — 5
1a. Safe-output parse*Config scaffold copy-pasted ~10× (HIGH) — same 5-step boilerplate (key check → preprocessIntFieldAsString("max") → parseConfigScaffold w/ near-identical onError → default-max). Sites: assign_to_user.go:20, unassign_from_user.go:19, assign_to_agent.go:25, add_reviewer.go:21, add_comment.go:28, close_entity_helpers.go:104, comment_memory.go:26 (+ create_issue/discussion/pull_request). Fix: extend parseConfigScaffold (or parseSafeOutputConfigWithMax[T]) to take templatable fields + default-max; the *_entity_helpers.go generics show the target.
1b. Inline []any→[]string loops reimplement parseStringSliceAny (HIGH) — parse_helpers.go:70 already does it, re-coded ~17× inline: tools_parser.go (11×, e.g. :230,:301,:321), comment.go:84, repo_memory.go (2×), safe_outputs_messages_config.go, mcp_config_types.go, claude_tools.go. Fix: call parseStringSliceAny (see model role_checks.go:186).
1c. toolCallMap upsert duplicated ~6× across engine log parsers (HIGH) — claude_logs.go:386, codex_logs.go:130&167, copilot_logs.go:102,394,435. Fix: recordToolCall(toolCallMap, name, inputSize).
1d. preprocess<Field>+"Invalid X"+return nil wrapper 12× (HIGH) — add_comment.go (4×), assign_to_user.go (2×), comment_memory.go (2×), assign_to_agent.go, add_reviewer.go, reply_to_pr_review_comment.go, noop.go. Fix: fold into 1a via preprocessFields(...).
1e. "cap max at 50" idiom 3× (MED) — call_workflow.go:57, dispatch_workflow.go:65, dispatch_repository.go:107. Fix: capMax(maxPtr, limit, log).
pkg/cli — 4
2a. Human-comment counting loop reimplemented (HIGH) — fetch issues/%d/comments, skip isBotUser, count: outcome_eval_issue.go:44, outcome_eval_pr.go:107, outcome_eval_comment.go:64 (adds time filter). Fix: countHumanComments(repo, num, after) in outcome_eval.go.
2b. Secret-prompt clones (HIGH) — engine_secrets.go:291/342/389 (promptForCopilotPATUnified/...SystemToken.../...GenericAPIKey...) share the same huh password-form → setenv → upload skeleton; differ only in intro text + Copilot validate. Fix: promptAndStoreSecret(req, config, intro, extraValidate, successMsg).
2c. Two parallel update-check subsystems (MED) — update_check.go vs compile_update_check.go: updateLastCheckTime:140/updateCompileUpdateCheckTime:298 near-identical (differ by logger+perm); shouldCheckForUpdate/shouldRunCompileUpdateCheck share throttle core. Fix: shared lastCheckThrottle{file,interval,perm,log}.
2d. gh secret set upload reimplemented (LOW) — engine_secrets.go:480 & add_interactive_secrets.go:51. Fix: share setRepoSecret(name,value,repo).
2. Outlier Functions (wrong file)
pkg/workflow is well-decomposed; mismatches cluster in notify_comment.go.
| Function |
Now in |
Belongs in |
Why |
isGroupConcurrencyQueueEnabled/parseGroupConcurrencyQueueFeatureValue :632/644 |
notify_comment.go |
concurrency.go |
concurrency-flag logic; already called from concurrency.go:84 (HIGH) |
getEngineAPIHosts :764 |
notify_comment.go |
engine_api_targets.go |
engine-API hostname table (HIGH) |
toEnvVarCase :740 |
notify_comment.go |
strings.go |
generic string transformer (MED) |
splitShellTokens :231 |
gh_cli_permissions.go |
shell.go |
generic shell tokenizer; single-caller (MED) |
3. Scattered Helpers & Util Reimplementations
The repo ships shared util packages; these are local reimplementations of them (✓ = verified in this run).
7 findings
3a. ✓ Order-preserving dedup vs sliceutil.Deduplicate/MergeUnique (HIGH) — claude_tools.go:413 dedupeAllowedTools (verified identical body), docker.go:225 mergeDockerImages (=MergeUnique), central_slash_command_workflow.go:487 uniqueSorted, parser/tools_merger.go:184 mergeAllowedArrays, + inline loops in parser/frontmatter_hash.go:~238, parser/import_field_extractor.go:466.
3b. ✓ Pass-through wrappers over stringutil (HIGH, nuanced) — parser/schema_suggestions.go:234/244 FindClosestMatches/LevenshteinDistance are one-line delegations (verified). Fix: import stringutil directly — unless retained as a deliberate re-export (maintainer call).
3c. Truncate-with-ellipsis inline 4× → stringutil.Truncate (MED) — audit_report.go:693,749,784, audit_report_analysis.go:57 (note: inline len = N+3 vs Truncate cap-at-N).
3d. parseRepoSlugLiteral vs repoutil.SplitRepoSlug (MED) — dispatch_workflow_validation.go:176; contrast the good cli/engine_secrets.go:694 splitRepoSlug which delegates.
3e. containsAny reimplemented in 3 packages (MED) — cli/audit_agentic_analysis.go:492, errorutil/errors.go:55 containsErrorSubstring, linters/errormessage/errormessage.go:213 containsAnyWholeWord. Fix: add stringutil.ContainsAny(s, subs...).
4. Generics Opportunity
typeutil.LookupString/LookupMap (convert.go:150/169) are identical apart from the asserted type → collapse to func Lookup[T any](m map[string]any, key string) (T, bool), keep the two as one-line wrappers (MED; ~3 call sites).
5. Near-Clone Files — highest single win
pkg/parser/inline_skill_extractor.go ↔ sub_agent_extractor.go are structurally line-for-line identical, differing only by marker (skill: vs agent:), result struct (InlineSkill/InlineSubAgent, identical fields), valid-field set, and wording.
Paired functions
| inline_skill_extractor.go |
sub_agent_extractor.go |
ValidateInlineSkillsFrontmatter:18 |
ValidateInlineSubAgentsFrontmatter:103 |
ValidateInlineSkillsInBody:28 |
ValidateInlineSubAgentsInBody:123 |
validateInlineSkillFrontmatterFields:44 |
validateSubAgentFrontmatterFields:143 |
GetEngineSkillDir:70 |
GetEngineSubAgentDir:178 |
ExtractInlineSkills:91 |
ExtractInlineSubAgents:248 |
validateUniqueInlineSkillNames:116 |
validateUniqueSubAgentNames:273 |
extractInlineSkill:137 |
extractInlineSubAgent:294 |
collectInlineSkillH2Positions:129+regex:89 |
collectH2Positions:286+regex:236 |
nextInlineSkillH2After:148/nextH2After:305 are byte-identical; both H2 regexes are (?m)^##[ \t].
Fix: one parameterized extractInlineSections(markdown, spec) + validateInlineSections(body, spec) driven by a small inlineSectionSpec{kind, sepRegex, validFields, validFieldList}; collapse InlineSkill/InlineSubAgent→InlineSection (or thin aliases), and GetEngine*Dir→engineConfigDir(engineID, subdir); keep one h2HeadingRegex+collectH2Positions.
Priority-Ordered Recommendations
P1 (high impact, mechanical): collapse the parser extractor files (§5); swap local reimplementations for sliceutil/stringutil/repoutil (§3a,3c,3d); extract the parse*Config scaffold + preprocess wrapper (§1a,1d).
P2: relocate the 4 outliers (§2, start with the cross-file-consumed pair); extract countHumanComments/promptAndStoreSecret/recordToolCall (§2a,2b,1c); add stringutil.ContainsAny + []any→[]string adoption (§3e,1b).
P3: unify update-check subsystems (§2c); add typeutil.Lookup[T] (§4); decide on the stringutil re-export wrappers (§3b).
Checklist
Metadata
871 Go files analyzed (excl. _test.go) — pkg/workflow 393, pkg/cli 309, pkg/parser 42, pkg/console 26. 22 confirmed findings (9 dup · 4 outlier · 7 scattered/util · 1 generics · 1 near-clone). Method: parallel semantic finders → adversarial per-finding verification → Serena/gopls + naming clustering. Date 2026-06-01.
References: §26728321552
Generated by 🔧 Semantic Function Refactoring · opus48 9.4M · ◷
🔧 Semantic Function Clustering Analysis
Automated analysis of
github/gh-aw— 871 Go source files (excl._test.go); mostlypkg/workflow(393) &pkg/cli(309). Parallel finder agents clustered functions by purpose; each candidate was then adversarially re-checked against the real code. Build-tag (*_wasm.go) variants, engine-specific log parsers, factory-backed codemods, and dual-format renderers were inspected and rejected.Summary
parse*Configscaffold (~10 handlers)notify_comment.gosliceutiltypeutil.Lookup[T]inline_skill_extractor.go↔sub_agent_extractor.goTop 3 fixes (high impact, low risk): (1) collapse the two near-identical parser extractor files; (2) adopt existing
sliceutil/stringutil/repoutilutils for ~10 local reimplementations; (3) extract the safe-outputparse*Configscaffold (absorbs the 12× preprocess wrapper).1. Duplicate / Near-Duplicate Functions
pkg/workflow — 5
1a. Safe-output
parse*Configscaffold copy-pasted ~10× (HIGH) — same 5-step boilerplate (key check →preprocessIntFieldAsString("max")→parseConfigScaffoldw/ near-identicalonError→ default-max). Sites:assign_to_user.go:20,unassign_from_user.go:19,assign_to_agent.go:25,add_reviewer.go:21,add_comment.go:28,close_entity_helpers.go:104,comment_memory.go:26(+ create_issue/discussion/pull_request). Fix: extendparseConfigScaffold(orparseSafeOutputConfigWithMax[T]) to take templatable fields + default-max; the*_entity_helpers.gogenerics show the target.1b. Inline
[]any→[]stringloops reimplementparseStringSliceAny(HIGH) —parse_helpers.go:70already does it, re-coded ~17× inline:tools_parser.go(11×, e.g.:230,:301,:321),comment.go:84,repo_memory.go(2×),safe_outputs_messages_config.go,mcp_config_types.go,claude_tools.go. Fix: callparseStringSliceAny(see modelrole_checks.go:186).1c.
toolCallMapupsert duplicated ~6× across engine log parsers (HIGH) —claude_logs.go:386,codex_logs.go:130&167,copilot_logs.go:102,394,435. Fix:recordToolCall(toolCallMap, name, inputSize).1d.
preprocess<Field>+"Invalid X"+return nilwrapper 12× (HIGH) —add_comment.go(4×),assign_to_user.go(2×),comment_memory.go(2×),assign_to_agent.go,add_reviewer.go,reply_to_pr_review_comment.go,noop.go. Fix: fold into 1a viapreprocessFields(...).1e. "cap max at 50" idiom 3× (MED) —
call_workflow.go:57,dispatch_workflow.go:65,dispatch_repository.go:107. Fix:capMax(maxPtr, limit, log).pkg/cli — 4
2a. Human-comment counting loop reimplemented (HIGH) — fetch
issues/%d/comments, skipisBotUser, count:outcome_eval_issue.go:44,outcome_eval_pr.go:107,outcome_eval_comment.go:64(adds time filter). Fix:countHumanComments(repo, num, after)inoutcome_eval.go.2b. Secret-prompt clones (HIGH) —
engine_secrets.go:291/342/389(promptForCopilotPATUnified/...SystemToken.../...GenericAPIKey...) share the samehuhpassword-form → setenv → upload skeleton; differ only in intro text + Copilot validate. Fix:promptAndStoreSecret(req, config, intro, extraValidate, successMsg).2c. Two parallel update-check subsystems (MED) —
update_check.govscompile_update_check.go:updateLastCheckTime:140/updateCompileUpdateCheckTime:298near-identical (differ by logger+perm);shouldCheckForUpdate/shouldRunCompileUpdateCheckshare throttle core. Fix: sharedlastCheckThrottle{file,interval,perm,log}.2d.
gh secret setupload reimplemented (LOW) —engine_secrets.go:480&add_interactive_secrets.go:51. Fix: sharesetRepoSecret(name,value,repo).2. Outlier Functions (wrong file)
pkg/workflowis well-decomposed; mismatches cluster innotify_comment.go.isGroupConcurrencyQueueEnabled/parseGroupConcurrencyQueueFeatureValue:632/644notify_comment.goconcurrency.goconcurrency.go:84(HIGH)getEngineAPIHosts:764notify_comment.goengine_api_targets.gotoEnvVarCase:740notify_comment.gostrings.gosplitShellTokens:231gh_cli_permissions.goshell.go3. Scattered Helpers & Util Reimplementations
The repo ships shared util packages; these are local reimplementations of them (✓ = verified in this run).
7 findings
3a. ✓ Order-preserving dedup vs
sliceutil.Deduplicate/MergeUnique(HIGH) —claude_tools.go:413 dedupeAllowedTools(verified identical body),docker.go:225 mergeDockerImages(=MergeUnique),central_slash_command_workflow.go:487 uniqueSorted,parser/tools_merger.go:184 mergeAllowedArrays, + inline loops inparser/frontmatter_hash.go:~238,parser/import_field_extractor.go:466.3b. ✓ Pass-through wrappers over
stringutil(HIGH, nuanced) —parser/schema_suggestions.go:234/244FindClosestMatches/LevenshteinDistanceare one-line delegations (verified). Fix: importstringutildirectly — unless retained as a deliberate re-export (maintainer call).3c. Truncate-with-ellipsis inline 4× →
stringutil.Truncate(MED) —audit_report.go:693,749,784,audit_report_analysis.go:57(note: inline len = N+3 vs Truncate cap-at-N).3d.
parseRepoSlugLiteralvsrepoutil.SplitRepoSlug(MED) —dispatch_workflow_validation.go:176; contrast the goodcli/engine_secrets.go:694 splitRepoSlugwhich delegates.3e.
containsAnyreimplemented in 3 packages (MED) —cli/audit_agentic_analysis.go:492,errorutil/errors.go:55 containsErrorSubstring,linters/errormessage/errormessage.go:213 containsAnyWholeWord. Fix: addstringutil.ContainsAny(s, subs...).4. Generics Opportunity
typeutil.LookupString/LookupMap(convert.go:150/169) are identical apart from the asserted type → collapse tofunc Lookup[T any](m map[string]any, key string) (T, bool), keep the two as one-line wrappers (MED; ~3 call sites).5. Near-Clone Files — highest single win
pkg/parser/inline_skill_extractor.go↔sub_agent_extractor.goare structurally line-for-line identical, differing only by marker (skill:vsagent:), result struct (InlineSkill/InlineSubAgent, identical fields), valid-field set, and wording.Paired functions
ValidateInlineSkillsFrontmatter:18ValidateInlineSubAgentsFrontmatter:103ValidateInlineSkillsInBody:28ValidateInlineSubAgentsInBody:123validateInlineSkillFrontmatterFields:44validateSubAgentFrontmatterFields:143GetEngineSkillDir:70GetEngineSubAgentDir:178ExtractInlineSkills:91ExtractInlineSubAgents:248validateUniqueInlineSkillNames:116validateUniqueSubAgentNames:273extractInlineSkill:137extractInlineSubAgent:294collectInlineSkillH2Positions:129+regex:89collectH2Positions:286+regex:236nextInlineSkillH2After:148/nextH2After:305are byte-identical; both H2 regexes are(?m)^##[ \t].Fix: one parameterized
extractInlineSections(markdown, spec)+validateInlineSections(body, spec)driven by a smallinlineSectionSpec{kind, sepRegex, validFields, validFieldList}; collapseInlineSkill/InlineSubAgent→InlineSection(or thin aliases), andGetEngine*Dir→engineConfigDir(engineID, subdir); keep oneh2HeadingRegex+collectH2Positions.Priority-Ordered Recommendations
P1 (high impact, mechanical): collapse the parser extractor files (§5); swap local reimplementations for
sliceutil/stringutil/repoutil(§3a,3c,3d); extract theparse*Configscaffold + preprocess wrapper (§1a,1d).P2: relocate the 4 outliers (§2, start with the cross-file-consumed pair); extract
countHumanComments/promptAndStoreSecret/recordToolCall(§2a,2b,1c); addstringutil.ContainsAny+[]any→[]stringadoption (§3e,1b).P3: unify update-check subsystems (§2c); add
typeutil.Lookup[T](§4); decide on thestringutilre-export wrappers (§3b).Checklist
sliceutil/stringutil/repoutilutilities at reimpl sitescountHumanComments/promptAndStoreSecret/recordToolCallgo build ./...+ full tests to confirm no behavior changeMetadata
871 Go files analyzed (excl.
_test.go) —pkg/workflow393,pkg/cli309,pkg/parser42,pkg/console26. 22 confirmed findings (9 dup · 4 outlier · 7 scattered/util · 1 generics · 1 near-clone). Method: parallel semantic finders → adversarial per-finding verification → Serena/gopls + naming clustering. Date 2026-06-01.References: §26728321552