[aw-failures] [aw] Failure Investigator &mdash; 6h Review (2026-06-12 08:14 UTC)

### Executive summary

**Fix the daily AIC guardrail experience first &mdash; it caused 3 of 7 in-window failures and is the dominant red-build source.** Two workflows hard-failed at `activation` because the daily AI-credits guardrail tripped, and one (`Code Simplifier`) burned **12.3M tokens / 4,219 AIC / 244 turns** in a single run before the `agent` job crashed. The guardrail tripping is the system working as designed, but it surfaces as opaque workflow failures and is already tracked by #38624 and #38645 &mdash; no new tracking needed there. The one genuinely uncovered fix is the `Code Simplifier` runaway (see sub-issue).

**Note on data freshness:** the deterministic pre-fetch payload (`prefetch.json`, generated 2026-06-12 08:14 UTC) listed `failed_run_ids: []` and `failures: []`, missing all 7 in-window failures. Findings below were recovered by querying `gh run list --status failure` directly. The empty prefetch is itself a reliability gap worth fixing.

### Failure cluster table

| Cluster | Workflow(s) | Run ID(s) | Failing job | Root cause | Priority | Coverage |
|---|---|---|---|---|---|---|
| A &mdash; Daily AIC guardrail | PR Code Quality Reviewer, Test Quality Sentinel | [27393233737](https://github.com/github/gh-aw/actions/runs/27393233737), [27393233765](https://github.com/github/gh-aw/actions/runs/27393233765) | `activation` | Daily AIC guardrail exceeded (5959.4/5000; 5043.0/5000) | P1 | Tracked: #38624, #38645 |
| B &mdash; Agent runaway | Code Simplifier | [27395179213](https://github.com/github/gh-aw/actions/runs/27395179213) | `agent` | Agent hard-fail after 12.3M tok / 4,219 AIC / 244 turns / 32.8m | P1 | **Uncovered &rarr; sub-issue** |
| C &mdash; CI lint (out of scope) | CGO | [27393478310](https://github.com/github/gh-aw/actions/runs/27393478310), [27393440909](https://github.com/github/gh-aw/actions/runs/27393440909) | `lint-go` | Go lint exit code 1 &mdash; non-agentic CI pipeline | P2 | No action (out of scope) |
| D &mdash; Post-job push | Daily Safe Outputs Git Simulator | [27397597917](https://github.com/github/gh-aw/actions/runs/27397597917) | `push_repo_memory` | Agent succeeded (4/4 sim configs PASS); memory-push post-job failed | P2 | Monitor (single/transient) |

### Evidence

<details>
<summary>Cluster A &mdash; AIC guardrail (activation hard-fails)</summary>

Both runs failed in the `activation` job (agent never ran; `detection`/`safe_outputs` skipped):

- PR Code Quality Reviewer: `##[error]Daily workflow AIC guardrail exceeded for PR Code Quality Reviewer: 5959.44416/5000.`
- Test Quality Sentinel: `##[error]Daily workflow AIC guardrail exceeded for Test Quality Sentinel: 5043.04317/5000.`

The guardrail behaves correctly, but a tripped guardrail renders as a generic red failure with no distinct status. This is the recurring theme behind #38624 (raise cap) and #38645 (soft pre-cap guard).

</details>

<details>
<summary>Cluster B &mdash; Code Simplifier runaway</summary>

- Engine: GitHub Copilot CLI 1.0.60, model `claude-sonnet-4.6`, scheduled trigger.
- Metrics: 12,306,086 tokens, 4,219.8 AIC, 244 turns, 32.8m wall time, 0 write actions (read-only posture), 670 firewall requests / 0 blocked.
- `agent` job hard-failed; no structured error captured (the process terminated). A single run consumed ~84% of the 5000 daily AIC budget.
- Audit flagged: Resource Heavy For Domain (high), Many Iterations (244 turns), ~50% reducible to deterministic steps.

This is distinct from Cluster A: it is a per-PR/scheduled code-fix workflow with a turn/token runaway, not a tripped guardrail. See sub-issue for the proposed fix.

</details>

<details>
<summary>Cluster C &mdash; CGO lint-go (out of scope)</summary>

CGO is the project's main CI pipeline (validate-yaml, test, build, security scans, lint-go, etc.), not an agentic workflow. Both failures were `lint-go` exiting 1; all other ~24 jobs passed. Excluded from agentic-workflow remediation; flagged here only for completeness.

</details>

<details>
<summary>Cluster D &mdash; Git Simulator push_repo_memory</summary>

Run [27397597917](https://github.com/github/gh-aw/actions/runs/27397597917): `activation`, `agent`, `detection`, `safe_outputs`, `conclusion` all succeeded &mdash; the agent reported 4/4 simulator configs PASS and correctly emitted `noop`. Only the `push_repo_memory` post-job failed (36s). Single occurrence; likely a transient push/concurrency issue. Monitoring rather than tracking.

</details>

<details>
<summary>audit-diff &mdash; no behavioral regressions</summary>

Pairwise `audit-diff` across the failed runs showed `has_anomalies: false` and `anomaly_count: 0` for every pair. The only firewall deltas were expected engine differences (`api.anthropic.com` vs `api.githubcopilot.com` + `sentry.io`). No firewall, MCP, or tooling regression is implicated in any cluster.

</details>

### Existing issue correlation

- **#38624** (`ai-credits`) &mdash; Raise max-ai-credits for Failure Investigator: directly covers the Cluster A guardrail theme.
- **#38645** (`deep-report`) &mdash; Add a soft pre-cap AI-credits guard to heavy aggregator workflows: covers both Cluster A (graceful pre-cap) and the prevention angle of Cluster B.
- **#38767** (`agentic-workflows`) &mdash; Closed by this run as a stale transient self-report.
- #38460, #38609, #29109 &mdash; reliability/audit/meta issues; no direct match to in-window failures.

No duplicate tracking is created for Clusters A/C/D.

### Fix roadmap

- **P0:** None. No P0 failure lacks tracking coverage.
- **P1:**
  - Fix the `Code Simplifier` turn/token runaway (Cluster B) &mdash; see sub-issue below.
  - Land the soft pre-cap AIC guard (#38645) so guardrail trips degrade gracefully instead of hard-failing `activation` (Cluster A).
- **P2:**
  - Fix the empty deterministic prefetch payload so this workflow does not depend on live `gh` recovery.
  - Monitor the Git Simulator `push_repo_memory` post-job failure (Cluster D); track only if it recurs.

### Sub-issues created

- #38809 &mdash; Fix Code Simplifier agent runaway (244 turns / 12.3M tokens / 4,219 AIC).

**References:**
- [&sect;27395179213](https://github.com/github/gh-aw/actions/runs/27395179213)
- [&sect;27393233737](https://github.com/github/gh-aw/actions/runs/27393233737)
- [&sect;27397597917](https://github.com/github/gh-aw/actions/runs/27397597917)

---

---

### 6h Review addendum &mdash; 2026-06-12 19:25 UTC

**Prefetch was empty again** (`failed_run_ids: []`, `failures: []`) despite **33 failures** in the window &mdash; recovered via `gh run list --status failure`. Same reliability gap this report already flagged.

New uncovered fixes filed as sub-issues of this report:
- **Documentation Unbloat &mdash; Git LFS / `build:slides`** (#aw_lfs1): checkout lacks `lfs: true`, slides PDF is an LFS pointer, docs build exits 1 before the agent runs. Latent in `technical-doc-writer`, `update-astro`, `visual-regression-checker`. Deterministic regression vs 2026-06-11 baseline.
- **Copilot SDK-driver tool-denial runaway** (#aw_deny1): `Daily Formal Spec Verifier` + `Breaking Change Checker` hit `guard.tool_denials_exceeded` (mis-scoped allowlist), burning up to 341 AIC before hard-failing.

Observed but not filed (lower priority / already covered):
- **Cluster A still active** &mdash; `PR Code Quality Reviewer` (&times;10) + `Matt Pocock Skills Reviewer` (&times;7) failed at `activation` on the daily AIC guardrail (e.g. 5934.9/5000, 5169.0/5000), turns=0. Already tracked by #38624 / #38645; volume is PR re-trigger churn.
- **Daily Issues Report Generator** ([27425935620](https://github.com/github/gh-aw/actions/runs/27425935620)) &mdash; exit 127, Node.js missing in the AWF chroot while launching the experimental Python SDK driver. Experimental-driver gap.
- **Test Quality Sentinel** ([27420645901](https://github.com/github/gh-aw/actions/runs/27420645901)) &mdash; agent succeeded (26 turns, valid `add_comment`) but the `safe_outputs` job failed posting on a GitHub REST API 403 installation rate limit. Transient; consider retry/backoff on safe-output posting.
- **#38809 (Code Simplifier runaway)** &mdash; fix appears merged as commit `4d9c6ac` (PR #38851, "Cap Code Simplifier runaways..."); no in-window Code Simplifier failure observed. Recommend verifying on the next scheduled run before closing.
- Out of scope: CGO (&times;3), CI (&times;2), Copilot cloud agent (&times;3, non-gh-aw), `Daily Credit Limit Test (Intentionally Broken)`. `Super Linter Report` and `Daily Agent of the Day Blog Writer` failures were not individually root-caused this pass.

**References:** [&sect;27431567219](https://github.com/github/gh-aw/actions/runs/27431567219), [&sect;27428819181](https://github.com/github/gh-aw/actions/runs/27428819181), [&sect;27420645901](https://github.com/github/gh-aw/actions/runs/27420645901)

> Generated by [&#128269; [aw] Failure Investigator (6h)](https://github.com/github/gh-aw/actions/runs/27438027894) &middot; 400 AIC &middot; &#8982; 14.2 AIC &middot; &#8862; 5.1K &middot; [&#9719;](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Faw-failure-investigator%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aw-failures] [aw] Failure Investigator — 6h Review (2026-06-12 08:14 UTC) #38807

Executive summary

Failure cluster table

Evidence

Existing issue correlation

Fix roadmap

Sub-issues created

6h Review addendum — 2026-06-12 19:25 UTC

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Cluster	Workflow(s)	Run ID(s)	Failing job	Root cause	Priority	Coverage
A — Daily AIC guardrail	PR Code Quality Reviewer, Test Quality Sentinel	27393233737, 27393233765	`activation`	Daily AIC guardrail exceeded (5959.4/5000; 5043.0/5000)	P1	Tracked: #38624, #38645
B — Agent runaway	Code Simplifier	27395179213	`agent`	Agent hard-fail after 12.3M tok / 4,219 AIC / 244 turns / 32.8m	P1	Uncovered → sub-issue
C — CI lint (out of scope)	CGO	27393478310, 27393440909	`lint-go`	Go lint exit code 1 — non-agentic CI pipeline	P2	No action (out of scope)
D — Post-job push	Daily Safe Outputs Git Simulator	27397597917	`push_repo_memory`	Agent succeeded (4/4 sim configs PASS); memory-push post-job failed	P2	Monitor (single/transient)

[aw-failures] [aw] Failure Investigator — 6h Review (2026-06-12 08:14 UTC) #38807

Description

Executive summary

Failure cluster table

Evidence

Existing issue correlation

Fix roadmap

Sub-issues created

6h Review addendum — 2026-06-12 19:25 UTC

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions