feat: require operator-authored justification for dangerously-disable-sandbox-agent#38325
Conversation
… trust boundary in diagnostics Co-authored-by: pelikhan <[email protected]>
There was a problem hiding this comment.
Pull request overview
Tightens compilation-time controls around sandbox.agent: false by requiring an explicit, operator-authored justification string (and preserving it on the parsed workflow struct) so disabling the agent sandbox is always a reviewable, intentional trust-boundary change.
Changes:
- Enforce
features.dangerously-disable-sandbox-agentas a 20+ char literal string justification (no booleans / no${{ }}expressions) and store it onAgentSandboxConfig.DisableReason. - Remove the codemod behavior that silently injected a generic justification during
network.firewallmigration; migrated workflows now require an operator-provided reason. - Add unit tests asserting rejection/acceptance cases and that validation diagnostics include “trust boundary”.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/sandbox.go | Adds DisableReason field to retain validated operator justification for downstream diagnostics/audit. |
| pkg/workflow/sandbox_validation.go | Updates sandbox-disable validation messaging and stores the validated justification on the config. |
| pkg/workflow/sandbox_validation_test.go | Adds tests for justification validation, trust-boundary wording, and struct storage. |
| pkg/cli/codemod_network_firewall.go | Removes silent insertion of a generic sandbox-disable justification during firewall migration. |
| pkg/cli/codemod_network_firewall_test.go | Updates codemod assertions to ensure no justification is silently injected and existing ones remain intact. |
| docs/src/content/docs/reference/sandbox.md | Documents the required operator-authored justification and the values the compiler rejects. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 6/6 changed files
- Comments generated: 1
| @@ -22,7 +22,9 @@ Configure the coding agent sandbox type to control how the AI engine is isolated | |||
| sandbox: | |||
| agent: awf | |||
|
|
|||
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. |
|
🧠 Matt Pocock Skills Reviewer was skipped during the skills-based review. |
|
✅ PR Code Quality Reviewer completed the code quality review. |
Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (>100 new lines in 📄 Draft ADR committed:
📋 What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. ❓ Why ADRs MatterADRs create a searchable, permanent record of why the codebase looks the way it does. This PR makes a deliberate security trade-off — requiring operator intent before a workflow can remove a trust boundary — exactly the kind of decision future contributors will want the rationale for. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
ADRs are stored in
|
🧪 Test Quality Sentinel Report✅ Test Quality Score: 80/100 — Excellent
📊 Metrics & Test Classification (3 tests analyzed)
Test Classification Details
Modified Existing Tests (Behavioral Improvements)Three assertion sites in
Language SupportTests analyzed:
|
There was a problem hiding this comment.
Four non-blocking findings across two medium and two low issues. No blocking correctness or security bugs, but three of the four touch the same getSandboxDisableJustification / test suite cluster and are worth cleaning up together.
### Findings summary
Medium — expression/length check ordering (test coverage gap exposes pre-existing bug)
All expression tests use ${{ inputs.reason }} which is exactly 20 chars — the minimum. A short expression like ${{ x }} (8 chars) hits the length gate first and returns the wrong error. Add a test for a sub-20-char expression and reorder the guards: expression check before length check.
Medium — DisableReason is write-only dead storage
The field is stored during validation but no production code reads it. The comment claims audit/diagnostics capabilities that do not exist yet. Either wire it to a real downstream pathway (lock-file metadata, structured log) or add an explicit TODO so the gap is visible.
Low — expression rejection test fragile against constant changes
The test input ${{ inputs.reason }} is exactly at the length boundary. Raise minSandboxDisableJustificationLength above 20 and the test silently starts failing for the wrong reason. Use an unambiguously long expression string to decouple the test from the constant.
Low — migrated YAML file gives no hint about the required next step
The # Migrated from deprecated network setting comment on agent: false is silent about the required dangerously-disable-sandbox-agent feature. The compiler error explains it, but the YAML file itself leaves operators without guidance on re-read. A one-line comment update would eliminate the second confused CI failure.
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
proxy.golang.org
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "proxy.golang.org"See Network Configuration for more information.
🔎 Code quality review by PR Code Quality Reviewer · ⌖ 13.8 AIC
Comments that could not be inline-anchored
pkg/workflow/sandbox_validation_test.go:136
Test coverage gap hides a check-ordering bug in the underlying function: all expression tests use ≥ 20-char values (${{ inputs.reason }} is exactly 20 chars), so no test exercises what happens when an operator provides a short expression like ${{ x }} (8 chars). That case hits the length check first and returns "at least 20 characters" instead of "expressions not allowed", misdirecting the operator.
<details>
<summary>💡 Add a test case and reorder checks in getSandboxDisableJustific…
pkg/workflow/sandbox.go:53
DisableReason is write-only dead storage: the field is assigned in validateSandboxConfig and documented as "available for diagnostics and audit logging", but nothing in this PR — or the existing codebase — reads it. This creates a false impression of an audit trail.
<details>
<summary>💡 Either wire it to a real consumer or mark it as deferred</summary>
If a downstream consumer (e.g. lock-file metadata, audit log, telemetry) genuinely exists or is planned in this sprint, wire it now w…
pkg/cli/codemod_network_firewall.go:55
Codemod silently leaves workflows in a compiler-failing state with no in-file guidance: removing ensureSandboxDisableFeatureFlag is the right call (operators must supply their own justification), but the migrated YAML now reads agent: false # Migrated from deprecated network setting with no hint that compilation will fail until the operator adds features.dangerously-disable-sandbox-agent. The compilation error message is clear, but the YAML file itself is silent.
<details>
<summary>…
pkg/workflow/sandbox_validation_test.go:139
Expression rejection test is fragile: it relies on the input being exactly minSandboxDisableJustificationLength chars: ${{ inputs.reason }} is exactly 20 characters, so the length check passes by a margin of zero and the expression check runs. If minSandboxDisableJustificationLength is ever bumped above 20, the length guard fires first, the error no longer contains "expressions", and this test fails with a misleading message that hides the root cause.
<details>
<summary>💡 Use an e…
dangerously-disable-sandbox-agent: true(boolean) was accepted as a shorthand and thenetwork-firewall-migrationcodemod silently injected a generic string justification — both paths let a workflow remove its agent sandbox trust boundary without reviewable operator intent.Validation
true/false) and${{ expressions }}are now rejected at compile timeStruct:
AgentSandboxConfig.DisableReasonThe validated justification is stored on the parsed struct — available to diagnostics, logging, and audit, not just the compilation gate.
Codemod: no silent justification insertion
ensureSandboxDisableFeatureFlagandmigratedSandboxDisableJustificationare removed. A workflow migrated fromnetwork.firewall: falsenow fails compilation until the operator writes their own reason. Pre-existing operator-authored justifications are preserved.Example: valid frontmatter after migration
Tests added
truerejected;falserejected; empty/short/whitespace-padded rejected;${{ inputs.reason }}rejectedDisableReason"trust boundary"in message