sandbox: atomic state writes, multi-gateway SSH config, smarter post-register hint#5537
Closed
akshaysingla-db wants to merge 3 commits into
Closed
sandbox: atomic state writes, multi-gateway SSH config, smarter post-register hint#5537akshaysingla-db wants to merge 3 commits into
akshaysingla-db wants to merge 3 commits into
Conversation
Now that the server stamps `gateway_host` on every SshKey and Sandbox response, `register` always populates the cache, and every subsequent operation that touches the API gets the workspace's real gateway directly. Two things become dead weight: 1. The hardcoded heuristic `resolveGatewayHost(workspaceHost)` that returned `uw2.dbrx.dev` for non-staging workspaces and `ue1.s.dbrx.dev` for staging. It produced a wrong answer for any workspace whose gateway didn't match those two values. 2. The `--gateway` flag on `sandbox ssh`. It was a manual override that made sense only when the CLI couldn't learn the gateway itself; with the cache always populated post-register, the only thing the flag now does is let a user typo their way into dialing a random host. The footgun outweighs the override value. Removes: - `defaultGatewayHost = "uw2.dbrx.dev"` - `stagingDefaultGatewayHost = "ue1.s.dbrx.dev"` - `resolveGatewayHost(workspaceHost string) string` - `--gateway` flag and the `gatewayHost` local var in `newSSHCommand` New resolution chain in `sandbox ssh`: fresh API response (populated by the `api.get` of the resolved sandbox ID) → cached value (set by any prior register / list / create / get / start / stop). If both empty, error out with a pointer to `databricks sandbox register`. Tests added: - `acceptance/cmd/sandbox/register/success/` — pins the new contract: registerKey returns SshKey with gateway_host, register caches it in ~/.databricks/sandbox.json, prints the consent-skip notice in the non-interactive acceptance path. Parent `test.toml` now Ignores `.ssh` so the generated key + managed config don't appear as unexpected output files. - `TestMaybeWriteSSHConfigSkipsOnEmptyGateway` — pins the empty-gateway short-circuit so a future refactor can't silently drop the "skip when server didn't stamp a host" branch. Co-authored-by: Isaac
…register hint Three followups from the post-merge review of the gateway-host resolution path. 1. **Atomic saveState** (state.go). `os.WriteFile` is open-truncate-write, which means a concurrent reader (another CLI invocation racing this one) can see a half-written sandbox.json and fail to parse it — `loadState` then silently returns "" for `getDefault`/`getGatewayHost`. The acceptance suite documents the symptom in its script.prepare; production code now gets the real atomic write via tmp + Rename, mirroring `writeManagedConfig` in sshconfig.go. 2. **One Host stanza per cached gateway** (sshconfig.go, state.go). Previously, registering against a workspace in region B after one in region A overwrote the SSH config block, silently breaking IDE Remote-SSH for workspace A. Now `register` reads the deduplicated set of gateway hosts across every cached profile (new `allGatewayHosts` helper) and emits one Host stanza per unique gateway. `buildSSHConfigBlock` takes `[]string` instead of a single host; `maybeWriteSSHConfig` no longer takes a gateway parameter at all and reads from state. The same-gateway case (multiple profiles in the same region) is still trivially handled — the dedup collapses them. 3. **Post-register hint branches on `getDefault`** (register.go). The tail of `sandbox register` used to unconditionally suggest `databricks sandbox ssh`, even when the user had no default sandbox configured (so `ssh` would error). It now suggests `databricks sandbox create` when there's no default, and `ssh` only when there's one to connect to. Tests: - `TestBuildSSHConfigBlockMultipleGateways` pins the per-gateway repetition (Host line, Port, IdentityFile) in the new block shape. - `TestMaybeWriteSSHConfigSkipsOnEmptyGateway` is renamed and refocused to `TestMaybeWriteSSHConfigSkipsWhenNoGatewaysCached` — the function no longer takes a gateway parameter, so the guarded branch is now "state is empty". - The `acceptance/cmd/sandbox/register/success` golden picks up the new "Run databricks sandbox create" suggestion and the pluralised "add the sandbox gateway block(s)" consent message. Stacked on databricks#5536 (drop heuristic + --gateway flag). The diff against main currently includes that work too; collapses to the changes above once databricks#5536 merges. Co-authored-by: Isaac
Contributor
Approval status: pending
|
Contributor
|
An authorized user can trigger integration tests manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
Collaborator
Author
|
Superseded by #5543 — same diff pushed to an upstream branch so CI can pull JFrog tokens (fork PRs can't, per GitHub OIDC permission rules). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three followups from the post-merge review of the gateway-host resolution path, stacked on #5536.
1. Atomic
saveStateos.WriteFileis open-truncate-write — a concurrent reader (another CLI invocation racing this one) can see a half-writtensandbox.jsonandloadStatereturns a parse error, whichgetDefault/getGatewayHostsilently swallow as"". The acceptance suite documents the symptom in itsscript.prepare(per-test HOME isolation). Production code now gets the real atomic write via tmp +Rename, matchingwriteManagedConfiginsshconfig.go.2. One Host stanza per cached gateway
Previously, registering against a workspace in region B after one in region A overwrote the SSH config block, silently breaking IDE Remote-SSH for workspace A. Now
registerreads the deduplicated set of gateway hosts across every cached profile (newallGatewayHostshelper) and emits oneHoststanza per unique gateway:buildSSHConfigBlocktakes[]stringinstead of a single host;maybeWriteSSHConfigno longer takes a gateway parameter at all and reads from state.3. Post-register hint branches on
getDefaultsandbox registerused to unconditionally suggestdatabricks sandbox ssh, even when the user had no default sandbox (sosshwould error). It now suggestsdatabricks sandbox createwhen there's no default, andsshonly when there's something to connect to.Test plan
go test ./cmd/sandbox/...passesgo test ./acceptance -run TestAccept/cmd/sandboxpassesgo build ./...clean./task lintcleanTestBuildSSHConfigBlockMultipleGatewayspins per-gateway repetitionTestMaybeWriteSSHConfigSkipsWhenNoGatewaysCached(was…SkipsOnEmptyGateway) — the function no longer takes a gateway parameterStacking note
Stacked on #5536 (drop heuristic +
--gatewayflag). The diff againstmaincurrently includes that work too; collapses to just these three changes once #5536 merges.This pull request and its description were written by Isaac.