fix(proxy): prevent stale-key collision in treedb.update() after SetUnFilteredMocks re-indexes by AkashKumar7902 · Pull Request #3946 · keploy/keploy

AkashKumar7902 · 2026-03-20T09:24:03Z

Describe the changes that are made

During replay, SetUnFilteredMocks is called twice per test-set:

Phase 1 (wide window, once at test-set start):
SendMockFilterParamsToAgent(ctx, [], BaseTime, now, ...)
→ FilterTcsMocks marks ALL mocks IsFiltered=true
→ SetUnFilteredMocks: deleteAll() + re-insert; assigns SortOrder/ID 0..N-1

Phase 2 (narrow window, once per test case):
SendMockFilterParamsToAgent(ctx, expectedNames, reqTime, respTime, ...)
→ FilterTcsMocks re-evaluates; startup mocks (outside window) become IsFiltered=false
→ SetUnFilteredMocks: deleteAll() + re-insert again; startup mocks move to end,
remaining mocks shift: what was ID=10 at Phase 1 is now ID=7 at Phase 2

Meanwhile, the integrations-side globalCache (processedMockCache in pkg/postgres/v2/replayer/replayer.go) caches mock pointers together with their Phase-1 TestModeInfo{SortOrder, ID} values. The cache is invalidated only when MockManager.Revision() changes — but in-flight connections that already read the old cache continue to use stale TestModeInfo values.

When such a connection matches a mock and calls UpdateUnFilteredMock → treedb.update(), it supplies the stale Phase-1 key (e.g. SortOrder=11, ID=10). After Phase 2 re-indexed the tree, that key slot now holds a completely different mock (e.g. mock-13). The old update() had no identity check — it trusted the caller's key unconditionally, removed mock-13, and left mock-10 (the actual target) untouched in the tree. Subsequent DB connections that needed mock-10 (e.g. the StartupMessage/auth for connID="2") found nothing and failed with "No matching mock found", causing exit code 1.

This was the root cause of test-set-132 failing while test-set-135 (single connID, no second DB connection) passed — test-set-135 never needed mock-10's equivalent.

Add three-layer identity validation inside treedb.update(), all executed under the existing mutex so the check-and-remove is atomic:

Layer 1 — Exact key match with name validation:
Look up oldKey in the RBT. If found, check mock.Name == expectedName before
proceeding. If the name doesn't match (stale key collision), mark as not-found
and fall through instead of removing the wrong mock.

Layer 2 — ID-index fallback with name validation:
If exact key failed, look up oldInfo.ID in idIndex to get the current key for
this mock (handles the case where only SortOrder shifted but ID is stable).
Validate name again before removing.

Layer 3 — Linear name scan fallback:
If both key and ID are stale (full re-index after deleteAll), iterate the RBT
to find the node whose mock.Name == expectedName. Use the found node's ID in
the new key to keep idIndex consistent.

expectedName is extracted from newObj (*models.Mock).Name. This is always correct because UpdateUnFilteredMock deep-copies the matched mock into updateMock before calling treedb.update(), so newObj.Name always carries the identity of the mock that was actually matched by the integrations layer.

Abstraction boundary: treedb is the authoritative store. Store integrity (only remove what was asked for) must be enforced by the store itself, not scattered across every caller.
Atomicity: any integrations-side fix would require a read-then-update pattern (re-fetch current key, then call update) across two separate operations, leaving a TOCTOU window where another goroutine could call SetUnFilteredMocks between the read and the update. The treedb fix happens entirely under the mutex — the name check and the removal are a single atomic operation.

pkg/agent/proxy/treedb_test.go (new file, 4 tests):

TestUpdateExactMatch
Normal happy path: mock at matching key is found and updated.

TestUpdateRejectsKeyCollision
Bug scenario: stale key points to a different mock after re-index.
update() must reject the operation and leave the innocent mock intact.

TestUpdateNameFallbackAfterReindex
Linear scan fallback: target mock exists under a different key after
SetUnFilteredMocks shifted all IDs. update() must find it by name and
update it correctly.

TestUpdateIDIndexRejectsCollision
ID-index path also validates identity: if idIndex[id] points to a node
holding a different mock, the operation is rejected.

All 4 tests pass: ok go.keploy.io/server/v3/pkg/agent/proxy 0.008s

pkg/agent/proxy/treedb.go — update() method rewritten with 3-layer validation
pkg/agent/proxy/treedb_test.go — new unit tests for update() identity semantics

Links & References

Closes: #[issue number that will be closed through this PR]

NA (if very small change like typo, linting, etc.)

🔗 Related PRs

NA

🐞 Related Issues

NA

📄 Related Documents

NA

What type of PR is this? (check all applicable)

Added e2e test pipeline?

👍 yes (added unit test)
🙅 no, because they aren't needed
🙋 no, because I need help

Added comments for hard-to-understand areas?

👍 yes
🙅 no, because the code is self-explanatory

Added to documentation?

📜 README.md
📓 Wiki
🙅 no documentation needed

Are there any sample code or steps to test the changes?

👍 yes, mentioned below
🙅 no, because it is not needed

Self Review done?

✅ yes
❌ no, because I need help

Any relevant screenshots, recordings or logs?

NA

🧠 Semantics for PR Title & Branch Name

Please ensure your PR title and branch name follow the Keploy semantics:

📌 PR Semantics Guide
📌 Branch Semantics Guide

Examples:

PR Title: fix: patch MongoDB document update bug
Branch Name: feat/#1-login-flow (You may skip mentioning the issue number in the branch name if the change is small and the PR description clearly explains it.)

Additional checklist:

Have you read the Contributing Guidelines on issues?
Have you followed the PR Semantics guide for naming this PR?
Have you followed the Branch Semantics guide for naming your branch?

…nFilteredMocks re-indexes During replay, SetUnFilteredMocks is called twice per test-set: Phase 1 (wide window, once at test-set start): SendMockFilterParamsToAgent(ctx, [], BaseTime, now, ...) → FilterTcsMocks marks ALL mocks IsFiltered=true → SetUnFilteredMocks: deleteAll() + re-insert; assigns SortOrder/ID 0..N-1 Phase 2 (narrow window, once per test case): SendMockFilterParamsToAgent(ctx, expectedNames, reqTime, respTime, ...) → FilterTcsMocks re-evaluates; startup mocks (outside window) become IsFiltered=false → SetUnFilteredMocks: deleteAll() + re-insert again; startup mocks move to end, remaining mocks shift: what was ID=10 at Phase 1 is now ID=7 at Phase 2 Meanwhile, the integrations-side globalCache (processedMockCache in pkg/postgres/v2/replayer/replayer.go) caches mock pointers together with their Phase-1 TestModeInfo{SortOrder, ID} values. The cache is invalidated only when MockManager.Revision() changes — but in-flight connections that already read the old cache continue to use stale TestModeInfo values. When such a connection matches a mock and calls UpdateUnFilteredMock → treedb.update(), it supplies the stale Phase-1 key (e.g. SortOrder=11, ID=10). After Phase 2 re-indexed the tree, that key slot now holds a completely different mock (e.g. mock-13). The old update() had no identity check — it trusted the caller's key unconditionally, removed mock-13, and left mock-10 (the actual target) untouched in the tree. Subsequent DB connections that needed mock-10 (e.g. the StartupMessage/auth for connID="2") found nothing and failed with "No matching mock found", causing exit code 1. This was the root cause of test-set-132 failing while test-set-135 (single connID, no second DB connection) passed — test-set-135 never needed mock-10's equivalent. Add three-layer identity validation inside treedb.update(), all executed under the existing mutex so the check-and-remove is atomic: Layer 1 — Exact key match with name validation: Look up oldKey in the RBT. If found, check mock.Name == expectedName before proceeding. If the name doesn't match (stale key collision), mark as not-found and fall through instead of removing the wrong mock. Layer 2 — ID-index fallback with name validation: If exact key failed, look up oldInfo.ID in idIndex to get the current key for this mock (handles the case where only SortOrder shifted but ID is stable). Validate name again before removing. Layer 3 — Linear name scan fallback: If both key and ID are stale (full re-index after deleteAll), iterate the RBT to find the node whose mock.Name == expectedName. Use the found node's ID in the new key to keep idIndex consistent. expectedName is extracted from newObj (*models.Mock).Name. This is always correct because UpdateUnFilteredMock deep-copies the matched mock into updateMock before calling treedb.update(), so newObj.Name always carries the identity of the mock that was actually matched by the integrations layer. 1. Abstraction boundary: treedb is the authoritative store. Store integrity (only remove what was asked for) must be enforced by the store itself, not scattered across every caller. 2. Atomicity: any integrations-side fix would require a read-then-update pattern (re-fetch current key, then call update) across two separate operations, leaving a TOCTOU window where another goroutine could call SetUnFilteredMocks between the read and the update. The treedb fix happens entirely under the mutex — the name check and the removal are a single atomic operation. pkg/agent/proxy/treedb_test.go (new file, 4 tests): TestUpdateExactMatch Normal happy path: mock at matching key is found and updated. TestUpdateRejectsKeyCollision Bug scenario: stale key points to a different mock after re-index. update() must reject the operation and leave the innocent mock intact. TestUpdateNameFallbackAfterReindex Linear scan fallback: target mock exists under a different key after SetUnFilteredMocks shifted all IDs. update() must find it by name and update it correctly. TestUpdateIDIndexRejectsCollision ID-index path also validates identity: if idIndex[id] points to a node holding a different mock, the operation is rejected. All 4 tests pass: ok go.keploy.io/server/v3/pkg/agent/proxy 0.008s pkg/agent/proxy/treedb.go — update() method rewritten with 3-layer validation pkg/agent/proxy/treedb_test.go — new unit tests for update() identity semantics Signed-off-by: Akash Kumar <[email protected]>

Copilot

Pull request overview

This PR hardens TreeDb.update() against stale-key collisions caused by SetUnFilteredMocks re-indexing, preventing accidental removal of the wrong mock during replay.

Changes:

Adds multi-layer identity validation to treedb.update() (exact-key → ID-index → name-scan fallback).
Introduces unit tests covering the collision scenario and fallback behaviors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
pkg/agent/proxy/treedb.go	Adds identity checks and fallback lookup paths to avoid removing the wrong mock after re-indexing.
pkg/agent/proxy/treedb_test.go	Adds tests that exercise exact match, key-collision rejection, name fallback after reindex, and ID-index collision rejection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-20T09:32:22Z

pkg/agent/proxy/treedb.go

+		currentKey, exists := db.idIndex[oldInfo.ID]
+		if exists {
+			// Verify identity before removing
+			if expectedName != "" {
+				idVal, idFound := db.rbt.Get(currentKey)
+				if idFound {
+					if existingMock, ok := idVal.(*models.Mock); ok && existingMock != nil {
+						if existingMock.Name != expectedName {
+							exists = false // Different mock at this ID, skip
+						}
+					}
+				}
+			}
+			if exists {
+				db.rbt.Remove(currentKey)
+				db.rbt.Put(newKey, newObj)
+				delete(db.idIndex, oldInfo.ID)


idFound == false currently leaves exists == true, so the ID-index path can still proceed and return success even when the idIndex points to a key that no longer exists in the tree. That can create “phantom updates” (inserting newKey without actually removing/replacing the intended old entry) and can corrupt idIndex consistency. Set exists = false (or return false) when db.rbt.Get(currentKey) does not find a node.

Copilot · 2026-03-20T09:32:22Z

pkg/agent/proxy/treedb.go

+			db.rbt.Remove(foundKey)
+			// Use the found mock's ID in the new key to keep idIndex consistent
+			adjustedNewKey := newInfo
+			if foundInfoOk {
+				adjustedNewKey.ID = foundInfo.ID
+				delete(db.idIndex, foundInfo.ID)
+			}
+			db.rbt.Put(adjustedNewKey, newObj)
+			if okNew {
+				db.idIndex[adjustedNewKey.ID] = adjustedNewKey
+			}


The fallback path always derives the new key from newInfo (a models.TestModeInfo), even if okNew is false. In that case newInfo is the zero value and db.rbt.Put(adjustedNewKey, ...) can insert under an unintended key type/value for the tree/comparator. This path should require okNew before using newInfo (otherwise avoid updating / return false), or build the adjusted key from the original newKey in a type-safe way.

Suggested change

db.rbt.Remove(foundKey)

// Use the found mock's ID in the new key to keep idIndex consistent

adjustedNewKey := newInfo

if foundInfoOk {

adjustedNewKey.ID = foundInfo.ID

delete(db.idIndex, foundInfo.ID)

}

db.rbt.Put(adjustedNewKey, newObj)

if okNew {

db.idIndex[adjustedNewKey.ID] = adjustedNewKey

}

// We can only safely construct a new tree key if the caller provided

// a valid TestModeInfo-based newKey (okNew) and the found key is also

// a TestModeInfo (foundInfoOk). Otherwise, avoid mutating the tree.

if !okNew || !foundInfoOk {

return false

}

db.rbt.Remove(foundKey)

// Use the found mock's ID in the new key to keep idIndex consistent

adjustedNewKey := newInfo

adjustedNewKey.ID = foundInfo.ID

delete(db.idIndex, foundInfo.ID)

db.rbt.Put(adjustedNewKey, newObj)

db.idIndex[adjustedNewKey.ID] = adjustedNewKey

Copilot · 2026-03-20T09:32:22Z

pkg/agent/proxy/treedb.go

+	// First try exact key match
+	existingVal, found := db.rbt.Get(oldKey)
+	if found && expectedName != "" {
+		if existingMock, ok := existingVal.(*models.Mock); ok && existingMock != nil {
+			if existingMock.Name != expectedName {
+				found = false // Different mock at this key, skip
+			}
+		}
+	}


The collision-prevention logic is skipped when expectedName == "", meaning a stale key can still remove the wrong entry (the original bug class) if a mock name is empty or if newObj isn’t a *models.Mock. If the goal is “never remove a different mock than intended,” consider treating missing expectedName as a hard failure (return false) or adding an alternate identity signal (e.g., validate against oldInfo.ID/other invariant) before allowing a remove.

Copilot · 2026-03-20T09:32:22Z

pkg/agent/proxy/treedb_test.go

+	// Stale reference uses old key for mock-7
+	staleKey := models.TestModeInfo{SortOrder: 8, ID: 7}
+	newMock7 := mockWith("mock-7", 99, 7)
+	newMock7.TestModeInfo.IsFiltered = false
+
+	ok := db.update(staleKey, newMock7.TestModeInfo, newMock7)


TestUpdateNameFallbackAfterReindex doesn’t assert that (a) the updated mock is present under the actual inserted key after fallback, and (b) the stored mock’s internal TestModeInfo is consistent with the key chosen by update() (especially since the implementation may adjust the ID). Adding assertions for the final key presence + mock.TestModeInfo/idIndex consistency would catch subtle regressions in the fallback logic.

AkashKumar7902 requested review from Sarthak160 and gouravkrosx as code owners March 20, 2026 09:24

Copilot AI review requested due to automatic review settings March 20, 2026 09:24

Copilot started reviewing on behalf of AkashKumar7902 March 20, 2026 09:30 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Merge branch 'main' into fix/treedb-stale-key-collision-on-reindex

c5ba3ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(proxy): prevent stale-key collision in treedb.update() after SetUnFilteredMocks re-indexes#3946

fix(proxy): prevent stale-key collision in treedb.update() after SetUnFilteredMocks re-indexes#3946
AkashKumar7902 wants to merge 2 commits intomainfrom
fix/treedb-stale-key-collision-on-reindex

AkashKumar7902 commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

Copilot AI Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AkashKumar7902 commented Mar 20, 2026

Describe the changes that are made

Links & References

🔗 Related PRs

🐞 Related Issues

📄 Related Documents

What type of PR is this? (check all applicable)

Added e2e test pipeline?

Added comments for hard-to-understand areas?

Added to documentation?

Are there any sample code or steps to test the changes?

Self Review done?

Any relevant screenshots, recordings or logs?

🧠 Semantics for PR Title & Branch Name

Additional checklist:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants