Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only wake up from waitDeps to trace when debugging #2176

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jonjohnsonjr
Copy link
Contributor

@jonjohnsonjr jonjohnsonjr commented Nov 18, 2024

When a vertex is waiting on its dependencies, it blocks on those deps, but it also wakes up every 5 seconds in order to log what it's waiting for. Those logs are immediately discarded unless TRACE logs are enabled.

This is similar to #1810

I've pulled out the timer creation into a separate function which only conditionally creates the timer if debugging logs are enabled. This allows us to avoid creating/selecting on timers at all, which takes up 90% of the waitDeps on-CPU time (and puts additional pressure on the scheduler).

Grabbed some screenshots of perf difference against our (somewhat pathological) workload:

Profiling before (3.6s):

image

Profiling after (318ms):

image

Target Release

1.9.0

Checklist

  • I have read the contribution guide.
  • I have not used an AI coding assistant to create this PR.
  • I have written all code in this PR myself OR I have marked all code I have not written myself (including modified code, e.g. copied from other places and then modified) with a comment indicating where it came from.
  • I (and other contributors to this PR) have not looked at the Terraform source code while implementing this PR.

Go checklist

  • I have run golangci-lint on my change and receive no errors relevant to my code.
  • I have run existing tests to ensure my code doesn't break anything.
  • I have added tests for all relevant use cases of my code, and those tests are passing.
  • I have only exported functions, variables and structs that should be used from other packages.
  • I have added meaningful comments to all exported functions, variables, and structs.

Website/documentation checklist

  • I have locally started the website as described here and checked my changes.

@jonjohnsonjr jonjohnsonjr requested a review from a team as a code owner November 18, 2024 20:19
Copy link

Reminder for the PR assignee: If this is a user-visible change, please update the changelog as part of the PR.

@cam72cam cam72cam self-assigned this Nov 19, 2024
@jonjohnsonjr
Copy link
Contributor Author

I think I actually prefer this simpler diff:

diff --git a/internal/dag/walk.go b/internal/dag/walk.go
index d73575cd2f..6c882f87dd 100644
--- a/internal/dag/walk.go
+++ b/internal/dag/walk.go
@@ -408,6 +409,15 @@ func (w *Walker) walkVertex(v Vertex, info *walkerVertex) {
        w.diagsLock.Unlock()
 }
 
+func debugTimer() <-chan time.Time {
+       // This is expensive to do at scale, so we only do it when debugging.
+       if logging.IsDebugOrHigher() {
+               return time.After(time.Second * 5)
+       }
+
+       return nil
+}
+
 func (w *Walker) waitDeps(
        v Vertex,
        deps map[Vertex]<-chan struct{},
@@ -429,7 +439,7 @@ func (w *Walker) waitDeps(
                                doneCh <- false
                                return
 
-                       case <-time.After(time.Second * 5):
+                       case <-debugTimer():
                                log.Printf("[TRACE] dag/walk: vertex %q is waiting for %q",
                                        VertexName(v), VertexName(dep))
                        }

It's a little weird but it has the same result, let me know which you prefer (if any)!

cam72cam
cam72cam previously approved these changes Nov 26, 2024
When a vertex is waiting on its dependencies, it blocks on those deps,
but it also wakes up every 5 seconds in order to log what it's waiting
for. Those logs are immediately discarded unless TRACE logs are enabled.

This is similar to opentofu#1810

I've pulled out the timer creation into a separate function which only
conditionally creates the timer if debugging logs are enabled.
This allows us to avoid creating/selecting on timers at all, which takes
up 90% of the waitDeps on-CPU time (and puts additional pressure on the
scheduler).

Signed-off-by: Jon Johnson <[email protected]>
@jonjohnsonjr
Copy link
Contributor Author

Just re-pushed something that (I think) appeases the linter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants