-
Notifications
You must be signed in to change notification settings - Fork 106
Rate-limit write-channel-full log in file surfacer (#1190) #1191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Andrew 👋 , how is it going? :)
Left some comments.
internal/surfacers/file/file.go
Outdated
| var ( | ||
| logLimiter = rate.NewLimiter(rate.Every(10*time.Second), 1) | ||
| suppressedLogCount int64 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about moving these to the Surfacer struct, so that they are bound to an instance of the surfacer?
Typically most people will have at most 1 file surfacer, but it's entirely possible to have more than 1 -- in that case package-level global variables will not play nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
At first I couldn't see how the log messages from different surfacers could be differentiated from each other, but looking closer, I see they can be configured with different loggers, so it makes sense for the log throttling to be independent for different surfacers.
| s.l.Errorf("Surfacer's write channel is full, dropping new data.") | ||
| if logLimiter.Allow() { | ||
| suppressed := atomic.SwapInt64(&suppressedLogCount, 0) | ||
| s.l.Errorf("Surfacer's write channel is full, dropping new data. (%d repeat messages suppressed)", suppressed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it not always log 0 repeat messages suppressed because you're resetting suppressed before logging?
I guess we can move this line before suppressed reset? Also, it will be nice to not say "0 repeat messages suppressed" -- in the beginning and when log message is anyway happening infrequently. Can we add that line only when suppressed is not 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it not always log 0 repeat messages suppressed because you're resetting suppressed before logging?
No, because we are not logging the suppressedLogCount itself - we are logging the result of atomic.SwapInt64 which should return the old value, before it got set to zero.
Also, it will be nice to not say "0 repeat messages suppressed" -- in the beginning and when log message is anyway happening infrequently. Can we add that line only when suppressed is not 0?
It is true that the first time this message gets logged, it will say 0 repeat messages suppressed. I actually thought that was a good thing, as it alerts the user there is log throttling going on, and it keeps the code simpler as well.
My personal experience of this log message is that it either happens basically never (because we are writing below the limit the file can cope with), or it happens at a very high rate (because we are persistently trying to write above that rate). I doubt there will be many cases where the log message is sporadic, so that every single instance says 0 repeat messages suppressed.
If you think the extra code complexity is justified to avoid logging 0 repeat messages suppressed, let me know and I will make the change.
No description provided.