Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid regex 'parsing' format strings every time #2164

Merged
merged 2 commits into from
Apr 8, 2022

Conversation

fbs
Copy link
Member

@fbs fbs commented Mar 10, 2022

bpftrace::format spends about 60% of its time in the regex_iterator we
use to split the string.
The FormatString type avoids this by splitting the string on first use
and storing the result for re-use.

I've kept the code mostly as is, just moving and minor modifications.
The whole output code can do with a refactoring but this should be a
nice performance gain.

Test case is running opensnoop.bt while running find / in another
terminal:

original: bpftrace::perf_event_printer spent 75% of the time in format
new: bpftrace::perf_event_printer spent 15% of the time in format

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

@fbs fbs force-pushed the optimize_fmtstr branch from 1d6e9bf to 32b7059 Compare March 10, 2022 23:30
@fbs
Copy link
Member Author

fbs commented Mar 10, 2022

before:

profile-pre

after:

profile-post

@fbs fbs force-pushed the optimize_fmtstr branch from 32b7059 to 44a4c15 Compare March 10, 2022 23:34
void split();

public:
FormatString() = default;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't need this but got some confusing template errors. Will try to remove it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see a way to avoid this, unless we change the vector<tuple<FormatString, vector<Fields>> into something with a custom serializer that does it correctly. Unless I'm doing something wrong.

Having this also keeps the serializer for this class easy.

@fbs fbs force-pushed the optimize_fmtstr branch 2 times, most recently from 02bc547 to 5a662e9 Compare March 14, 2022 17:45
Copy link
Contributor

@viktormalik viktormalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment, LGTM otherwise.

@@ -6,70 +6,6 @@

namespace bpftrace {

std::string verify_format_string(const std::string &fmt, std::vector<Field> args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be also removed from src/printf.h.

@fbs fbs force-pushed the optimize_fmtstr branch 2 times, most recently from ac29327 to 4b62cc3 Compare April 8, 2022 10:15
fbs added 2 commits April 8, 2022 14:26
bpftrace::format spends about 60% of its time in the regex_iterator we
use to split the string.
The FormatString type avoids this by splitting the string on first use
and storing the result for re-use.

I've kept the code mostly as is, just moving and minor modifications.
The whole output code can do with a refactoring but this should be a
nice performance gain.

Test case is running opensnoop.bt while running `find /` in another
terminal:

original: bpftrace::perf_event_printer spent 75% of the time in format
new: bpftrace::perf_event_printer spent 15% of the time in format
@fbs fbs force-pushed the optimize_fmtstr branch from 4b62cc3 to e18f936 Compare April 8, 2022 12:27
@fbs fbs merged commit e7b716f into bpftrace:master Apr 8, 2022
@fbs fbs deleted the optimize_fmtstr branch April 8, 2022 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants