-
Notifications
You must be signed in to change notification settings - Fork 36.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tinyformat: enforce compile-time checks for format string literals #31149
tinyformat: enforce compile-time checks for format string literals #31149
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Only change since previous review (#30928 (review)) is dropping one commit. re-ACK 4cf1221 🔡 Show signatureSignature:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 4cf1221
Only dropped commit I had concerns about. l0rinc too.
Does not entirely prevent throwing of exceptions from incorrect format strings (notably translated strings), but is a step in the right direction (validating more format strings at compile time).
4cf1221
to
23560b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased to address merge-conflict from #29936. Also added commit 99b12b5 to minimize c_str()
usage (an earlier review concern of @ryanofsky) and minimize code churn, as suggested by @maflcko.
re-ACK 23560b4 🎛 Show signatureSignature:
|
result += tfm::format_raw("<-> type net v mping ping send recv txn blk hb %*s%*s%*s ", | ||
m_max_addr_processed_length, "addrp", | ||
m_max_addr_rate_limited_length, "addrl", | ||
m_max_age_length, "age"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is all the indentation churn really necessary?
Does changing strprintf to tfm_format_raw here catch any potential issues that wouldn't be seen in manual testing?
Reduce unexpected run-time crashes from string formatting
Can you provide an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is all the indentation churn really necessary?
It's clang-format
output which I believe is what we're trying to adhere to.
Does changing strprintf to tfm_format_raw here catch any potential issues that wouldn't be seen in manual testing?
The code does not compile with strprintf
because this PR enforces compile-time checks, which fail because of the width specifiers in the format string (there's quite some discussion on these limitations in #30546). See also this excerpt from the PR description that addresses this:
- rename the non-compile-time-validated
tfm::format
overloads totfm::format_raw
, so existing callsites by default use the saferutil::ConstevalFormatString
tfm::format
overloads. Callsites that for some reason don't pass the compile-time checks (such as inbitcoin-cli.cpp
) can usetfm::format_raw
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-format can be helpful for new code, but I don't think it's necessary to re-indent code for it that isn't otherwise touched -- that's just noisy churn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide an example of how this code could crash currently without this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-format can be helpful for new code, but I don't think it's necessary to re-indent code for it that isn't otherwise touched -- that's just noisy churn.
I'm just going with our clang-format
settings as defined in src/.clang-format
- if you feel like those settings are suboptimal opening a separate pull/issue for that is probably more appropriate?
Please provide an example of how this code could crash currently without this change?
The PR states that there is no behaviour change, so (unless there are run-time bugs we haven't seen yet) we can't make this code crash without this change. The point of this PR is to make the code more robust against future changes, where a new/modified format string can cause a run-time error, e.g. if not all code-paths are manually tested by reviewers (which is not unusual).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no bikeshedding about this before on #31149, it just started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By manual testing do you mean automated testing of print statements?
No, manual testing by building and running the CLI commands.
There is no automated test coverage of -netinfo
(the touched code above), nor, for instance, of -addrinfo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By doing some of the validation at compile time, there is now some automated 'test coverage' on it - i.e. it will bite back if it's very wrong without any additional manual check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally unneeded changes can be avoided, but that was just a suggestion. Further historical context to get started:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Useful context, thanks!
We must however keep our kitchen clean, we can't just be cooking all the time.
This change by @stickies-v was a small, low-risk change, making the code slightly more maintainable (or at least slightly less unmaintainable).
We should encourage these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, ACK 23560b4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 23560b4
Changes since previous ACK:
- Added file-local
fuzz_fmt
function to reduce churn. - Rebased and applied conversions to new code from fuzz: wallet: add target for
CreateTransaction
#29936.
In future commits, the non-ConstevalFormatString tfm::format overloads will be renamed or removed. To avoid all callsites having to include util/string.h (or to have a circular dependency between tinyformat.h and util/string.h), just move the ConstevalFormatString overloads to tinyformat.h
In a future commit, the tfm::format std::string overload will be removed to enforce usage of a compile-time checked overload. To minimize code churn in future commits, add a test-only std::string tfm::format helper function. Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^[email protected]>
All (non-test) format strings are known at compile-time, so we don't need the std::string overload. Remove it so users use the safer util::ConstevalFormatString where possible, or the bilingual_str overload where applicable.
Callsites such as bitcoin-cli that (incorrectly) fail compile-time validation can still use tfm::format_raw.
23560b4
to
b448d60
Compare
Force-pushed to rebase to address merge conflict from #31042, no other changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared to #30928 I see the following changes:
- clientversion.cpp: Merge conflict in clientversion at CopyrightHolders -> seems to be resolved correctly
- logging.h: old LogPrintFormatInternal is retained
- run-lint-format-strings.py: FALSE_POSITIVES not removed
- util_string_tests.cpp: FALSE_POSITIVES not moved to tests
- strprintf.cpp: fuzz_fmt added to avoid renames, try-catch added around fuzz_fmt calls (any reason for not try-catching inside fuzz_fmt instead?)
- tinyformat.h: try-catch removed from vformat
- wallet.h: new strprintf migrated
ACK b448d60
Sorry for causing unnecessary churn with my earlier comments. After reading Marco's comment in #31072 (comment) I finally understand what this PR is really trying to do, and why all the changes are being made here. Before reading that comment I thought this PR was trying not just to add compile-time checking for tinyformat format strings, but also to force calls that don't use compile-time checking to use different formatting functions like That being said, I think the approach in this PR is not ideal. I think introduction of I implemented an different approach in #31174 which only adds compile-time checking for |
Closing this PR in favour of #31174, which I think achieves mostly the same goal but does so in a much more elegant way. Thanks for your review and suggestion @ryanofsky, and everyone else here for the persistent re-review as this work is evolving.
I think there is merit in making the less safe (i.e. |
I agree with this, and commit b6a39c8 in this PR shows places where code is doing things like |
I agree that the changes here are still useful. It would be good to rebase on top of #31174, maybe after it is merged. |
…t strings fe39acf tinyformat: Add compile-time checking for literal format strings (Ryan Ofsky) 184f34f util: Support dynamic width & precision in ConstevalFormatString (Ryan Ofsky) Pull request description: Add compile-time checking for literal format strings passed to `strprintf` and `tfm::format` to make sure the right number of format arguments are passed. There is still no compile-time checking if non-literal `std::string` or `bilingual_str` format strings are passed, but this is improved in other PRs: - [#31061](#31061) implements compile-time checking for translated strings - [#31072](#31072) increases compile-time checking by using literal strings as format strings, instead of `std::string` and `bilingual_str` - [#31149](#31149) may drop the `std::string` overload for `strprintf` to [require](#31149 (comment)) compile-time checking ACKs for top commit: maflcko: re-ACK fe39acf 🕐 l0rinc: ACK fe39acf hodlinator: re-ACK fe39acf Tree-SHA512: f1ddef0c96b9468c5ffe31b42dc19f1922583dd14f2e180b618d992c98614c5cc7db9f9cd917ef503f833bbc7dbec78e4489d0035416dce6840837e1d66d87cb
…ages 0184d33 scripted-diff: Replace strprintf(Untranslated) with Untranslated(strprintf) (Ryan Ofsky) 006e4d1 refactor: Use + instead of strformat to concatenate translated & untranslated strings (Ryan Ofsky) 831d2bf refactor: Don't embed translated string in untranslated string. (Ryan Ofsky) 0580219 refactor: Avoid concatenation of format strings (Ryan Ofsky) Pull request description: This PR cleans up string formatting in the codebase so other PRs adding compile time checking can be simpler and easier to review (specifically #30928, #31061, #31074, and #31149). Currently these PRs are hard to review because in addition to changing formatting APIs, they have to update callers that are using the API's in unusual ways. Clean up these callers now so later PRs can be simpler. Specifically: - Use string literals instead of `std::string` format strings to enable more compile-time checking. - Avoid using untranslated bilingual strings as format strings. Use originals so they can by checked at compile time. - Favor `Untranslated(strprintf(...))` over `strprintf(Untranslated(...), ...)` for consistency and to prevent translated and untranslated strings from being unintentionally combined. ACKs for top commit: maflcko: lgtm ACK 0184d33 🔹 l0rinc: ACK 0184d33 - no overall difference because of the rebase Tree-SHA512: 37eb771187d30977f5f054eddb82af6dd69878ace89cede72032bb389e57200898119f9fe486ce6903ebb00cb24648be215ab3e44842b3d206b35e26038da750
Due to inactivity here, I went ahead and implemented this syntax in #31061, with a different name. So I think everything here has been picked up. |
Carved out of #30928 - this PR doesn't include the controversial change of suppressing
tinyformat::format_error
throwing behaviour, and just adds enforcement of compile-time checks for format string literals.Reduce unexpected run-time crashes from string formatting by enforcing compile-time format string checks for most*
tfm::format
andstrprintf
usage.This PR should introduce no behaviour change.
The main changes are:
const std::string&
tfm::format
overload since it's not necessary. Usage of this overload is removed by one of:bilingual_str
overloadtfm::format_raw
functions (only for tinyformat implementation or tests)tfm::format
overloads totfm::format_raw
, so existing callsites by default use the saferutil::ConstevalFormatString
tfm::format
overloads. Callsites that for some reason don't pass the compile-time checks (such as inbitcoin-cli.cpp
) can usetfm::format_raw
.See #31149 (comment) for an example on how this PR can prevent a run-time error for a small format string mistake.
* See #31061 for
bilingual_str format(const bilingual_str& fmt, const Args&... args)
compile-time checks.