tinyformat: Add compile-time checking for literal format strings #31174

ryanofsky · 2024-10-28T23:56:27Z

Add compile-time checking for literal format strings passed to strprintf and tfm::format to make sure the right number of format arguments are passed.

There is still no compile-time checking if non-literal std::string or bilingual_str format strings are passed, but this is improved in other PRs:

#31061 implements compile-time checking for translated strings
#31072 increases compile-time checking by using literal strings as format strings, instead of std::string and bilingual_str
#31149 may drop the std::string overload for strprintf to require compile-time checking

This is needed in the next commit to add compile-time checking to strprintf calls, because bitcoin-cli.cpp uses dynamic width in many format strings. This change is easiest to review ignoring whitespace. Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^[email protected]> Co-authored-by: Hodlinator <[email protected]> Co-authored-by: l0rinc <[email protected]>

Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^[email protected]>

DrahtBot · 2024-10-28T23:56:30Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31174.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	maflcko, l0rinc, hodlinator

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#31061 (refactor: Check translatable format strings at compile-time by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

maflcko

Nice. This looks less scary than expected. Left a nit to add more compile time checks, but this looks good either way.

src/test/util_string_tests.cpp

src/tinyformat.h

src/util/string.h

ryanofsky

Updated 1d16d6e -> e6086e0 (pr/tcheck.1 -> pr/tcheck.2, compare) addressing comments and making ConstEvalFormatString parsing stricter to reject incomplete specifiers.
Updated e6086e0 -> e53829d (pr/tcheck.2 -> pr/tcheck.3, compare) cleaning up whitespace and comments.

src/test/util_string_tests.cpp

src/tinyformat.h

src/util/string.h

hodlinator

Concept ACK e53829d

Cleanest attempt at increased compile time validation of format so far. When reviewing #31149 I had the gnawing feeling that more complete format string support would have reduced the diff, but pushed it away for expediency (an earlier attempt at more complete support was attempted in #30999).

src/util/string.h

hodlinator · 2024-10-30T13:34:40Z

Rebased #30933 on top of this PR (as a separate branch for now) and the mismatches between our custom consteval checking and tinyformat are gone as far as our current tests go - rebased commit: 32d4cf3

ryanofsky · 2024-10-30T13:49:18Z

re: #31174 (comment)

mismatches between our custom consteval checking and tinyformat are gone as far as our current tests go

In case you do want a test with different behavior, I think you can use %n specifier which is not supported by tinyformat

hodlinator

In case you do want a test with different behavior, I think you can use %n specifier which is not supported by tinyformat

Could document non-parity like so (unless you prefer I do it as part of #30933):

    // Non-parity
    int n{};
    BOOST_CHECK_EXCEPTION(tfm::format(std::string{"%n"}, n), tfm::format_error,
        HasReason{"tinyformat: %n conversion spec not supported"});
    ConstevalFormatString<1>::Detail_CheckNumFormatSpecifiers("%n");

src/util/string.h

src/tinyformat.h

src/test/util_string_tests.cpp

ryanofsky

Updated e53829d -> ecc5cb9 (pr/tcheck.3 -> pr/tcheck.4, compare) with review suggestions.

re: #31174 (review)

Could document non-parity like so (unless you prefer I do it as part of #30933):

I think that change doesn't really fit into this PR, since this PR isn't checking type characters. But it does make sense as part of #30933, so would be good to add there and I'd be happy to review it.

src/test/util_string_tests.cpp

src/util/string.h

src/tinyformat.h

hodlinator

ACK ecc5cb9

Implemented my suggestions (except comment removal suggestion) + broke out parse_size() since my last review.

util_string_tests tests passed locally.

Left one comment, but nothing blocking.

src/util/string.h

maflcko

left some nit ideas for more tests, but this is good either way.

review ACK ecc5cb9 🕯

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: review ACK ecc5cb9a89c6b001df839675b23d8fc1f7ac69ba 🕯
+oFB4Q8dHdvzp6J/1Ir4akTLS5GbDLpGOeKvcRP31CsusrUqTTnwOMie2fGfDcGYiEyKkNN9JiriK4ne+GSICw==

src/test/util_string_tests.cpp

l0rinc

Nice, simple approach, like it a lot!
I think we can simplify the validator a bit more - let me know what you think.

Also, not exactly sure why %n parity wasn't added like in https://github.com/bitcoin/bitcoin/pull/30999/files#diff-71badc1cc71ba46244f7841a088251bb294265f4fe9e662c0ad6b15be540eee4R69 (is it controversial or unnecessary or not useful)?

l0rinc · 2024-11-10T19:33:50Z

src/util/string.h

+                    ++it;
+                    add_arg();
+                } else {
+                    while ('0' <= *it && *it <= '9') ++it;


Given that we have two separate number "parsers" (one that keeps the result and one that throws it away), we might as well extract number parsing to a local lambda like you did with the other ones.

Diff

diff --git a/src/util/string.h b/src/util/string.h --- a/src/util/string.h (revision ecc5cb9a89c6b001df839675b23d8fc1f7ac69ba) +++ b/src/util/string.h (date 1731267170701) @@ -45,14 +45,16 @@ continue; } + auto parse_number = [&] { + unsigned num{0}; + for (; '0' <= *it && *it <= '9'; ++it) { + num = num * 10 + (*it - '0'); + } + return num; + }; + auto add_arg = [&] { - unsigned maybe_num{0}; - while ('0' <= *it && *it <= '9') { - maybe_num *= 10; - maybe_num += *it - '0'; - ++it; - } - + unsigned maybe_num = parse_number(); if (*it == '$') { ++it; // Positional specifier, like %8$s @@ -75,7 +77,7 @@ ++it; add_arg(); } else { - while ('0' <= *it && *it <= '9') ++it; + parse_number(); } };

re: #31174 (comment)

we might as well extract number parsing to a local lambda like you did with the other ones.

This seems reasonable but I"m not sure it's clearer, and it does make the diff bigger replacing the maybe_num code that doesn't have to change currently. Happy to apply this suggestion if other reviews think it is a good idea.

@hodlinator, @maflcko, what do you think?
I can ack without this as well, but I'd prefer reducing duplication.

I think while ('0' <= *it && *it <= '9') ++it; is fine. It is pretty standard self-explanatory code. I don't think a one-line while loop needs to be de-duplicated. Also, I like that the diff is minimal as-is now.

src/util/string.h

l0rinc · 2024-11-10T19:59:31Z

src/util/string.h

+            add_arg();
+
+            // Consume flags.
+            while (*it == '#' || *it == '0' || *it == '-' || *it == ' ' || *it == '+') ++it;


In C++23 this could be a simple .contains, but even in C++20 we should be able to group the flags to something like:

Suggested change

while (*it == '#' || *it == '0' || *it == '-' || *it == ' ' || *it == '+') ++it;

while ("#0- +"sv.find(*it) != std::string_view::npos) ++it;

(we could even extract the flag in which case we could get rid of the comment)

Time to return to C89? ;)

Suggested change

while (*it == '#' || *it == '0' || *it == '-' || *it == ' ' || *it == '+') ++it;

while (strchr("#0- +", *it)) ++it;

I thought of that, but not sure how to make it work, I'm getting:

note: non-constexpr function 'strchr' cannot be used in a constant expression

Still rebooting 🧠 for this week, sorry for the noise.

re: #31174 (comment)

I can apply the "#0- +"sv.find change if others like it, but to me it seems less readable and only a little shorter.

it could be more readable and a bit shorter if we extract the flags as a variable and delete the comment stating the same - what do you think?

My preference would be to leave style-only nits to a follow-up, especially given that they will be temporary anyway until C++23. This pull request is basically ready for two weeks now, with more than 50 style-only or test-only comments. Unless there are any real issues or bugs with the code, and a foce-push needs to happen anyway, I don't really see the value of asking reviewers to go through more comments and code changes, some of which don't even compile.

l0rinc · 2024-11-10T20:15:11Z

src/tinyformat.h

+// Unlike ConstevalFormatString this supports std::string for runtime string
+// formatting without compile time checks.
+template <unsigned num_params>
+struct FormatStringCheck {


Checked and failures seem to be validated successfully from command line, but - unlike the previous versions - doesn't seem to be shown in the IDE... Weird :/

ryanofsky

Updated ecc5cb9 -> fe39acf (pr/tcheck.4 -> pr/tcheck.5, compare) with suggested changes. Thanks for the reviews and suggestions!

src/util/string.h

src/test/util_string_tests.cpp

src/util/string.h

ryanofsky · 2024-11-12T12:00:45Z

src/util/string.h

+            add_arg();
+
+            // Consume flags.
+            while (*it == '#' || *it == '0' || *it == '-' || *it == ' ' || *it == '+') ++it;


re: #31174 (comment)

I can apply the "#0- +"sv.find change if others like it, but to me it seems less readable and only a little shorter.

ryanofsky · 2024-11-12T12:11:21Z

src/util/string.h

+                    ++it;
+                    add_arg();
+                } else {
+                    while ('0' <= *it && *it <= '9') ++it;


re: #31174 (comment)

we might as well extract number parsing to a local lambda like you did with the other ones.

This seems reasonable but I"m not sure it's clearer, and it does make the diff bigger replacing the maybe_num code that doesn't have to change currently. Happy to apply this suggestion if other reviews think it is a good idea.

ryanofsky · 2024-11-12T12:46:08Z

re: #31174 (review)

Also, not exactly sure why %n parity wasn't added

Thanks I hadn't seen #30999, and it seems like that would be a reasonable thing to check for, though I think there is a case for keeping the code as simple as possible and not trying to reproduce tinyformat quirks here. But the reason for not making that change here is I don't think it's related to this PR, and I think it's generally better to make separate changes n separate PRs so they can be evaluated correctly and discussed more clearly.

maflcko · 2024-11-13T15:42:55Z

re-ACK fe39acf 🕐

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: re-ACK fe39acf88ff552bfc4a502c99774375b91824bb1 🕐
Gv99KVENxeKH2/4i5mCpvzXcxQ66+sVorM28bzE1QmucATBYm7QTTpcvRNaaGKsCRRtGOmk0QEigsfGUPldxCA==

l0rinc · 2024-11-13T16:09:23Z

Thanks for improving developer productivity with these small changes <3

ACK fe39acf

l0rinc · 2024-11-13T16:07:39Z

src/test/util_string_tests.cpp

+    PassFmt<3>("'%- 0+*.*f'");
+    PassFmt<3>("'%1$- 0+*3$.*2$f'");


are the extra surrounding ' deliberate? If so, what do they mean?

re: #31174 (comment)

I'm pretty sure they must be accidental. These cases came from #31174 (comment), and I just pasted them without noticing the single quotes. Can remove if the PR is updated again.

The ' are used to denote the begin and the end of the string, which would otherwise not be possible, because trailing spaces can normally not be seen when printing. They are not needed in this test and they are a leftover when I tested this against tinyformat at runtime.

hodlinator

re-ACK fe39acf

git range-diff master ecc5cb9 fe39acf

Added more FailFmtWithError-tests (maflcko).
Terser skipping of %% (l0rinc).
Comment regarding format string components updated (me).

ryanofsky and others added 2 commits October 28, 2024 19:11

tinyformat: Add compile-time checking for literal format strings

fe39acf

Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^[email protected]>

ryanofsky mentioned this pull request Oct 29, 2024

tinyformat: enforce compile-time checks for format string literals #31149

Closed

maflcko approved these changes Oct 29, 2024

View reviewed changes

src/test/util_string_tests.cpp Show resolved Hide resolved

src/tinyformat.h Outdated Show resolved Hide resolved

l0rinc reviewed Oct 29, 2024

View reviewed changes

src/util/string.h Show resolved Hide resolved

ryanofsky force-pushed the pr/tcheck branch from 1d16d6e to e6086e0 Compare October 29, 2024 15:43

ryanofsky commented Oct 29, 2024

View reviewed changes

src/test/util_string_tests.cpp Show resolved Hide resolved

src/tinyformat.h Outdated Show resolved Hide resolved

src/util/string.h Show resolved Hide resolved

ryanofsky force-pushed the pr/tcheck branch from e6086e0 to e53829d Compare October 29, 2024 20:01

DrahtBot mentioned this pull request Oct 29, 2024

refactor: Check translatable format strings at compile-time #31061

Merged

laanwj added the Utils/log/libs label Oct 30, 2024

hodlinator reviewed Oct 30, 2024

View reviewed changes

src/util/string.h Outdated Show resolved Hide resolved

bitcoin deleted a comment Oct 30, 2024

hodlinator mentioned this pull request Oct 30, 2024

test: Prove+document ConstevalFormatString/tinyformat parity #30933

Merged

hodlinator reviewed Oct 30, 2024

View reviewed changes

src/util/string.h Outdated Show resolved Hide resolved

src/util/string.h Outdated Show resolved Hide resolved

src/tinyformat.h Show resolved Hide resolved

src/test/util_string_tests.cpp Show resolved Hide resolved

ryanofsky force-pushed the pr/tcheck branch from e53829d to ecc5cb9 Compare November 4, 2024 22:31

ryanofsky commented Nov 4, 2024

View reviewed changes

src/test/util_string_tests.cpp Show resolved Hide resolved

src/util/string.h Outdated Show resolved Hide resolved

src/util/string.h Outdated Show resolved Hide resolved

src/util/string.h Outdated Show resolved Hide resolved

src/tinyformat.h Show resolved Hide resolved

hodlinator approved these changes Nov 8, 2024

View reviewed changes

src/util/string.h Outdated Show resolved Hide resolved

maflcko approved these changes Nov 8, 2024

View reviewed changes

src/test/util_string_tests.cpp Show resolved Hide resolved

src/test/util_string_tests.cpp Show resolved Hide resolved

l0rinc reviewed Nov 10, 2024

View reviewed changes

ryanofsky force-pushed the pr/tcheck branch from ecc5cb9 to fe39acf Compare November 12, 2024 12:20

ryanofsky commented Nov 12, 2024

View reviewed changes

DrahtBot requested a review from hodlinator November 13, 2024 15:43

l0rinc reviewed Nov 13, 2024

View reviewed changes

hodlinator approved these changes Nov 13, 2024

View reviewed changes

fanquake merged commit f44e39c into bitcoin:master Nov 14, 2024
16 checks passed

This was referenced Nov 14, 2024

scripted-diff: Type-safe settings retrieval #31260

Open

refactor: Clean up messy strformat and bilingual_str usages #31072

Merged

ryanofsky mentioned this pull request Dec 10, 2024

log, refactor: Allow log macros to accept context arguments #29256

Open

stickies-v mentioned this pull request Jan 15, 2025

tinyformat: refactor: increase compile-time checks and don't throw for tfm::format_error #30928

Closed

bitcoin locked and limited conversation to collaborators Nov 14, 2025

	while (it == '#' \|\| it == '0' \|\| it == '-' \|\| it == ' ' \|\| *it == '+') ++it;
	while ("#0- +"sv.find(*it) != std::string_view::npos) ++it;

	while (it == '#' \|\| it == '0' \|\| it == '-' \|\| it == ' ' \|\| *it == '+') ++it;
	while (strchr("#0- +", *it)) ++it;

tinyformat: Add compile-time checking for literal format strings #31174

tinyformat: Add compile-time checking for literal format strings #31174

Uh oh!

Conversation

ryanofsky commented Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage & Benchmarks

Reviews

Conflicts

Uh oh!

maflcko left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryanofsky left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hodlinator left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hodlinator commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ryanofsky commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hodlinator left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryanofsky left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hodlinator left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maflcko left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

l0rinc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ryanofsky commented Oct 28, 2024 •

edited

Loading

DrahtBot commented Oct 28, 2024 •

edited

Loading

ryanofsky left a comment •

edited

Loading

hodlinator commented Oct 30, 2024 •

edited

Loading

ryanofsky commented Oct 30, 2024 •

edited

Loading

ryanofsky left a comment •

edited

Loading

l0rinc Nov 13, 2024 •

edited

Loading

maflcko Nov 13, 2024 •

edited

Loading