Add bswap built-in function #2166

gedimitr · 2022-03-13T20:26:00Z

The function takes an unsigned integer and returns an unsigned
integer of the same width, but with the byte order reversed.

Resolves #1875

One thing I'd like to draw your attention is the integer type promotion / demotion in order to fix a strange behaviour when 16 bit arguments were received. Since I am not really competent in LLVM, it is possible that something is escaping me....

Also, I have updated TCP related programs (e.g. tcpconnect.bt) to use the new bswap function, instead of the manual byte reordering.

Checklist

Language changes are updated in man/adoc/bpftrace.adoc and if needed in docs/reference_guide.md
User-visible and non-trivial changes updated in CHANGELOG.md
The new behaviour is covered by tests

fbs · 2022-03-15T16:09:20Z

src/ast/passes/codegen_llvm.cpp

+      // correct bswap intrinsic would be selected, (bswap.i16), yet LLVM would
+      // give to it a zero extended i64 argument! The trick below seems to fix
+      // this behaviour and the argument to bswap is always of the correct type.
+      Value *promoted_value = b_.CreateIntCast(expr_, b_.getInt64Ty(), false);


does this happen in the original IR or the optimized one? We have some weird casts in there for 'reasons' which can make the optimized ir confusing

I did a build without the promoting / demoting trick and compared the unoptimized / optimized IRs from tcpconnect.bt.

Unoptimized IR

%"$dport" = alloca i16, align 2 ... %75 = load i16, i16* %"struct sock_common.skc_dport", align 2 %76 = zext i16 %75 to i64 store i64 %76, i16* %"$dport", align 8 %78 = load i16, i16* %"$dport", align 2 %79 = call i16 @llvm.bswap.i16(i16 %78)

Optimized IR

%16 = load i16, i16* %"struct sock_common.skc_dport", align 2 %17 = zext i16 %16 to i64 %18 = call i16 @llvm.bswap.i16(i64 %17)

In the unoptimized version the bswap intrinsic is correctly called with a i16 argument, but the storing of an i64 value into the i16 $dport looks strange...

fbs · 2022-03-15T16:10:14Z

src/ast/passes/codegen_llvm.cpp

+    auto scoped_del_arg = accept(arg);
+
+    assert(arg->type.IsIntegerTy());
+    if (arg->type.GetSize() > 1)


do we need this and the assert? Isn't this guaranteed by the semantic analyser?

I added the assert in order to explicitly declare this expectation / guarantee from the semantic analyser, as well as a safeguard in case the behaviour of the semantic analyzer changes in the future.

Regarding the arg->type.GetSize() check, the intention is that bswap is invoked only for multibyte integers, while i8 ones are simply passed through.

Should I amend?

ah yeah skipping the byte case makes sense.

re assert, we should try to avoid them as the end user experience of asserts is quite bad. But in this case have test coverage for it (test("kprobe:f { bswap(\"hello\"); }", 10);) so we will probably catch it before it ends up in the master branch

gedimitr · 2022-03-24T00:31:28Z

I did a more throrough investigation and I added a changeset which fixes this strange behaviour:

When an element was read from a compound data structure in probereadDatastructElem, the returned value would uncoditionally be cast to a 64 bit integer
By removing this cast, the narrower types get correctly handled
The promoting / demoting trick is no longer necessary

After running the runtime tests, some errors were reported for some test cases due to differences in the generated IRs. I had a look into the diffs and they appear to be benign.

I also added a fix in the semantic analysis of bswap, because assertion failures were occurring when the tests were running in debug builds.

fbs · 2022-03-24T10:16:03Z

I did a more throrough investigation and I added a changeset which fixes this strange behaviour:

* When an element was read from a compound data structure in `probereadDatastructElem`, the returned value would uncoditionally be cast to a 64 bit integer

* By removing this cast, the narrower types get correctly handled

* The promoting / demoting trick is no longer necessary

Before we merge this we should run all the tools we have and make sure the output makes sense. We had some issues in the past and these weird casts are part of it, e.g. #1173

gedimitr · 2022-03-24T20:58:11Z

Went through all the programs in the tools directory and run them: all of them executed successfully and provided sensible results.

Exceptions (that by all accounts are irrelevant to the changes introduced with this PR):

I got some warnings (e.g. biolatency.bt complained that __blk_account_io_start / __blk_account_io_done is not traceable)
mdflush failed to start due to an error in a struct definition (ERROR: Struct/union of type 'struct bio' does not contain a field named 'bi_bdev')
The xfsdist.bt program wasn't tested for results accuracy, since no XFS filesystem was available (the program was successfully started and the probes established)

viktormalik · 2022-03-28T09:44:23Z

Exceptions (that by all accounts are irrelevant to the changes introduced with this PR):

I got some warnings (e.g. biolatency.bt complained that __blk_account_io_start / __blk_account_io_done is not traceable)

mdflush failed to start due to an error in a struct definition (ERROR: Struct/union of type 'struct bio' does not contain a field named 'bi_bdev')

The xfsdist.bt program wasn't tested for results accuracy, since no XFS filesystem was available (the program was successfully started and the probes established)

I can confirm that these should be unrelated to this PR.

viktormalik

LGTM, thanks!

I think that getting rid of that unconditional cast is reasonable. Looking at the changes in codegen tests, it seems that each operation does typecasts on its inputs when needed (which is the way it should be done anyway), so hopefully nothing will break.

Could you update the codegen tests so that CI passes? There's a script in scripts/ for that.

gedimitr · 2022-03-28T18:53:45Z

Thanks @viktormalik! I have updated the codegen tests as you proposed above.

viktormalik

Great, thanks! Could you squash the commits, please?

The function takes an unsigned integer and returns an unsigned integer of the same width, but with the byte order reversed. Changed behaviour when reading elements from compound data structures: Previously, the returned value would unconditionally be cast to a 64 bit integer --- this would cause LLVM intrinsic invocations to fail if narrower integers were involved. Now, an integer of the same width is returned as the accessed field: it is the callers responsibility to cast to a wider integer (something which is nevertheless currently done)

gedimitr · 2022-03-29T10:07:10Z

@viktormalik Just pushed everything as a single comit. Thanks!

gedimitr force-pushed the bswap_builtin branch 3 times, most recently from abb2006 to b7df4b1 Compare March 13, 2022 20:53

gedimitr marked this pull request as ready for review March 13, 2022 21:18

fbs reviewed Mar 15, 2022

View reviewed changes

viktormalik reviewed Mar 28, 2022

View reviewed changes

viktormalik approved these changes Mar 29, 2022

View reviewed changes

gedimitr force-pushed the bswap_builtin branch from 807e994 to 5f6ff7e Compare March 29, 2022 10:05

viktormalik requested a review from fbs March 30, 2022 05:00

fbs approved these changes Mar 31, 2022

View reviewed changes

viktormalik merged commit 1972e89 into bpftrace:master Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bswap built-in function #2166

Add bswap built-in function #2166

gedimitr commented Mar 13, 2022 •

edited

Loading

fbs Mar 15, 2022

gedimitr Mar 15, 2022

fbs Mar 15, 2022

gedimitr Mar 15, 2022

fbs Mar 15, 2022

gedimitr commented Mar 24, 2022

fbs commented Mar 24, 2022

gedimitr commented Mar 24, 2022

viktormalik commented Mar 28, 2022

viktormalik left a comment

gedimitr commented Mar 28, 2022

viktormalik left a comment

gedimitr commented Mar 29, 2022

Add bswap built-in function #2166

Add bswap built-in function #2166

Conversation

gedimitr commented Mar 13, 2022 • edited Loading

Checklist

fbs Mar 15, 2022

Choose a reason for hiding this comment

gedimitr Mar 15, 2022

Choose a reason for hiding this comment

fbs Mar 15, 2022

Choose a reason for hiding this comment

gedimitr Mar 15, 2022

Choose a reason for hiding this comment

fbs Mar 15, 2022

Choose a reason for hiding this comment

gedimitr commented Mar 24, 2022

fbs commented Mar 24, 2022

gedimitr commented Mar 24, 2022

viktormalik commented Mar 28, 2022

viktormalik left a comment

Choose a reason for hiding this comment

gedimitr commented Mar 28, 2022

viktormalik left a comment

Choose a reason for hiding this comment

gedimitr commented Mar 29, 2022

gedimitr commented Mar 13, 2022 •

edited

Loading