Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing pointers broken on LLVM <12 #1305

Closed
danobi opened this issue Apr 24, 2020 · 19 comments · Fixed by #1720
Closed

Accessing pointers broken on LLVM <12 #1305

danobi opened this issue Apr 24, 2020 · 19 comments · Fixed by #1720
Labels
bug Something isn't working

Comments

@danobi
Copy link
Member

danobi commented Apr 24, 2020

bpftrace --info

System
  OS: Linux 5.6.4-arch1-1 #1 SMP PREEMPT Mon, 13 Apr 2020 12:21:19 +0000
  Arch: x86_64

Build
  version: v0.10.0-63-g4b37
  LLVM: 10
  foreach_sym: yes
  unsafe uprobe: no
  btf: no
  bfd: yes
  bpf_attach_kfunc: yes

Kernel helpers
  probe_read: yes
  probe_read_str: yes
  probe_read_user: yes
  probe_read_user_str: yes
  probe_read_kernel: yes
  probe_read_kernel_str: yes
  get_current_cgroup_id: yes
  send_signal: yes
  override_return: yes

Kernel features
  Instruction limit: 1000000
  Loop support: yes

Map types
  hash: yes
  percpu hash: yes
  array: yes
  percpu array: yes
  stack_trace: yes
  perf_event_array: yes

Probe types
  kprobe: yes
  tracepoint: yes
  perf_event: yes
  kfunc: no

What reproduces the bug?

This works:

#include <linux/fs.h>
#include <linux/blk_types.h>
#include <linux/blkdev.h>
#include <linux/blk-cgroup.h>
#include <linux/cgroup-defs.h>
#include <linux/sched.h>

kprobe:blk_update_request
{
        $req = (struct request *)arg0;
        printf("cgroup name: %p\n", $req->bio->bi_blkg->blkcg->css.cgroup->kn->parent);
}

with output like this:

cgroup name: 0xffffa02fdd173480
cgroup name: 0xffffa02fdd173480
cgroup name: (nil)
cgroup name: (nil)

But this script does not work:

#include <linux/fs.h>
#include <linux/blk_types.h>
#include <linux/blkdev.h>
#include <linux/blk-cgroup.h>
#include <linux/cgroup-defs.h>
#include <linux/sched.h>

kprobe:blk_update_request
{
        $req = (struct request *)arg0;
        $kn = $req->bio->bi_blkg->blkcg->css.cgroup->kn;
        printf("cgroup name: %p\n", $kn->parent);
}

with output like:

cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)
cgroup name: (nil)

There are no non-nils.

I think there might have been some change w/ handling pointers already on the stack. Maybe it's eliding a bpf_probe_read()? Could make sense cuz then kernel will fail the dereference and give you 0x0.

cc @mmisono

@danobi danobi added the bug Something isn't working label Apr 24, 2020
@mmisono
Copy link
Collaborator

mmisono commented Apr 24, 2020

It's strange. When I try

#include <linux/fs.h>
#include <linux/blk_types.h>
#include <linux/blkdev.h>
#include <linux/blk-cgroup.h>
#include <linux/cgroup-defs.h>
#include <linux/sched.h>

kprobe:blk_update_request
{
        $req = (struct request *)arg0;
        $parent = $req->bio->bi_blkg->blkcg->css.cgroup->kn->parent;

        $kn = $req->bio->bi_blkg->blkcg->css.cgroup->kn;
        $parent2 = $kn->parent;

        printf("cgroup name: %p, %p\n", $parent, $parent2);
}

then it seems to work.

Attaching 1 probe...
cgroup name: (nil), (nil)
cgroup name: 0xffff8c32d9e1a300, 0xffff8c32d9e1a300

but surely

kprobe:blk_update_request
{
        $req = (struct request *)arg0;
        $kn = $req->bio->bi_blkg->blkcg->css.cgroup->kn;
        printf("cgroup name: %p\n", $kn->parent);
}

does not.

@mmisono
Copy link
Collaborator

mmisono commented Apr 24, 2020

Apparently generated code is wrong.

kprobe:blk_update_request
{
        $req = (struct request *)arg0;
        $kn = $req->bio->bi_blkg->blkcg->css.cgroup->kn;
        printf("cgroup name: %p, %p\n", (uint64)$kn, $kn->parent);
}

The program store zero *(u64 *)(r10 -32) = r1 at the location where $kn is stored.

24: (79) r3 = *(u64 *)(r10 -32)
25: (07) r3 += 288
26: (bf) r1 = r10
27: (07) r1 += -32
28: (b7) r2 = 8
29: (85) call bpf_probe_read#4
last_idx 29 first_idx 24
regs=4 stack=0 before 28: (b7) r2 = 8
30: (b7) r1 = 0
31: (7b) *(u64 *)(r10 -32) = r1
last_idx 31 first_idx 24
regs=2 stack=0 before 30: (b7) r1 = 0
32: (79) r3 = *(u64 *)(r10 -32)
33: (7b) *(u64 *)(r10 -24) = r3
34: (07) r3 += 8
35: (bf) r1 = r10
36: (07) r1 += -8
37: (b7) r2 = 8
38: (85) call bpf_probe_read#4

all generated code
Attaching 1 probe...

Program ID: 361369

Bytecode: 
0: (bf) r6 = r1
1: (79) r3 = *(u64 *)(r6 +112)
2: (07) r3 += 56
3: (bf) r1 = r10
4: (07) r1 += -32
5: (b7) r2 = 8
6: (85) call bpf_probe_read#4
last_idx 6 first_idx 0
regs=4 stack=0 before 5: (b7) r2 = 8
7: (79) r3 = *(u64 *)(r10 -32)
8: (07) r3 += 72
9: (bf) r1 = r10
10: (07) r1 += -32
11: (b7) r2 = 8
12: (85) call bpf_probe_read#4
last_idx 12 first_idx 0
regs=4 stack=0 before 11: (b7) r2 = 8
13: (79) r3 = *(u64 *)(r10 -32)
14: (07) r3 += 40
15: (bf) r1 = r10
16: (07) r1 += -32
17: (b7) r2 = 8
18: (85) call bpf_probe_read#4
last_idx 18 first_idx 13
regs=4 stack=0 before 17: (b7) r2 = 8
19: (79) r3 = *(u64 *)(r10 -32)
20: (bf) r1 = r10
21: (07) r1 += -32
22: (b7) r2 = 8
23: (85) call bpf_probe_read#4
last_idx 23 first_idx 13
regs=4 stack=0 before 22: (b7) r2 = 8
24: (79) r3 = *(u64 *)(r10 -32)
25: (07) r3 += 288
26: (bf) r1 = r10
27: (07) r1 += -32
28: (b7) r2 = 8
29: (85) call bpf_probe_read#4
last_idx 29 first_idx 24
regs=4 stack=0 before 28: (b7) r2 = 8
30: (b7) r1 = 0
31: (7b) *(u64 *)(r10 -32) = r1
last_idx 31 first_idx 24
regs=2 stack=0 before 30: (b7) r1 = 0
32: (79) r3 = *(u64 *)(r10 -32)
33: (7b) *(u64 *)(r10 -24) = r3
34: (07) r3 += 8
35: (bf) r1 = r10
36: (07) r1 += -8
37: (b7) r2 = 8
38: (85) call bpf_probe_read#4
last_idx 38 first_idx 24
regs=4 stack=0 before 37: (b7) r2 = 8
39: (79) r1 = *(u64 *)(r10 -8)
40: (7b) *(u64 *)(r10 -16) = r1
41: (18) r7 = 0xffff8c32cad79400
43: (85) call bpf_get_smp_processor_id#8
44: (bf) r4 = r10
45: (07) r4 += -32
46: (bf) r1 = r6
47: (bf) r2 = r7
48: (bf) r3 = r0
49: (b7) r5 = 24
50: (85) call bpf_perf_event_output#25
last_idx 50 first_idx 39
regs=20 stack=0 before 49: (b7) r5 = 24
51: (b7) r0 = 0
52: (95) exit
processed 52 insns (limit 1000000) max_states_per_insn 0 total_states 4 peak_states 4 mark_read 3

@mmisono
Copy link
Collaborator

mmisono commented Apr 24, 2020

I found llvm 8 does not have the problem, but llvm9 and 10 has.

left: llvm10, right: llvm8
image

I don't see any LLVM IR differences other than attributes (left: llvm10, right: llvm8).

image

@fbs
Copy link
Member

fbs commented Sep 14, 2020

IR:

define i64 @"kprobe:blk_update_request"(i8* %0) local_unnamed_addr section "s_kprobe:blk_update_request_1" {
entry:
  %"struct kernfs_node.parent" = alloca i64, align 8
  %printf_args = alloca %printf_t, align 8
  %"struct cgroup.kn" = alloca i64, align 8
  %"struct cgroup_subsys_state.cgroup" = alloca i64, align 8
  %"struct blkcg_gq.blkcg" = alloca i64, align 8
  %"struct bio.bi_blkg" = alloca i64, align 8
  %"struct request.bio" = alloca i64, align 8
  %1 = getelementptr i8, i8* %0, i64 112
  %2 = bitcast i8* %1 to i64*
  %arg0 = load volatile i64, i64* %2, align 8
  %3 = add i64 %arg0, 56
  %4 = bitcast i64* %"struct request.bio" to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %4)
  %probe_read = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct request.bio", i32 8, i64 %3)
  %5 = load i64, i64* %"struct request.bio", align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %4)
  %6 = add i64 %5, 72
  %7 = bitcast i64* %"struct bio.bi_blkg" to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %7)
  %probe_read1 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct bio.bi_blkg", i32 8, i64 %6)
  %8 = load i64, i64* %"struct bio.bi_blkg", align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %7)
  %9 = add i64 %8, 40
  %10 = bitcast i64* %"struct blkcg_gq.blkcg" to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %10)
  %probe_read2 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct blkcg_gq.blkcg", i32 8, i64 %9)
  %11 = load i64, i64* %"struct blkcg_gq.blkcg", align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %10)
  %12 = bitcast i64* %"struct cgroup_subsys_state.cgroup" to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %12)
  %probe_read3 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct cgroup_subsys_state.cgroup", i32 8, i64 %11)
  %13 = load i64, i64* %"struct cgroup_subsys_state.cgroup", align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %12)
  %14 = add i64 %13, 288
  %15 = bitcast i64* %"struct cgroup.kn" to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %15)
  %probe_read4 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct cgroup.kn", i32 8, i64 %14)
  %16 = load i64, i64* %"struct cgroup.kn", align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %15)
  %17 = bitcast %printf_t* %printf_args to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %17)
  %18 = add i64 %16, 8
  %19 = bitcast i64* %"struct kernfs_node.parent" to i8*
  %20 = getelementptr inbounds %printf_t, %printf_t* %printf_args, i64 0, i32 0
  store i64 0, i64* %20, align 8
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %19)
  %probe_read5 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct kernfs_node.parent", i32 8, i64 %18)
  %21 = load i64, i64* %"struct kernfs_node.parent", align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %19)
  %22 = getelementptr inbounds %printf_t, %printf_t* %printf_args, i64 0, i32 1
  store i64 %21, i64* %22, align 8
  %pseudo = call i64 @llvm.bpf.pseudo(i64 1, i64 1)
  %get_cpu_id = call i64 inttoptr (i64 8 to i64 ()*)()
  %perf_event_output = call i64 inttoptr (i64 25 to i64 (i8*, i64, i64, %printf_t*, i64)*)(i8* %0, i64 %pseudo, i64 %get_cpu_id, %printf_t* nonnull %printf_args, i64 16)
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %17)
  ret i64 0
}

llvm10

23: (85) call bpf_probe_read_compat#-53648
  24: (79) r3 = *(u64 *)(r10 -24)
  25: (07) r3 += 288
  26: (bf) r1 = r10
  27: (07) r1 += -24
  28: (b7) r2 = 8
  29: (85) call bpf_probe_read_compat#-53648
  30: (b7) r1 = 0
  31: (7b) *(u64 *)(r10 -24) = r1
  32: (79) r3 = *(u64 *)(r10 -24)
  33: (07) r3 += 8
  34: (bf) r1 = r10
  35: (07) r1 += -8
  36: (b7) r2 = 8
  37: (85) call bpf_probe_read_compat#-53648
  38: (79) r1 = *(u64 *)(r10 -8)
  39: (7b) *(u64 *)(r10 -16) = r1
  40: (18) r7 = map[id:413]
  42: (85) call bpf_get_smp_processor_id#109888
  43: (bf) r4 = r10
  44: (07) r4 += -24
  45: (bf) r1 = r6
  46: (bf) r2 = r7
  47: (bf) r3 = r0
  48: (b7) r5 = 16
  49: (85) call bpf_perf_event_output#-51984

llvm7

  23: (85) call bpf_probe_read#-52672
  24: (79) r3 = *(u64 *)(r10 -24)
  25: (bf) r1 = r10
  26: (07) r1 += -24
  27: (b7) r2 = 8
  28: (85) call bpf_probe_read#-52672
  29: (79) r3 = *(u64 *)(r10 -24)
  30: (b7) r1 = 0
  31: (7b) *(u64 *)(r10 -24) = r1
  32: (07) r3 += 8
  33: (bf) r1 = r10
  34: (07) r1 += -8
  35: (b7) r2 = 8
  36: (85) call bpf_probe_read#-52672
  37: (79) r1 = *(u64 *)(r10 -8)
  38: (7b) *(u64 *)(r10 -16) = r1
  39: (18) r7 = map[id:11601]
  41: (85) call bpf_get_smp_processor_id#98192
  42: (bf) r4 = r10
  43: (07) r4 += -24
  44: (bf) r1 = r6
  45: (bf) r2 = r7
  46: (bf) r3 = r0
  47: (b7) r5 = 16
  48: (85) call bpf_perf_event_output#-51120

Looking at the IR lifetime seem to be ok. Not sure what else could trigger this, I'm looking into it.

@fbs
Copy link
Member

fbs commented Sep 14, 2020

Using:

sudo bpftrace llvm9_issue.bt -d | sed -n '/ModuleID/,$p' | llc-7 --print-after-all --print-before-all 2>&1 | sed -n '/Before BPF DAG->DAG Pattern/,/IR Dump Before/p'

On llvm7:

  LIFETIME_START %stack.2.struct cgroup.kn
  ADJCALLSTACKDOWN 0, 0, implicit-def dead $r11, implicit $r11
  %18:gpr = MOV_rr %stack.2.struct cgroup.kn
  $r1 = COPY %18:gpr
  $r2 = COPY %4:gpr
  $r3 = COPY %17:gpr
  JAL 4, implicit-def $r0, implicit-def dead $r1, implicit-def dead $r2, implicit-def dead $r3, implicit-def dead $r4, implicit-def dead $r5, implicit $r11, implicit $r1, implicit $r2, implicit $r3
  ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11
  %19:gpr = COPY $r0
  %20:gpr = LDD %stack.2.struct cgroup.kn, 0 :: (dereferenceable load 8 from %ir."struct cgroup.kn")
  LIFETIME_END %stack.2.struct cgroup.kn
  LIFETIME_START %stack.1.printf_args
  %21:gpr = MOV_ri 0
  STD %21:gpr, %stack.1.printf_args, 0 :: (store 8 into %ir.19)
  LIFETIME_START %stack.0.struct kernfs_node.parent
  ADJCALLSTACKDOWN 0, 0, implicit-def dead $r11, implicit $r11
  %22:gpr = ADD_ri %20:gpr, 8
  %23:gpr = MOV_rr %stack.0.struct kernfs_node.parent
  $r1 = COPY %23:gpr
  $r2 = COPY %4:gpr
  $r3 = COPY %22:gpr
  JAL 4, implicit-def $r0, implicit-def dead $r1, implicit-def dead $r2, implicit-def dead $r3, implicit-def dead $r4, implicit-def dead $r5, implicit $r11, implicit $r1, implicit $r2, implicit $r3
  ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11
  %24:gpr = COPY $r0
  %25:gpr = LDD %stack.0.struct kernfs_node.parent, 0 :: (dereferenceable load 8 from %ir."struct kernfs_node.parent")
  LIFETIME_END %stack.0.struct kernfs_node.parent

llvm9

  LIFETIME_START %stack.2.struct cgroup.kn
  ADJCALLSTACKDOWN 0, 0, implicit-def dead $r11, implicit $r11
  %18:gpr = MOV_rr %stack.2.struct cgroup.kn
  $r1 = COPY %18:gpr
  $r2 = COPY %4:gpr
  $r3 = COPY %17:gpr
  JAL 4, implicit-def $r0, implicit-def dead $r1, implicit-def dead $r2, implicit-def dead $r3, implicit-def dead $r4, implicit-def dead $r5, implicit $r11, implicit $r1, implicit $r2, implicit $r3
  ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11
  %19:gpr = COPY $r0
  %20:gpr = MOV_ri 0
  STD %20:gpr, %stack.1.printf_args, 0 :: (store 8 into %ir.19)
  %21:gpr = LDD %stack.2.struct cgroup.kn, 0 :: (dereferenceable load 8 from %ir."struct cgroup.kn")
  LIFETIME_END %stack.2.struct cgroup.kn
  LIFETIME_START %stack.1.printf_args
  LIFETIME_START %stack.0.struct kernfs_node.parent
  ADJCALLSTACKDOWN 0, 0, implicit-def dead $r11, implicit $r11
  %22:gpr = ADD_ri %21:gpr(tied-def 0), 8
  %23:gpr = MOV_rr %stack.0.struct kernfs_node.parent
  $r1 = COPY %23:gpr
  $r2 = COPY %4:gpr
  $r3 = COPY %22:gpr
  JAL 4, implicit-def $r0, implicit-def dead $r1, implicit-def dead $r2, implicit-def dead $r3, implicit-def dead $r4, implicit-def dead $r5, implicit $r11, implicit $r1, implicit $r2, implicit $r3
  ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11
  %24:gpr = COPY $r0
  %25:gpr = LDD %stack.0.struct kernfs_node.parent, 0 :: (dereferenceable load 8 from %ir."struct kernfs_node.parent")
  STD killed %25:gpr, %stack.1.printf_args, 8 :: (store 8 into %ir.21)
  LIFETIME_END %stack.0.struct kernfs_node.parent

and the diff:

--- llvm7	2020-09-14 18:33:17.000000000 +0200
+++ llvm9	2020-09-14 18:33:09.000000000 +0200
@@ -7,14 +7,14 @@
   JAL 4, implicit-def $r0, implicit-def dead $r1, implicit-def dead $r2, implicit-def dead $r3, implicit-def dead $r4, implicit-def dead $r5, implicit $r11, implicit $r1, implicit $r2, implicit $r3
   ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11
   %19:gpr = COPY $r0
-  %20:gpr = LDD %stack.2.struct cgroup.kn, 0 :: (dereferenceable load 8 from %ir."struct cgroup.kn")
+  %20:gpr = MOV_ri 0
+  STD %20:gpr, %stack.1.printf_args, 0 :: (store 8 into %ir.19)
+  %21:gpr = LDD %stack.2.struct cgroup.kn, 0 :: (dereferenceable load 8 from %ir."struct cgroup.kn")
   LIFETIME_END %stack.2.struct cgroup.kn
   LIFETIME_START %stack.1.printf_args
-  %21:gpr = MOV_ri 0
-  STD %21:gpr, %stack.1.printf_args, 0 :: (store 8 into %ir.19)
   LIFETIME_START %stack.0.struct kernfs_node.parent
   ADJCALLSTACKDOWN 0, 0, implicit-def dead $r11, implicit $r11
-  %22:gpr = ADD_ri %20:gpr, 8
+  %22:gpr = ADD_ri %21:gpr(tied-def 0), 8
   %23:gpr = MOV_rr %stack.0.struct kernfs_node.parent
   $r1 = COPY %23:gpr
   $r2 = COPY %4:gpr
@@ -23,5 +23,6 @@
   ADJCALLSTACKUP 0, 0, implicit-def dead $r11, implicit $r11
   %24:gpr = COPY $r0
   %25:gpr = LDD %stack.0.struct kernfs_node.parent, 0 :: (dereferenceable load 8 from %ir."struct kernfs_node.parent")
+  STD killed %25:gpr, %stack.1.printf_args, 8 :: (store 8 into %ir.21)
   LIFETIME_END %stack.0.struct kernfs_node.parent

The IR after the last pass (IR Dump After Module Verifier) is identical between 7 and 9, so it looks like its in the BPF DAG->DAG Pattern Instruction Selection phase, which for ebpf looks mostly standard.

@fbs
Copy link
Member

fbs commented Sep 15, 2020

Removing the lifetime end fixes it:

--- llvm9-issue.ll	2020-09-15 09:46:45.062129782 +0000
+++ llvm9-issue-hack.ll	2020-09-15 09:58:51.975020340 +0000
@@ -48,7 +48,6 @@
   call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %15)
   %probe_read4 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct cgroup.kn", i32 8, i64 %14)
   %16 = load i64, i64* %"struct cgroup.kn", align 8
-  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %15)
   %17 = bitcast %printf_t* %printf_args to i8*
   call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %17)
   %18 = add i64 %16, 8
$ diff -u <(llc llvm9-issue.ll -o -) <(llc llvm9-issue-hack.ll -o -)
--- /dev/fd/63	2020-09-15 10:00:40.536629484 +0000
+++ /dev/fd/62	2020-09-15 10:00:40.536629484 +0000
@@ -34,12 +34,12 @@
 	r3 = *(u64 *)(r10 - 24)
 	r3 += 288
 	r1 = r10
-	r1 += -24
+	r1 += -32
 	r2 = 8
 	call 4
 	r1 = 0
 	*(u64 *)(r10 - 24) = r1
-	r3 = *(u64 *)(r10 - 24)
+	r3 = *(u64 *)(r10 - 32)
 	r3 += 8
 	r1 = r10
 	r1 += -8

But that doesn't explain why it reorders the load/store this way.

@yonghong-song
Copy link

From https://llvm.org/docs/LangRef.html, llvm.lifetime.{start,end} encloses a region of lifetime of a memory object.
From the above IR, I see

  %"struct cgroup.kn" = alloca i64, align 8
...
  %15 = bitcast i64* %"struct cgroup.kn" to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %15)
  %probe_read4 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct cgroup.kn", i32 8, i64 %14)
  %16 = load i64, i64* %"struct cgroup.kn", align 8
  call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %15)
  %17 = bitcast %printf_t* %printf_args to i8*
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %17)
  %18 = add i64 %16, 8
  %19 = bitcast i64* %"struct kernfs_node.parent" to i8*
  %20 = getelementptr inbounds %printf_t, %printf_t* %printf_args, i64 0, i32 0
  store i64 0, i64* %20, align 8
  call void @llvm.lifetime.start.p0i8(i64 -1, i8* nonnull %19)
  %probe_read5 = call i64 inttoptr (i64 4 to i64 (i64*, i32, i64)*)(i64* nonnull %"struct kernfs_node.parent", i32 8, i64 %18)

After call void @llvm.lifetime.end.p0i8(i64 -1, i8* nonnull %15), the compiler is free to reuse the same stack location, and based on the above, the compiler is supposed to do load first and then reuse the stack. This is exactly llvm7 asm code is doing.

Need to do a little bit study on why llvm did the above reordering. I guess this might be a llvm bug. One comment mentioned it is after BPF DAG->DAG Pattern Instruction Selection. Let me take a further check and then will comment back.

@yonghong-song
Copy link

I tried to build bpftrace from source.
I have locally built llvm10 and its install bin in PATH env variable.
I then build bcc from source following https://github.com/iovisor/bcc/blob/master/INSTALL.md
I then tried to build bpftrace from source with
cmake -DCMAKE_BUILD_TYPE=Release ../, and then
make

I then hit the following error

[ 24%] Building CXX object src/CMakeFiles/bpftrace.dir/utils.cpp.o
[ 24%] Linking CXX executable bpftrace
ast/libast.a(codegen_llvm.cpp.o):(.rodata._ZTIN4llvm3orc22LegacyLookupFnResolverIZN8bpftrace6BpfOrcC4EPNS_13TargetMachineEEUlRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEE_EE[_ZTIN4llvm3orc22LegacyLookupFnResolverIZN8bpftrace6BpfOrcC4EPNS_13TargetMachineEEUlRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEE_EE]+0x10): undefined reference to `typeinfo for llvm::orc::SymbolResolver'
ast/libast.a(codegen_llvm.cpp.o):(.rodata._ZTIN8bpftrace13MemoryManagerE[_ZTIN8bpftrace13MemoryManagerE]+0x10): undefined reference to `typeinfo for llvm::SectionMemoryManager'
ast/libast.a(codegen_llvm.cpp.o):(.rodata._ZTIN4llvm9ErrorInfoINS_9ErrorListENS_13ErrorInfoBaseEEE[_ZTIN4llvm9ErrorInfoINS_9ErrorListENS_13ErrorInfoBaseEEE]+0x10): undefined reference to `typeinfo for llvm::ErrorInfoBase'
collect2: error: ld returned 1 exit status
make[2]: *** [src/CMakeFiles/bpftrace.dir/build.make:504: src/bpftrace] Error 1
make[1]: *** [CMakeFiles/Makefile2:1202: src/CMakeFiles/bpftrace.dir/all] Error 2
make: *** [Makefile:141: all] Error 2

Any particular tip on how to resolve this?

@fbs
Copy link
Member

fbs commented Sep 20, 2020

Is llvm built with rtti?

@yonghong-song
Copy link

with rtti on for llvm build, the bpftrace build issue is gone. Thanks!

@yonghong-song
Copy link

I filed a bug against llvm upstream: https://bugs.llvm.org/show_bug.cgi?id=47591

@fbs
Copy link
Member

fbs commented Sep 29, 2020

Thanks @yonghong-song

As nobody has found IR issues this looks like an LLVM bug, how can we best move that forward? I'd like to dig into it but it might take a while.

@yonghong-song
Copy link

@fbs This is a BPF backend fix. https://reviews.llvm.org/D88525 I need to resolve bpf selftest failures before merging. Also, I am wondering whether there is an issue selection dag optimization process. Please feel free to try the patch and also suggest whether there is a better approach. Thanks!

@fbs
Copy link
Member

fbs commented Sep 30, 2020

Ah nice 👍, I'l see if I can get that LLVM version built and tested.

Any idea how we can implement a w/a for this? Ditching all lifetime ends seems to work, but that might blow the stack quickly. Or maybe we can avoid the last lifetime end and hope thats enough. I'll try to experiment a bit

@yonghong-song
Copy link

Any idea how we can implement a w/a for this? Ditching all lifetime ends seems to work, but that might blow the stack quickly. Or maybe we can avoid the last lifetime end and hope thats enough. I'll try to experiment a bit

That is a problem. Selection dag seems assuming different stack alloc won't collide with each other in memory access.

I guess bpftrace could generate different IR. For example, do an analysis to find maximum stack requirement considering the access chain and just do one alloc so you will only have one lifetime.

@yonghong-song
Copy link

The bug is fixed in llvm12. This LLVM patch https://reviews.llvm.org/D91833 fixed the problem and I added a test case (derived from this issue) in BPF https://reviews.llvm.org/D92451.

@fbs
Copy link
Member

fbs commented Mar 6, 2021

Cannot reproduce this one and #1558 with the llvm12 build from #1716 \o/

@fbs fbs mentioned this issue Mar 6, 2021
@fbs fbs mentioned this issue Mar 17, 2021
@fbs fbs closed this as completed in #1720 Mar 17, 2021
@fbs fbs reopened this Mar 17, 2021
@danobi
Copy link
Member Author

danobi commented Apr 2, 2021

Going to close this out b/c we released support for llvm 12. Going to pin this issue so users are aware of implications of using llvm <12.

@danobi danobi closed this as completed Apr 2, 2021
@danobi danobi pinned this issue Apr 2, 2021
@danobi danobi changed the title Reading pointers off stack broken Accessing pointers broken on LLVM <12 Apr 2, 2021
@ajor ajor unpinned this issue Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@mmisono @fbs @danobi @yonghong-song and others