- 1. ã¯ããã«
- 2. 使ç¨ç°å¢
- 3. ã³ã³ãã¤ã«
- 4. opensnoopã¨ã¯
- 5. opensnoop.bpf.cã®è§£èª¬
- 6. opensnoop.cã®è§£èª¬
- 7. ãããã«
- 8. åèæç®
å·çè ï¼ç¨²èè²´æ
1. ã¯ããã«
ååã®æ¦è«ç·¨ã§ã¯ãeBPFï¼ä»¥ä¸ãBPFã¨è¡¨è¨ï¼ãã©ããªãã®ã§ã©ã®ããã«å®ç¾ããã¦ããããä¸å¿ã«è§£èª¬ãããã¾ããã ã¾ããBCCãBCC-ToolsãCO-REã«ã¤ãã¦ã®è§£èª¬ãè¡ãã¾ããã å®è£ ç·¨ã¨ãªãä»åã¯ãBCC-Toolsã®å¾ç¶ã¨ãªãlibbpf-toolsããopensnoopã³ãã³ãã«çç®ã㦠ã½ã¼ã¹ã³ã¼ãã¬ãã«ã®è§£èª¬ãè¡ãã¾ãã
2. 使ç¨ç°å¢
BPFã¯ã2022å¹´8æç¾å¨ãæ´»çºã«éçºãé²ãããã¦ãããããçç±ããªããã°æ°ããã«ã¼ãã«ãã¼ã¸ã§ã³ã使ã£ãã»ããããã§ãã
ç¹ã«CO-REã«ã¤ãã¦ã¯ãlibbpfã®ãã¼ã¸ã§ã³ã®é¢ä¿ã§Ubuntuã ã¨20.10以éã«ããå¿
è¦ãããã¾ãã
ä»åã®èª¿æ»ã¯ãææ°ã®LTSçã§ããUbuntu 22.04 LTSã使ç¨ãã¾ããã
調æ»ãè¡ã£ãã®ã¯ãå·çæç¹ã§ææ°ã®ãã¼ã¸ã§ã³ã ã£ãBCC v0.24.0ã§ãã*1
libbpf-toolsã®ãªãã¸ããªãè¦ãã¨*.bpf.cã¨*.cã¨ãããã¡ã¤ã«ã大éã«ãããã¨ããããã¾ãã
ãã¡ã¤ã«åãã*.bpf.cãã¨ãªã£ã¦ããã®ã¯ãååã®è¨äºã«ãããBPFããã°ã©ã ã«å½ããã¾ãã
ã¾ãããã¡ã¤ã«åãã*.cãã«ãªã£ã¦ããã®ã¯ãååã®è¨äºã«ãããBPFã¢ããªã±ã¼ã·ã§ã³ã«ãªãã¾ãã
ãã¦ãæ¬è¨äºã§ã¯ãããã³ãã³ãã®ä¸ããopensnoopã¨ããã³ãã³ãã®è§£èª¬ã軸ã«ãã¾ãã ãã ãããã¹ã¦ã®ã³ã¼ãã«ã¤ãã¦è§£èª¬ããããã¨ç¸å½ãªéã«ãªããããçè ããã¤ã³ãã«ãªã£ã¦ããã¨æããç®æã®ã¿è§£èª¬ããå½¢ã«ãããã¾ãã
3. ã³ã³ãã¤ã«
以ä¸ã®æé ã§cloneã¨ã³ã³ãã¤ã«ãã§ãã¾ãã
sudo apt install make gcc libelf-dev clang llvm git clone https://github.com/iovisor/bcc.git cd bcc git checkout -b v0.24.0 refs/tags/v0.24.0 git submodule update --init --recursive cd libbpf-tools make
4. opensnoopã¨ã¯
opensnoopã¯ãopenã·ã¹ãã ã³ã¼ã«
ãå¼ã°ããéã«ãã©ã®ããã»ã¹ãã©ã®ãã¡ã¤ã«ãopenããã®ãããã¬ã¼ã¹ããããã®ã³ãã³ãã§ãã
usageã¯ä»¥ä¸ã®ããã«ãªã£ã¦ãã¾ãã
Usage: opensnoop [OPTION...] Trace open family syscalls USAGE: opensnoop [-h] [-T] [-U] [-x] [-p PID] [-t TID] [-u UID] [-d DURATION] [-n NAME] [-e] EXAMPLES: ./opensnoop # trace all open() syscalls ./opensnoop -T # include timestamps ./opensnoop -U # include UID ./opensnoop -x # only show failed opens ./opensnoop -p 181 # only trace PID 181 ./opensnoop -t 123 # only trace TID 123 ./opensnoop -u 1000 # only trace UID 1000 ./opensnoop -d 10 # trace for 10 seconds only ./opensnoop -n main # only print process names containing "main" ./opensnoop -e # show extended fields -d, --duration=DURATION Duration to trace -e, --extended-fields Print extended fields -n, --name=NAME Trace process names containing this -p, --pid=PID Process ID to trace -t, --tid=TID Thread ID to trace -T, --timestamp Print timestamp -u, --uid=UID User ID to trace -U, --print-uid Print UID -v, --verbose Verbose debug output -x, --failed Failed opens only -?, --help Give this help list --usage Give a short usage message -V, --version Print program version
以ä¸ã¯ãªãã·ã§ã³æå®ãªãã§å®è¡ããã¨ãã®åºåçµæã§ãã
å®è¡ã«ã¯ç®¡çè
権éãå¿
è¦ã«ãªãã¾ãã
å·¦ããããopenãå¼ãã ããã»ã¹IDãããããã»ã¹åãããopenã®æ»ãå¤ã§ãããã¡ã¤ã«ãã£ã¹ã¯ãªãã¿ãã
ãerrnoãããopenãããã¡ã¤ã«ãã¹ãã¨ãªã£ã¦ãã¾ãã
PID COMM FD ERR PATH 2870 systemd-journal 39 0 /proc/618498/comm 2870 systemd-journal 39 0 /proc/618498/cmdline ... 618500 opensnoop 24 0 /etc/localtime 2870 systemd-journal 39 0 /proc/618498/loginuid 2870 systemd-journal 39 0 /proc/618498/cgroup ... 67345 tmux: server 9 0 /proc/618498/cmdline 24927 irqbalance 6 0 /proc/interrupts
5. opensnoop.bpf.cã®è§£èª¬
opensnoop.bpf.cã¯ãopensnoopã«ãããBPFããã°ã©ã ã«ãããé¨åã§ãã
5.1 mapã®å®£è¨
10 const volatile __u64 min_us = 0; 11 const volatile pid_t targ_pid = 0; 12 const volatile pid_t targ_tgid = 0; 13 const volatile uid_t targ_uid = 0; 14 const volatile bool targ_failed = false; 15 16 struct { 17 __uint(type, BPF_MAP_TYPE_HASH); 18 __uint(max_entries, 10240); 19 __type(key, u32); 20 __type(value, struct args_t); 21 } start SEC(".maps"); 22 23 struct { 24 __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); 25 __uint(key_size, sizeof(u32)); 26 __uint(value_size, sizeof(u32)); 27 } events SEC(".maps");
ããã§ã¯ãmapã®å®£è¨ãè¡ã£ã¦ããã®ã§ãããBPFç¹æã®Cè¨èªã®æ¸ãæ¹ã«ãªã£ã¦ãã¾ãã
ã¾ããå
é 4è¡ã®å¤æ°ã§ããããããã¯opensnoopã®-p/-t/-u/-fãªãã·ã§ã³ãæå®ããã¨ãã«ä½¿ç¨ããããã®ã§ãã
ä¾ãã°ã-pãªãã·ã§ã³ã¯å¼æ°ã§æå®ããããã»ã¹IDãopenã·ã¹ãã ã³ã¼ã«ãå¼ãã æã®ã¿åºåãããªãã·ã§ã³ãªã®ã§ã
BPFã¢ããªã±ã¼ã·ã§ã³ã®å¼æ°è§£æå¦çãçµãã£ãå¾ã§å¼æ°æå®ããããã»ã¹IDãã»ããããã¾ãã
Cè¨èªãç¥ã£ã¦ãã人ããããã°ããconstã§å®£è¨ãã¦ããã®ã«å¾ã§æ¸ãæããï¼ãããã³ã³ãã¤ã«åä½ãéãããã°ã©ã ã§
å¤ãæ¸ãæããï¼ãã¨æãããããããã¾ããã
ãã®è¾ºã®çåã¯BPFã¢ããªã±ã¼ã·ã§ã³ã®è§£èª¬æã«å¾è¿°ããã®ã§ãããã§ã¯ãconstã§å®£è¨ãã¦ããããrodataã»ã¯ã·ã§ã³ã«é
ç½®ã
ãããã¨ãããã¨ã ããèªèãã ããã
ä¸ã®2ã¤ã®æ§é ä½ã¯ãmapã®å®£è¨ã«ãªãã¾ãã
startã¯ãããã»ã¹IDãkeyã«ãopenã·ã¹ãã ã³ã¼ã«ã«æ¸¡ãã¦ããfilenameãflagsãvalueã«ããæ大ã¨ã³ããªæ°10240ã®hash mapã§ã
å¼æ°ãä¸æçã«æ ¼ç´ããããã«ä½¿ç¨ããã¾ãã
eventsã¯ãBPFããã°ã©ã ã®å¦ççµæãBPFã¢ããªã±ã¼ã·ã§ã³å´ã¨å
±æããããã®ãã®ã§ãã
SEC()ã¯æå®ããã»ã¯ã·ã§ã³ã«å¤æ°ãé ç½®ããããã®ãã¯ãã§ãã SEC(".maps")ã¨ãªã£ã¦ããã®ã§ãELFã®.mapã»ã¯ã·ã§ã³ã«é ç½®ããããã¨ã«ãªãã¾ãã
5.2 BPFããã°ã©ã ã®ã¡ã¤ã³å¦ç
ããããã¯ãopensnoop.bpf.cã®ã¡ã¤ã³å¦çã§ãã opensnoopã¯ãopenã·ã¹ãã ã³ã¼ã«ã¨openatã·ã¹ãã ã³ã¼ã«ã®ä¸¡æ¹ããã¬ã¼ã¹ãã¦ããã®ã§ããã ãã£ã¦ãããã¨ã¯åããªã®ã§æ¬è¨äºã§ã¯openã«é¢ããé¨åã®ã¿è§£èª¬ãã¾ãã
33 static __always_inline 34 bool trace_allowed(u32 tgid, u32 pid) 35 { 36 u32 uid; 37 38 /* filters */ 39 if (targ_tgid && targ_tgid != tgid) 40 return false; 41 if (targ_pid && targ_pid != pid) 42 return false; 43 if (valid_uid(targ_uid)) { 44 uid = (u32)bpf_get_current_uid_gid(); 45 if (targ_uid != uid) { 46 return false; 47 } 48 } 49 return true; 50 } 51 52 SEC("tracepoint/syscalls/sys_enter_open") 53 int tracepoint__syscalls__sys_enter_open(struct trace_event_raw_sys_enter* ctx) 54 { 55 u64 id = bpf_get_current_pid_tgid(); 56 /* use kernel terminology here for tgid/pid: */ 57 u32 tgid = id >> 32; 58 u32 pid = id; 59 60 /* store arg info for later lookup */ 61 if (trace_allowed(tgid, pid)) { 62 struct args_t args = {}; 63 args.fname = (const char *)ctx->args[0]; 64 args.flags = (int)ctx->args[1]; 65 bpf_map_update_elem(&start, &pid, &args, 0); 66 } 67 return 0; 68 } .... 88 static __always_inline 89 int trace_exit(struct trace_event_raw_sys_exit* ctx) 90 { 91 struct event event = {}; 92 struct args_t *ap; 93 int ret; 94 u32 pid = bpf_get_current_pid_tgid(); 95 96 ap = bpf_map_lookup_elem(&start, &pid); 97 if (!ap) 98 return 0; /* missed entry */ 99 ret = ctx->ret; 100 if (targ_failed && ret >= 0) 101 goto cleanup; /* want failed only */ 102 103 /* event data */ 104 event.pid = bpf_get_current_pid_tgid() >> 32; 105 event.uid = bpf_get_current_uid_gid(); 106 bpf_get_current_comm(&event.comm, sizeof(event.comm)); 107 bpf_probe_read_user_str(&event.fname, sizeof(event.fname), ap->fname); 108 event.flags = ap->flags; 109 event.ret = ret; 110 111 /* emit event */ 112 bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, 113 &event, sizeof(event)); 114 115 cleanup: 116 bpf_map_delete_elem(&start, &pid); 117 return 0; 118 } 119 120 SEC("tracepoint/syscalls/sys_exit_open") 121 int tracepoint__syscalls__sys_exit_open(struct trace_event_raw_sys_exit* ctx) 122 { 123 return trace_exit(ctx); 124 }
5.2.1 33-68è¡ç®
tracepoint__syscalls__sys_enter_open
ã¯ãopenã·ã¹ãã ã³ã¼ã«ã®å¼ã³åºããã¤ãã³ãã¨ãã¦å¼ã°ããé¢æ°ã§ãã
55è¡ç®ã®bpf_get_current_pid_tgid
ã¯BPFãã«ãã¼é¢æ°ã®ï¼ã¤ã§ãå¼ã³åºããæç¹ã§ã®ããã»ã¹ã®ã¹ã¬ããIDã¨ããã»ã¹IDãã«ã¼ãã«ãæã¤ãã¼ã¿æ§é ããåå¾ãã¾ãã
å¦çã®å®ä½ã¯ãã«ã¼ãã«å
ã«åå¨ãã¦ãã¾ãã
trace_allowed
ã使ã£ã¦å¤å®ï¼61è¡ç®ï¼ãããã¬ã¼ã¹å¯¾è±¡ã§ããã°openã·ã¹ãã ã³ã¼ã«ã®å¼æ°ã§ãããã¡ã¤ã«åã¨ãã©ã°ãBPFãã«ãã¼é¢æ°ã®
bpf_map_update_elemã使ã£ã¦mapã«æ ¼ç´ãã¾ãã
5.2.2 88-124è¡ç®
tracepoint__syscalls__sys_exit_open
ã¯ãopenã·ã¹ãã ã³ã¼ã«ãçµäºããã¨ãã«å¼ã°ããé¢æ°ã§ãã
trace_exit
ã§ã¯ãtracepoint__syscalls__sys_enter_open
ã§mapã«æ ¼ç´ãããã¼ã¿ãåãåºãï¼96è¡ç®ï¼ã
ããã«BPFã¢ããªã±ã¼ã·ã§ã³å´ã§å¿
è¦ã«ãªãæ
å ±ãeventæ§é ä½ã«ã³ãã¼ãã¦ãã¾ãï¼104-109è¡ç®ï¼ã
æçµçã«ãBPFãã«ãã¼é¢æ°ã§ããbpf_perf_event_outputãå¼ã³åºããBPFã¢ããªã±ã¼ã·ã§ã³ãããã¼ã¿ãèªã¿ã ããããã«ãã¾ãã
5.2.3 ããã°ã©ã ã¿ã¤ãã¨ã³ã³ããã¹ã
BPFããã°ã©ã ãå¼ã°ããã¨ãã«ã«ã¼ãã«å´ãã渡ãããå¼æ°ã®ãã¨ãã³ã³ããã¹ãã¨å¼ã³ã¾ãã
tracepoint__syscalls__sys_exit_open
ã¨tracepoint__syscalls__sys_enter_open
ã«æ¸¡ããã¦ããctxãããã«ãããã¾ãã
ãã®ã³ã³ããã¹ãã§ãããä½ã渡ããããã¯ããã°ã©ã ã¿ã¤ãã«ãã£ã¦æ±ºã¾ãã¾ãã
ããã°ã©ã ã¿ã¤ãã¯ãã¦ã¼ã¶ãä½æããBPFããã°ã©ã ã®ç¨®å¥ã表ããã®ã§ã
ã«ã¼ãã«å
ã«å®ç¾©å¤ããããBPFã¢ããªã±ã¼ã·ã§ã³ã§BPFããã°ã©ã ããã¼ãããéã«æå®ããå¿
è¦ãããã¾ãã
ããã°ã©ã ã¿ã¤ãã¯ãBPFããã°ã©ã ããå¼ã³åºããBPFãã«ãã¼é¢æ°ã®ç¨®é¡ããããããã©ã®ã¿ã¤ãã³ã°ã§BPFããã°ã©ã ãã«ã¼ãã«ããå¼ã°ãããã«ãå½±é¿ãã¾ãã
ããã°ã©ã ã¿ã¤ããã¨ã«å¼ã³åºãå¯è½ãªBPFãã«ãã¼é¢æ°ã¯ãBCCã®ããã¥ã¡ã³ãã«ã¾ã¨ãããã¦ãã¾ãã
BPFã¢ããªã±ã¼ã·ã§ã³ãã©ã®ããã«ããã°ã©ã ã¿ã¤ããæå®ãã¦ããã®ãã¯å¾è¿°ãã¾ãããå
ã«çµè«ã
æ¸ãã¨opensnoopã¯ãBPF_PROG_TYPE_TRACEPOINTãã使ç¨ãã¦ãã¾ãã
opensnoopã®ããã«ã·ã¹ãã ã³ã¼ã«ã«å¯¾ãã¦ãBPF_PROG_TYPE_TRACEPOINTããæå®ããå ´åã
BPFããã°ã©ã ã«æ¸¡ãããã³ã³ããã¹ãã¯ä»¥ä¸ã®ã³ãã³ãã§èª¿ã¹ããã¨ãã§ãã¾ãã
ã¾ããsys_exit_openã«æ¸¡ãããã³ã³ããã¹ãã¯ãsys_enter_openã®é¨åãsys_exit_openã«å¤æ´ãããã¨ã§èª¿ã¹ããã¾ãã
sudo cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_open/format
5.3 opensnoop.bpf.cã®ã³ã³ãã¤ã«
ãmake opensnoopãã¨å®è¡ããã¨ã以ä¸ã®æµãã§opensnoopã®å®è¡å½¢å¼ãã¡ã¤ã«ãçæããã¾ãã
clang -g -O2 -Wall -target bpf -D__TARGET_ARCH_x86 -Ix86/ -I.output -I../src/cc/libbpf/include/uapi -c opensnoop.bpf.c -o .output/opensnoop.bpf.o && llvm-strip -g .output/opensnoop.bpf.o bin/bpftool gen skeleton .output/opensnoop.bpf.o > .output/opensnoop.skel.h cc -g -O2 -Wall -I.output -I../src/cc/libbpf/include/uapi -c opensnoop.c -o .output/opensnoop.o cc -g -O2 -Wall .output/opensnoop.o bcc/libbpf-tools/.output/libbpf.a .output/trace_helpers.o .output/syscall_helpers.o .output/errno_helpers.o .output/map_helpers.o .output/uprobe_helpers.o -lelf -lz -o opensnoop
1è¡ç®ã¯ãClang/LLVMã§opensnoop.bpf.cãBPFãã¤ãã³ã¼ãã¨ãã¦ã³ã³ãã¤ã«ãã¦ãã¾ãã
2è¡ç®ã¯ãbpftoolã¨ããBPFã®éçºãã¼ã«ã使ã£ã¦ãopensnoop.bpf.oããopensnoop.skel.hã¨ããããããã¡ã¤ã«ã
çæãã¦ãã¾ãããã®*.skel.hã¨ããã®ã¯ãã¹ã±ã«ãã³ãããã¨è¨ããopensnoop.bpf.oã®éª¨çµã¿ãæ½åºãããã®ã«ãªãã¾ãã
3è¡ç®ã§opensnoop.cãã³ã³ãã¤ã«ãã4è¡ç®ã§libbpfãªã©ã¨ãªã³ã¯ãã¦opensnoopãå®æãã¾ãã
BPFããã°ã©ã ã¯ã«ã¼ãã«ã«ãã¼ãããå¿
è¦ããããããopensnoopèªä½ã«ãªã³ã¯ã¯ããã¦ãã¾ããã
BPFã¢ããªã±ã¼ã·ã§ã³ï¼opensnoop.cï¼ãã©ããã£ã¦BPFããã°ã©ã ãèªã¿è¾¼ããã¯ã2è¡ç®ã§çæãããã¹ã±ã«ãã³ããã
ãé¢ä¿ãã¦ãã¾ãã
6. opensnoop.cã®è§£èª¬
opensnoop.cã¯ãopensnoopã«ãããBPFã¢ããªã±ã¼ã·ã§ã³ã«ãããé¨åã§ãã
6.1ã6.2ç« ã§BPFã¢ããªã±ã¼ã·ã§ã³ã®ã¡ã¤ã³å¦çã§ããopensnoop.cã®èªä½ã®è§£èª¬ãè¡ãã
以éã¯ã¡ã¤ã³å¦çããå¼ã°ããå種å¦çã«ã¤ãã¦æ·±è¿½ããã¦ããã¾ãã
èªåçæã®ã¹ã±ã«ãã³é¨åã¨libbpfå
é¨ã®é¢æ°ãå
¥ãä¹±ãã¦ãã¾ãã¾ããã
è¦åãæ¹ã¨ãã¦ã¯é¢æ°åããopensnoop_ãã§å§ã¾ããã®ãã¹ã±ã«ãã³ãããå
ã«åå¨ãã¦ãããã®ã§ã
ãbpf_ãã§å§ã¾ããã®ãlibbpfå
ã«åå¨ãã¦ãããã®ã«ãªãã¾ãã
6.1 ããããã¡ã¤ã«ã®ã¤ã³ã¯ã«ã¼ã
17 #include "opensnoop.h" 18 #include "opensnoop.skel.h" 19 #include "trace_helpers.h"
17è¡ç®ã§BPFããã°ã©ã ã¨å
±æããæ§é ä½ãå®ç¾©ããã¦ããopensnoop.hãã¤ã³ã¯ã«ã¼ããã¦ãã¾ãã
ã¾ãã18è¡ç®ã§BPFããã°ã©ã ã使ã£ã¦èªåçæããã¹ã±ã«ãã³ããããã¤ã³ã¯ã«ã¼ããããã¨ã§ã
ã¹ã±ã«ãã³ãããå
ã®ãã¤ãã³ã¼ããã¡ã¢ãªã«èªã¿è¾¼ã¿ã¾ãã
6.2 BPFã¢ããªã±ã¼ã·ã§ã³ã®ã¡ã¤ã³å¦ç
BPFã¢ããªã±ã¼ã·ã§ã³ã®ã¡ã¤ã³å¦çã¯ä»¥ä¸ã®ããã«ãªã£ã¦ãã¾ãã å¦çæ¦è¦ã¨ãã¦ã¯ä»¥ä¸ã®ããã«ãªã£ã¦ãã¾ãã
opensnoop_bpf__open
ï¼ã¹ã±ã«ãã³ãããã®èªã¿è¾¼ã¿ãå種æ§é ä½ã®å®ä½åãããã°ã©ã ã¿ã¤ãã®è¨å®opensnoop_bpf__load
ï¼mapã®ä½æãBPFããã°ã©ã ã®ã«ã¼ãã«ã¸ã®ãã¼ãopensnoop_bpf__attach
ï¼BPFããã°ã©ã ãã¤ãã³ãã«ã¢ã¿ããperf_buffer__poll
ï¼BPFããã°ã©ã ãmapã«æ¸ãè¾¼ã¿ããããã¨ãå¥æ©ã«ãã³ãã©ãå®è¡
215 int main(int argc, char **argv) 216 { ... 231 libbpf_set_strict_mode(LIBBPF_STRICT_ALL); 232 libbpf_set_print(libbpf_print_fn); 233 234 obj = opensnoop_bpf__open(); ... 241 obj->rodata->targ_tgid = env.pid; 242 obj->rodata->targ_pid = env.tid; 243 obj->rodata->targ_uid = env.uid; 244 obj->rodata->targ_failed = env.failed; ... 256 err = opensnoop_bpf__load(obj); ... 262 err = opensnoop_bpf__attach(obj); ... 278 /* setup event callbacks */ 279 pb = perf_buffer__new(bpf_map__fd(obj->maps.events), PERF_BUFFER_PAGES, 280 handle_event, handle_lost_events, NULL, NULL); ... 297 /* main: poll */ 298 while (!exiting) { 299 err = perf_buffer__poll(pb, PERF_POLL_TIMEOUT_MS); ... 308 } ... 315 }
6.3 231-232è¡ç®
231è¡ç®ã®libbpf_set_strict_modeã¯ãè¿ãå°æ¥è¡ãããlibbpfã®å¤§è¦æ¨¡å¤æ´ã«ãããããé©åããããã®ãã®ã§ãã libbpfã¯2022å¹´8æç¾å¨ã®ææ°çãv0.8.1ã§ãããv1.0ã§ã¯å¤§è¦æ¨¡ãªå¤æ´ãè¡ãããã¨ã¢ãã¦ã³ã¹ããã¦ãã¾ãã
232è¡ç®ã¯ãopensnoopã®åºåé¢æ°ã®å·®ãæ¿ããè¡ã£ã¦ãã¾ãã libbpf_print_fnã¯ãvfprintfã使ãæ¨æºã¨ã©ã¼åºåã«åºåãã¦ãã¾ãã
6.4 opensnoop_bpf__open
opensnoop_bpf__open
ã¯ãã¹ã±ã«ãã³ãããã®èªã¿è¾¼ã¿ãå種æ§é ä½ã®å®ä½åãããã°ã©ã ã¿ã¤ãã®è¨å®ãªã©æ§ã
ãªå¦çãè¡ããopensnoop_bpfæ§é ä½ãæ§ç¯ãã¾ãã
opensnoop_bpfæ§é ä½ã®å®ç¾©ã¯opensnoop.skel.hã«ããã¾ãã
11 struct opensnoop_bpf { 12 struct bpf_object_skeleton *skeleton; 13 struct bpf_object *obj; 14 struct { 15 struct bpf_map *start; 16 struct bpf_map *events; 17 struct bpf_map *rodata; 18 } maps; 19 struct { 20 struct bpf_program *tracepoint__syscalls__sys_enter_open; 21 struct bpf_program *tracepoint__syscalls__sys_enter_openat; 22 struct bpf_program *tracepoint__syscalls__sys_exit_open; 23 struct bpf_program *tracepoint__syscalls__sys_exit_openat; 24 } progs; 25 struct { 26 struct bpf_link *tracepoint__syscalls__sys_enter_open; 27 struct bpf_link *tracepoint__syscalls__sys_enter_openat; 28 struct bpf_link *tracepoint__syscalls__sys_exit_open; 29 struct bpf_link *tracepoint__syscalls__sys_exit_openat; 30 } links; 31 struct opensnoop_bpf__rodata { 32 __u64 min_us; 33 pid_t targ_pid; 34 pid_t targ_tgid; 35 uid_t targ_uid; 36 bool targ_failed; 37 } *rodata; 38 };
åè¿°ã®éããã¹ã±ã«ãã³ãããã¯bpftoolã«ããBPFããã°ã©ã ããèªåçæããããã®ã§ãããããæ§é ä½ã¡ã³ãã®ååã®å¤ããBPFããã°ã©ã ã§è¨å®ãããã®ã«ãªã£ã¦ãã¾ãã
ããããã®ã¡ã³ãã®å½¹å²ã¯å¿
è¦ã«å¿ãã¦å¾è¿°ãã¾ãããopensnoop_bpf__open
ã¯ãopensnoop_bpfæ§é ä½ãä½æãã¦ãå¤ãã»ããããã®ãã²ã¨ã¤ã®å½¹å²ã«ãªã£ã¦ãã¾ãã
opensnoop_bpf__open
ã¯ã以ä¸ã®ããã«opensnoop_bpf__open_opts
ãå¼ã³åºãã ãã§ãã
53 static inline struct opensnoop_bpf * 54 opensnoop_bpf__open_opts(const struct bpf_object_open_opts *opts) 55 { ... 59 obj = (struct opensnoop_bpf *)calloc(1, sizeof(*obj)); ... 65 err = opensnoop_bpf__create_skeleton(obj); ... 69 err = bpf_object__open_skeleton(obj->skeleton, opts); ... 78 } 79 80 static inline struct opensnoop_bpf * 81 opensnoop_bpf__open(void) 82 { 83 return opensnoop_bpf__open_opts(NULL); 84 }
59è¡ç®ã§opensnoop_bpfæ§é ä½ãã¢ãã±ã¼ããã65è¡ç®ã§opensnoop_bpf__create_skeleton
ãå¼ã³åºããã¹ã±ã«ãã³æ§é ä½ãæ§ç¯ãã¾ãã
opensnoop_bpf__create_skeleton
ã¯ä»¥ä¸ã®ããã«ãªã£ã¦ãã¾ãã
124 static inline int 125 opensnoop_bpf__create_skeleton(struct opensnoop_bpf *obj) 126 { 127 struct bpf_object_skeleton *s; 128 129 s = (struct bpf_object_skeleton *)calloc(1, sizeof(*s)); 130 if (!s) 131 goto err; 132 obj->skeleton = s; 133 134 s->sz = sizeof(*s); 135 s->name = "opensnoop_bpf"; 136 s->obj = &obj->obj; 137 138 /* maps */ 139 s->map_cnt = 3; 140 s->map_skel_sz = sizeof(*s->maps); 141 s->maps = (struct bpf_map_skeleton *)calloc(s->map_cnt, s->map_skel_sz); 142 if (!s->maps) 143 goto err; 144 145 s->maps[0].name = "start"; 146 s->maps[0].map = &obj->maps.start; 147 148 s->maps[1].name = "events"; 149 s->maps[1].map = &obj->maps.events; 150 151 s->maps[2].name = "opensnoo.rodata"; 152 s->maps[2].map = &obj->maps.rodata; 153 s->maps[2].mmaped = (void **)&obj->rodata; ... 158 s->progs = (struct bpf_prog_skeleton *)calloc(s->prog_cnt, s->prog_skel_sz); 159 if (!s->progs) 160 goto err; 161 162 s->progs[0].name = "tracepoint__syscalls__sys_enter_open"; 163 s->progs[0].prog = &obj->progs.tracepoint__syscalls__sys_enter_open; 164 s->progs[0].link = &obj->links.tracepoint__syscalls__sys_enter_open; ... 170 s->progs[2].name = "tracepoint__syscalls__sys_exit_open"; 171 s->progs[2].prog = &obj->progs.tracepoint__syscalls__sys_exit_open; 172 s->progs[2].link = &obj->links.tracepoint__syscalls__sys_exit_open; ... 178 s->data = (void *)opensnoop_bpf__elf_bytes(&s->data_sz); ... 186 static inline const void *opensnoop_bpf__elf_bytes(size_t *sz) 187 { 188 *sz = 12752; 189 return (const void *) " 190 \x7f\x45\x4c\x46\x02\x01\x01\0\0\0\0\0\0\0\0\0\x01\0\xf7\0\x01\0\0\0\0\0\0\0\0\ 191 \0\0\0\0\0\0\0\0\0\0\0\xd0\x2c\0\0\0\0\0\0\0\0\0\0\x40\0\0\0\0\0\x40\0\x14\0\ 192 \x01\0\xbf\x16\0\0\0\0\0\0\x85\0\0\0\x0e\0\0\0\x63\x0a\xfc\xff\0\0\0\0\x18\x01\
opensnoop_bpf__elf_bytes
ã§returnãã¦ãããã¤ãåãBPFããã°ã©ã ã®ãã¤ãã³ã¼ãã«ãªãã¾ãã
å¾ã®å·¥ç¨ã§ãã®ELFãã¤ããªãlibelfã§è§£æãã¦ãå¿
è¦ãªæ§é ä½ã«ç§»ãæ¿ããããã¦æçµçã«
ã«ã¼ãã«ã«ãã¼ããããããªæ§é ã«ãªã£ã¦ãã¾ãã
opensnoopã®ãªã³ã¯ã§-lelfãæå®ãã¦ããã®ã¯ãã®ããã§ãã
ãã¦ãopensnoop_bpf__create_skeleton
ãå¼ã³åºããçµæãopensnoop_bpfæ§é ä½ã¯å³1ã®ããã«ãªãã¾ãã
å³ï¼ã®éããbpf_object_skeletonæ§é ä½ã®ä¸èº«ã¯ã»ã¨ãã©ãopensnoop_bpfæ§é ä½ã¸ã®åç §ã«ãªã£ã¦ãã¾ããã ããã¯ä»¥éã®å·¥ç¨ã§libbpfã®å¦çã«å ¥ã£ã¦ãããããopensnoop_bpfæ§é ä½ã¨ããopensnoopã«ç¹åããæ§é ä½ããbpf_object_skeletonæ§é ä½ã¨ããæ±ç¨çãªæ§é ä½ã¸ãã¼ã¿ãé¢é£ä»ããããã§ãã
bpf_object__open_skeleton
ã¯ã以ä¸ã®ããã«ãªã£ã¦ãã¾ãã
6957 static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, size_t obj_buf_sz, 6958 const struct bpf_object_open_opts *opts) 6959 { ... 6997 obj = bpf_object__new(path, obj_buf, obj_buf_sz, obj_name); ... 7032 err = err ? : bpf_object__init_maps(obj, opts); 7033 err = err ? : bpf_object_init_progs(obj, opts); ... 7044 } ... 7087 struct bpf_object * 7088 bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz, 7089 const struct bpf_object_open_opts *opts) 7090 { ... 7094 return libbpf_ptr(bpf_object_open(NULL, obj_buf, obj_buf_sz, opts)); 7095 } ... 11643 int bpf_object__open_skeleton(struct bpf_object_skeleton *s, 11644 const struct bpf_object_open_opts *opts) 11645 { ... 11664 obj = bpf_object__open_mem(s->data, s->data_sz, &skel_opts); ... 11672 *s->obj = obj; ... 11699 } ...
libbpfã®ä¸ã¯ã³ã¼ãéãå¤ãããã解説ãè¡ããªãç®æã¯å¤§èã«å²æãã¦ãã¾ãã
bpf_object__open_skeleton
ã¯ã
bpf_object__open_memâbpf_object_openãå¼ã³bpf_objectæ§é ä½ãä½æãã¾ãã
bpf_objectæ§é ä½ã¯ãæçµçã«ã¯bpfã·ã¹ãã ã³ã¼ã«ãå¼ã¶ã¨ãã«ã«ã¼ãã«ã«æ¸¡ãå¼æ°ãå«ãã§ããéè¦ãªæ§é ä½ã§ãã
11672è¡ç®ã§ãopensnoop_bpfæ§é ä½ããåç
§ã§ããããã«ã»ããããã¦ãã¾ãã
6997è¡ç®ã®bpf_object__newã§ã¯bpf_objectæ§é ä½ãã¢ãã±ã¼ãã¨opensnoop.bpf.oã®ELFãã¤ããªã®ãã¤ãåã®åç
§ã®ã»ãããªã©ãè¡ã£ã¦ãã¾ãã
6.4.1 bpf_object__init_maps
7032è¡ç®ã®bpf_object__init_mapsã§ã¯ãmapã®åæåã®éç¨ã§BPFããã°ã©ã å
ã®SEC(".maps")ã使ã£ã¦å®£è¨ããå¤æ°ã«å¯¾ããå¦çã以ä¸ã®æµãã§è¡ã£ã¦ãã¾ãã
bpf_object__init_mapsâbpf_object__init_user_btf_mapsâbpf_object__init_user_btf_mapâparse_btf_map_def
æçµçã«ã¯ãBPFããã°ã©ã å
ã§å®£è¨ããstartãeventsã¯ãbpf_objectæ§é ä½ã®mapsé
åã«æ ¼ç´ãããå¾å·¥ç¨ã§bpfã·ã¹ãã ã³ã¼ã«ãå¼ãã§mapãä½æããéã«ä½¿ããã¾ãã
ã¾ãã以ä¸ã®æµãã§BPFããã°ã©ã å
ã§constä»ãã§å®£è¨ãã¦ããã°ãã¼ãã«å¤æ°ã«é¢ããå¦çãè¡ã£ã¦ãã¾ãã
bpf_object__init_mapsâbpf_object__init_global_data_mapsâbpf_object__init_internal_map
BPFããã°ã©ã å ã§ã°ãã¼ãã«å¤æ°ã«ä»£å ¥ãã¦ããåæå¤ã¯ã1579è¡ç®ã§mmapã·ã¹ãã ã³ã¼ã«ã§ç¢ºä¿ããé åã«ã³ãã¼ããã¦ãã¾ãã
6.4.2 bpf_object_init_progs
6931è¡ç®ã§ã«ã¼ãã«ã«ãã¼ãããããã°ã©ã ã®ããã°ã©ã ã¿ã¤ããã»ãããã¦ãã¾ãã
åè¿°ã®éããopensnoopã¯ãBPF_PROG_TYPE_TRACEPOINTããã»ããããããã§ããã
ããã¯å®ã¯BPFããã°ã©ã å
ã«ããSEC("tracepoint/syscalls/sys_enter_open")ã¨ããæå®æ¹æ³ãé¢ä¿ãã¦ãã¾ãã
6923è¡ç®ã®find_sec_defã§ã¯ã以ä¸ã®section_defsãã¼ãã«ãæ¢ç´¢ãã»ã¯ã·ã§ã³åãsection_defsãã¼ãã«ã«æå®ããæååãªãã©ã«ã¨ä¸è´ãããªãã¸ã§ã¯ããæ¢ç´¢ãã¦ãã¾ãã ãã®çµæã以ä¸ã®ãã¼ãã«ã®8585è¡ç®ã«ä¸è´ããprog->sec_def->prog_typeããBPF_PROG_TYPE_TRACEPOINTãã«ãªãã¨ããããã§ãã
8558 #define SEC_DEF(sec_pfx, ptype, atype, flags, ...) { \ 8559 .sec = sec_pfx, \ 8560 .prog_type = BPF_PROG_TYPE_##ptype, \ 8561 .expected_attach_type = atype, \ 8562 .cookie = (long)(flags), \ 8563 .preload_fn = libbpf_preload_prog, \ 8564 __VA_ARGS__ \ 8565 } ... 8574 static const struct bpf_sec_def section_defs[] = { 8575 SEC_DEF("socket", SOCKET_FILTER, 0, SEC_NONE | SEC_SLOPPY_PFX), ... 8578 SEC_DEF("kprobe/", KPROBE, 0, SEC_NONE, attach_kprobe), 8579 SEC_DEF("uprobe/", KPROBE, 0, SEC_NONE), 8580 SEC_DEF("kretprobe/", KPROBE, 0, SEC_NONE, attach_kprobe), 8581 SEC_DEF("uretprobe/", KPROBE, 0, SEC_NONE), ... 8585 SEC_DEF("tracepoint/", TRACEPOINT, 0, SEC_NONE, attach_tp), 8586 SEC_DEF("tp/", TRACEPOINT, 0, SEC_NONE, attach_tp), ...
6.5 241-245è¡ç®
ã¦ã¼ã¶ãå¼æ°ã§æå®ãããã£ã«ã¿ã®ãªãã·ã§ã³å¤ã代å
¥ãã¦ãã¾ãã
BPFããã°ã©ã å
ã§targ_tgidãtarg_pidãªã©ãåç
§ãã¦ãã¾ãããæçµçã«ã¯ããã§è¨å®ããå¤ãåç
§ãã¦ãã¾ãã
BPFããã°ã©ã ã®ã³ã³ãã¤ã«åä½ã§ã¯ãconstä»ãã§å®£è¨ãã¦ããã®ã§å¤æ´ä¸å¯ã§ãã
obj->rodataã§åç
§ããã®ã¯ãbpf_object__init_internal_mapå
ã§mmapã確ä¿ããé åã«ãªã£ã¦ãããããå¤æ´ãã§ããã¨ããããã§ãã
ãã ã°ãã¼ãã«å¤æ°ã®å¤æ´ãæå¹ãªã®ã¯opensnoop_bpf__open
å¾ãopensnoop_bpf__load
åã®åºéã ãã«ãªãã¾ãã
opensnoop_bpf__load
以éã¯ãBPFããã°ã©ã ã¯ã«ã¼ãã«ã«ãã¼ãããããããBPFããã°ã©ã å
ã§åç
§ãã¦ããå¤æ°ã®å¤æ´ãã§ããªãããã§ãã
6.6 opensnoop_bpf__load
opensnoop_bpf__load
ã¯ãã¹ã±ã«ãã³ããããã¡ã¤ã«ã®ä¸ã«å¦çãããã¾ããã
ãã£ã¦ãããã¨ã¯ä»¥ä¸ã®ããã«libbpfã®bpf_object__load_skeletonãå¼ã¶ã ãã§ãã
86 static inline int 87 opensnoop_bpf__load(struct opensnoop_bpf *obj) 88 { 89 return bpf_object__load_skeleton(obj->skeleton); 90 }
bpf_object__load_skeleton㯠bpf_object__loadâbpf_object_loadãå¼ã³ã¾ãã
7464 static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const char *target_btf_path) 7465 { ... 7485 err = err ? : bpf_object__create_maps(obj); ... 7487 err = err ? : bpf_object__load_progs(obj, extra_log_level); ... 7530 }
6.6.1 bpf_object__create_maps
bpf_object_loadã§ã¯ã7485è¡ç®ã§bpfã·ã¹ãã ã³ã¼ã«ã使ã£ã¦mapãä½æãã¦ãã¾ãã
bpf_object__create_mapsâbpf_object__create_mapâbpf_map_createâsys_bpf_fdâsys_bpf
ã¨ããæµãã§å¦çãããæçµçã«ã¯bpfã·ã¹ãã ã³ã¼ã«ãcmd=BPF_MAP_CREATEã§å¼ã³åºããã¨ã§ãmapãä½æãã¦ãã¾ãã
ã¡ãªã¿ã«rodataã»ã¯ã·ã§ã³ã«é
ç½®ãããmapã«ã¤ãã¦ã¯ã
ããã§bpf_object__populate_internal_mapãå¼ã³åºããBPF_MAP_FREEZEã§bpfã·ã¹ãã ã³ã¼ã«ãå¼ã³åºããã¨ã§ãread onlyã«ãã¦ãã¾ãã
6.6.2 bpf_object__load_progs
bpf_object__load_progsã§ã¯ãtracepoint__syscalls__sys_enter_open
ãªã©ã®BPFããã°ã©ã ã®ãã¸ãã¯é¨åãã«ã¼ãã«ã«ãã¼ããã¾ãã
bpf_object__load_progsâbpf_object_load_progâbpf_object_load_prog_instanceâbpf_prog_loadâsys_bpf_prog_load
ã®æµãã§å¦çãããæçµçã«ã¯bpfã·ã¹ãã ã³ã¼ã«ãcmd=BPF_PROG_LOADã§å¼ã³åºããã¨ã§ãããã°ã©ã é¨åãã«ã¼ãã«ã«ãã¼ããã¦ãã¾ãã
6.7 opensnoop_bpf__attach
opensnoop_bpf__load
ã§bpfã·ã¹ãã ã³ã¼ã«ãå¼ã³åºããVerifierã§æ¤è¨¼ãããå¾ã§JITã³ã³ãã¤ã«ãè¡ãããã¤ãã£ãã³ã¼ãã«ãªãã«ã¼ãã«ã®ãã¼ããå®äºãã¾ãã
ã¤ãã³ãçºçæã«BPFããã°ã©ã ãåä½ãããã«ã¯ããã¼ãããBPFããã°ã©ã ãããã¯ãã¤ã³ãã«ã¢ã¿ããããå¿
è¦ãããã¾ãã
ãããè¡ãã®ãopensnoop_bpf__attach
ã§ãã
opensnoop_bpf__attach
ã®ã³ã¼ãã¯ãã¹ã±ã«ãã³ãããã«åå¨ãã以ä¸ã®ããã«libbpfã®bpf_object__attach_skeletonãå¼ã¶ã ãã§ãã
110 static inline int 111 opensnoop_bpf__attach(struct opensnoop_bpf *obj) 112 { 113 return bpf_object__attach_skeleton(obj->skeleton); 114 }
bpf_object__attach_skeletonã¯ã«ã¼ãã«ã«ãã¼ãããåã
ã®BPFããã°ã©ã ã«ã¤ãã¦ãbpf_program__attachãå¼ã³ã¾ãã
attach_fnã¯ãããã§å®£è¨ããã¦ãããBPFããã°ã©ã ãã©ãã®ã»ã¯ã·ã§ã³ã«é
ç½®ãããã§å¤ããã¾ãã
tracepointã»ã¯ã·ã§ã³ã«é
ç½®ãããå ´åã«å¼ã°ããattach_tpã¯ãå
é¨ã§
bpf_program__attach_tracepointâbpf_program__attach_tracepoint_optsã¨å¦çãé²ãããã¾ãã
bpf_program__attach_tracepoint_optsãã¢ã¿ããå¦çã®ã¡ã¤ã³é¨åã«ãªã£ã¦ãã¦ã以ä¸ã®ããã«ãªã£ã¦ãã¾ãã
10318 struct bpf_link *bpf_program__attach_tracepoint_opts(const struct bpf_program *prog, 10319 const char *tp_category, 10320 const char *tp_name, 10321 const struct bpf_tracepoint_opts *opts) 10322 { ... 10333 pfd = perf_event_open_tracepoint(tp_category, tp_name); ... 10340 link = bpf_program__attach_perf_event_opts(prog, pfd, &pe_opts); ... 10350 }
perf_event_open_tracepointã§perf_event_openã·ã¹ãã ã³ã¼ã«ãå¼ã³åºããPERF_TYPE_TRACEPOINTã«å¯¾å¿ãããã¡ã¤ã«ãã£ã¹ã¯ãªãã¿ãä½æãã¾ãã
perfã¨ã¯ãLinuxã«ãããããã©ã¼ãã³ã¹ã¢ãã¿ãªã³ã°ãè¡ãããã®ä»çµã¿ã§ãã
CPUã®ã«ã¦ã³ã¿å¤ãåå¾ããéãªã©ã«ä½¿ç¨ããperfã³ãã³ãã¯ãå
é¨ã§perf_event_openã·ã¹ãã ã³ã¼ã«ãå¼ã³åºãã¦ããããã§ãã
ä¸è¨ã®éããBPFã§tracepointã«ã¢ã¿ããããå ´åã ã¨ãperfã®ä»çµã¿ãå©ç¨ãã¦ãã¾ããã
ä¾ãã°BPFããã°ã©ã å
ã®SEC()ã§"kprobe/"ã®ããã«æå®ããå ´åã¯ãkprobeã®ä»çµã¿ãå©ç¨ãã¾ãã
libbpf-toolsã ã¨ãä¾ãã°bindsnoopãªã©ãkprobeã®ä»çµã¿ã使ç¨ãã¦ãã¾ãã
bpf_program__attach_perf_event_optsâbpf_link_createã¨å¦çãè¡ãããæçµçã«ã¯bpfã·ã¹ãã ã³ã¼ã«ãcmd=BPF_LINK_CREATEã§å¼ã³åºãã¦ã¢ã¿ãããã¦ãã¾ãã ãã ããã«ã¼ãã«å´ãBPF_LINK_CREATEã«å¯¾å¿ãã¦ããªãå ´åã¯ãioctl(PERF_EVENT_IOC_SET_BPF)ã§ã¢ã¿ããããããã§ãã ãã®å¾ãioctl(PERF_EVENT_IOC_ENABLE)ã§perfã¤ãã³ããæå¹ãããã¨ã§ãperfã¤ãã³ããçºçããã¨ãã«BPFããã°ã©ã ãå®è¡ãããããã«ãã¦ãã¾ãã
6.8 279-308è¡ç®
ã¢ã¿ãããå®äºããã°ããã¨ã¯BPFããã°ã©ã ãmapã«åºåããå
容ãBPFã¢ããªã±ã¼ã·ã§ã³å´ã§èªã¿åã£ã¦åºåããã ãã§ãã
perf_buffer__new
ã¨perf_buffer__pollã¯ãã¾ãã«ãããè¡ã£ã¦ããã
perf_buffer__new
ã§ãã¼ãªã³ã°ããmapã¨ã¤ãã³ããçºçããã¨ãã«å¼ã³åºããããã³ãã©ï¼handle_eventï¼ãã»ãããã
çµäºæ¡ä»¶ãæºããã¾ã§perf_buffer__pollã§ã¤ãã³ãï¼mapã¸ã®æ¸ãè¾¼ã¿ï¼ãå¾
ã£ã¦ãã¾ãã
ç»é¢åºåé¨åã¯ãç·ãã¦é£ãããã¨ã¯è¡ã£ã¦ããªãã®ã§å²æãã¾ãã
ã¾ãä½è«ã§ãããç¾å¨ã§ã¯opensnoopã§ä½¿ç¨ãã¦ããperf_bufferã使ã£ãããæ¹ããããringbufã使ãæ¹ãããã¿ããã§ããæ¬è¨äºã¯opensnoopã®è§£èª¬ããã¼ããªã®ã§ringbufã¯è¿½åãã¾ããã
ãã¦ãperf_buffer__new
ã¯é¢æ°ãã¯ãã«ãªã£ã¦ãããå®éã«ã¯perf_buffer__new_v0_6_0ãå¼ã°ããããããã¡ã¤ã³å¦çã§ãã
__perf_buffer__newãå¼ã°ãã¾ãã
__perf_buffer__newã§ã¯ãcpu*ã¨ããå¤æ°åãããã¾ãããããã¯perf_bufferãCPUãã¨ã«æã¤ãããã¡ã®ããã§ãã
__perf_buffer__newã¯ä»¥ä¸ã®ããã«ãªã£ã¦ãã¾ãã
10945 static struct perf_buffer *__perf_buffer__new(int map_fd, size_t page_cnt, 10946 struct perf_buffer_params *p) 10947 { ... 10965 err = bpf_obj_get_info_by_fd(map_fd, &map, &map_info_len); ... 10991 pb->sample_cb = p->sample_cb; ... 10999 pb->epoll_fd = epoll_create1(EPOLL_CLOEXEC); ... 11038 for (i = 0, j = 0; i < pb->cpu_cnt; i++) { ... 11050 11051 cpu_buf = perf_buffer__open_cpu_buf(pb, p->attr, cpu, map_key); ... 11059 err = bpf_map_update_elem(pb->map_fd, &map_key, 11060 &cpu_buf->fd, 0); ... 11068 11069 pb->events[j].events = EPOLLIN; 11070 pb->events[j].data.ptr = cpu_buf; 11071 if (epoll_ctl(pb->epoll_fd, EPOLL_CTL_ADD, cpu_buf->fd, 11072 &pb->events[j]) < 0) { ... 11078 } ... 11080 } ... 11091 }
10965è¡ç®ã§ã«ã¼ãã«å
ã«ä½æããmapã®ãã¡ã¤ã«ãã£ã¹ã¯ãªãã¿ãåå¾ãã¦ãã¾ãã
ãã®mapã¯BPFããã°ã©ã ã§eventsã¨ãã¦å®£è¨ãã¦ãããã®ã§ãã
ãã®å¾ãpbï¼perf_bufferæ§é ä½ï¼ãã¢ãã±ã¼ããã10991è¡ç®ã§p->sample_cbã代å
¥ãã¦ãã¾ããã
ãã®sample_cbãopensnoop.cãã渡ãããhandle_eventã«ãªãã¾ãã
10999è¡ç®ã§epollã®ãã¡ã¤ã«ãã£ã¹ã¯ãªãã¿ãä½æãã¦ãã¾ããããã®ãã¡ã¤ã«ãã£ã¹ã¯ãªãã¿ãå¾ã§perf_buffer__pollããã¨ãã«ã¤ãã³ãå¾
ã¡ã«ä½¿ããããã®ã§ãã
11051è¡ç®ã§perf_buffer__open_cpu_bufãå¼ã³ãCPUæ°åãããã¡ã®ä½æã¨perf_event_openã·ã¹ãã ã³ã¼ã«ã®å¼ã³åºããªã©ãè¡ã£ã¦ãã¾ãã
perf_buffer__open_cpu_bufã§ãã£ã¦ãããã¨ã¯ãBPFããã°ã©ã å´ã§å¼ãã§ããbpf_perf_event_outputã使ãããã«å¿
è¦ãªãã¨ã®ããã§ãã
bpf_perf_event_outputã®ããã¥ã¢ã«ã«ã¦ã¼ã¶ç©ºéã§èªã¿ã ãããã«ã¯ã©ã®ãããªæç¶ããå¿
è¦ãã®è¨è¼ãããã¾ãã
11059è¡ç®ã§eventsã¨CPUãã¨ã®ãããã¡ãç´ã¥ãã¦ãã¾ããkeyã¯ãã«ã¼ãã®ã¤ã³ããã¯ã¹ï¼iï¼ã§ãã
11071è¡ç®ã§epoll_ctlãå¼ã³ãCPUãã¨ã®ãããã¡ãepollã®ç£è¦å¯¾è±¡ã«ãã¾ãã
次ã«perf_buffer__pollã§ããã以ä¸ã®ãããªã³ã¼ãã«ãªã£ã¦ãã¾ãã
11106 static enum bpf_perf_event_ret 11107 perf_buffer__process_record(struct perf_event_header *e, void *ctx) 11108 { 11109 struct perf_cpu_buf *cpu_buf = ctx; 11110 struct perf_buffer *pb = cpu_buf->pb; 11111 void *data = e; ... 11117 switch (e->type) { 11118 case PERF_RECORD_SAMPLE: { 11119 struct perf_sample_raw *s = data; 11120 11121 if (pb->sample_cb) 11122 pb->sample_cb(pb->ctx, cpu_buf->cpu, s->data, s->size); 11123 break; 11124 } ... 11135 } 11136 return LIBBPF_PERF_EVENT_CONT; 11137 } ... 11139 static int perf_buffer__process_records(struct perf_buffer *pb, 11140 struct perf_cpu_buf *cpu_buf) 11141 { ... 11144 ret = perf_event_read_simple(cpu_buf->base, pb->mmap_size, 11145 pb->page_size, &cpu_buf->buf, 11146 &cpu_buf->buf_size, 11147 perf_buffer__process_record, cpu_buf); ... 11151 } ... 11158 int perf_buffer__poll(struct perf_buffer *pb, int timeout_ms) 11159 { ... 11162 cnt = epoll_wait(pb->epoll_fd, pb->events, pb->cpu_cnt, timeout_ms); ... 11165 11166 for (i = 0; i < cnt; i++) { 11167 struct perf_cpu_buf *cpu_buf = pb->events[i].data.ptr; 11168 11169 err = perf_buffer__process_records(pb, cpu_buf); ... 11174 } ... 11176 }
11162è¡ç®ã§epoll_waitã§CPUãã¨ã®ãããã¡ãèªã¿åºãå¯è½ã«ãªããã¿ã¤ã ã¢ã¦ãï¼100ããªç§ï¼ã«ãªãã¾ã§ãããã¯ãã¾ãã
ãããã¡ã«æ¸ãè¾¼ã¿ãçºçãããã11167è¡ç®ã§ãããã¡ã¸ã®ãã¤ã³ã¿ãåãåºãã¦perf_buffer__process_recordsãå¼ã³åºãã¾ãã
__perf_buffer__newã§ä½æãããããã¡ã¯ããªã³ã°ãããã¡ã¨ãã¦ä½¿ç¨ããã¦ãããperf_event_read_simpleã§ã¯ãå®éã«ãããã¡ã«æ¸ãè¾¼ã¾ãããã¼ã¿ãåãåºãã
perf_buffer__process_recordã«æ¸¡ãã¾ãã
æçµçã«ã¯ããã¡ãã§å¼ã°ããsample_cbã®å¼ã³åºãããopensnoop.cã§æ¸¡ããhandle_eventã«ãªã£ã¦ãããBPFããã°ã©ã ãã渡ããããã¼ã¿ã®åºåãè¡ããã¾ãã
7. ãããã«
BPFã¯ããã¼ã«ãå«ãã¾ã ã¾ã 絶è³éçºä¸ã®æè¡ã§ãã
æ¨å¹´ã¯ãWindowsã§BPFããµãã¼ããããã¨ããã¤ã¯ãã½ãããçºè¡¨ãããªã©Linux以å¤ã®OSã§ãBPFã®ä»çµã¿ã¯åºããã¤ã¤ããã¾ãã
ã¡ãªã¿ã«BPFãã©ã®ãããªæ¹åã«é²ãã§ãã£ã¦ããã®ãã¯ã
Linux Kernel Developers' bpfconf 2022ã®è³æãåèã«ãªãããããã¾ããã
8. åèæç®
Liz Rice. What is eBPF? . O'Reilly Media, Inc, April 2022,
*1:2022å¹´8æ11æ¥ã«v0.25.0ããªãªã¼ã¹ããã¾ããããæ¬è¨äºã§ã¯å·çæç¹ã§ææ°ã ã£ãv0.24.0ã対象ã«ãã¦ãã¾ã