Systems Performance 2nd Ed.



BPF Performance Tools book

Recent posts:
Blog index
About
RSS

sysdig CCWFS 2016: Designing Tracing Tools

Video: https://www.youtube.com/watch?v=uibLwoVKjec

Keynote by Brendan Gregg for Sysdig Camp-Con-World-Fest-Summit 2016.

Description: "You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!"

next
prev
1/40
next
prev
2/40
next
prev
3/40
next
prev
4/40
next
prev
5/40
next
prev
6/40
next
prev
7/40
next
prev
8/40
next
prev
9/40
next
prev
10/40
next
prev
11/40
next
prev
12/40
next
prev
13/40
next
prev
14/40
next
prev
15/40
next
prev
16/40
next
prev
17/40
next
prev
18/40
next
prev
19/40
next
prev
20/40
next
prev
21/40
next
prev
22/40
next
prev
23/40
next
prev
24/40
next
prev
25/40
next
prev
26/40
next
prev
27/40
next
prev
28/40
next
prev
29/40
next
prev
30/40
next
prev
31/40
next
prev
32/40
next
prev
33/40
next
prev
34/40
next
prev
35/40
next
prev
36/40
next
prev
37/40
next
prev
38/40
next
prev
39/40
next
prev
40/40

PDF: SysdigSummit2016_DesigningTracingTools.pdf

Keywords (from pdftotext):

slide 1:
    Designing
    Tracing
    Tools
    Brendan Gregg, Senior Performance Architect
    
    slide 2:
      Wielding Superpowers
      
      slide 3:
        I'm currently developing more tracing tools (bcc/BPF)
        
        slide 4:
          Tool Design
          • For tool developers
          • For everyone else: what you can ask for
          – Tool templates
          – GUI visualizations
          • The following is applicable to all tracers
          – sysdig, bcc/BPF, DTrace, SystemTap, LTTng, …
          
          slide 5:
            Methodologies
            
            slide 6:
              Methodology-driven Design
              • Methodologies provide ideas for purposeful tools
              • Find/draw a functional diagram, apply methods
              Operating Systems
              See: http://www.brendangregg.com/methodology.html
              
              slide 7:
                Methodology Examples
                Eg, at the syscall layer (well known & documented):
                • Workload Characterization
                – exec() or open() per-event trace
                (execsnoop, opensnoop)
                – connect()/accept() per-event trace (tcpconnect, tcpaccept)
                – read()/write() size histogram
                (one-liners)
                • Latency Analysis
                – read()/write() latency histogram
                (funclatency, …)
                • USE Method
                – network utilization by thread
                (not done yet)
                – syscall errors
                (fserrors, soerrors)
                
                slide 8:
                  CLI Tracing Tools
                  
                  slide 9:
                    CLI Templates
                    1. Per event output
                    *snoop, *slower 0, …
                    2. Filtered event output
                    *slower
                    3. Interval summary
                    *stat, *top
                    4. Count summary
                    *count
                    5. Histogram summary
                    *dist, *latency
                    6. Heatmap summary
                    spectrogram.lua, subsecoffset.lua, …
                    
                    slide 10:
                      Template 1: Per Event Output
                      Examples: *snoop, *slower 0, …
                      # opensnoop
                      PID
                      COMM
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      10085 sshd
                      […]
                      FD ERR PATH
                      0 /lib/x86_64-linux-gnu/libkeyutils.so.1
                      0 /lib/x86_64-linux-gnu/libresolv.so.2
                      0 /lib/x86_64-linux-gnu/libgpg-error.so.0
                      0 /dev/urandom
                      2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac
                      2 /proc/sys/crypto/fips_enabled
                      0 /proc/filesystems
                      0 /dev/null
                      0 /proc/10085/fd
                      0 /usr/lib/ssl/openssl.cnf
                      0 /etc/gai.conf
                      0 /etc/nsswitch.conf
                      0 /etc/ld.so.cache
                      0 /lib/x86_64-linux-gnu/libnss_compat.so.2
                      0 /etc/ld.so.cache
                      0 /lib/x86_64-linux-gnu/libnss_nis.so.2
                      
                      slide 11:
                        Template 2: Filtered Event Output
                        Examples: *slower
                        # sysdig -c fileslower 1
                        TIME
                        PROCESS
                        2014-04-13 20:40:43.973 cksum
                        2014-04-13 20:40:44.187 cksum
                        2014-04-13 20:40:44.689 cksum
                        2014-04-13 20:40:45.005 cksum
                        2014-04-13 20:40:45.193 cksum
                        […]
                        TYPE
                        read
                        read
                        read
                        read
                        read
                        LAT(ms) FILE
                        2 /mnt/partial.0.0
                        1 /mnt/partial.0.0
                        2 /mnt/partial.0.0
                        2 /mnt/partial.0.0
                        1 /mnt/partial.0.0
                        Tools like this can also do all event output:
                        # sysdig -c fileslower 0
                        TIME
                        PROCESS
                        2014-04-13 20:59:04.414 ls
                        2014-04-13 20:59:04.414 ls
                        2014-04-13 20:59:04.414 ls
                        2014-04-13 20:59:04.414 ls
                        2014-04-13 20:59:04.414 ls
                        2014-04-13 20:59:04.415 ls
                        2014-04-13 20:59:04.415 ls
                        [...]
                        TYPE
                        read
                        read
                        read
                        read
                        read
                        read
                        read
                        LAT(ms) FILE
                        0 /lib/x86_64-linux-gnu/librt.so.1
                        0 /lib/x86_64-linux-gnu/libacl.so.1
                        0 /lib/x86_64-linux-gnu/libc.so.6
                        0 /lib/x86_64-linux-gnu/libdl.so.2
                        0 /lib/x86_64-linux-gnu/libattr.so.1
                        0 /proc/filesystems
                        0 /proc/filesystems
                        
                        slide 12:
                          Template 3: Interval Summary
                          Examples: *stat, *top
                          # dcstat
                          TIME
                          08:11:47:
                          08:11:48:
                          08:11:49:
                          08:11:50:
                          08:11:51:
                          08:11:52:
                          08:11:53:
                          08:11:54:
                          08:11:55:
                          08:11:56:
                          08:11:57:
                          […]
                          REFS/s
                          SLOW/s
                          MISS/s
                          HIT%
                          
                          slide 13:
                            Template 4: Count Summary
                            Examples: *count
                            # funccount 'vfs_*'
                            Tracing... Ctrl-C to end.
                            ADDR
                            FUNC
                            ffffffff811efe81 vfs_create
                            ffffffff811f24a1 vfs_rename
                            ffffffff81215191 vfs_fsync_range
                            ffffffff81231df1 vfs_lock_file
                            ffffffff811e8dd1 vfs_fstatat
                            ffffffff811e8d71 vfs_fstat
                            ffffffff811e4381 vfs_write
                            ffffffff811e8c71 vfs_getattr_nosec
                            ffffffff811e8d41 vfs_getattr
                            ffffffff811e3221 vfs_open
                            ffffffff811e4251 vfs_read
                            Detaching...
                            COUNT
                            
                            slide 14:
                              Template 5: Histogram Summary
                              Examples: *dist, *latency
                              # biolatency
                              Tracing block device I/O... Hit Ctrl-C to end.
                              usecs
                              : count
                              distribution
                              4 ->gt; 7
                              : 0
                              8 ->gt; 15
                              : 0
                              16 ->gt; 31
                              : 0
                              32 ->gt; 63
                              : 0
                              64 ->gt; 127
                              : 1
                              128 ->gt; 255
                              : 12
                              |********
                              256 ->gt; 511
                              : 15
                              |**********
                              512 ->gt; 1023
                              : 43
                              |*******************************
                              1024 ->gt; 2047
                              : 52
                              |**************************************|
                              2048 ->gt; 4095
                              : 47
                              |**********************************
                              4096 ->gt; 8191
                              : 52
                              |**************************************|
                              8192 ->gt; 16383
                              : 36
                              |**************************
                              16384 ->gt; 32767
                              : 15
                              |**********
                              32768 ->gt; 65535
                              : 2
                              65536 ->gt; 131071
                              : 2
                              
                              slide 15:
                                Template 6: Heatmap Summary
                                Example: subsecoffset.lua (aka "spectrogram")
                                
                                slide 16:
                                  
                                  slide 17:
                                    Valuable
                                    Know what already exists, and what doesn't
                                    
                                    slide 18:
                                      Low Overhead (or documented)
                                      sysdig
                                      Kernel
                                      sysdig
                                      driver
                                      1. enable
                                      3. output
                                      lua
                                      program
                                      2. async
                                      read
                                      syscalls
                                      ring
                                      buffer
                                      • Understand tracing internals
                                      – For example, sysdig's design has ~20x lower overhead than strace
                                      (it still has overhead: test and measure to see if it's acceptable)
                                      – Tracing overhead is usually relative to event rate
                                      • Design for low overhead, and document expectations
                                      
                                      slide 19:
                                        Documentation
                                        • Good tools have 3 docs:
                                        .TH
                                        Title heading
                                        Code comments
                                        .SH
                                        Section heading
                                        Man page
                                        .IP
                                        Indented paragraph
                                        Examples file
                                        .TP
                                        Indented paragraph with label
                                        Bold
                                        • Man page
                                        – troff, docbook, …
                                        • Examples file:
                                        common man macros (see groff_man(7))
                                        Demonstrations of biosnoop, the Linux eBPF/bcc version.
                                        biosnoop traces block device I/O (disk I/O), and prints a line of output
                                        per I/O. Example:
                                        # ./biosnoop
                                        TIME(s)
                                        [...]
                                        COMM
                                        supervise
                                        PID
                                        DISK
                                        xvda1
                                        SECTOR
                                        BYTES
                                        LAT(ms)
                                        
                                        slide 20:
                                          Concise, intuitive, self-explanatory
                                          # ./iolatency
                                          Tracing block I/O. Output every 1 seconds. Ctrl-C to end.
                                          >gt;=(ms) .. gt; 1
                                          1 ->gt; 2
                                          2 ->gt; 4
                                          4 ->gt; 8
                                          8 ->gt; 16
                                          […]
                                          : I/O
                                          : 4381
                                          : 9
                                          : 5
                                          : 0
                                          : 1
                                          |Distribution
                                          |######################################|
                                          • Useful startup message
                                          – What I'm tracing, when there's output, when I'll end
                                          • Vigorous tooling is concise
                                          – No wasted text; leave less useful output for non-default options
                                          • Unix philosophy: do one thing and do it well
                                          
                                          slide 21:
                                            POSIX-style Arguments
                                            # ./biolatency -h
                                            usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count]
                                            Summarize block device I/O latency as a histogram
                                            positional arguments:
                                            interval
                                            output interval, in seconds
                                            count
                                            number of outputs
                                            optional arguments:
                                            -h, --help
                                            -T, --timestamp
                                            -Q, --queued
                                            -m, --milliseconds
                                            -D, --disks
                                            examples:
                                            ./biolatency
                                            ./biolatency 1 10
                                            ./biolatency -mT 1
                                            ./biolatency -Q
                                            ./biolatency -D
                                            show this help message and exit
                                            include timestamp on output
                                            include OS queued time in I/O time
                                            millisecond histogram
                                            print a histogram per disk device
                                            # summarize block I/O latency as a histogram
                                            # print 1 second summaries, 10 times
                                            # 1s summaries, milliseconds, and timestamps
                                            # include OS queued time in I/O time
                                            # show each disk device separately
                                            
                                            slide 22:
                                              POSIX-style Arguments
                                              Option
                                              Alternate
                                              Expectation
                                              --all
                                              all events
                                              -c CMD
                                              --cmd …
                                              run this command
                                              -d SECONDS
                                              --duration …
                                              duration of tool execution
                                              --help
                                              help
                                              -i FILE
                                              --input …
                                              input file
                                              -i SECONDS
                                              --interval …
                                              summary interval
                                              -n name
                                              --name …
                                              this process name only
                                              -o FILE
                                              --output …
                                              output file
                                              -p PID
                                              --pid …
                                              this process ID only
                                              --by-process
                                              per-process ID breakdown
                                              -P PORT
                                              --port …
                                              this TCP port only
                                              -t or -T
                                              --[no]timestamp
                                              include or exclude timestamps
                                              --verbose
                                              verbose output
                                              --extended, --errors
                                              extended output, or only failures
                                              [interval [count]]
                                              summary interval, and # of outputs
                                              
                                              slide 23:
                                                Testing, Testing, Testing
                                                • If you can't write the workload, you can't write the tool
                                                – Be it 10 lines of C, some shell, or dd
                                                – dd if=/dev/urandom of=/dev/null bs=1k count=23
                                                • Known workload testing: create 23 events
                                                • Testing can be time consuming
                                                – I spend more time testing a tool than writing it
                                                – Automatic tool testing is a difficult problem
                                                
                                                slide 24:
                                                  Example: gethostlatency
                                                  # gethostlatency
                                                  TIME
                                                  PID
                                                  COMM
                                                  06:10:24 28011 wget
                                                  06:10:28 28127 wget
                                                  06:10:41 28404 wget
                                                  06:10:48 28544 curl
                                                  06:11:10 29054 curl
                                                  06:11:16 29195 curl
                                                  06:11:25 29404 curl
                                                  06:11:28 29475 curl
                                                  LATms HOST
                                                  90.00 www.iovisor.org
                                                  0.00 www.iovisor.org
                                                  9.00 www.netflix.com
                                                  35.00 www.netflix.com.au
                                                  31.00 www.plumgrid.com
                                                  3.00 www.facebook.com
                                                  72.00 foo
                                                  1.00 foo
                                                  
                                                  slide 25:
                                                    Example: ext4slower
                                                    # ext4slower 1
                                                    Tracing ext4 operations slower than 1 ms
                                                    TIME
                                                    COMM
                                                    PID
                                                    T BYTES
                                                    OFF_KB
                                                    06:49:17 bash
                                                    R 128
                                                    06:49:17 cksum
                                                    R 39552
                                                    06:49:17 cksum
                                                    R 96
                                                    06:49:17 cksum
                                                    R 96
                                                    06:49:17 cksum
                                                    R 10320
                                                    06:49:17 cksum
                                                    R 65536
                                                    06:49:17 cksum
                                                    R 55400
                                                    06:49:17 cksum
                                                    R 36792
                                                    06:49:17 cksum
                                                    R 15008
                                                    06:49:17 cksum
                                                    R 6123
                                                    repository
                                                    06:49:17 cksum
                                                    R 6280
                                                    06:49:17 cksum
                                                    R 27696
                                                    06:49:17 cksum
                                                    R 58080
                                                    06:49:17 cksum
                                                    R 906
                                                    […]
                                                    LAT(ms) FILENAME
                                                    7.75 cksum
                                                    1.34 [
                                                    5.36 2to3-2.7
                                                    14.94 2to3-3.4
                                                    6.82 411toppm
                                                    4.01 a2p
                                                    8.77 ab
                                                    16.34 aclocal-1.14
                                                    19.31 acpi_listen
                                                    17.23 add-apt18.40 addpart
                                                    2.16 addr2line
                                                    10.11 ag
                                                    6.30 ec2-meta-data
                                                    
                                                    slide 26:
                                                      Example: biolatency
                                                      # biolatency -m 1 5
                                                      Tracing block device I/O... Hit Ctrl-C to end.
                                                      msecs
                                                      0 ->gt; 1
                                                      2 ->gt; 3
                                                      4 ->gt; 7
                                                      8 ->gt; 15
                                                      16 ->gt; 31
                                                      32 ->gt; 63
                                                      64 ->gt; 127
                                                      […]
                                                      : count
                                                      : 36
                                                      : 1
                                                      : 3
                                                      : 17
                                                      : 33
                                                      : 7
                                                      : 6
                                                      distribution
                                                      |**************************************|
                                                      |***
                                                      |*****************
                                                      |**********************************
                                                      |*******
                                                      |******
                                                      
                                                      slide 27:
                                                        GUI Tracing Tools
                                                        
                                                        slide 28:
                                                          GUI Visualizations
                                                          1. Event logs
                                                          2. Tables
                                                          3. Line graphs
                                                          4. Histograms
                                                          5. Heatmaps (spectrographs)
                                                          6. Waterfall charts
                                                          7. Directed graphs
                                                          8. Flame graphs
                                                          9. Flame charts
                                                          
                                                          slide 29:
                                                            Visualization 1: Event Logs
                                                            https://commons.wikimedia.org/wiki/File:Wireshark_screenshot.png
                                                            
                                                            slide 30:
                                                              Visualization 2: Tables
                                                              
                                                              slide 31:
                                                                Visualization 3: Line Graphs
                                                                http://www.paradyn.org/html/screen-shots.html
                                                                
                                                                slide 32:
                                                                  Visualization 4: Histograms
                                                                  Or a density plot
                                                                  Or as a frequency trail (can cascade)
                                                                  
                                                                  slide 33:
                                                                    Visualization 5: Heat Maps
                                                                    eg, Oracle ZFS Storage Appliance Analytics (DTrace-based)
                                                                    
                                                                    slide 34:
                                                                      Visualization 5: Spectrograms
                                                                      
                                                                      slide 35:
                                                                        Visualization 6: Waterfall Charts
                                                                        
                                                                        slide 36:
                                                                          Visualization 7: Directed Graphs
                                                                          
                                                                          slide 37:
                                                                            Visualization 8: Flame Graphs
                                                                            fstat from disk
                                                                            directory read
                                                                            from disk
                                                                            file read
                                                                            from disk
                                                                            path read from disk
                                                                            pipe write
                                                                            Commonly used with CPU profilers. Also useful for tracers: off-CPU time, ...
                                                                            
                                                                            slide 38:
                                                                              Visualization 9: Flame Charts
                                                                              
                                                                              slide 39:
                                                                                Desirable Attributes
                                                                                • Valuable
                                                                                – Methodologies provide ideas for purposeful metrics
                                                                                • Documented
                                                                                – Tool tips, wikis
                                                                                • Tested
                                                                                • Real Time
                                                                                • Dashboards
                                                                                – To support methodologies
                                                                                
                                                                                slide 40:
                                                                                  Thank You!
                                                                                  http://www.brendangregg.com
                                                                                  http://slideshare.net/brendangregg
                                                                                  [email protected]
                                                                                  @brendangregg
                                                                                  References & Links:
                                                                                  – http://www.brendangregg.com/heatmaps.html
                                                                                  – http://www.brendangregg.com/flamegraphs.html
                                                                                  – http://www.slideshare.net/brendangregg/monitorama-2015-netflix-instance-analysis