Systems Performance 2nd Ed.



BPF Performance Tools book

Recent posts:
Blog index
About
RSS

Solaris Performance: Introduction

Slides for an introduction to performance analysis and tuning on the Solaris operating system in 2007 by Brendan Gregg.

next
prev
1/30
next
prev
2/30
next
prev
3/30
next
prev
4/30
next
prev
5/30
next
prev
6/30
next
prev
7/30
next
prev
8/30
next
prev
9/30
next
prev
10/30
next
prev
11/30
next
prev
12/30
next
prev
13/30
next
prev
14/30
next
prev
15/30
next
prev
16/30
next
prev
17/30
next
prev
18/30
next
prev
19/30
next
prev
20/30
next
prev
21/30
next
prev
22/30
next
prev
23/30
next
prev
24/30
next
prev
25/30
next
prev
26/30
next
prev
27/30
next
prev
28/30
next
prev
29/30
next
prev
30/30

PDF: Sun2007_Solaris_Performance_Intro.pdf

Keywords (from pdftotext):

slide 1:
    # vmstat 1
    kthr
    r b w
    Solaris Performance:
    Introduction
    Brendan Gregg
    Sun Microsystems
    May 2007
    memory
    swap
    free
    page
    mf pi po fr de
    0 0 0 4596848 120908 0
    0 0 0 4411920 48652 14
    0 0 0 4411576 48316 80 476
    0 0 0 4411576 48316 37 240
    0 0 0 4411196 48004 45 467
    0 0 0 4411196 48004
    2 0 0 4410852 47728 23 236
    1 0 0 4410852 47728
    4 0 0 4410504 47448 23 235
    3 0 0 4410208 47220 23 237
    3 0 0 4410208 47220
    3 0 0 4410208 47220
    3 0 0 4410208 47224
    3 0 0 4410208 47224
    2 0 0 4410648 47596
    3 0 0 4410696 47644
    2 0 0 4410696 47648
    0 0 0 4411384 48204
    kthr
    r b w
    memory
    swap
    page
    free
    mf pi po fr de
    0 0 0 4411736 48488
    0 0 0 4412088 48840 37 239
    0 0 0 4411752 48572 23 234
    0 0 0 4411752 48576 23 237
    1 0 0 4411408 48300
    
    slide 2:
      Solaris Performance: Introduction
      • This presentation is an introduction to the field of
      Solaris performance.
      • These slides cover:
      >gt; Solaris Performance Features
      – Top Features
      – Solaris
      – Solaris 10
      >gt; Solaris Performance Observability
      – By-Layer Strategy
      – 3-Metric Strategy
      – System Components
      
      slide 3:
        Performance Matters
        • How performance helps you:
        1. Shipped performance features
        – Solaris can do more with less
        2. Tune performance features
        – Solaris tunables, library features, compiler optimisation, ...
        3. Manage resources
        – Get the best ROI
        4. Solve performance issues
        – Solaris has outstanding performance observability
        
        slide 4:
          Solaris Performance Features
          • Solaris is a mature operating system with numerous
          performance features
          • Top performance features are,
          >gt; CPU and Memory Scaleability
          >gt; 64-bit Support
          >gt; Fully Preemptive Kernel
          >gt; Resource Management
          >gt; Compiler Technology
          >gt; Observability
          
          slide 5:
            CPU and Memory Scaleability
            • Sun bet on SMP in early 90's
            >gt; Symmetric Multi Processing: user and kernel work
            distributed across all CPUs - best scaleability
            • Per-CPU dispatcher queues
            • Thread CPU affinity
            • Processor sets and interrupt masking
            • CMP and CMT support and optimisations
            • Memory locality aware
            • Kernel page relocation - for hot plug and DR
            
            slide 6:
              64-Bit Support
              • Since Solaris 7 (October 1998)
              • Originally for SPARC, now also AMD64 and IA-64
              Fully Preemptive Kernel
              • Allows Real Time scheduling class
              
              slide 7:
                Resource Management
                • Standard tools: pbind, ulimit
                • Processor sets, pools
                • IPQoS - IP Quality of Service (network priorities)
                • SRM - Solaris Resource Manager
                • Zones + SRM = Containers
                • FSS - Fair Share Schedular
                • Resource Controls
                >gt; CPU shares
                >gt; Max threads, CPU time, file descriptors, ...
                
                slide 8:
                  Compiler Technology
                  • Sun Studio compiler optimises for SPARC, x86
                  • Both gcc and cc can be used (try both and see)
                  • Java VM - hotspot compiler
                  
                  slide 9:
                    Observability
                    • DTrace
                    • Microstate Accounting - prstat -mL
                    • kstat - vmstat, mpstat, ...
                    • procfs - ps, prstat, truss, ...
                    • PICs - cpustat/cputrack, busstat
                    
                    slide 10:
                      Solaris Performance Feature List
                      • Scaleability
                      • Reliability
                      • Fully preemptive
                      kernel
                      • Real-Time
                      scheduling class
                      • Cyclic page cache
                      • Inode cache
                      • UFS buffer cache
                      • DNLC
                      • 64-bit support
                      • direct I/O
                      • cpustat/cputrack
                      • truss/apptrace
                      • libumem
                      • lgroups
                      • TCP MDT
                      • cyclics
                      • processor sets
                      • kstat
                      • procfs
                      • SNMP
                      • DISM
                      • NCA
                      • MPSS
                      • MPO
                      • rcapd
                      • SRM
                      
                      slide 11:
                        Solaris 10 Performance Feature List
                        • DTrace
                        • ZFS
                        • Zones
                        • FireEngine - faster TCP/IP
                        • SMF - faster boot
                        • CMT, Niagara
                        • Numerous performance improvements
                        (many found using DTrace)
                        
                        slide 12:
                          Status
                          • Just Covered,
                          >gt; Solaris Performance Features
                          – Top features
                          – Solaris
                          – Solaris 10
                          • Next up,
                          >gt; Solaris Performance Observability
                          – By-Layer Strategy
                          – 3-Metric Strategy
                          – System Components
                          
                          slide 13:
                            Solaris Performance Observability
                            • Solaris provides numerous performance tools;
                            the trick is knowing what questions to ask performance analysis strategy
                            
                            slide 14:
                              By-Layer Strategy
                              • All software stack layers are observable
                              >gt; locate latency regardless of location
                              Dynamic Languages
                              The Software Stack
                              User Executable
                              Libraries
                              Syscall Interface
                              Memory
                              allocation
                              Kernel
                              File Systems
                              Device Drivers
                              Scheduler
                              Hardware
                              
                              slide 15:
                                By-Layer Strategy
                                • For an application transaction, is the latency,
                                >gt; In the application code?
                                – e.g., bad scaleability architecture
                                >gt; In library code?
                                – e.g., synchronisation locks
                                >gt; In syscalls?
                                – e.g., disk or network I/O
                                >gt; In devices?
                                – e.g., memory bus latency
                                • Solaris observability tools can provide the answers
                                >gt; especially DTrace
                                
                                slide 16:
                                  3-Metric Strategy
                                  • For every system component, look for,
                                  1.Utilisation
                                  2.Saturation
                                  3.Errors
                                  
                                  slide 17:
                                    System Components
                                    How do you measure utilisation, saturation and
                                    errors for these?
                                    CPUs
                                    Memory
                                    Busses
                                    Memory
                                    System Busses
                                    Disks
                                    Net
                                    * Your Architecture
                                    Will Vary
                                    Simple diagram, simple question, this should be
                                    easy to answer.
                                    
                                    slide 18:
                                      System CPU
                                      • Load average = overall utilisation + saturation
                                      $ uptime
                                      2:30pm
                                      up 39 day(s), 12:40,
                                      5 users,
                                      load average: 0.07, 0.07, 0.11
                                      >gt; printed by uptime, prstat
                                      >gt; 1, 5 and 15 minute averages.
                                      >gt; Divide load average by CPU count,
                                      – value gt; 1.0
                                      suggests saturation
                                      >gt; Useful for an initial impression, then move onto other
                                      tools like vmstat and mpstat
                                      
                                      slide 19:
                                        System CPU
                                        • vmstat - utilisation and saturation as metrics
                                        $ vmstat 1
                                        kthr
                                        memory
                                        r b w
                                        swap free re
                                        0 0 0 4592308 120572 0
                                        2 0 0 4349740 48280 10
                                        0 0 0 4349756 48320 0
                                        [...]
                                        page
                                        disk
                                        mf pi po fr de sr cd s0 -- -3 0 0 0 0 5 30 -1 0 0
                                        28 0 0 0 0 0 0 0 0 0
                                        0 0 0 0 0 0 0 0 0 0
                                        faults
                                        cpu
                                        cs us sy id
                                        967 5343 861 2 1 97
                                        602 1253 791 55 0 45
                                        608 1059 723 50 1 49
                                        >gt; first line is summary since boot
                                        >gt; kthr:r = saturation, total threads on the run queues (but
                                        sampled at a low rate)
                                        >gt; cpu:us + cpu:sy = utilisation, CPU user and system time
                                        
                                        slide 20:
                                          System CPU
                                          • mpstat - utilisation by-CPU
                                          $ mpstat 1
                                          CPU minf mjf xcal
                                          0 108
                                          CPU minf mjf xcal
                                          CPU minf mjf xcal
                                          0 175
                                          [...]
                                          intr ithr
                                          607 338
                                          intr ithr
                                          451 323
                                          intr ithr
                                          620 328
                                          csw icsw migr smtx
                                          csw icsw migr smtx
                                          csw icsw migr smtx
                                          srw syscl
                                          0 2580
                                          0 2762
                                          srw syscl
                                          srw syscl
                                          usr sys
                                          usr sys
                                          usr sys
                                          wt idl
                                          0 96
                                          0 97
                                          wt idl
                                          0 14
                                          0 86
                                          wt idl
                                          0 16
                                          0 82
                                          • Classic performance problem - under utilised CPUs
                                          due to poor threading architecture
                                          
                                          slide 21:
                                            System CPU
                                            • Solaris 10 FMA detects and can automatically
                                            respond to CPU errors
                                            • fmadm faulty - what faults currently exist
                                            • fmstat -m cpumem-retire - raw statistics
                                            $ fmstat -m cpumem-retire
                                            NAME VALUE
                                            auto_flts 0
                                            bad_flts 0
                                            cpu_blfails 0
                                            cpu_blsupp 0
                                            cpu_fails 0
                                            cpu_flts 0
                                            cpu_supp 0
                                            nop_flts 0
                                            [...]
                                            DESCRIPTION
                                            auto-close faults received
                                            invalid fault events received
                                            failed cpu blacklists
                                            cpu blacklists suppressed
                                            cpu faults unresolveable
                                            cpu faults resolved
                                            cpu offlines suppressed
                                            inapplicable fault events received
                                            
                                            slide 22:
                                              System Memory
                                              • vmstat - swap and physical memory utilisation
                                              and saturation
                                              $ vmstat 1
                                              kthr
                                              memory
                                              r b w
                                              swap free re
                                              0 0 0 4592236 120548 0
                                              0 0 0 4350572 48096 18
                                              0 0 0 4350572 48124 0
                                              [...]
                                              page
                                              disk
                                              faults
                                              cpu
                                              mf pi po fr de sr cd s0 -- -in
                                              cs us sy id
                                              3 0 0 0 0 5 30 -1 0 0 967 5342 861 2 1 97
                                              30 0 0 0 0 0 0 0 0 0 687 1114 781 0 1 99
                                              0 0 0 0 0 0 0 0 0 0 6206 37271 11979 3 12 85
                                              >gt; swap - free virtual memory (RAM + disk based swap)
                                              >gt; free - available physical memory (RAM)
                                              >gt; page:sr - values suggest physical memory saturation
                                              • mdb -k - provides breakdown with ::memstat
                                              
                                              slide 23:
                                                System Memory
                                                • Solaris 10 FMA detects and can automatically
                                                respond to memory errors
                                                • For example, blacklisting a page of RAM that has
                                                had too many (correctable) ECC errors
                                                • fmadm faulty - what is currently faulted
                                                • fmstat -m cpumem-retire - raw statistics
                                                
                                                slide 24:
                                                  System Disks
                                                  • iostat - disk utilisation, saturation, errors
                                                  $ iostat -xnmpz 5
                                                  r/s
                                                  [...]
                                                  w/s
                                                  extended device statistics
                                                  kr/s
                                                  kw/s wait actv wsvc_t asvc_t
                                                  0.0 0.0 0.0
                                                  0.0 0.0 0.0
                                                  0.1 0.0 0.0
                                                  21.8 0.0 0.0
                                                  0.0 0.0 0.0
                                                  %b device
                                                  0 c0t0d0
                                                  0 c1t0d0s0
                                                  0 c1t0d0s1
                                                  1 c1t0d0s3 (/)
                                                  0 c1t0d0s4
                                                  >gt; first output is summary since boot
                                                  >gt; %b - percent busy, a measure of utilisation
                                                  >gt; wait - transactions waiting, a measure of saturation
                                                  • iostat -E - error summaries
                                                  
                                                  slide 25:
                                                    System Network
                                                    • kstat - network utilisation, saturation, errors
                                                    $ kstat -n nge0 10
                                                    module: nge
                                                    name:
                                                    nge0
                                                    brdcstrcv
                                                    brdcstxmt
                                                    collisions
                                                    crtime
                                                    ierrors
                                                    ifspeed
                                                    ipackets
                                                    [...]
                                                    instance: 0
                                                    class:
                                                    net
                                                    >gt; output includes byte counts, various errors
                                                    • netstat and nicstat (opensource) provide
                                                    useful summaries of network stats
                                                    
                                                    slide 26:
                                                      System Busses
                                                      • Measuring utilisation, saturation and errors is
                                                      hard, but usually still possible with some effort
                                                      >gt; cpustat - measure CPU Performance Instrumentation
                                                      Counters (PICs)
                                                      – PICs for cache activity, memory bus activity, instruction events
                                                      >gt; cputrack - CPU PICs for a process
                                                      >gt; busstat - On some SPARC systems, provides
                                                      hardware bus PICs
                                                      
                                                      slide 27:
                                                        Processes
                                                        • Apart from performance observability by-system,
                                                        also examine performance observability by-process.
                                                        • prstat -mL - useful microstates by thread
                                                        $ prstat -mL
                                                        PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
                                                        557 brendan 7.9 0.3 0.0 0.0 0.0 0.0 91 0.5 579 141 2K 96 Xorg/1
                                                        828 brendan 0.6 0.4 0.0 0.0 0.0 0.0 95 4.1 434 299 2K
                                                        0 ssh/1
                                                        830 brendan 0.2 0.0 0.0 0.0 0.0 0.0 99 0.3 36 11 160
                                                        0 gnome-termin/1
                                                        788 brendan 0.1 0.1 0.0 0.0 0.0 0.0 100 0.0 58
                                                        0 910
                                                        0 dtwm/1
                                                        1437 brendan 0.0 0.1 0.0 0.0 0.0 0.0 100 0.0 44
                                                        2 297
                                                        0 prstat/1
                                                        791 brendan 0.0 0.0 0.0 0.0 0.0 0.0 100 0.3
                                                        7 11 129
                                                        0 dtterm/1
                                                        [...]
                                                        • DTrace - measure custom microstates
                                                        >gt; in terms of application activity, across all software layers
                                                        
                                                        slide 28:
                                                          Further Observability
                                                          • Much more can be observed and analysed on
                                                          Solaris
                                                          >gt; DTrace is its own field of study
                                                          • “You don't miss what you never had”
                                                          >gt; Once you start exploring Solaris observability, other
                                                          OSes won't feel the same again
                                                          
                                                          slide 29:
                                                            References
                                                            • http://www.solarisinternals.com
                                                            >gt; Latest Solaris Performance Slides
                                                            >gt; Performance wiki
                                                            • The “Solaris Performance and Tools” book,
                                                            http://www.sun.com/books/catalog/solaris_perf_tools.xml
                                                            • Performance Community,
                                                            http://www.opensolaris.org/os/community/performance
                                                            
                                                            slide 30:
                                                              Ctrl-D
                                                              Brendan Gregg
                                                              [email protected]