USE Method: FreeBSD Performance Checklist

This page contains an example USE Method-based performance checklist for FreeBSD, for identifying common bottlenecks and errors. This is intended to be used early in a performance investigation, before moving onto more time consuming methodologies. This should be helpful for anyone using FreeBSD, especially system administrators.

This was developed on FreeBSD 10.0 alpha, and focuses on tools shipped by default. With DTrace, I was able to create a few new one-liners to answer some metrics. See the notes below the tables.

Physical Resources

<!HW-START>
componenttypemetric
CPUutilizationsystem-wide: vmstat 1, "us" + "sy"; per-cpu: vmstat -P; per-process: top, "WCPU" for weighted and recent usage; per-kernel-process: top -S, "WCPU"
CPUsaturationsystem-wide: uptime, "load averages" > CPU count; vmstat 1, "procs:r" > CPU count; per-cpu: DTrace to profile CPU run queue lengths [1]; per-process: DTrace of scheduler events [2]
CPUerrorsdmesg; /var/log/messages; pmcstat for PMC and whatever error counters are supported (eg, thermal throttling)
Memory capacityutilizationsystem-wide: vmstat 1, "fre" is main memory free; top, "Mem:"; per-process: top -o res, "RES" is resident main memory size, "SIZE" is virtual memory size; ps -auxw, "RSS" is resident set size (Kbytes), "VSZ" is virtual memory size (Kbytes)
Memory capacitysaturationsystem-wide: vmstat 1, "sr" for scan rate, "w" for swapped threads (was saturated, may not be now); swapinfo, "Capacity" also for evidence of swapping/paging; per-process: DTrace [3]
Memory capacityerrorsphysical: dmesg?; /var/log/messages?; virtual: DTrace failed malloc()s
Network Interfacesutilizationsystem-wide: netstat -i 1, assume one very busy interface and use input/output "bytes" / known max (note: includes localhost traffic); per-interface: netstat -I interface 1, input/output "bytes" / known max
Network Interfacessaturationsystem-wide: netstat -s, for saturation related metrics, eg netstat -s | egrep 'retrans|drop|out-of-order|memory problems|overflow'; per-interface: DTrace
Network Interfaceserrorssystem-wide: netstat -s | egrep 'bad|checksum', for various metrics; per-interface: netstat -i, "Ierrs", "Oerrs" (eg, late collisions), "Colls" [5]
Storage device I/Outilizationsystem-wide: iostat -xz 1, "%b"; per-process: DTrace io provider, eg, iosnoop or iotop (DTT, needs porting)
Storage device I/Osaturationsystem-wide: iostat -xz 1, "qlen"; DTrace for queue duration or length [4]
Storage device I/OerrorsDTrace io:::done probe when /args[0]->b_error != 0/
Storage capacityutilizationfile systems: df -h, "Capacity"; swap: swapinfo, "Capacity"; pstat -T, also shows swap space;
Storage capacitysaturationnot sure this one makes sense - once its full, ENOSPC
Storage capacityerrorsDTrace; /var/log/messages file system full messages
Storage controllerutilizationiostat -xz 1, sum IOPS & tput metrics for devices on the same controller, and compare to known limits [5]
Storage controllersaturationcheck utilization and DTrace and look for kernel queueing
Storage controllererrorsDTrace the driver
Network controllerutilizationsystem-wide: netstat -i 1, assume one busy controller and examine input/output "bytes" / known max (note: includes localhost traffic)
Network controllersaturationsee network interface saturation
Network controllererrorssee network interface errors
CPU interconnectutilizationpmcstat (PMC) for CPU interconnect ports, tput / max
CPU interconnectsaturationpmcstat and relevant PMCs for CPU interconnect stall cycles
CPU interconnecterrorspmcstat and relevant PMCs for whatever is available
Memory interconnectutilizationpmcstat and relevant PMCs for memory bus throughput / max, or, measure CPI and treat, say, 5+ as high utilization
Memory interconnectsaturationpmcstat and relevant PMCs for memory stall cycles
Memory interconnecterrorspmcstat and relevant PMCs for whatever is available
I/O interconnectutilizationpmcstat and relevant PMCs for tput / max if available; inference via known tput from iostat/netstat/...
I/O interconnectsaturationpmcstat and relevant PMCs for I/O bus stall cycles
I/O interconnecterrorspmcstat and relevant PMCs for whatever is available

Software Resources

<!SW-START>
componenttypemetric
Kernel mutexutilizationlockstat -H (held time); DTrace lockstat provider
Kernel mutexsaturationlockstat -C (contention); DTrace lockstat provider [6]; spinning shows up with dtrace -n 'profile-997 { @[stack()] = count(); }'
Kernel mutexerrorslockstat -E (errors); DTrace and fbt provider for return probes and error status
User mutexutilizationDTrace pid provider for hold times; eg, pthread_mutex_*lock() return to pthread_mutex_unlock() entry
User mutexsaturationDTrace pid provider for contention; eg, pthread_mutex_*lock() entry to return times
User mutexerrorsDTrace pid provider for EINVAL, EDEADLK, ... see pthread_mutex_lock(3C) etc.
Process capacityutilizationcurrent/max using: ps -a | wc -l / sysctl kern.maxproc; top, "Processes:" also shows current
Process capacitysaturationnot sure this makes sense
Process capacityerrors"can't fork()" messages
File descriptorsutilizationsystem-wide: pstat -T, "files"; sysctl kern.openfiles / sysctl kern.maxfiles; per-process: can figure out using fstat -p PID and ulimit -n
File descriptorssaturationI don't think this one makes sense, as if it can't allocate or expand the array, it errors; see fdalloc()
File descriptorserrorstruss, dtruss, or custom DTrace to look for errno == EMFILE on syscalls returning fds (eg, open(), accept(), ...)

Other Tools

I didn't include procstat, sockstat, gstat or others, as here I'm beginning with questions (the methodology) and only including tools that answer them. This is instead of the other way around: listing all the tools and trying to find a use for them. Those other tools are useful for other methodologies, which can be used after this one.

What's Next

See the USE Method for the follow-up methodologies after identifying a possible bottleneck. If you complete this checklist but still have a performance issue, move onto other methodologies: drill-down analysis and latency analysis.

Acknowledgements

Resources used:

Filling this this checklist has required a lot of research, testing and experimentation. Please reference back to this post if it helps you develop related material.

It's quite possible I've missed something or included the wrong metric somewhere (sorry); I'll update the post to fix these up as they are understood, and note at the top the update date.

Also see my USE method performance checklists for Solaris, SmartOS, and Linux, and Mac OS X.


Last updated: 03-Apr-2014