Performance Tweaks and Tools For Linux

Download as pdf or txt
Download as pdf or txt
You are on page 1of 113

performance tweaks and

tools for linux


joe damato
twitter: @joedamato
blog: timetobleed.com
~12 hour flight
+11 hour time change
and
side effects are..
nasty bugs
ejpphoto (flickr)
fatboyke (flickr)
code
memory bloat
37prime (flickr)
?
TOOLS
LSOF
list open files

lsof -nPp <pid>


lsof -nPp <pid>
-n
Inhibits the conversion of network numbers to host names.

-P
Inhibits the conversion of port numbers to names for network files

FD TYPE NAME json


cwd DIR /var/www/myapp memcached
txt REG /usr/bin/ruby mysql
mem REG /json-1.1.9/ext/json/ext/generator.so http
mem REG /json-1.1.9/ext/json/ext/parser.so
mem REG /memcached-0.17.4/lib/rlibmemcached.so
mem REG /mysql-2.8.1/lib/mysql_api.so
0u CHR /dev/null
1w REG /usr/local/nginx/logs/error.log
2w REG /usr/local/nginx/logs/error.log
3u IPv4 10.8.85.66:33326->10.8.85.68:3306 (ESTABLISHED)
10u IPv4 10.8.85.66:33327->10.8.85.68:3306 (ESTABLISHED)
11u IPv4 127.0.0.1:58273->127.0.0.1:11211 (ESTABLISHED)
12u REG /tmp/RackMultipart.28957.0
33u IPv4 174.36.83.42:37466->69.63.180.21:80 (ESTABLISHED)
STRACE
trace system calls and signals

strace -cp <pid>


strace -ttTp <pid> -o <file>
strace -cp <pid>
-c
Count time, calls, and errors for each system call and report a
summary on program exit.

-p pid
Attach to the process with the process ID pid and begin tracing.

% time seconds usecs/call calls errors syscall


------ ----------- ----------- --------- --------- ----------------
50.39 0.000064 0 1197 592 read
34.65 0.000044 0 609 writev
14.96 0.000019 0 1226 epoll_ctl
0.00 0.000000 0 4 close
0.00 0.000000 0 1 select
0.00 0.000000 0 4 socket
0.00 0.000000 0 4 4 connect
0.00 0.000000 0 1057 epoll_wait
------ ----------- ----------- --------- --------- ----------------
100.00 0.000127 4134 596 total
strace -ttTp <pid> -o <file>
-tt
If given twice, the time printed will include the microseconds.

-T
Show the time spent in system calls.

-o filename
Write the trace output to the file filename rather than to stderr.

epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>


accept(10, {sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>
fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>
fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>
setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>
accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>
epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>
epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>
read(22, "GET / HTTP/1.1\r"..., 16384) = 772 <0.000012>
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>
poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>
write(5, "1\0\0\0\0\0\0-\0\0\0\3SELECT * FROM `table`"..., 56) = 56 <0.000023>
read(5, "\25\1\0\1,\2\0x\234m"..., 16384) = 284 <1.300897>
http client connection
read 772 bytes

read(22, "GET / HTTP/1.1\r"..., 16384) = 772 <0.0012>

incoming http request


took 0.0012s
mysql connection
write sql query to db

write(5, "SELECT * FROM `table`"..., 56) = 56 <0.0023>


read(5, "\25\1\0\1,\2\0x\234m"..., 16384) = 284 <1.30>

read query
response
slow query
strace ruby to see some interesting things...
stracing ruby: SIGVTALRM
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 2207807 <0.000009>
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 0 <0.000009>
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 140734552062624 <0.000009>
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 140734552066688 <0.000009>
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 11333952 <0.000008>
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 0 <0.000009>
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 1 <0.000010>
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---

• ruby 1.8 uses signals to schedule its green threads


• process receives a SIGVTALRM signal every 10ms
stracing ruby: sigprocmask
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.326334 0 3568567 rt_sigprocmask
0.00 0.000000 0 9 read
0.00 0.000000 0 10 open
0.00 0.000000 0 10 close
0.00 0.000000 0 9 fstat
0.00 0.000000 0 25 mmap
------ ----------- ----------- --------- --------- ----------------
100.00 0.326334 3568685 0 total

• debian/redhat compile ruby with --enable-pthread


• uses a native thread timer for SIGVTALRM
• causes excessive calls to sigprocmask: 30%
slowdown!
TCPDUMP
dump traffic on a network

tcpdump -i eth0 -s 0 -nqA


tcp dst port 3306
tcpdump -i <eth> -s <len> -nqA <expr>
tcpdump -i <eth> -w <file> <expr>
-i <eth>
Network interface.

-s <len>
Snarf len bytes of data from each packet.

-n
Don't convert addresses (host addresses, port numbers) to names.

-q
Quiet output. Print less protocol information.

-A
Print each packet (minus its link level header) in ASCII.

-w <file>
Write the raw packets to file rather than printing them out.

<expr>
libpcap expression, for example:
tcp src port 80
tcp dst port 3306
tcp dst port 80
19:52:20.216294 IP 24.203.197.27.40105 >
174.37.48.236.80: tcp 438
E...*[email protected].%&.....%0....POx..%s.oP.......
GET /poll_images/cld99erh0/logo.png HTTP/1.1
Accept: */*
Referer: http://apps.facebook.com/realpolls/?
_fb_q=1
tcp dst port 3306
19:51:06.501632 IP 10.8.85.66.50443 >
10.8.85.68.3306: tcp 98
E..."K@[email protected]
.UB
.UD.....z....L............
GZ.y3b..[......W....
SELECT * FROM `votes` WHERE (`poll_id` =
72621) LIMIT 1
tcpdump -w <file>
PERFTOOLS
google's performance tools

CPUPROFILE=/tmp/myprof ./myapp
pprof ./myapp /tmp/myprof
wget http://google-perftools.googlecode.com/files/google-
perftools-1.6.tar.gz download
tar zxvf google-perftools-1.6.tar.gz
cd google-perftools-1.6

./configure --prefix=/opt
make compile
sudo make install

# for linux
export LD_PRELOAD=/opt/lib/libprofiler.so setup

# for osx
export DYLD_INSERT_LIBRARIES=/opt/lib/libprofiler.dylib

CPUPROFILE=/tmp/ruby.prof ruby -e' profile


5_000_000.times{ "hello world" }
'

pprof `which ruby` --text /tmp/ruby.prof report


pprof ruby pprof ruby
ruby.prof --text ruby.prof --gif
Total: 103 samples
95 92.2% rb_yield_0
103 100.0% rb_eval
12 11.7% gc_sweep
52 50.5% rb_str_new3
3 2.9% obj_free
103 100.0% int_dotimes
12 11.7% gc_mark
Profiling MRI
• 10% of production
VM time spent in
rb_str_sub_bang
• String#sub!
• called from
Time.parse

return unless str.sub!(/\A(\d{1,2})/, '')


return unless str.sub!(/\A( \d|\d{1,2})/, '')
return unless str.sub!(/\A( \d|\d{1,2})/, '')
return unless str.sub!(/\A(\d{1,3})/, '')
return unless str.sub!(/\A(\d{1,2})/, '')
return unless str.sub!(/\A(\d{1,2})/, '')
Profiling EM + threads
Total: 3763 samples
2764 73.5% catch_timer
989 26.3% memcpy
3 0.1% st_lookup
2 0.1% rb_thread_schedule
1 0.0% rb_eval
1 0.0% rb_newobj
1 0.0% rb_gc_force_recycle

• known issue: EM+threads =


slow
• memcpy??
• thread context switches copy
the stack w/ memcpy
• EM allocates huge buffer on
the stack
• solution: move buffer to the
heap
PERFTOOLS.RB
perftools for ruby code

pprof.rb /tmp/myrbprof

github.com/tmm1/perftools.rb
gem install perftools.rb

RUBYOPT="-r`gem which perftools | tail -1`"


CPUPROFILE=/tmp/myrbprof
ruby myapp.rb

pprof.rb /tmp/myrbprof --text


pprof.rb /tmp/myrbprof --gif > /tmp/myrbprof.gif
require 'sinatra'
$ ab -c 1 -n 50 http://127.0.0.1:4567/compute
$ ab -c 1 -n 50 http://127.0.0.1:4567/sleep
get '/sleep' do
sleep 0.25
'done' • Sampling profiler:
end
• 232 samples total
get '/compute' do • 83 samples were in /compute
proc{ |n|
a,b=0,1 • 118 samples had /compute on
the stack but were in
n.times{ a,b = b,a+b } another function
b
}.call(10_000) • /compute accounts for 50%
'done' of process, but only 35% of
time was in /compute itself
end

== Sinatra has ended his set (crowd applauds)


PROFILE: interrupts/evictions/bytes = 232/0/2152

Total: 232 samples


83 35.8% 35.8% 118 50.9% Sinatra::Application#GET /compute
56 24.1% 59.9% 56 24.1% garbage_collector
35 15.1% 75.0% 113 48.7% Integer#times
CPUPROFILE_REALTIME=1
CPUPROFILE=app.prof
CPUPROFILE=app-rt.prof
redis-rb bottleneck
why is rubygems slow?
faster
bundle
install
• 23% spent in
Gem::Version#<=>
• simple patch to rubygems
improved overall install
performance by 15%
• http://gist.github.com/
458185
CPUPROFILE_OBJECTS=1
CPUPROFILE=app-objs.prof
• object allocation profiler
mode built-in
• 1 sample = 1 object
created
• Time parsing is both
CPU and object
allocation intensive
• using mysql2 moves
this to C
LTRACE
trace library calls

ltrace -cp <pid>


ltrace -ttTp <pid> -o <file>
ltrace -c ruby threaded_em.rb
% time seconds usecs/call calls function
------ ----------- ----------- --------- --------------------
48.65 11.741295 617 19009 memcpy
30.16 7.279634 831 8751 longjmp
9.78 2.359889 135 17357 _setjmp
8.91 2.150565 285 7540 malloc
1.10 0.265946 20 13021 memset
0.81 0.195272 19 10105 __ctype_b_loc
0.35 0.084575 19 4361 strcmp
0.19 0.046163 19 2377 strlen
0.03 0.006272 23 265 realloc
------ ----------- ----------- --------- --------------------
100.00 24.134999 82999 total

ltrace -ttT -e memcpy ruby threaded_em.rb


01:24:48.769408 --- SIGVTALRM (Virtual timer expired) ---
01:24:48.769616 memcpy(0x1216000, "", 1086328) = 0x1216000 <0.000578>
01:24:48.770555 memcpy(0x6e32670, "\240&\343v", 1086328) = 0x6e32670 <0.000418>

01:24:49.899414 --- SIGVTALRM (Virtual timer expired) ---


01:24:49.899490 memcpy(0x1320000, "", 1082584) = 0x1320000 <0.000628>
01:24:49.900474 memcpy(0x6e32670, "", 1086328) = 0x6e32670 <0.000479>
LTRACE/LIBDL
trace dlopen’d library calls

ltrace -F <conf> -bg -x


<symbol> -p <pid>

github.com/ice799/ltrace/tree/libdl
ltrace -F <conf> -b -g -x <sym>
-b
Ignore signals.

-g
Ignore libraries linked at compile time.

-F <conf>
Read prototypes from config file.

-x <sym>
Trace calls to the function sym.

-F ltrace.conf
int mysql_real_query(addr,string,ulong);
void garbage_collect(void);
int memcached_set(addr,string,ulong,string,ulong);
ltrace -x garbage_collect
19:08:06.436926 garbage_collect() = <void> <0.221679>
19:08:15.329311 garbage_collect() = <void> <0.187546>
19:08:17.662149 garbage_collect() = <void> <0.199200>
19:08:20.486655 garbage_collect() = <void> <0.205864>
19:08:25.102302 garbage_collect() = <void> <0.214295>
19:08:35.552337 garbage_collect() = <void> <0.189172>
ltrace -x mysql_real_query
mysql_real_query(0x1c9e0500, "SET NAMES 'UTF8'", 16) = 0 <0.000324>
mysql_real_query(0x1c9e0500, "SET SQL_AUTO_IS_NULL=0", 22) = 0 <0.000322>
mysql_real_query(0x19c7a500, "SELECT * FROM `users`", 21) = 0 <1.206506>
mysql_real_query(0x1c9e0500, "COMMIT", 6) = 0 <0.000181>
ltrace -x memcached_set
memcached_set(0x15d46b80, "Status:33", 21, "\004\b", 366) = 0 <0.01116>
memcached_set(0x15d46b80, "Status:96", 21, "\004\b", 333) = 0 <0.00224>
memcached_set(0x15d46b80, "Status:57", 21, "\004\b", 298) = 0 <0.01850>
memcached_set(0x15d46b80, "Status:10", 21, "\004\b", 302) = 0 <0.00530>
memcached_set(0x15d46b80, "Status:67", 21, "\004\b", 318) = 0 <0.00291>
memcached_set(0x15d46b80, "Status:02", 21, "\004\b", 299) = 0 <0.00658>
memcached_set(0x15d46b80, "Status:34", 21, "\004\b", 264) = 0 <0.00243>
GDB
the GNU debugger

gdb <executable>
gdb attach <pid>
Debugging Ruby Segfaults
test_segv.rb:4: [BUG] Segmentation fault
ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9.7.0]

def test
#include "ruby.h" require 'segv'
4.times do
VALUE Dir.chdir '/tmp' do
segv() Hash.new{ segv }[0]
{ end
VALUE array[1]; end
array[1000000] = NULL; end
return Qnil;
} sleep 10
test()
void
Init_segv()
{
rb_define_method(rb_cObject, "segv", segv, 0);
}
1. Attach to running process
$ ps aux | grep ruby
joe 23611 0.0 0.1 25424 7540 S Dec01 0:00 ruby test_segv.rb

$ sudo gdb ruby 23611


Attaching to program: ruby, process 23611
0x00007fa5113c0c93 in nanosleep () from /lib/libc.so.6
(gdb) c
Continuing.

Program received signal SIGBUS, Bus error.


segv () at segv.c:7
7 array[1000000] = NULL;

2. Use a coredump
Process.setrlimit Process::RLIMIT_CORE, 300*1024*1024
$ sudo mkdir /cores
$ sudo chmod 777 /cores
$ sudo sysctl kernel.core_pattern=/cores/%e.core.%s.%p.%t

$ sudo gdb ruby /cores/ruby.core.6.23611.1259781224


def test
require 'segv'
4.times do
Dir.chdir '/tmp' do
Hash.new{ segv }[0]
end
end (gdb) where
end #0 segv () at segv.c:7
#1 0x000000000041f2be in call_cfunc () at eval.c:5727
test() ...
#13 0x000000000043ba8c in rb_hash_default () at hash.c:521
...
#19 0x000000000043b92a in rb_hash_aref () at hash.c:429
...
#26 0x00000000004bb7bc in chdir_yield () at dir.c:728
#27 0x000000000041d8d7 in rb_ensure () at eval.c:5528
#28 0x00000000004bb93a in dir_s_chdir () at dir.c:816
...
#35 0x000000000041c444 in rb_yield () at eval.c:5142
#36 0x0000000000450690 in int_dotimes () at numeric.c:2834
...
#48 0x0000000000412a90 in ruby_run () at eval.c:1678
#49 0x000000000041014e in main () at main.c:48
GDB.RB
gdb with MRI hooks

gem install gdb.rb


gdb.rb <pid>
github.com/tmm1/gdb.rb
(gdb) ruby eval 1+2
3

(gdb) ruby eval Thread.current


#<Thread:0x1d630 run>

(gdb) ruby eval Thread.list.size


8
(gdb) ruby threads list
0x15890 main thread THREAD_STOPPED WAIT_JOIN(0x19ef4)
0x19ef4 thread THREAD_STOPPED WAIT_TIME(57.10s)
0x19e34 thread THREAD_STOPPED WAIT_FD(5)
0x19dc4 thread THREAD_STOPPED WAIT_NONE
0x19dc8 thread THREAD_STOPPED WAIT_NONE
0x19dcc thread THREAD_STOPPED WAIT_NONE
0x22668 thread THREAD_STOPPED WAIT_NONE
0x1d630 curr thread THREAD_RUNNABLE WAIT_NONE
(gdb) ruby objects
HEAPS 8
SLOTS 1686252
LIVE 893327 (52.98%)
FREE 792925 (47.02%)

scope 1641 (0.18%)


regexp 2255 (0.25%)
data 3539 (0.40%)
class 3680 (0.41%)
hash 6196 (0.69%)
object 8785 (0.98%)
array 13850 (1.55%)
string 105350 (11.79%)
node 742346 (83.10%)
(gdb) ruby objects strings
140 u'lib'
158 u'0'
294 u'\n'
619 u''

30503 unique strings


3187435 bytes
def test
require 'segv'
4.times do
Dir.chdir '/tmp' do
Hash.new{ segv }[0]
end
end
end
(gdb) ruby threads
test()
0xa3e000 main curr thread THREAD_RUNNABLE WAIT_NONE
node_vcall segv in test_segv.rb:5
node_call test in test_segv.rb:5
node_call call in test_segv.rb:5
node_call default in test_segv.rb:5
node_call [] in test_segv.rb:5
node_call test in test_segv.rb:4
node_call chdir in test_segv.rb:4
node_call test in test_segv.rb:3
node_call times in test_segv.rb:3
node_vcall test in test_segv.rb:9
rails_warden leak
(gdb) ruby objects classes
1197 MIME::Type
2657 NewRelic::MetricSpec
2719 TZInfo::TimezoneTransitionInfo
4124 Warden::Manager
4124 MethodOverrideForAll
4124 AccountMiddleware
4124 Rack::Cookies
4125 ActiveRecord::ConnectionAdapters::ConnectionManagement
4125 ActionController::Session::CookieStore
4125 ActionController::Failsafe
4125 ActionController::ParamsParser
4125 Rack::Lock
4125 ActionController::Dispatcher
4125 ActiveRecord::QueryCache
4125 ActiveSupport::MessageVerifier
4125 Rack::Head

middleware chain leaking per request


mongrel sleeper thread
0x16814c00 thread THREAD_STOPPED WAIT_TIME(0.47) 1522 bytes
node_fcall sleep in lib/mongrel/configurator.rb:285
node_fcall run in lib/mongrel/configurator.rb:285
node_fcall loop in lib/mongrel/configurator.rb:285
node_call run in lib/mongrel/configurator.rb:285
node_call initialize in lib/mongrel/configurator.rb:285
node_call new in lib/mongrel/configurator.rb:285
node_call run in bin/mongrel_rails:128
node_call run in lib/mongrel/command.rb:212
node_call run in bin/mongrel_rails:281
node_fcall (unknown) in bin/mongrel_rails:19

def run
@listeners.each {|name,s|
s.run
}

$mongrel_sleeper_thread = Thread.new { loop { sleep 1 } }


end
god memory leaks
(gdb) ruby objects arrays 43 God::Process
elements instances 43 God::Watch
94310 3 43 God::Driver
94311 3 43 God::DriverEventQueue
94314 2 43 God::Conditions::MemoryUsage
94316 1 43 God::Conditions::ProcessRunning
43 God::Behaviors::CleanPidFile
5369 arrays 45 Process::Status
2863364 member elements 86 God::Metric
327 God::System::SlashProcPoller
many arrays with 327 God::System::Process
90k+ elements! 406 God::DriverEvent

5 separate god leaks fixed by Eric


Lindvall with the help of gdb.rb!
BLEAK_HOUSE
ruby memory leak detector

ruby-bleak-house myapp.rb
bleak /tmp/bleak.<PID>.*.dump
github.com/fauna/bleak_house
• BleakHouse
• installs a patched version of ruby: ruby-bleak-
house
• unlike gdb.rb, see where objects were created
(file:line)
• create multiple dumps over time with `kill -USR2
<pid>` and compare to find leaks
191691 total objects
Final heap size 191691 filled, 220961 free
Displaying top 20 most common line/class pairs
89513 __null__:__null__:__node__
41438 __null__:__null__:String
2348 ruby/site_ruby/1.8/rubygems/specification.rb:557:Array
1508 ruby/gems/1.8/specifications/gettext-1.9.gemspec:14:String
1021 ruby/gems/1.8/specifications/heel-0.2.0.gemspec:14:String
951 ruby/site_ruby/1.8/rubygems/version.rb:111:String
935 ruby/site_ruby/1.8/rubygems/specification.rb:557:String
834 ruby/site_ruby/1.8/rubygems/version.rb:146:Array
IOPROFILE
summarizes strace and lsof

wget http://aspersa.googlecode.com/svn/trunk/ioprofile
strace -cp <pid>

ioprofile is a script that captures one sample of lsof output then


starts strace for a specified amount of time.

after strace finishes, the results are processed.

below is an example that comes with ioprofile

$ ioprofile t/samples/ioprofile-001.txt
total pread read pwrite write filename
10.094264 10.094264 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_228.ibd
8.356632 8.356632 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_223.ibd
0.048850 0.046989 0.000000 0.001861 0.000000 /data/data/abd/aia_instances.ibd
0.035016 0.031001 0.000000 0.004015 0.000000 /data/data/abd/vo_difuus.ibd
0.013360 0.000000 0.001723 0.000000 0.011637 /var/log/mysql/mysql-relay.002113
0.008676 0.000000 0.000000 0.000000 0.008676 /data/data/master.info
0.002060 0.000000 0.000000 0.002060 0.000000 /data/data/ibdata1
0.001490 0.000000 0.000000 0.001490 0.000000 /data/data/ib_logfile1
0.000555 0.000000 0.000000 0.000000 0.000555 /var/log/mysql/mysql-relay-log.info
0.000141 0.000000 0.000000 0.000141 0.000000 /data/data/ib_logfile0
0.000100 0.000000 0.000000 0.000100 0.000000 /data/data/abd/9fvus.ibd
strace -ttTp <pid> -o <file>
-c CELL
specify what to put in the cells of the output. ‘times’, ‘count’,
or ‘sizes‘.

below is an example of -c sizes:

$ ioprofile -c sizes t/samples/ioprofile-001.txt


total pread read pwrite write filename
90800128 90800128 0 0 0 /data/data/abd_2dia/aia_227_223.ibd
52150272 52150272 0 0 0 /data/data/abd_2dia/aia_227_228.ibd
999424 0 0 999424 0 /data/data/ibdata1
638976 131072 0 507904 0 /data/data/abd/vo_difuus.ibd
327680 114688 0 212992 0 /data/data/abd/aia_instances.ibd
305263 0 149662 0 155601 /var/log/mysql/mysql-relay.002113
217088 0 0 217088 0 /data/data/ib_logfile1
22638 0 0 0 22638 /data/data/master.info
16384 0 0 16384 0 /data/data/abd/9fvus.ibd
1088 0 0 0 1088 /var/log/mysql/mysql-relay-log.info
512 0 0 512 0 /data/data/ib_logfile0
/proc
lots of interesting things hiding in proc
/proc/[pid]/pagemap

• (as of linux 2.6.25)

• /proc/[pid]/pagemap - find out which physical


frame each virtual page is mapped to and swap
flag.

• /proc/kpagecount - stores number of times a


particular physical page is mapped.

• /proc/kpageflags - flags about each page (locked,


slab, dirty, writeback, ...)

• read more: Documentation/vm/pagemap.txt


/proc/[pid]/stack

• get a process’s stack trace


/proc/[pid]/status

• lots of small bits of information


/proc/sys/kernel/core_pattern
• tell system where to output core dumps
• can also launch user program when
core dumps happen.

• echo "|/path/to/core_helper.rb %p %s %u
%g" > /proc/sys/kernel/core_pattern

• /proc/[pid]/ files are kept in place until your


handler exits, so full state of the process at
death may be inspected.

• http://gist.github.com/587443
and lots more
/proc/meminfo
/proc/scsi/*
/proc/net/*
/proc/sys/net/ipv4/*
...
MONITOR RAID STATUS
• how do you know when a hard drive in
your RAID array fails?

• turns out there some command line tools


that most RAID vendors provide.

• adaptec - /usr/StorMan/arcconf getconfig 1


AL

• 3ware - /usr/bin/tw_cli
snooki:/# /usr/StorMan/arcconf getconfig 1 AL
Controllers found: 1
----------------------------------------------------------------------
Controller information
----------------------------------------------------------------------
Controller Status : Optimal
Channel description : SAS/SATA
Controller Model : Adaptec 3405
Controller Serial Number : 7C4911519E3
Physical Slot :2
Temperature : 42 C/ 107 F (Normal)
Installed memory : 128 MB
Copyback : Disabled
Background consistency check : Enabled
Automatic Failover : Enabled
Global task priority : High
Stayawake period : Disabled
Spinup limit internal drives :0
Spinup limit external drives :0
Defunct disk drive count :0
Logical devices/Failed/Degraded : 1/0/0
--------------------------------------------------------
Controller Version Information
--------------------------------------------------------
BIOS : 5.2-0 (17304)
Firmware : 5.2-0 (17304)
Driver : 1.1-5 (2461)
• write a script to parse that
• run it with cron
• the ugly ruby script i use for adaptec:
http://gist.github.com/643666
more well known tools:
iostat, vmstat, top, and free
LINUX TWEAKS
ADJUST TIMER FREQUENCY
CONFIG_HZ_100=y
CONFIG_HZ=100
• Set the timer interrupt frequency.
• Fewer timer interrupts means processes
run with fewer interruptions.
• Servers (without interactive software)
should have lower timer frequency.
CONNECTOR
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
• connector kernel module is useful for
process monitoring.
• build or use a system like god to watch
processes.
• when processes die the kernel notifies you
• you can restart/recover/etc.
TCP SEGMENTATION OFFLOADING
sudo ethtool -K eth1 tso on
• Allows kernel to offload large packet
segmentation to the network adapter.
• Frees the CPU to do more useful work.
• After running the command above, verify
with:
[joe@timetobleed]% dmesg | tail -1

[892528.450378] 0000:04:00.1: eth1: TSO is Enabled


http://kerneltrap.org/node/397
Tx/Rx TCP file send long (bi-directional Rx/Tx):
w/o TSO: 1500Mbps, 82% CPU
w/ TSO: 1633Mbps, 75% CPU

Tx TCP file send long (Tx only):


w/o TSO: 940Mbps, 40% CPU
w/ TSO: 940Mbps, 19% CPU
INTEL I/OAT DMA ENGINE
CONFIG_DMADEVICES=y
CONFIG_INTEL_IOATDMA=y
CONFIG_DMA_ENGINE=y
CONFIG_NET_DMA=y

• these options enable Intel I/OAT DMA


engine present in recent Xeon CPUs.
• increases throughput because kernel can
offload network data copying to the DMA
engine.
• CPU can do more useful work.
• statistics about savings can be found in
sysfs: /sys/class/dma/
check if I/OAT is enabled
[joe@timetobleed]% dmesg | grep ioat

ioatdma 0000:00:08.0: setting latency timer to 64

ioatdma 0000:00:08.0: Intel(R) I/OAT DMA Engine


found, 4 channels, device version 0x12, driver
version 3.64

ioatdma 0000:00:08.0: irq 56 for MSI/MSI-X


DIRECT CACHE ACCESS
CONFIG_DCA=y
• I/OAT includes Direct Cache Access (DCA)
• DCA allows a driver to warm CPU cache.
• Requires driver and device support.
• Intel 10GbE driver (ixgbe) supports this
feature.
• Must enable this feature in the BIOS.
• Some vendors hide BIOS option so you will
need a hack to enable DCA.
DCA enable hack
get a hack to enable DCA on some vendor
locked systems:
http://github.com/ice799/dca_force

read about the hack here:


http://timetobleed.com/enabling-bios-
options-on-a-live-server-with-no-rebooting/
THROTTLE NIC INTERRUPTS
insmod e1000e.ko
InterruptThrottleRate=1
• SOME drivers allow you to specify the interrupt throttling
algorithm.
• e1000e is one of these drivers.
• Two dynamic throttling algorithms: dynamic (1) and dynamic
conservative (3).
• The difference is the interrupt rate for “Lowest Latency”
traffic.
• Algorithm 1 is more aggressive for this traffic class.
• Read driver documentation for more information.
• Be careful to avoid receive livelock.
PROCESS AFFINITY
Process Affinity
• Linux allows you to set which CPU(s) a
process may run on.
• For example, set PID 123 to CPUs 4-6:
# taskset -c 4,5,6 123
IRQ AFFINITY
IRQ Affinity
• NIC, disk, etc IRQ handlers can be set to
execute on specific processors.
• The IRQ to CPU map can be found at:
/proc/interrupts
• Individual IRQs may be set in the file:
/proc/irq/[IRQ NUMBER]/smp_affinity
IRQ affinity
• Can pin IRQ handlers for devices to specific
CPUs.
• Can then use taskset to pin important
processes to other CPUs.
• The result is NIC and disk will not interrupt
important processes running elsewhere.
• Can also help preserve CPU caches.
irqbalance
• http://www.irqbalance.org
• “irqbalance is a Linux* daemon that
distributes interrupts over the processors
and cores you have in your computer
system.”
OPROFILE
CONFIG_OPROFILE=y
CONFIG_HAVE_OPROFILE=y
• oprofile is a system wide profiler that can
profile both kernel and application level code.
• oprofile has a kernel driver which collects data
from CPU registers (on x86 these are MSRs).
• oprofile can also annote source code with
performance information.
• this can help you find and fix performance
problems.
LATENCYTOP
CONFIG_LATENCYTOP=y
• LatencyTOP helps you understand what
caused the kernel to block a process.
nasty bugs
ejpphoto (flickr)
fatboyke (flickr)
code
memory bloat
37prime (flickr)
TOOLS
TWEAKS
Thanks.
(spasiba ?)

@joedamato
timetobleed.com

You might also like