Kazuho at Work: Maximum Peformance of MySQL and Q4M

I always use to blog my temporary ideas on one of my Japanese blog (id:kazuhooku's memos). When I wrote my thoughts on how to further optimize Q4M, Nishida-san asked me "how fast is the raw performance without client overhead?" Although it seems a bit diffcult to answer directly, it is easy to measure the performance of MySQL core and the storage engine interface, and by deducting the overhead, the raw performance of I/O operations in Q4M can be estimated. All the benchmarks were taken on linux 2.6.18 running on two Opteron 2218s.

So at first, I measured the raw performance of MySQL core on my testbed using mysqlslap, which was 115k queries per second.

$ perl -e 'print "select 1;\n" for 1..10000' > /tmp/select10k.sql && /usr/local/mysql51/bin/mysqlslap --query=/tmp/select10k.sql --socket=/tmp/mysql51.sock --iterations=1 --concurrency=40
Benchmark
        Average number of seconds to run all queries: 3.470 seconds
        Minimum number of seconds to run all queries: 3.470 seconds
        Maximum number of seconds to run all queries: 3.470 seconds
        Number of clients running queries: 40
        Average number of queries per client: 10000

And the throughput of single row selects to the Q4M storage engine was 76k queries per second.

$ perl -e 'print "select * from test.q4m_t limit 1;\n" for 1..10000' > /tmp/select10k.sql && /usr/local/mysql51/bin/mysqlslap --query=/tmp/select10k.sql --socket=/tmp/mysql51.sock --iterations=1 --concurrency=40
Benchmark
        Average number of seconds to run all queries: 5.282 seconds
        Minimum number of seconds to run all queries: 5.282 seconds
        Maximum number of seconds to run all queries: 5.282 seconds
        Number of clients running queries: 40
        Average number of queries per client: 10000

And finally, the queue consumption speed of Q4M (configure option: --with-mt-pwrite --with-sync=no) was 28k messages per second. And when I turned the --with-sync flag to fsync the speed was 20k messages per second. Considering the fact that consumption of a single row requires two queries (one query for retrieving a row, and one query for removing it), the numbers seem quite well to me, although further optimization would be possible.

$ MESSAGES=200000 CONCURRENCY=40 DBI='dbi:mysql:test;mysql_socket=/tmp/mysql51.sock' t/05-multireader.t 
1..4
ok 1
ok 2
ok 3
ok 4


Multireader benchmark result:
    Number of messages: 200000
    Number of readers:  40
    Elapsed:            7.040 seconds
    Throughput:         28410.198 mess./sec.

And regarding the question about the raw performance of Q4M, the answer would be that the overhead of consuming a single row takes about 30 microseconds in Q4M core with fsync enabled, and about 15 microseconds if only pwrite's are being called.

Kazuho at Work

Kazuho Oku's Blog in English (日本語)

Maximum Peformance of MySQL and Q4M

Post a comment