dallas marlow

andrew.tumblr.com: Combining tar, pigz, and netcat

some interesting timing info with this:

(both machines are 12 core xeons (hyperthreading enabled) with 12 ssd’s in raid 10 on a 1Gb network)

transferring with no compression saturates 1Gb network, giving disk writes on the target at about 120MB/s.
transferring with min compression cuts the network traffic in half allowing for higher disk throughput than possible on a normal 1Gb network.
using 12 threads (the number for physical cores not counting hyperthreading) lowered compression throughput.
seemed to be disk bound at about 250MB/s or so, raising the compression level ( > 6 ) and/or number of threads ( > 24 ) didn’t seem to help much after that.
qpress is faster, but pigz is already on all the machines and in the package repo (and these machines are already disk bound).
i played with making the block size larger, but it didn’t appear to make a big difference in these rounds of copies.

compression level	compression threads	network traffic	disk write speed	cpu usage (%)
0	0	1Gb	120MB	2%
1	24	510Mb	200MB	20%
6	12	370Mb	200MB	40%
6	24	420Mb	230MB	60%

#no compression
tar vc . | nc 1.2.3.4 5678 #send
tar xvf - < <(nc -l 5678) #recv

#level 6 (default), 24 threads (default)
tar vc . | pigz | nc 1.2.3.4 5678 #send
tar xvf - < <(nc -l 5678 | pigz -d) #recv

andrew:

Typically used scp to transfer files around. But more frequently, for faster results, I find myself using a combo of tar + pigz + netcat, especially when transferring larger amounts of data.

Here’s how to run tar + pigz + netcat. On the source server, use netcat to open up a listening socket…