Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcpreplay won't replay all packets in a pcap file with --netmap flag at 5+ Gbps rate #255

Closed
c-f-s-s opened this issue Jul 21, 2016 · 4 comments
Assignees
Milestone

Comments

@c-f-s-s
Copy link

c-f-s-s commented Jul 21, 2016

I have FreeBSD 10.3 set up with a dual-port 10Gb Intel X520-SR2 network card, the netmap drivers, and tcpreplay.

Recently I discovered that if I replay a pcap file with the --netmap flag at a rate of 5 Gbps or higher all packets will not be transmitted. Tcpreplay will report that every packet was transmitted, but the receiving equipment reports that some are missing. The amount missing is not consistent from run to run, but some are always missing. I have a high-performance capture adapter receiving the traffic; it doesn't report any dropped packets or bad CRCs. In-between the capture adapter and the Intel card transmitting the packets, I have a Gigamon GigaVUE-HB2 which doesn't report any dropped packets but corroborates the number reported by my capture device.

If I remove the --netmap option, or if I retain the --netmap option but replay at a lower rate, such as 1 Gbps, the issue will not reproduce. (without the netmap option my system can't achieve 5 Gbps or higher either so, not sure if differentiating those scenarios means much)

This issue seems to reproduce regardless of the pcap file used. If necessary, I may be willing to find one that I'm ok with sharing.

A screenshot is located below, displaying the number of packets reported by all three tools.

Command example

tcpreplay -i ix1 -M 10000 -K --netmap [pcap file here]

dmesg output regarding dual-port 10Gb Intel X520-SR2

ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.1.13-k> port 0xe020-0xe03f mem 0xdfd80000-0xdfdfffff,0xdfe04000-0xdfe07fff irq 16 at device 0.0 on pci1
ix1: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.1.13-k> port 0xe000-0xe01f mem 0xdfd00000-0xdfd7ffff,0xdfe00000-0xdfe03fff irq 17 at device 0.1 on pci1

uname -rsp:

FreeBSD 10.3-RELEASE amd64

tcpreplay version info:

tcpreplay version: 4.1.1 (build git:v4.1.1)
Copyright 2013-2014 by Fred Klassen - AppNeta
Copyright 2000-2012 by Aaron Turner
The entire Tcpreplay Suite is licensed under the GPLv3
Cache file supported: 04
Compiled against libdnet: 1.12
Compiled against libpcap: 1.4.0
64 bit packet counters: enabled
Verbose printing via tcpdump: enabled
Packet editing: disabled
Fragroute engine: enabled
Default injection method: bpf send()
Not compiled with Quick TX
Optional injection method: netmap

missingpackets

@fklassen
Copy link
Member

Tcpreplay "Successful packets" reported means that netmap accepted the packets for transmission. Possibly you adapter needs more time before switching out of netmap mode, causing some packets to clip. Can you try playing with --nm-delay to see if it goes away?

What version of netmap are you using?

@c-f-s-s
Copy link
Author

c-f-s-s commented Jul 22, 2016

That sounds plausible because, when I save the resulting captures in CSV format and do a diff on them, the missing packets are always the last packets of the pcap file.

However, this flag only seems to affect the delay switching into netmap mode. Switching out of netmap mode seems to be relatively instantaneous, regardless of what I set the flag to. I've tried up to 30 seconds. No improvement on reliability. Is the specified delay supposed to affect switching out of netmap mode as well as into?

I'm not sure how to check the netmap version on FreeBSD, actually. It's whatever version is included in 10.3 as I just recompiled the kernel with the version included in FreeBSD's source. You wouldn't happen to know how I could check the version, do you? I'm not too familiar with working with *nix kernel modules and I am coming up short on Google.

@fklassen
Copy link
Member

I guess its hard coded. I'll have a look. Maybe #250 will help as well, once it gets implemented.

@fklassen fklassen added 4.2.0 and removed 4.1.2 labels Feb 28, 2017
@fklassen fklassen added this to the 4.2 milestone Feb 28, 2017
@fklassen fklassen self-assigned this Feb 28, 2017
@fklassen
Copy link
Member

I did notice similar issues. Added delay before switching netmap to normal mode.

Example of fix, note that TX difference on eth5 shows that the number reported sent by tcpreplay was also reported by the driver.

root@r400-10GBaseT:/home/admin# ifconfig eth5
eth5      Link encap:Ethernet  HWaddr 00:90:fb:49:f8:8d  
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:67854920 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:30738965109 (28.6 GiB)
          Interrupt:17 Memory:f7800000-f7820000 

root@r400-10GBaseT:/home/admin# ./tcpreplay -i netmap:eth5 -Kt -l10 --stats 0  /home/admin/bigFlows.pcap 
^[[A^[[ASwitching network driver for eth5 to netmap bypass mode... 
done!
File Cache is enabled
Test start: 2017-02-27 23:01:33.891493 ...
Loop 1 of 10...
Actual: 791615 packets (355417784 bytes) sent in 2.99 seconds
Rated: 118596900.0 Bps, 948.77 Mbps, 264148.58 pps
Loop 2 of 10...
Actual: 1583230 packets (710835568 bytes) sent in 5.99 seconds
Rated: 118538700.0 Bps, 948.30 Mbps, 264018.98 pps
Loop 3 of 10...
Actual: 2374845 packets (1066253352 bytes) sent in 8.99 seconds
Rated: 118539500.0 Bps, 948.31 Mbps, 264020.80 pps
Loop 4 of 10...
Actual: 3166460 packets (1421671136 bytes) sent in 11.99 seconds
Rated: 118531800.0 Bps, 948.25 Mbps, 264003.53 pps
Loop 5 of 10...
Actual: 3958075 packets (1777088920 bytes) sent in 14.99 seconds
Rated: 118535800.0 Bps, 948.28 Mbps, 264012.47 pps
Loop 6 of 10...
Actual: 4749690 packets (2132506704 bytes) sent in 17.99 seconds
Rated: 118530600.0 Bps, 948.24 Mbps, 264000.95 pps
Loop 7 of 10...
Actual: 5541305 packets (2487924488 bytes) sent in 20.98 seconds
Rated: 118532400.0 Bps, 948.25 Mbps, 264004.96 pps
Loop 8 of 10...
Actual: 6332920 packets (2843342272 bytes) sent in 23.98 seconds
Rated: 118525700.0 Bps, 948.20 Mbps, 263990.03 pps
Loop 9 of 10...
Actual: 7124535 packets (3198760056 bytes) sent in 26.98 seconds
Rated: 118519800.0 Bps, 948.15 Mbps, 263976.88 pps
Loop 10 of 10...
Test complete: 2017-02-27 23:02:03.879221
Actual: 7916150 packets (3554177840 bytes) sent in 29.98 seconds
Rated: 118521000.0 Bps, 948.16 Mbps, 263979.65 pps
Flows: 40686 flows, 1356.75 fps, 7911790 flow packets, 4360 non-flow
Statistics for network device: eth5
	Successful packets:        7916150
	Failed packets:            0
	Truncated packets:         0
	Retried packets (ENOBUFS): 0
	Retried packets (EAGAIN):  6758695
Switching network driver for eth5 to normal mode... done!
root@r400-10GBaseT:/home/admin# ifconfig eth5
eth5      Link encap:Ethernet  HWaddr 00:90:fb:49:f8:8d  
          UP BROADCAST PROMISC MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:75771070 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:34324767972 (31.9 GiB)
          Interrupt:17 Memory:f7800000-f7820000 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants