43. RPS netperf result
netperf benchmark result on lwn.net:
e1000e on 8 core Intel
Without RPS: 90K tps at 33% CPU
With RPS: 239K tps at 60% CPU
foredeth on 16 core AMD
Without RPS: 103K tps at 15% CPU
With RPS: 285K tps at 49% CPU
13年6月7日金曜日
54. RFS netperf result
netperf benchmark result on lwn.net:
e1000e on 8 core Intel
No RFS or RPS 104K tps at 30% CPU
No RFS (best RPS config): 290K tps at 63% CPU
RFS 303K tps at 61% CPU
RPC test tps CPU% 50/90/99% usec latency StdDev
No RFS or RPS 103K 48% 757/900/3185 4472.35
RPS only: 174K 73% 415/993/2468 491.66
RFS 223K 73% 379/651/1382 315.61
13年6月7日金曜日
57. • Disabling RSS on the fly is not allowed, and the 82599 must be reset after RSS is disabled.
• When RSS is disabled, packets are assigned an RSS output index = zero.
When multiple request queues are enabled in RSS mode, un-decodable packets are assigned an RSS
output index = zero. The 32-bit tag (normally a result of the hash function) equals zero.
Receive Side Scalingの制限
Parsed receive packet
RSS hash
7 LS
bits
32
Packet Descriptor
7
128フローしか
識別出来ない
RSS Disable or (RSS
& not decodable)
Redirection Table
128 x 4
4
0
4
RSS output index
32bitのハッシュ値のうち
4bitしか使ってない
フローが多いとハッシュ衝突する為、特定フローを
Figure 7.10. RSS Block Diagram
特定CPUへキューするのには向いていない
60. following figure shows a block diagram of the flow director filters. Received flows are identified buckets by a hash function on the relevant tuples as defined by the FDIR...M registers. Each bucket organized in a linked list indicated by the hash lookup table. Buckets can have a variable length while
last filter in each bucket is indicated as a last. There is no upper limit for a linked list length during
Flow Director
programming; however, a received packet that matches a filter that exceeds the FDIRCTRL.Max-Length
reported to software (see Section 7.1.2.7.5).
Logic AND of Rx Packet tuples with
the Flexible filters Mask registers
~350
Hash
15 bit output
15 bit address
Bucket Valid First Filter PTR
Bucket Valid First Filter PTR
. . .
. . .
Flow ID Fields in “Perfect Match mode”
Hash (Signature)
15 bit output Flow ID Field in “Signature mode”
32K Filter Action
Hash Lookup Table
Shares the Rx
packet buffer memory space
Addr
0
1
2
. . .
M
Bucket Valid First Filter PTR
Bucket Valid First Filter PTR
Hash-Index = 0
Flow ID fields
Filter Action
Collision flag
Next Filter PTR
Hash-Index = 1
Flow ID fields
Filter Action
Collision flag
Next Filter PTR
Hash-Index = N
Flow ID fields
. . . Filter Action
Collision flag
Next Filter PTR
Hash-Index = N+1
Flow ID fields
Filter Action . . .
Collision flag
Next Filter PTR
Max recommended linked list length
(FDIRCTRL.Max-Length)
Hash-Index = 0
Flow ID fields
Filter Action
Collision flag
Next Filter PTR
Hash-Index = 1
Flow ID fields
Collision flag
Next Filter PTR
. . .
Bucket Valid First Filter PTR
Bucket M (linked list M)
Bucket 0 (linked list 0)
‘too long’
Linked list
Flexible Filters table - Shares the Rx packet buffer memory space
61. Linuxでの利用例#1(自
動)
送信処理
プロトコル
スタック
ソケット
システムコール
proce
ss
ドライバ
Txq
NIC
Flow
Director
Filters
フィルタ
更新
プロセスコンテキストからのパケット送出時に送信
元CPUとパケットヘッダを用いてフィルタを更新