概要
こちらの投稿を参考にKVM上でMeltdown/Spectreパッチ適用後に性能試験を実施しました。
今回はとりあえずunixBenchのみです。適宜検証したら追記したいと思います。
Redhatの参考情報
https://access.redhat.com/ja/security/vulnerabilities/3311961
前提条件
- 検証するVMが動作するKVMホストは、パッチ適用済です。
- 検証するVMは、CentOS7.3、CPU:4core、Memory:16GB
バージョン情報
VM(パッチ未適用)
$ uname -a
Linux vm1 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 13 10:46:25 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
VM(パッチ適用)
$ uname -a
Linux vm2 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
脆弱性チェック
パッチ未適用VM
Spectre Variant 1,2,3に脆弱性を確認
$ sudo sh ./spectre-meltdown-checker.sh
Spectre and Meltdown mitigation detection tool v0.31
Checking for vulnerabilities against running kernel Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 13 10:46:25 EDT 2017 x86_64
CPU is QEMU Virtual CPU version 1.5.3
CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: NO
> STATUS: VULNERABLE (only 21 opcodes found, should be >= 70, heuristic to be improved when official patches become available)
CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation
* The SPEC_CTRL MSR is available: YES
* The SPEC_CTRL CPUID feature bit is set: NO
* Kernel support for IBRS: NO
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)
CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): NO
* PTI enabled and active: NO
* Checking if we're running under Xen PV (64 bits): NO
> STATUS: VULNERABLE (PTI is needed to mitigate the vulnerability)
A false sense of security is worse than no security at all, see --disclaimer
パッチ適用VM
Spectre Variant 2に脆弱性を確認。それ以外はOK。
$ sudo sh ./spectre-meltdown-checker.sh
Spectre and Meltdown mitigation detection tool v0.31
Checking for vulnerabilities against running kernel Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64
CPU is QEMU Virtual CPU version 1.5.3
CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: YES
> STATUS: NOT VULNERABLE (106 opcodes found, which is >= 70, heuristic to be improved when official patches become available)
CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation
* The SPEC_CTRL MSR is available: YES
* The SPEC_CTRL CPUID feature bit is set: NO
* Kernel support for IBRS: YES
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)
CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Checking if we're running under Xen PV (64 bits): NO
> STATUS: NOT VULNERABLE (PTI mitigates the vulnerability)
A false sense of security is worse than no security at all, see --disclaimer
unixBenchの結果
4coreのテストのみを記載
パッチ未適用VM
------------------------------------------------------------------------
Benchmark Run: Wed Jan 17 2018 15:07:20 - 15:14:03
4 CPUs in system; running 4 parallel copies of tests
Dhrystone 2 using register variables 131231880.5 lps (10.0 s, 1 samples)
Double-Precision Whetstone 16937.1 MWIPS (9.6 s, 1 samples)
Execl Throughput 16987.2 lps (29.3 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 1389846.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 385034.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 4373700.0 KBps (30.0 s, 1 samples)
Pipe Throughput 6748830.2 lps (10.0 s, 1 samples)
Pipe-based Context Switching 1417922.6 lps (10.0 s, 1 samples)
Process Creation 44711.5 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 21357.6 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 3414.1 lpm (60.0 s, 1 samples)
System Call Overhead 6678039.5 lps (10.0 s, 1 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 131231880.5 11245.2
Double-Precision Whetstone 55.0 16937.1 3079.5
Execl Throughput 43.0 16987.2 3950.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 1389846.0 3509.7
File Copy 256 bufsize 500 maxblocks 1655.0 385034.0 2326.5
File Copy 4096 bufsize 8000 maxblocks 5800.0 4373700.0 7540.9
Pipe Throughput 12440.0 6748830.2 5425.1
Pipe-based Context Switching 4000.0 1417922.6 3544.8
Process Creation 126.0 44711.5 3548.5
Shell Scripts (1 concurrent) 42.4 21357.6 5037.2
Shell Scripts (8 concurrent) 6.0 3414.1 5690.2
System Call Overhead 15000.0 6678039.5 4452.0
========
System Benchmarks Index Score 4523.3
パッチ適用VM
------------------------------------------------------------------------
Benchmark Run: Wed Jan 17 2018 15:29:47 - 15:36:30
4 CPUs in system; running 4 parallel copies of tests
Dhrystone 2 using register variables 128905413.5 lps (10.0 s, 1 samples)
Double-Precision Whetstone 16450.1 MWIPS (9.8 s, 1 samples)
Execl Throughput 11097.3 lps (29.0 s, 1 samples)
File Copy 1024 bufsize 2000 maxblocks 1203052.0 KBps (30.0 s, 1 samples)
File Copy 256 bufsize 500 maxblocks 313511.0 KBps (30.0 s, 1 samples)
File Copy 4096 bufsize 8000 maxblocks 3669716.0 KBps (30.0 s, 1 samples)
Pipe Throughput 2038905.8 lps (10.0 s, 1 samples)
Pipe-based Context Switching 638231.7 lps (10.0 s, 1 samples)
Process Creation 34631.1 lps (30.0 s, 1 samples)
Shell Scripts (1 concurrent) 16080.8 lpm (60.0 s, 1 samples)
Shell Scripts (8 concurrent) 2448.6 lpm (60.0 s, 1 samples)
System Call Overhead 1568599.1 lps (10.0 s, 1 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 128905413.5 11045.9
Double-Precision Whetstone 55.0 16450.1 2990.9
Execl Throughput 43.0 11097.3 2580.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 1203052.0 3038.0
File Copy 256 bufsize 500 maxblocks 1655.0 313511.0 1894.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 3669716.0 6327.1
Pipe Throughput 12440.0 2038905.8 1639.0
Pipe-based Context Switching 4000.0 638231.7 1595.6
Process Creation 126.0 34631.1 2748.5
Shell Scripts (1 concurrent) 42.4 16080.8 3792.6
Shell Scripts (8 concurrent) 6.0 2448.6 4081.0
System Call Overhead 15000.0 1568599.1 1045.7
========
System Benchmarks Index Score 2905.0
unixBenchの性能比較結果
参考にしたqiitaの記事通りのテスト項目が著しく低下することを確認しました。
System Benchmarks Index Values | パッチ適用前 | パッチ適用後 | 性能差(%) |
---|---|---|---|
Dhrystone 2 using register variables | 11245.2 | 11045.9 | 98.23% |
Double-Precision Whetstone | 3079.5 | 2990.9 | 97.12% |
Execl Throughput | 3950.5 | 2580.8 | 65.33% |
File Copy 1024 bufsize 2000 maxblocks | 3509.7 | 3038 | 86.56% |
File Copy 256 bufsize 500 maxblocks | 2326.5 | 1894.3 | 81.42% |
File Copy 4096 bufsize 8000 maxblocks | 7540.9 | 6327.1 | 83.90% |
Pipe Throughput | 5425.1 | 1639 | 30.21% |
Pipe-based Context Switching | 3544.8 | 1595.6 | 45.01% |
Process Creation | 3548.5 | 2748.5 | 77.46% |
Shell Scripts (1 concurrent) | 5037.2 | 3792.6 | 75.29% |
Shell Scripts (8 concurrent) | 5690.2 | 4081 | 71.72% |
System Call Overhead | 4452 | 1045.7 | 23.49% |
System Benchmarks Index Score | 4523.3 | 2905.0 | 64.22% |
テスト項目の意味は、こちらを参考にしました。
実際にちゃんと本番相当の負荷をかけて本格的なテストをしたかったですが、一旦ここまでとします。続報があれば追記します。
2018/01/23追記
mysqlslapの結果
こちらの記事を参考に、mysqlslapの性能を検証しました。
全体的に10%から20%前後の性能劣化が確認できました。特にinsert系の性能劣化が目立ちます。
テスト内容 | テスト結果の分類 | パッチ適用前 | パッチ適用後 | 性能差(%) |
---|---|---|---|---|
50スレッド 1000行データ 1000クエリ read(テーブルスキャン) | Average | 0.602 | 0.676 | -12.29% |
Minimum | 0.495 | 0.548 | -10.71% | |
Maximum | 0.873 | 1.016 | -16.38% | |
50スレッド 1000行データ 1000クエリ write(テーブルへの挿入) | Average | 0.148 | 0.173 | -16.89% |
Minimum | 0.124 | 0.152 | -22.58% | |
Maximum | 0.173 | 0.196 | -13.29% | |
50スレッド 1000行データ 1000クエリ key(主キー読み取り) | Average | 0.075 | 0.085 | -13.33% |
Minimum | 0.069 | 0.079 | -14.49% | |
Maximum | 0.085 | 0.095 | -11.76% | |
50スレッド 1000行データ 1000クエリ mixed(挿入とテーブルスキャンを半々) | Average | 0.106 | 0.121 | -14.15% |
Minimum | 0.098 | 0.104 | -6.12% | |
Maximum | 0.13 | 0.128 | 1.54% | |
50スレッド 1000行データ 1000クエリ update(更新) | Average | 0.176 | 0.189 | -7.39% |
Minimum | 0.159 | 0.175 | -10.06% | |
Maximum | 0.197 | 0.209 | -6.09% |
また続報があれば追記します。