SlideShare a Scribd company logo
Copyright 2014 FUJITSU LIMITED 
BtrfsCurrent Status and Future Prospects 
Oct 13 2014 
Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> 
Fujitsu LTD.
Copyright 2014 FUJITSU LIMITED 
Background 
Core Features 
Developments Statistics 
Future Prospects 
Agenda 
1
Copyright 2014 FUJITSU LIMITED 
Background 
Core Features 
Developments Statistics 
Future Prospects 
Agenda 
2
Copyright 2014 FUJITSU LIMITED 
Fujitsu has developed Btrfsfor Mission Critical (MC) systems since 2010 
Requirements of MC systems 
High robustness 
•Don’t crash: data duplication 
•Error detection: checksum 
•Repair, recovery: snapshot, backup/restore, repairing tools 
High availability: Should work 365days/24h 
•Limited maintenance time: enlarge storage size and backup online 
Btrfsis designed for such the requirements 
Background 
3
Copyright 2014 FUJITSU LIMITED 
Background 
Core Features 
Developments Statistics 
Future Prospects 
Agenda 
4
Copyright 2014 FUJITSU LIMITED 
Multi-volumes 
Copy-on-Write Style Update 
Data/Metadata Checksum 
Subvolume 
Snapshot 
Transparent Compression 
Core Features 
5
Multi-volumes 
Btrfs file system can consists of multiple volumes 
Low layered and low overhead than LVM 
Many features: RAID, online {add/remove/replace} devices 
Copyright 2014 FUJITSU LIMITED 
VFS 
File system 
(XFS, ext4and so on) 
LVM 
Block device 
VFS 
Block device 
Btrfs 
XFS or ext4 + LVM 
Btrfs 
# mkfs.btrfs/dev/sd{a,b,c}1 
6
Copy-on-Write(CoW) style update 
Btrfs uses CoW style data/metadata update 
Safer than overwrite style update by design 
Overwrite style: Update the data in place 
CoW style: Copy, update, and replace pointer 
Copyright 2014 FUJITSU LIMITED 
file 
data 
file 
data 
file 
data 
file 
data 
file 
data 
data 
file 
data 
data 
file 
data 
7 
Will be deleted later 
System crash => data keep consistency 
System crash => data become inconsistent
CoW versus Overwrite 
1,000 surprising power failure test 
LinuxFileSystemAnalysisforIVIsystem,MitsuharuIto,Fujitsu 
http://events.linuxfoundation.jp/sites/events/files/slides/linux_file_system_analysis_for_IVI_systems.pdf 
Result 
Ext4: Metadata was corrupted 
Btrfs:Workedfinewithoutanyproblem 
Inmyinternalsimilartesting,XFScorruptedtoo. 
Copyright 8 2014 FUJITSU LIMITED
Copyright 2014 FUJITSU LIMITED 
Btrfs has checksum for each data/metadata extent to detect and repair the broken data 
When Btrfs reads a broken extent, it detects checksum inconsistency 
With mirroring: RAID1/RAID10 
•Read a correct copy 
•Repair a broken extent with a correct copy 
Without mirroring 
•Dispose a broken extent and return EIO 
With “btrfs scrub”, Btrfs traverses all extents and fix incorrect ones 
Online background job 
Data/Metadata Checksum 
9
Subvolume 
A subvolumeis a file system inside file system 
Can be treated as a file system root 
•Mountable: most mount options are shared 
•Own inodenamespace and quota limit 
Efficient: Available space is shared 
Copyright 2014 FUJITSU LIMITED 
/ 
sub 
10 
# btrfs subvolumecreate sub
Snapshot 
Copy of a subvolume 
Far faster than LVM 
•Not a full copy, but only update metadata in CoW style 
Readonly snapshot: with –r option 
Incremental snapshot: snapshot of snapshot 
Copyright 2014 FUJITSU LIMITED 
snap 
sub 
2 
2 
2 
2 
2 
2 
# btrfs subvolumesnapshot [-r] ./sub ./snap 
A 
B 
C 
11 
sub 
1 
1 
1 
1 
1 
1 
A 
B 
C 
Reference count 
snap 
sub 
1 
1 
2 
1 
2 
2 
A 
B 
C 
C’ 
1 
1 
1 
Capture a snapshot 
Update data C in a snapshot
Performance of Snapshot: Btrfs versus LVM 
1.Copy the following data to a volume 
Consists of 100 directories and 100 files for each directory 
•File size: 1MB 
2.Capture a snapshot of the volume 
Copyright 12 2014 FUJITSU LIMITED 
Hardware Environment 
Software Environment 
•PRIMERGY RX300 S6 
•CPU:Intel Xeon X5690 3.47GHz x12 core 
•Memory:16GiB 
•Storages:100GB HDD x 2 
•Red Hat Enterprise Linux 7.0 
•Filesystems 
•Btrfs 
•Data/metadata: RAID1 
•Other options: default 
•XFS:default options 
•Volumemanager for XFS 
•dm-thinp:chunksizeis 256KiB 
•LVM:RAID1
Result 
Copy: Btrfs > LVM >>> dm-thinp 
Snapshot: Btrfs > dm-thinp >>> LVM 
Copyright 13 2014 FUJITSU LIMITED 
Volumetype 
Copy 
Snapshot 
Without page cache 
With page cache 
Btrfs 
106s 
0.126s 
11.7s 
XFS on dm-thinp 
209s 
0.260s 
15.5s 
XFSon LVM 
133s 
1.03s 
45.2s
Transparent compression 
Automatically compress/expand file data on I/O 
Low space consumption and high I/O performance 
•Need some extra CPU time 
Usage: mount -o compress={lzo,zlib} <device><mntpoint> 
•Can also be enabled/disabled for each file 
Copyright 14 2014 FUJITSU LIMITED 
compress/expand 
system 
storage 
Page cache 
without compression 
with compression
Copyright 2014 FUJITSU LIMITED 
Background 
Core Features 
Developments statistics 
Future Prospects 
Agenda 
15
Copyright 2014 FUJITSU LIMITED 
Patch statistics 
Performance 
Summary 
Developments statistics 
16
Patch Statistics 
17 Copyright 2014 FUJITSU LIMITED
Patch Statistics: Tips of v3.17 
18 Copyright 2014 FUJITSU LIMITED 
Rejected by Linus
Patch statistics: Main changes 
19 Copyright 2014 FUJITSU LIMITED 
Inodeproperties 
offline dedup 
Improve sync write ~60% 
RAID5/6 
replace subcommand 
quota 
send/receive 
btrfsck 
repair 
auto defrag 
scrub 
Improve error handling 
Inodeproperties
Fujitsu’s contribution 
20 Copyright 2014 FUJITSU LIMITED 
•btrfsck, error handling 
•fast {random/async} write 
•LZO compression 
•read only snapshot 
•random Bug fixes 
•enrich xfstests
Copyright 2014 FUJITSU LIMITED 
Fujitsu’s contribution: btrfs-progs 
21 
•fsck 
•error handling 
•random bug fixes 
•enrich xfstests 
•documentation
Performance measurement 
22 Copyright 2014 FUJITSU LIMITED 
Hardware Environment 
Software Environment 
•PRIMERGY TX300 S6 
•CPU: Xeon x5670 x 2 
•12 core 
•HT is disabled 
•Memory: 4GB 
•HDD: 300GB x 1 
•MegaRAIDSAS, HITACHI HUS156030VLS600 
•Benchmark software: filebench 
•Kernel: 3.14.11, 3.15.4, 3.16.3, and 3.17-rc2 
•I/O scheduler: deadline 
•File systems: Btrfs(single volume), XFS, and ext4 
•default mkfsoptions and mount options
Copyright 2014 FUJITSU LIMITED 
The result: Compare with other file systems 
23 
Kernel version: v3.17-rc2
The result: Compare with old Btrfses 
24 Copyright 2014 FUJITSU LIMITED
VFS has also improved performance 
25 Copyright 2014 FUJITSU LIMITED 
Accomplished by VFS layer performance enhancement
Summary 
Ready to use without RAID5/6 
Performance: OK 
Stability: OK 
•# of new features has decreased 
•Test coverage has increased 
Features: almost OK 
•RAID5/6: Lack of scrub and replace subcommands 
RAID1 and RAID10 are the best choice 
Especially safe and stable 
26 Copyright 2014 FUJITSU LIMITED
Copyright 2014 FUJITSU LIMITED 
Background 
Core Features 
Developments statistics 
Future Prospects 
Agenda 
27
Future Prospects: Fujitsu’s plan 
RAID 5/6 enhancement 
Add scrub and replace subcommands 
•We’re testing patches now and will post it to linux-btrfsML soon 
Add five tests for these features to xfstests 
Further enhancement of robustness and performance 
Repairing tools and so on 
Education and documents for this purpose 
Operation know-how 
•Btrfsoperations are different from other file systems 
•e.g.Btrfsの基礎part1機能編(It’sinJapanese.NowtranslatingtoEnglish…) 
http://www.slideshare.net/fj_staoru_takeuchi/btrfs-part1 
File system structure 
Code logic 
Copyright 28 2014 FUJITSU LIMITED
Future Prospects: Btrfs users are increasing 
Will be used by OpenSuSE13.2 as its default 
Supported by Ubuntu 
Available with RHEL7 as tech-preview 
Will be used for In Vehicle Infortaiment(IVI) system 
LinuxFileSystemAnalysisforIVIsystem,MitsuharuIto,Fujitsu 
http://events.linuxfoundation.jp/sites/events/files/slides/linux_file_system_analysis_for_IVI_systems.pdf 
Copyright 29 2014 FUJITSU LIMITED
Conclusion 
PleasetryBtrfs 
It’sreadytouse 
RAID1/10arethebestchoice 
RAID5/6needsomemorework 
Recommendtheneweststablekernel 
30 Copyright 2014 FUJITSU LIMITED
References 
LinuxFileSystemAnalysisforIVIsystem,MitsuharuIto,Fujitsu 
http://events.linuxfoundation.jp/sites/events/files/slides/linux_file_system_analysis_for_IVI_systems.pdf 
Btrfsの基礎part1機能編 
http://www.slideshare.net/fj_staoru_takeuchi/btrfs-part1 
Linux-btrfsMLlinux-btrfs@vger.kernel.org 
Btrfswiki 
https://btrfs.wiki.kernel.org/index.php/Main_Page 
31 Copyright 2014 FUJITSU LIMITED
Copyright 32 2014 FUJITSU LIMITED

More Related Content

Btrfs current status and_future_prospects

  • 1. Copyright 2014 FUJITSU LIMITED BtrfsCurrent Status and Future Prospects Oct 13 2014 Satoru Takeuchi <[email protected]> Fujitsu LTD.
  • 2. Copyright 2014 FUJITSU LIMITED Background Core Features Developments Statistics Future Prospects Agenda 1
  • 3. Copyright 2014 FUJITSU LIMITED Background Core Features Developments Statistics Future Prospects Agenda 2
  • 4. Copyright 2014 FUJITSU LIMITED Fujitsu has developed Btrfsfor Mission Critical (MC) systems since 2010 Requirements of MC systems High robustness •Don’t crash: data duplication •Error detection: checksum •Repair, recovery: snapshot, backup/restore, repairing tools High availability: Should work 365days/24h •Limited maintenance time: enlarge storage size and backup online Btrfsis designed for such the requirements Background 3
  • 5. Copyright 2014 FUJITSU LIMITED Background Core Features Developments Statistics Future Prospects Agenda 4
  • 6. Copyright 2014 FUJITSU LIMITED Multi-volumes Copy-on-Write Style Update Data/Metadata Checksum Subvolume Snapshot Transparent Compression Core Features 5
  • 7. Multi-volumes Btrfs file system can consists of multiple volumes Low layered and low overhead than LVM Many features: RAID, online {add/remove/replace} devices Copyright 2014 FUJITSU LIMITED VFS File system (XFS, ext4and so on) LVM Block device VFS Block device Btrfs XFS or ext4 + LVM Btrfs # mkfs.btrfs/dev/sd{a,b,c}1 6
  • 8. Copy-on-Write(CoW) style update Btrfs uses CoW style data/metadata update Safer than overwrite style update by design Overwrite style: Update the data in place CoW style: Copy, update, and replace pointer Copyright 2014 FUJITSU LIMITED file data file data file data file data file data data file data data file data 7 Will be deleted later System crash => data keep consistency System crash => data become inconsistent
  • 9. CoW versus Overwrite 1,000 surprising power failure test LinuxFileSystemAnalysisforIVIsystem,MitsuharuIto,Fujitsu http://events.linuxfoundation.jp/sites/events/files/slides/linux_file_system_analysis_for_IVI_systems.pdf Result Ext4: Metadata was corrupted Btrfs:Workedfinewithoutanyproblem Inmyinternalsimilartesting,XFScorruptedtoo. Copyright 8 2014 FUJITSU LIMITED
  • 10. Copyright 2014 FUJITSU LIMITED Btrfs has checksum for each data/metadata extent to detect and repair the broken data When Btrfs reads a broken extent, it detects checksum inconsistency With mirroring: RAID1/RAID10 •Read a correct copy •Repair a broken extent with a correct copy Without mirroring •Dispose a broken extent and return EIO With “btrfs scrub”, Btrfs traverses all extents and fix incorrect ones Online background job Data/Metadata Checksum 9
  • 11. Subvolume A subvolumeis a file system inside file system Can be treated as a file system root •Mountable: most mount options are shared •Own inodenamespace and quota limit Efficient: Available space is shared Copyright 2014 FUJITSU LIMITED / sub 10 # btrfs subvolumecreate sub
  • 12. Snapshot Copy of a subvolume Far faster than LVM •Not a full copy, but only update metadata in CoW style Readonly snapshot: with –r option Incremental snapshot: snapshot of snapshot Copyright 2014 FUJITSU LIMITED snap sub 2 2 2 2 2 2 # btrfs subvolumesnapshot [-r] ./sub ./snap A B C 11 sub 1 1 1 1 1 1 A B C Reference count snap sub 1 1 2 1 2 2 A B C C’ 1 1 1 Capture a snapshot Update data C in a snapshot
  • 13. Performance of Snapshot: Btrfs versus LVM 1.Copy the following data to a volume Consists of 100 directories and 100 files for each directory •File size: 1MB 2.Capture a snapshot of the volume Copyright 12 2014 FUJITSU LIMITED Hardware Environment Software Environment •PRIMERGY RX300 S6 •CPU:Intel Xeon X5690 3.47GHz x12 core •Memory:16GiB •Storages:100GB HDD x 2 •Red Hat Enterprise Linux 7.0 •Filesystems •Btrfs •Data/metadata: RAID1 •Other options: default •XFS:default options •Volumemanager for XFS •dm-thinp:chunksizeis 256KiB •LVM:RAID1
  • 14. Result Copy: Btrfs > LVM >>> dm-thinp Snapshot: Btrfs > dm-thinp >>> LVM Copyright 13 2014 FUJITSU LIMITED Volumetype Copy Snapshot Without page cache With page cache Btrfs 106s 0.126s 11.7s XFS on dm-thinp 209s 0.260s 15.5s XFSon LVM 133s 1.03s 45.2s
  • 15. Transparent compression Automatically compress/expand file data on I/O Low space consumption and high I/O performance •Need some extra CPU time Usage: mount -o compress={lzo,zlib} <device><mntpoint> •Can also be enabled/disabled for each file Copyright 14 2014 FUJITSU LIMITED compress/expand system storage Page cache without compression with compression
  • 16. Copyright 2014 FUJITSU LIMITED Background Core Features Developments statistics Future Prospects Agenda 15
  • 17. Copyright 2014 FUJITSU LIMITED Patch statistics Performance Summary Developments statistics 16
  • 18. Patch Statistics 17 Copyright 2014 FUJITSU LIMITED
  • 19. Patch Statistics: Tips of v3.17 18 Copyright 2014 FUJITSU LIMITED Rejected by Linus
  • 20. Patch statistics: Main changes 19 Copyright 2014 FUJITSU LIMITED Inodeproperties offline dedup Improve sync write ~60% RAID5/6 replace subcommand quota send/receive btrfsck repair auto defrag scrub Improve error handling Inodeproperties
  • 21. Fujitsu’s contribution 20 Copyright 2014 FUJITSU LIMITED •btrfsck, error handling •fast {random/async} write •LZO compression •read only snapshot •random Bug fixes •enrich xfstests
  • 22. Copyright 2014 FUJITSU LIMITED Fujitsu’s contribution: btrfs-progs 21 •fsck •error handling •random bug fixes •enrich xfstests •documentation
  • 23. Performance measurement 22 Copyright 2014 FUJITSU LIMITED Hardware Environment Software Environment •PRIMERGY TX300 S6 •CPU: Xeon x5670 x 2 •12 core •HT is disabled •Memory: 4GB •HDD: 300GB x 1 •MegaRAIDSAS, HITACHI HUS156030VLS600 •Benchmark software: filebench •Kernel: 3.14.11, 3.15.4, 3.16.3, and 3.17-rc2 •I/O scheduler: deadline •File systems: Btrfs(single volume), XFS, and ext4 •default mkfsoptions and mount options
  • 24. Copyright 2014 FUJITSU LIMITED The result: Compare with other file systems 23 Kernel version: v3.17-rc2
  • 25. The result: Compare with old Btrfses 24 Copyright 2014 FUJITSU LIMITED
  • 26. VFS has also improved performance 25 Copyright 2014 FUJITSU LIMITED Accomplished by VFS layer performance enhancement
  • 27. Summary Ready to use without RAID5/6 Performance: OK Stability: OK •# of new features has decreased •Test coverage has increased Features: almost OK •RAID5/6: Lack of scrub and replace subcommands RAID1 and RAID10 are the best choice Especially safe and stable 26 Copyright 2014 FUJITSU LIMITED
  • 28. Copyright 2014 FUJITSU LIMITED Background Core Features Developments statistics Future Prospects Agenda 27
  • 29. Future Prospects: Fujitsu’s plan RAID 5/6 enhancement Add scrub and replace subcommands •We’re testing patches now and will post it to linux-btrfsML soon Add five tests for these features to xfstests Further enhancement of robustness and performance Repairing tools and so on Education and documents for this purpose Operation know-how •Btrfsoperations are different from other file systems •e.g.Btrfsの基礎part1機能編(It’sinJapanese.NowtranslatingtoEnglish…) http://www.slideshare.net/fj_staoru_takeuchi/btrfs-part1 File system structure Code logic Copyright 28 2014 FUJITSU LIMITED
  • 30. Future Prospects: Btrfs users are increasing Will be used by OpenSuSE13.2 as its default Supported by Ubuntu Available with RHEL7 as tech-preview Will be used for In Vehicle Infortaiment(IVI) system LinuxFileSystemAnalysisforIVIsystem,MitsuharuIto,Fujitsu http://events.linuxfoundation.jp/sites/events/files/slides/linux_file_system_analysis_for_IVI_systems.pdf Copyright 29 2014 FUJITSU LIMITED
  • 31. Conclusion PleasetryBtrfs It’sreadytouse RAID1/10arethebestchoice RAID5/6needsomemorework Recommendtheneweststablekernel 30 Copyright 2014 FUJITSU LIMITED
  • 32. References LinuxFileSystemAnalysisforIVIsystem,MitsuharuIto,Fujitsu http://events.linuxfoundation.jp/sites/events/files/slides/linux_file_system_analysis_for_IVI_systems.pdf Btrfsの基礎part1機能編 http://www.slideshare.net/fj_staoru_takeuchi/btrfs-part1 [email protected] Btrfswiki https://btrfs.wiki.kernel.org/index.php/Main_Page 31 Copyright 2014 FUJITSU LIMITED
  • 33. Copyright 32 2014 FUJITSU LIMITED