Linux Multi-Queue Block IO Queueing Mechanism (blk-mq) Details
blk-mq (Multi-Queue Block IO Queueing Mechanism) is a new framework for the Linux block layer that was introduced with Linux Kernel 3.13, and which has become feature-complete with Kernel 3.16.[1] Blk-mq allows for over 15 million IOPS with high-performance flash devices (e.g. PCIe SSDs) on 8-socket servers, though even single and dual socket servers also benefit considerably from blk-mq.[2] To use a device with blk-mq, the device must support the respective driver.
This article explains how blk-mq integrates into the Linux storage stack and which devices have blk-mq compatible drivers already included in the Linux kernel.
blk-mq in the Linux Storage Stack
Blk-mq integrates seamlessly into the Linux storage stack. It provides basic functions to device drivers for mapping I/O enquiries to multiple queues. The tasks are distributed across multiple threads and therefore to multiple CPU cores (per-core software queues). Blk-mq compatible drivers inform blk-mq how many parallel hardware queues a device supports (number of submission queues as part of the hardware dispatch queue registration).
Blk-mq-based device drivers bypass the previous Linux I/O scheduler. In the past, some drivers without blk-mq already performed this (iomemory-vsl, nvme, mtip32xx), but these had to establish as bio-based (block-I/O-based) drivers many generic functions on their own ("stacked" approach).
All device drivers that use the previous block I/O layer continue to work independently of blk-mq as request-based drivers according to the Linux I/O scheduler (request_fn based approach, see Linux I/O Stack Diagram).[3] How much longer this request_fn based approach will exist in the Linux kernel is currently unclear (July 2014).[4][5]
Device Drivers
Driver | Device Name | Supported Devices | blk-mq Since Kernel Version |
---|---|---|---|
null_blk | /dev/nullb*[6] | none (test drivers) | 3.13 (git commit) |
virtio-blk | /dev/vd* | Virtual guest drivers (e.g. under KVM[7][8]) | 3.13 (git commit) |
mtip32xx | /dev/rssd* | Micron RealSSD PCIe | 3.16 (git commit) |
scsi (scsi_mq) | /dev/sd* | e.g. SAS and SATA SSDs/HDDs | 3.17 (git commit) |
NVMe | /dev/nvme* | e.g. Intel SSD DC P3600 DC P3700 Series[9] | 3.19 (git commit) |
rbd | /dev/rdb* | RADOS Block Device (Ceph) | 4.0 (git commit) |
ubi/block | /dev/ubiblock* | 4.0 (git commit) | |
loop | /dev/loop* | Loopback-Device | 4.0 (git commit) |
dm / dm-mpath | request-based device mapper targets (derzeit ist dies ausschließlich dm-multipath) | planned for 4.1[10] |
Additional Resources
- Kernel-Log – What's New in 3.13 (1): File Systems and Storage (heise.de, Dec. 10, 2013)
- The Multi-queue Block Layer (lwn.net, June 5, 2013)
- Linux & NVM File and Storage System Challenges (snia.org, Slides from Ric Wheeler, Senior Engineering Manager Kernel File Systems, Red Hat, Inc.)
References
- ↑ Blk-mq Is Almost Feature Complete & Fast With Linux 3.16 (phoronix.com, 02.06.2014)
- ↑ 2.0 2.1 Linux Block IO: Introducing Multi-queue SSD Access on Multi-core Systems (Matias Bjørling, Jens Axboe, David Nellans, Philippe Bonnet at SYSTOR 2013 - 6th Annual International Systems and Storage Conference)
- ↑ blk-mq: New Multi-queue Block IO Queueing Mechanism (git commit by Jens Axboe from Oct. 25, 2013)
- ↑ Re: (PATCH RFC - TAKE TWO - 00/12) New Version of the BFQ I/O Scheduler ... yes, we're likely going to maintain that code for a long time, so it's not going anywhere anytime soon... (Jens Axboe, June 2, 2014)
- ↑ Re: (PATCH RFC - TAKE TWO - 00/12) New Version of the BFQ I/O Scheduler I'd really planning on not maintaining the old request based SCSI code for a long time once we get positive reports in from users of various kinds of older hardware. (Christoph Hellwig, June 4, 2014)
- ↑ Null block device driver (kernel.org/doc/Documentation)
- ↑ Virtio (www.linux-kvm.org)
- ↑ Boot from virtio block device (www.linux-kvm.org)
- ↑ Intel Solid-State Drive Data Center Family for PCIe (www.intel.com)
- ↑ dm: add full blk-mq support to request-based DM (dm-devel mailing list, 13.03.2015)
Author: Werner Fischer Werner Fischer, working in the Knowledge Transfer team at Thomas-Krenn, completed his studies of Computer and Media Security at FH Hagenberg in Austria. He is a regular speaker at many conferences like LinuxTag, OSMC, OSDC, LinuxCon, and author for various IT magazines. In his spare time he enjoys playing the piano and training for a good result at the annual Linz marathon relay.
|