Linux Kernel vs DPDK: HTTP Performance ShowdownScyllaDB
In this session I will use a simple HTTP benchmark to compare the performance of the Linux kernel networking stack with userspace networking powered by DPDK (kernel-bypass).
It is said that kernel-bypass technologies avoid the kernel because it is "slow", but in reality, a lot of the performance advantages that they bring just come from enforcing certain constraints.
As it turns out, many of these constraints can be enforced without bypassing the kernel. If the system is tuned just right, one can achieve performance that approaches kernel-bypass speeds, while still benefiting from the kernel's battle-tested compatibility, and rich ecosystem of tools.
P2P Container Image Distribution on IPFS With containerd and nerdctlKohei Tokunaga
Talked at FOSDEM 2022 about IPFS-based P2P image distribution with containerd and nerdctl (Feburary 6, 2022).
https://fosdem.org/2022/schedule/event/container_ipfs_image/
nerdctl is a Docker-compatible CLI of containerd, developed as a subproject of containerd. nerdctl recently added support of P2P image distribution on IPFS. This enables to share container images among hosts without hosting or relying on the registry.
In this session, Kohei, one of the maintainers of nerdctl, will introduce IPFS-based P2P image distribution with containerd and nerdctl. This session will also show the combination of IPFS-based distribution with the existing image distribution techniques, focusing on lazy pulling (eStargz) and image encryption (OCIcrypt). The status of integration work with other tools including Kubernetes will also be shared.
Related blog post: "P2P Container Image Distribution on IPFS With Containerd" . https://medium.com/nttlabs/nerdctl-ipfs-975569520e3d
The document discusses using the Storage Performance Development Kit (SPDK) to optimize Ceph performance. SPDK provides userspace libraries and drivers to unlock the full potential of Intel storage technologies. It summarizes current SPDK support in Ceph's BlueStore backend and proposes leveraging SPDK further to accelerate Ceph's block services through optimized SPDK targets and caching. Collaboration is needed between the SPDK and Ceph communities to fully realize these optimizations.
This document provides an agenda and overview for a hands-on lab on using DPDK in containers. It introduces Linux containers and how they use fewer system resources than VMs. It discusses how containers still use the kernel network stack, which is not ideal for SDN/NFV usages, and how DPDK can be used in containers to address this. The hands-on lab section guides users through building DPDK and Open vSwitch, configuring them to work with containers, and running packet generation and forwarding using testpmd and pktgen Docker containers connected via Open vSwitch.
- The document discusses Linux network stack monitoring and configuration. It begins with definitions of key concepts like RSS, RPS, RFS, LRO, GRO, DCA, XDP and BPF.
- It then provides an overview of how the network stack works from the hardware interrupts and driver level up through routing, TCP/IP and to the socket level.
- Monitoring tools like ethtool, ftrace and /proc/interrupts are described for viewing hardware statistics, software stack traces and interrupt information.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
This document outlines the process flow for receiving a packet on a network interface, passing it through various networking stacks in the kernel, and delivering it to a socket or application. Key steps include:
1) The packet is received by the network interface driver and passed to netif_receive_skb.
2) It then goes through processing such as checksum verification, filtering by iptables, and defragmentation if needed.
3) The packet is then routed and delivered to the appropriate socket using functions like ip_local_deliver.
4) Data from the packet is then placed into the receive queue for the socket's application to read.
Linux Kernel vs DPDK: HTTP Performance ShowdownScyllaDB
In this session I will use a simple HTTP benchmark to compare the performance of the Linux kernel networking stack with userspace networking powered by DPDK (kernel-bypass).
It is said that kernel-bypass technologies avoid the kernel because it is "slow", but in reality, a lot of the performance advantages that they bring just come from enforcing certain constraints.
As it turns out, many of these constraints can be enforced without bypassing the kernel. If the system is tuned just right, one can achieve performance that approaches kernel-bypass speeds, while still benefiting from the kernel's battle-tested compatibility, and rich ecosystem of tools.
P2P Container Image Distribution on IPFS With containerd and nerdctlKohei Tokunaga
Talked at FOSDEM 2022 about IPFS-based P2P image distribution with containerd and nerdctl (Feburary 6, 2022).
https://fosdem.org/2022/schedule/event/container_ipfs_image/
nerdctl is a Docker-compatible CLI of containerd, developed as a subproject of containerd. nerdctl recently added support of P2P image distribution on IPFS. This enables to share container images among hosts without hosting or relying on the registry.
In this session, Kohei, one of the maintainers of nerdctl, will introduce IPFS-based P2P image distribution with containerd and nerdctl. This session will also show the combination of IPFS-based distribution with the existing image distribution techniques, focusing on lazy pulling (eStargz) and image encryption (OCIcrypt). The status of integration work with other tools including Kubernetes will also be shared.
Related blog post: "P2P Container Image Distribution on IPFS With Containerd" . https://medium.com/nttlabs/nerdctl-ipfs-975569520e3d
The document discusses using the Storage Performance Development Kit (SPDK) to optimize Ceph performance. SPDK provides userspace libraries and drivers to unlock the full potential of Intel storage technologies. It summarizes current SPDK support in Ceph's BlueStore backend and proposes leveraging SPDK further to accelerate Ceph's block services through optimized SPDK targets and caching. Collaboration is needed between the SPDK and Ceph communities to fully realize these optimizations.
This document provides an agenda and overview for a hands-on lab on using DPDK in containers. It introduces Linux containers and how they use fewer system resources than VMs. It discusses how containers still use the kernel network stack, which is not ideal for SDN/NFV usages, and how DPDK can be used in containers to address this. The hands-on lab section guides users through building DPDK and Open vSwitch, configuring them to work with containers, and running packet generation and forwarding using testpmd and pktgen Docker containers connected via Open vSwitch.
- The document discusses Linux network stack monitoring and configuration. It begins with definitions of key concepts like RSS, RPS, RFS, LRO, GRO, DCA, XDP and BPF.
- It then provides an overview of how the network stack works from the hardware interrupts and driver level up through routing, TCP/IP and to the socket level.
- Monitoring tools like ethtool, ftrace and /proc/interrupts are described for viewing hardware statistics, software stack traces and interrupt information.
Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg.
There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes.
This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.
This document outlines the process flow for receiving a packet on a network interface, passing it through various networking stacks in the kernel, and delivering it to a socket or application. Key steps include:
1) The packet is received by the network interface driver and passed to netif_receive_skb.
2) It then goes through processing such as checksum verification, filtering by iptables, and defragmentation if needed.
3) The packet is then routed and delivered to the appropriate socket using functions like ip_local_deliver.
4) Data from the packet is then placed into the receive queue for the socket's application to read.
主に論文 "Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions" の紹介。
https://pmg.csail.mit.edu/pubs/adya99__weak_consis-abstract.html
This document discusses the internals of WalB Driver, which is a data storage driver developed by Cybozu Lab. It records only redo logs, not undo logs, to avoid performance degradation. WalB completes I/O operations by just writing redo logs to log storage, without needing to read current data or generate undo logs. This allows it to overlap and parallelize log flushing and data I/O for efficient write performance.
The document introduces an algorithm called B2ST (Big tree, Big string Suffix Tree construction) for constructing suffix trees of data larger than main memory. B2ST partitions the input string into partitions that fit in memory, sorts suffixes within partition pairs using suffix arrays with LCP information, and merges the results by building a suffix tree from the suffix array streams and order arrays on disk in a single pass without reloading the entire input.
The document introduces two algorithms for constructing a suffix array: SA-IS and SA-DS. SA-IS uses induced sorting of longest common prefix substrings, while SA-DS uses radix sorting of fixed-length substrings. The document provides pseudocode for the algorithms and explains various terms and data structures used, including longest minimal suffixes, L-type and S-type characters, and buckets for sorting.
An Efficient Backup and Replication of StorageTakashi Hoshino
This document describes WalB, a Linux kernel device driver that provides efficient backup and replication of storage using block-level write-ahead logging (WAL). It has negligible performance overhead and avoids issues like fragmentation. WalB works by wrapping a block device and writing redo logs to a separate log device. It then extracts diffs for backup/replication. The document discusses WalB's architecture, algorithm, performance evaluation and future work.
WalB is a block device driver that uses write-ahead logging (WAL) to provide efficient incremental backups. It aims to address the lack of a good backup solution that works online, with low overhead, across various applications, and using commodity hardware and free software. WalB acts as a wrapper device that logs writes to a separate log device to enable consistent incremental backups of the data device.
14. 参考⽂文献
• 詳解LINUXカーネル第3版
– Daniel P. Bovet, Marco Cesati
• Linux Block IO: Introducing Multi-queue
SSD Access on Multi-core Systems
– Matias Bjørling, Jens Axboe, David Nellans,
Philippe Bonnet
• Linux Weekly News
– http://lwn.net/
14