I/O Performance HOWTO

SharonSnidersnidersd@us.ibm.com [mailto:snidersd@us.ibm.com snidersd@us.ibm.com] snidersd@us.ibm.com Publication date: v1.1, 05/2002This HOWTO covers information on available patches for the 2.4 kernel that can improve the I/O performance of your Linux™ operating system. This HOWTO covers information on available patches for the 2.4 kernel that can improve the I/O performance of your Linux™ operating system.

Revision History:

||v1.1||2002-05-01||sds||Updated technical information and links.v1.12002-05-01sdsUpdated technical information and links. ||v1.0||2002-04-01||sds||Wrote and converted to DocBook XML.v1.02002-04-01sdsWrote and converted to DocBook XML.

Distribution Policy

The I/O Performance-HOWTO is copyrighted © 2002, by IBM Corporation The I/O Performance-HOWTO is copyrighted © 2002, by IBM Corporation Permission is granted to copy, distribute, and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation with no Invariant Sections, no Front-Cover text, and no Back-Cover text. A copy of the license can be found at .Permission is granted to copy, distribute, and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation with no Invariant Sections, no Front-Cover text, and no Back-Cover text. A copy of the license can be found at [http://www.gnu.org/licenses/fdl.txt ] .

Introduction

This HOWTO provides information on improving the input/output (I/O) performance of the Linux operating system for the 2.4 kernel. Additional patches will be added as they become available.This HOWTO provides information on improving the input/output (I/O) performance of the Linux operating system for the 2.4 kernel. Additional patches will be added as they become available.Please send any comments, or contributions via e-mail to Sharon Snider. Please send any comments, or contributions via e-mail to [mailto:snidersd@us.ibm.com Sharon Snider]

Avoiding Bounce Buffers

This section provides information on applying and using the bounce buffer patch on the Linux 2.4 kernel. The bounce buffer patch, written by Jens Axboe, enables device drivers that support direct memory access (DMA) I/O to high-address physical memory to avoid bounce buffers.This section provides information on applying and using the bounce buffer patch on the Linux 2.4 kernel. The bounce buffer patch, written by Jens Axboe, enables device drivers that support direct memory access (DMA) I/O to high-address physical memory to avoid bounce buffers.This document provides a brief overview on memory and addressing in the Linux kernel, followed by information on why and how to make use of the bounce buffer patch.This document provides a brief overview on memory and addressing in the Linux kernel, followed by information on why and how to make use of the bounce buffer patch.

Memory and Addressing in the Linux 2.4 Kernel

The Linux 2.4 kernel includes configuration options for specifying the amount of physical memory in the target computer. By default, the configuration is limited to the amount of memory that can be directly mapped into the kernel's virtual address space starting at PAGE_OFFSET. On i386 systems the default mapping scheme limits kernel-mode addressability to the first gigabyte (GB) of physical memory, also known as low memory. Conversely, high memory is normally the memory above 1 GB. High memory is not directly accessible or permanently mapped by the kernel. Support for high memory is an option that is enabled during configuration of the Linux kernel.The Linux 2.4 kernel includes configuration options for specifying the amount of physical memory in the target computer. By default, the configuration is limited to the amount of memory that can be directly mapped into the kernel's virtual address space starting at PAGE_OFFSET. On i386 systems the default mapping scheme limits kernel-mode addressability to the first gigabyte (GB) of physical memory, also known as low memory. Conversely, high memory is normally the memory above 1 GB. High memory is not directly accessible or permanently mapped by the kernel. Support for high memory is an option that is enabled during .

The Problem with Bounce Buffers

When DMA I/O is performed to or from high memory, an area is allocated in low memory known as a bounce buffer. When data travels between a device and high memory, it is first copied through the bounce buffer.When DMA I/O is performed to or from high memory, an area is allocated in low memory known as a bounce buffer. When data travels between a device and high memory, it is first copied through the bounce buffer.Systems with a large amount of high memory and intense I/O activity can create a large number of bounce buffers that can cause memory shortage problems. In addition, the excessive number of bounce buffer data copies can lead to performance degradation.Systems with a large amount of high memory and intense I/O activity can create a large number of bounce buffers that can cause memory shortage problems. In addition, the excessive number of bounce buffer data copies can lead to performance degradation.Peripheral component interface (PCI) devices normally address up to 4 GB of physical memory. When a bounce buffer is used for high memory that is below 4 GB, time and memory are wasted because the peripheral has the ability to address that memory directly. Using the bounce buffer patch can decrease, and possibly eliminate, the use of bounce buffers.Peripheral component interface (PCI) devices normally address up to 4 GB of physical memory. When a bounce buffer is used for high memory that is below 4 GB, time and memory are wasted because the peripheral has the ability to address that memory directly. Using the bounce buffer patch can decrease, and possibly eliminate, the use of bounce buffers.

Locating the Patch

[http://kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/ ] .

Configuring the Linux Kernel to Avoid Bounce Buffers

This section includes information on configuring the Linux kernel to avoid bounce buffers. The Linux Kernel-HOWTO at explains the process of re-compiling the Linux kernel.This section includes information on configuring the Linux kernel to avoid bounce buffers. The Linux Kernel-HOWTO at [http://www.linuxdoc.org/HOWTO/Kernel-HOWTO.html ]

Enabled Device Drivers

The bounce buffer patch provides the kernel infrastructure, as well as the SCSI and IDE mid-level driver modifications to support DMA I/O to high memory. Updates for several device drivers to make use of the added support are also included with the patch.The bounce buffer patch provides the kernel infrastructure, as well as the SCSI and IDE mid-level driver modifications to support DMA I/O to high memory. Updates for several device drivers to make use of the added support are also included with the patch.If the bounce buffer patch is applied and you configure the kernel to support high memory I/O, many IDE configurations and the device drivers listed below perform DMA I/O without the use of bounce buffers:If the bounce buffer patch is applied and you configure the kernel to support high memory I/O, many IDE configurations and the device drivers listed below perform DMA I/O without the use of bounce buffers:aic7xxx_drv.oaic7xxx_old.occiss.ocpqarray.omegaraid.oqlogicfc.osym53c8xx.o

Modifying Your Device Driver to Avoid Bounce Buffers

If your device drivers are not listed above in the Enabled Device Drivers section, and the device is capable of high-memory DMA I/O, you can modify your device driver to make use of the bounce buffer patch as follows. More information on rebuilding a Linux device driver is available at .If your device drivers are not listed above in the

[http://www.xml.com/ldd/chapter/book/index.html ] .

Raw I/O Variable-Size Optimization Patch

This section provides information on the raw I/O variable-size optimization patch for the Linux 2.4 kernel written by Badari Pulavarty. This patch is also known as the RAW VARY or PAGESIZE_io patch. This section provides information on the raw I/O variable-size optimization patch for the Linux 2.4 kernel written by Badari Pulavarty. This patch is also known as the RAW VARY or PAGESIZE_io patch. The raw I/O variable-size patch changes the block size used for raw I/O from hardsect_size (normally 512 bytes) to 4 kilobytes (K). The patch improves I/O throughput and CPU utilization by reducing the number of buffer heads needed for raw I/O operations.The raw I/O variable-size patch changes the block size used for raw I/O from hardsect_size (normally 512 bytes) to 4 kilobytes (K). The patch improves I/O throughput and CPU utilization by reducing the number of buffer heads needed for raw I/O operations.

Locating the Patch

You can download the patch from one of the following locations:You can download the patch from one of the following locations:Andrea Arcangeli has made the patch available at . The name of the file is 10_rawio-vary-io-1.Andrea Arcangeli has made the patch available at

[http://www.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre7aa2/ ] . The name of the file is .Alan Cox has included the patch in the 2.4.18pre9-ac2 kernel patch. The patch is available at . Alan Cox has included the patch in the kernel patch. The patch is available at [http://www.kernel.org/pub/linux/kernel/people/alan/linux-2.4/2.4.18/ ] . The patch is available from SourceForge at . The latest version is PAGESIZE_io-2.4.17.patch.The patch is available from SourceForge at [http://sourceforge.net/projects/lse/io ] . The latest version is .

Modifying Your Driver for the Raw I/O Variable-Size Optimization Patch

In previous versions of this patch, changes were enabled for all drivers. However, the 2.4.17 and later versions of the patch enable the changes only for the Adaptec, Qlogic ISP1020, and IBM ServerRAID drivers. All other drivers for version 2.4.17 and later must be modified to make use of the patch by setting the can_do_varyio bit in the Scsi_Host_Template structure.In previous versions of this patch, changes were enabled for all drivers. However, the 2.4.17 and later versions of the patch enable the changes only for the Adaptec, Qlogic ISP1020, and IBM ServerRAID drivers. All other drivers for version 2.4.17 and later must be modified to make use of the patch by setting the can_do_varyio bit in the Scsi_Host_Template structure.

I/O Request Lock Patch

This section provides information on the I/O request lock patch, also known as the scsi concurrent queuing patch (sior1), written by Johnathan Lahr. This section provides information on the I/O request lock patch, also known as the scsi concurrent queuing patch (sior1), written by Johnathan Lahr. The I/O request lock patch improves SCSI I/O performance on Linux 2.4 multi-processor systems by providing concurrent I/O request queuing. There are significant I/O performance and CPU utilization improvements possible by enabling multi-processors to concurrently drive multiple block devices.The I/O request lock patch improves SCSI I/O performance on Linux 2.4 multi-processor systems by providing concurrent I/O request queuing. There are significant I/O performance and CPU utilization improvements possible by enabling multi-processors to concurrently drive multiple block devices.Before the patch is applied block I/O requests are queued one at a time holding the global spin lock, io_request_lock. Once the patch is applied, SCSI requests are queued while holding the lock specific to the queue associated with the request. Requests that are made to different devices are queued concurrently, and requests that are made to the same device are queued serially.Before the patch is applied block I/O requests are queued one at a time holding the global spin lock, io_request_lock. Once the patch is applied, SCSI requests are queued while holding the lock specific to the queue associated with the request. Requests that are made to different devices are queued concurrently, and requests that are made to the same device are queued serially.

Locating the Patch

You can download the I/O request patch from Sourceforge at . The latest version is sior1-v1.2416. Patches that enable concurrent queuing for specific drivers are also available at SourceForge. The patch for the Emulex SCSI/FC is lpfc_sior1-v0.249 and the patch for Adaptec SCSI is aic_sior1-v0.249.You can download the I/O request patch from Sourceforge at [http://sourceforge.net/projects/lse/io ] . The latest version is . Patches that enable concurrent queuing for specific drivers are also available at SourceForge. The patch for the Emulex SCSI/FC is and the patch for Adaptec SCSI is .

Modifying Your Driver for the I/O Request Lock Patch

The I/O request lock patch installs concurrent queuing capability into the SCSI midlayer. Concurrent queuing is activated for each SCSI adapter device driver. To activate the driver, the concurrent_queue field in the Scsi_Host_Template structure must be set when the driver is registered.The I/O request lock patch installs concurrent queuing capability into the SCSI midlayer. Concurrent queuing is activated for each SCSI adapter device driver. To activate the driver, the concurrent_queue field in the Scsi_Host_Template structure must be set when the driver is registered.

Additional Resources

The following list of Web sites provides additional information on modifying device drivers and configuring the Linux kernel.The following list of Web sites provides additional information on modifying device drivers and configuring the Linux kernel.Information on Dynamic DMA mapping is available at .Information on Dynamic DMA mapping is available at

[http://lwn.net/2001/0712/a/dma-interface.php3 ] .Kernel-HOWTO is available from the Linux Documentation Project at .Kernel-HOWTO is available from the Linux Documentation Project at

[http://www.linuxdoc.org/HOWTO/Kernel-HOWTO.html ] .Linux Device Drivers, 2nd Edition published by O'Reilly is available online at .Linux Device Drivers, 2nd Edition published by O'Reilly is available online at

[http://www.xml.com/ldd/chapter/book/index.html ] .

I/O_Performance_HOWTO (last edited 2008-09-18 17:23:12 by SvetoslavChukov)