The cost of small hard drives seems to be forever plummeting. For example, a 9 GB drive was recently introduced for under $5000. An unthinkable cost per MB even a year ago! Unfortunately, drive access speeds have not kept up with their price. Even the new generation of 'AV' hard drives aren't fast enough for large database systems. What can you do to maximize the performance of these small, inexpensive hard drives? More and more companies are considering (and installing) RAID.
RAID stands for Redundant Array of Inexpensive Disks which describes itself quite well the basic. The premise is to take many small hard drives and organize them so that data is spread across them. This idea is not new, but was formally defined in 1987 when a Berkeley paper was published describing the 5 levels of architecture. These levels (along with RAID 0) are described below:
RAID O -- Technically there is no redundancy at this level but it does provide for speed advantages, versus a single disk drive, by striping data in parallel sectors across multiple disk drives. The I/O transfer speed is increased for this architecture; however, a single drive element failure can result in an unrecoverable data loss.
RAID 1 -- Independent data paths allow for complete disk duplication or "data mirroring" in this architecture. This level introduces redundancy in the sense that there are two copies of all data; this complete duplication increases also doubles the cost per megabyte. The speed of transfers is faster than a single drive because of overlapping reads and parallel writes. Access time can also be improved because of accessing either copy of the data. Additional costs also derive from custom controllers and/or operating system changes.
RAID 2 -- Level 2 introduces hamming code error checking across the disks. This introduces the possibility of data, recovery without a complete duplication of data albeit it does require several check disks. This also requires that all disks in a group be accessed, even for small transfers, and wait for the slowest to finish before the transfer is complete.
RAID 3 -- The distinguishing feature of level three is a single parity drive to accomplish redundancy. This is achieved by interleaving the parity information at the byte level. Typically, the drive spindles are synchronized. It still requires that all disks in a group be accessed, even for small transfers, and wait for the slowest to finish before the transfer is complete. Spindle synchronization is expensive and often limits the choice of disk elements.
RAID 4 -- Level 4 introduces the concept of interleaving parity at the sector or transfer level. This permits faster individual disk reads for small transfers, and writes accessing the disks. The parity check disk becomes a throughput bottleneck.
RAID 5 -- Parity information is spiraled across all data drives in level 5, which attacks the problem of the parity disk bottleneck. This distributed parity increases write performance, but introduces high overhead to track the location of parity addresses.
If you have a Novell server within your organization, you may already be running RAID level 0 or 1 as they are now an integral part of the Novell Operating System. In the future, RAID will become a standard part of many operating systems. The simple levels of RAID have several disadvantages that may be acceptable in a Novell server, but unacceptable on a high-availability UNIX machine.
Mirroring or duplexing data is simple -- it requires almost no extra hardware (other than the drives and controllers). The main disadvantages to these methods is that disk accese is relatively slow, and the system may be interrupted if a drive fails. A RAID system can offer faster access times, and transparent protection against hardware failure. On RAID levels 2-5, faster access times are negated because of the need to write parity information to either a separate drive, or within data drives.
RAID 6 -- RAID level 6 attempts to address this problem by allowing the parity drive to be asynchronously updated via an independent bus and cache.
In the late 80's, Storage Computer Corporation (StorComp) set out to improve on RAID level 6. They believed that all data access should have the same privileges as the parity drive in RAID 6. In 1991, they released RAID 7 -- a vast improvement over the existing RAID standards. The level 7 architecture allows each individual drive to access data as fast as it possibly can by incorporating three crucial features:
For further information, you can contact John O'Brien at:
Storage Computer Corporation
11 Riverside Street
Nashua, NH 03062
Tel. 603-880-3005
Fax. 603-889-7232
Manitoba UNIX User Group
7 November 1994