Redundant Array of Independent Disks
RAID comes in several flavors, each with a combination of tradeoffs between cost, performance, redundancy and error recovery.
Most Common Levels
Raid 0: Performance, no redundancy. Data is striped across multiples drives allowing parallel reads, thus better performance. Striping is done in blocks of 512 bytes to multi-megabyte.
Note:you can gain some advantage from this idea without RAID by having multiple SCSI and arranging the system tempfiles, data tables and static HTML files so that typical requests to the server pull data from multiple drives.
Raid 3: Fault tolerance, medium performance. Uses one drive as a parity check for the others.
Raid 5: Error recovery, low performance. Similar to Raid 0, but stripes at the bit level. Also places the parity of one drive on another. Thus, if a drive fails it can be replaced and re-built from the parity blocks. Requires at least three disks.
See: http://www.computerworld.com/res/quickstudy.html for a list of quick studies on various technologies, including more details on RAID.
It is important that the disks should be near identical in order to improve performance and data integrity, but one should not have them be all from the same manufacture's lot. One bad lot and all your data is in jeopardy. A good supplier will be able to help you with this.
An external RAID box moves all RAID handling "intelligence" into a contoller that is sitting in the external disk subsystem. The whole subsystem is connected to the host via a normal SCSI controller and appears to the host as a single disk.
This solution only uses single SCSI channel to transfer data into the computer, which creates a bottleneck. 4 SCSI drives can already completely flood a SCSI bus.
DPT's SCSI controllers are a good example for a internal controller based RAID solution.
The intelligent contoller manages the RAID subsystem independently from the host. The advantage over an external subsystem is that the contoller is able to span the RAID subsystem over multiple SCSI channels and by this remove the limiting factor external RAID solutions have: the transfer rate over the SCSI bus.
Adaptecs RAID controllers are an example, they have no RAID functionality whatsoever on the controller, they depend on external drivers to provide all external RAID functionality.
They are basically only multiple single AHA2940 controllers which have been integrated on one card. Linux detects them as AHA2940 and treats them accordingly.
Every OS needs its own special driver for this type of RAID solution, this is error prone and not very compatible.
Simulating data corruption
RAID assumes that if a write to a disk doesn't return an error, then the write was successful. Therefore, if your disk corrupts data without returning an error, your data will become corrupted. This is of course very unlikely to happen, but it is possible, and it would result in a corrupt filesystem.
RAID cannot and is not supposed to guard against data corruption on the media. Therefore, it doesn't make any sense either, to purposely corrupt data (using dd for example) on a disk to see how the RAID system will handle that. It is most likely (unless you corrupt the RAID superblock) that the RAID layer will never find out about the corruption, but your filesystem on the RAID device will be corrupted.
This is the way things are supposed to work. RAID is not a guarantee for data integrity, it just allows you to keep your data if a disk dies (that is, with RAID levels above or equal one, of course).
Summary of levels
RAID-0 is the fastest and most efficient array type but offers no fault-tolerance.
RAID-1 is the array of choice for performance-critical, fault-tolerant environments. In addition, RAID-1 is the only choice for fault-tolerance if no more than two drives are desired.
RAID-2 is seldom used today since ECC is embedded in almost all modern disk drives.
RAID-3 can be used in data intensive or single-user environments which access long sequential records to speed up data transfer. However, RAID-3 does not allow multiple I/O operations to be overlapped and requires synchronized-spindle drives in order to avoid performance degradation with short records.
RAID-4 offers no advantages over RAID-5 and does not support multiple simultaneous write operations.
RAID-5 is the best choice in multi-user environments which are not write performance sensitive. However, at least three, and more typically five drives are required for RAID-5 arrays.
Also, see http://support.usdatacenters.com/serversetup.htm for a nice tutorial on web-server setup.
How to repair a basic RAID-5 volume (stripe set with parity)
1. Open Disk Management.
2. Right-click the RAID-5 volume you want to repair, and then click Repair Volume.
The RAID-5 volume's status should change to Regenerating, then Healthy. If the volume does not return to the Healthy status, right-click the volume and then click Regenerate Parity.
The BIG raid reference: http://www.pcguide.com/ref/hdd/perf/raid/conf/drive.htm
Good diagrams of RAID 0 through 53 : http://www.acnc.com/04_01_00.html
IBMSís raid overview with pictorial diagrams : http://www.storage.ibm.com/hardsoft/diskdrls/ramref/ramref5.htm
SCSI and RAID : http://www.uni-mainz.de/~neuffer/scsi/
Formulas to find your usable disk space : http://www.penguinmagazine.com/Magazine/This_Issue/0011
Dellís guide on RAID : http://www.dell.com/us/en/esg/topics/power_ps3q99-raid.htm
Comparing RAID 3 v. RAID 5 : http://www.networkbuyersguide.com/search/294003.htm
Why RAID 5 is slow on writes : http://www.winntmag.com/Articles/Index.cfm?ArticleID=8255
How to set up RAID 5 on Windows 2000 server : http://support.microsoft.com/default.aspx?scid=kb;EN-US;q303237
Contributors: Lauren Clarke, Paul Jordan
( Topic last updated: 2001.12.07 02:02:28 AM )