RAID
- The Basics of RAID
What is
RAID?
RAID stands for
"Redundant Array of Independent
Disks". There are six levels of
RAID: level 0 - level 5. Each level
supports a different storage layout
scheme on the disk drives, from mirroring
to parity striping. All drives in a RAID
set MUST BE the exact same drive.
What is
meant by redundancy, parity or parity
data?
Redundancy means
that there is protection against any
single disk failure. Parity data is
information used by a RAID system to
rebuild the data on a disk in the event
of a failure. Parity data is created by
using a logical exclusive-OR (XOR) on
actual user data and storing the result
on disk. Example: If an array of 5 drives
exists, the 4 drives are used as the
storage devices and the 5th as the parity
drive. Data on the first sector of each
of the 4 data drives is XORed creating
parity data that is stored on the first
sector of the parity drive. The same
holds true for the second sector on.
What
are the different RAID levels and what do
they support?
Level 0 :
Disk Striping - data is transferred in
parallel across an array of disks.
Redundancy is not provided in this level.
Level 1 :
Disk Mirroring - duplicate contents of
one disk are written onto another disk.
Level 2 :
Bit interleaving data across multiple
disks with parity information created
using a Hamming code. A Hamming code
detects errors that occur and determine
which part is in error. RAID level 2
specifies 39 disks with 32 disks of user
storage and 7 disks of error recovery
coding.
Level 3 :
Data is striped across multiple drives
and parity is written to a dedicated
drive. Level 3 is typically implemented
at the BYTE level.
Level 4 :
Data is striped across mulitple drives
and parity is written to a dedicated
drive. Level 4 is typically implemented
at the BLOCK level.
Level 5 :
Error correction data is striped at the
block level across all the drives in the
array. Reads and writes may be performed
concurrently.
What
are the minimum requirements to run a
RAID set?
If you are
implementing RAID level 0, you will need
a minimum of 2 disk drives to create a
rank, a RAID controller card, and Windows
NT. If you are implementing RAID level 3
or 5, then you will need a minimum of 3
disks to create a rank. RAID is also
supported in some Novell and Unix
environments.
Detailed
RAID From
the Linux high performance SCSI
& RAID
page.
In 1987, Patterson,
Gibson and Katz at the University of
California Berkeley, published a paper
entitled "A Case for Redundant
Arrays of Inexpensive Disks (RAID)"
. This paper described various types of
disk arrays, referred to by the acronym
RAID. The basic idea of RAID was to
combine multiple small, inexpensive disk
drives into an array of disk drives which
yields performance exceeding that of a
Single Large Expensive Drive (SLED).
Additionally, this array of drives
appears to the computer as a single
logical storage unit or drive.
The Mean Time
Between Failure (MTBF) of the array will
be equal to the MTBF of an individual
drive, divided by the number of drives in
the array. Because of this, the MTBF of
an array of drives would be too low for
many application requirements. However,
disk arrays can be made fault-tolerant by
redundantly storing information in
various ways.
Five types of array
architectures, RAID-1 through RAID-5,
were defined by the Berkeley paper, each
providing disk fault-tolerance and each
offering different trade-offs in features
and performance. In addition to these
five redundant array architectures, it
has become popular to refer to a
non-redundant array of disk drives as a
RAID-0 array.
Data
Striping
Fundamental to RAID
is "striping", a method of
concatenating multiple drives into one
logical storage unit. Striping involves
partitioning each drive's storage space
into stripes which may be as small as one
sector (512 bytes) or as large as several
megabytes. These stripes are then
interleaved round-robin, so that the
combined space is composed alternately of
stripes from each drive. In effect, the
storage space of the drives is shuffled
like a deck of cards. The type of
application environment, I/O or data
intensive, determines whether large or
small stripes should be used.
Most multi-user
operating systems today, like NT, Unix
and Netware, support overlapped disk I/O
operations across multiple drives.
However, in order to maximize throughput
for the disk subsystem, the I/O load must
be balanced across all the drives so that
each drive can be kept busy as much as
possible. In a multiple drive system
without striping, the disk I/O load is
never perfectly balanced. Some drives
will contain data files which are
frequently accessed and some drives will
only rarely be accessed. In I/O intensive
environments, performance is optimized by
striping the drives in the array with
stripes large enough so that each record
potentially falls entirely within one
stripe. This ensures that the data and
I/O will be evenly distributed across the
array, allowing each drive to work on a
different I/O operation, and thus
maximize the number of simultaneous I/O
operations which can be performed by the
array.
In data intensive
environments and single-user systems
which access large records, small stripes
(typically one 512-byte sector in length)
can be used so that each record will span
across all the drives in the array, each
drive storing part of the data from the
record. This causes long record accesses
to be performed faster, since the data
transfer occurs in parallel on multiple
drives. Unfortunately, small stripes rule
out multiple overlapped I/O operations,
since each I/O will typically involve all
drives. However, operating systems like
DOS which does not allow overlapped disk
I/O, will not be negatively impacted.
Applications such as on-demand
video/audio, medical imaging and data
acquisition, which utilize long record
accesses, will achieve optimum
performance with small stripe arrays.
A potential
drawback to using small stripes is that
synchronized spindle drives are required
in order to keep performance from being
degraded when short records are accessed.
Without synchronized spindles, each drive
in the array will be at different random
rotational positions. Since an I/O cannot
be completed until every drive has
accessed its part of the record, the
drive which takes the longest will
determine when the I/O completes. The
more drives in the array, the more the
average access time for the array
approaches the worst case single-drive
access time. Synchronized spindles assure
that every drive in the array reaches its
data at the same time. The access time of
the array will thus be equal to the
average access time of a single drive
rather than approaching the worst case
access time.
The
different RAID levels
- RAID-0
- RAID Level 0
is not redundant, hence does not
truly fit the "RAID"
acronym. In level 0, data is
split across drives, resulting in
higher data throughput. Since no
redundant information is stored,
performance is very good, but the
failure of any disk in the array
results in data loss. This level
is commonly referred to as
striping.
- RAID-1
- RAID Level 1
provides redundancy by writing
all data to two or more drives.
The performance of a level 1
array tends to be faster on reads
and slower on writes compared to
a single drive, but if either
drive fails, no data is lost.
This is a good entry-level
redundant system, since only two
drives are required; however,
since one drive is used to store
a duplicate of the data, the cost
per megabyte is high. This level
is commonly referred to as
mirroring.
- RAID-2
- RAID Level 2,
which uses Hamming error
correction codes, is intended for
use with drives which do not have
built-in error detection. All
SCSI drives support built-in
error detection, so this level is
of little use when using SCSI
drives.
- RAID-3
- RAID Level 3
stripes data at a byte level
across several drives, with
parity stored on one drive. It is
otherwise similar to level 4.
Byte-level striping requires
hardware support for efficient
use.
- RAID-4
- RAID Level 4
stripes data at a block level
across several drives, with
parity stored on one drive. The
parity information allows
recovery from the failure of any
single drive. The performance of
a level 4 array is very good for
reads (the same as level 0).
Writes, however, require that
parity data be updated each time.
This slows small random writes,
in particular, though large
writes or sequential writes are
fairly fast. Because only one
drive in the array stores
redundant data, the cost per
megabyte of a level 4 array can
be fairly low.
- RAID-5
- RAID Level 5
is similar to level 4, but
distributes parity among the
drives. This can speed small
writes in multiprocessing
systems, since the parity disk
does not become a bottleneck.
Because parity data must be
skipped on each drive during
reads, however, the performance
for reads tends to be
considerably lower than a level 4
array. The cost per megabyte is
the same as for level 4.
RAID-0 is the
fastest and most efficient array type but
offers no fault-tolerance.
RAID-1 is the array
of choice for performance-critical,
fault-tolerant environments. In addition,
RAID-1 is the only choice for
fault-tolerance if no more than two
drives are desired.
RAID-2 is seldom
used today since ECC is embedded in
almost all modern disk drives.
RAID-3 can be used
in data intensive or single-user
environments which access long sequential
records to speed up data transfer.
However, RAID-3 does not allow multiple
I/O operations to be overlapped and
requires synchronized-spindle drives in
order to avoid performance degradation
with short records.
RAID-4 offers no
advantages over RAID-5 and does not
support multiple simultaneous write
operations.
RAID-5 is the best
choice in multi-user environments which
are not write performance sensitive.
However, at least three, and more
typically five drives are required for
RAID-5 arrays.
Possible
aproaches to RAID
Hardware RAID
The hardware based system manages the
RAID subsystem independently from the
host and presents to the host only a
single disk per RAID array. This way the
host doesn't have to be aware of the RAID
subsystems(s).
- The controller
based hardware solution
DPT's SCSI controllers are a good
example for a controller based
RAID solution.
The intelligent contoller manages
the RAID subsystem independently
from the host. The advantage over
an external SCSI---SCSI RAID
subsystem is that the contoller
is able to span the RAID
subsystem over multiple SCSI
channels and and by this remove
the limiting factor external RAID
solutions have: The transfer rate
over the SCSI bus.
- The external
hardware solution (SCSI---SCSI
RAID)
An external RAID box moves all
RAID handling
"intelligence" into a
contoller that is sitting in the
external disk subsystem. The
whole subsystem is connected to
the host via a normal SCSI
controller and apears to the host
as a single disk.
This solution has drawbacks
compared to the contoller based
solution: The single SCSI channel
used in this solution creates a
bottleneck.
4 SCSI drives can already
completely flood a SCSI bus,
since the average transfer size
is around 4KB and the command
transfer overhead - which is even
in Ultra SCSI still done
asynchonously - takes most of the
bus time.
Software
RAID
- The MD driver
in the Linux kernel is an example
of a RAID solution that is
completely hardware independent.
However its application is
limited, since it only provides
RAID level 0, but not the levels
1 and 5. The author stopped
working on this.
- Adaptecs RAID
controllers are another example,
they have no RAID functionality
whatsoever on the controller,
they depend on external drivers
to provide all external RAID
functionality.
They are basically only multiple
single AHA2940 controllers which
have been integrated on one card.
Linux detects them as AHA2940 and
treats them accordingly.
Every OS needs its own special
driver for this type of RAID
solution, this is error prone and
not very compatible.
Hardware vs.
Software RAID
Just like any other application,
software-based arrays occupy host system
memory, consume CPU cycles and are
operating system dependent. By contending
with other applications that are running
concurrently for host CPU cycles and
memory, software-based arrays degrade
overall server performance. Also, unlike
hardware-based arrays, the performance of
a software-based array is directly
dependent on server CPU performance and
load.
Except for the
array functionality, hardware-based RAID
schemes have very little in common with
software-based implementations. Since the
host CPU can execute user applications
while the array adapter's processor
simultaneously executes the array
functions, the result is true hardware
multi-tasking. Hardware arrays also do
not occupy any host system memory, nor
are they operating system dependent.
Hardware arrays are
also highly fault tolerant. Since the
array logic is based in hardware,
software is NOT required to boot. Some
software arrays, however, will fail to
boot if the boot drive in the array
fails. For example, an array implemented
in software can only be functional when
the array software has been read from the
disks and is memory-resident. What
happens if the server can't load the
array software because the disk that
contains the fault tolerant software has
failed? Software-based implementations
commonly require a separate boot drive,
which is NOT included in the array.
Reasons why
you should use RAID:
- Speed
- Increased
Storage capacity
- The economic
costs of disk failure
- In
addition to downtime,
consider. . .
- Emergency
service cost
- Cost
of restoring data
- Immediate
lost productivity
- Long
term lost sales
- Lost
repeat sales
- Lost
word-of-mouth advertising
- In a
commercial enterprise,
the cost of a disk
failure when there is no
mirroring or RAID is much
larger than usually
recognized.
- Unexpectedly,
the largest cost is the
accumulated lost sales
over a long period of
time.
|