RAID array (Redundant Array of Independent/Inexpensive Disks) is a formation arranged from certain number of disks and unified into one logical segment. All disks are managed by one or several controllers. As a rule, RAID arrays are created from hard drives, although, nowadays there is a trend to create them from SSD. Such an assembly may increase speed and improve safety of data storing.
The arrays differ in both the number of disks and the methods of data distribution among the disks. A data distribution method is called a type or level of RAID. In addition to technical characteristics, this level defines array properties, speed of operation and its possibility to resist hardware failures.
There are several different array levels with its own specific properties and appliance each. Besides, almost every array controller may have a mode of JBOD (Just a Bunch of Disks), i.e. a conventional set of disks no longer considered to be the whole.
Most levels of RAID arrays (except RAID0) have a possibility to operate when one or even more disks fail, in a so-called degraded mode or a mode of limited functionality meaning in the mode of constrained (or absent) data protection and limited speed.
RAID0 is based on the technology of block data striping and their even distribution among all disks of the array (stripe). Such distribution increases operational speed of the whole system significantly because several disks can perform reading/writing simultaneously.
Unfortunately, the technology did not focus much on reliability. If one of the disks fails, the data of the entire array will be practically irrecoverable.
RAID0 is perfect for the storages when speed is more important than reliability. For example, a server for video streaming or photo editing.
- The highest productivity among the whole group of RAID arrays.
- High speed of data writing/reading.
- The user can utilize the entire array capacity for writing data.
- The cheapest option from the perspective of expenses per 1GB of data storing.
- Rather poor reliability. Any failure may bring along data loss.
The array employs the technology of mirrored copying (mirroring), i.e. each disk with data has its identical twin, which is completely like this disk. Therefore, RAID1 is composed of two and more drives – exact twins.
This means that in case of a simple mirroring (on two components), only half of the whole capacity of the array is available to the user increasing the price for data storing by two times, i.e. for a RAID1 array of 500 GB you need to buy 2 disks 500 GB each. On the other hand, if one disk fails, you can replace it with a new one very quickly. At the same time, the system copies the data from the remaining disk to the new one instantly. To configure a RAID1 array you need to have minimum two disks at hand.
RAID1 features high reliability often achieved at cost of reduction of the operational speed at data writing. The information is written onto one disk first and then to the other. However, reading speed may be high on account of parallel data reading. These traits depend completely on the hardware (or software) RAID configuration.
RAID1 is perfect for storing critical data such as accounting systems as well as for small data servers.
- High failure resistance.
- The array will continue working even after failure of one of the data disks (if working mirror is present).
- High reading speed (depending on the controller).
- The most wide spread configuration with the highest support among RAID arrays.
- Low writing speed (without independent controllers).
- Only half of the disk space is available to the user.
- High price per gigabyte.
RAID2 and RAID3
RAID Level 2 also employs the technology of data striping, but it divides it into bytes instead of blocks. For failure resistance RAID2 provides space for Hemming code, i.e. needs at least three disks for assembly.
RAID3 cuts data into bytes and distributed it through the disks. A separate disk in the array for failure correction exists.
Due to byte-to-byte data distribution, all drives of the array work as one unit. This means they can do only one operation per one time. These RAID arrays are almost not used.
RAID4 applies data striping method as well and has a separate disk for error correction. However, in comparison to levels 2 and 3 the data is divided into blocks. This allows disks to work independently and perform several reading operations per one time. This means that the array has a high reading speed, the same as RAID0. The writing speed is reduced due to necessity to keep parity information on an additional disk for error correction purposes, because its data needs to be updated every time when new information is being written.
To assemble RAID4 array you need at least three disks. On the other hand, this modification found its broad use due to a completely suitable compromise between speed and reliability. If one of the blocks is lost, the system can recover it by itself using neighboring blocks and parity information.
- Fairly high fault tolerance, self-recovery possibility.
- Possible work in the mode of limited functionality.
- High data-reading speed.
- Low data writing speed due to dedicating one disk to error correction.
- Considerably lowered reading speed in the mode of limited functionality.
- In case of loss of one disk, data recovery on a new disk may take much time. If during this period one more disk fails, data can be lost irreversibly.
RAID5 is probably the most popular configuration of RAID arrays used in NAS storages. At present RAID5 is an ideal combination of price, speed and quality.
RAID5 is almost identical to RAID4, although, it doesn’t write the parity information onto a separate disk and rather evenly spreads it over all disks. This means that RAID5 array can resist to failure of one disk without loss of data and refusal of access to it because the data and the parity information located at disks serve to recover lost blocks.
Depending on the retailers, their production capacities and objectives, RAID arrays Level 5 may differ in data distribution methods and parity information.
To build the array Level 5 you need at least three disks.
- High reading and writing speed.
- If one of the disks fails, the user still has access to data (in the mode of restricted functionality).
- Resistance to failures and errors.
- Disk breakdown reduces productivity.
- In case of loss of one disk, data recovery on a new disk may be time-consuming. If during this period one more disk fails, data can be lost irreversibly.
RAID6 is similar to RAID5 with one difference – a double Reed-Solomon code, which is written onto two disks, is used instead of the parity information. Therefore, a minimum number of disks rises up to four and the system may continue working even in case of loss of two disks.
There are several popular complex arrays, e.g. RAID10 (1+0), RAID01(0+1), RAID50(5+0) etc.
RAID10 is a RAID0 array combined of RAID1 arrays, and a RAID01 array is array 1 out of arrays 0. Similar to that, RAID50 is an array level 0 arranged out of arrays level 5.
Such combinations serve to increase productivity and failure resistance, but all this is achieved at the expense of a significant rise in price. For instance, in order to build RAID10 you need to have four hard drives as a minimum.
The combination of two and more arrays RAID5 into RAID0 speeds up device performance essentially due to parallel data use.
RAID10 and RAID50 have become very popular due to simplicity from the implementation perspective and combination of sufficient speed with backup possibilities, even though a high price does not allow employing them in small systems. RAID01 is not in big favor because it is practically identical to RAID10 by its features.
Data recovery from a RAID array is the process which allows bringing back the data lost as a result of failure.
Possible symptoms of a damaged RAID array include:
- The array is in a limited mode, at the same time there is access to all files.
- RAID was recovered incorrectly after the loss, settings or disk sequence number were changed; access to files is limited or absent.
- Array status is “not active”, “not found”, “switched off”, at the same time the system does not display it and access to files is blocked accordingly.
RAID arrays have all the data loss problems typical of usual disks: deletion, software failure, overwriting, file system damage etc. Therefore, if RAID works as usual but has no access to data, the loss may be caused by another problem, like, operator mistakes or computer viruses.
Besides, a RAID failure may be caused by controller breakdowns or disk defects. Depending on the RAID level, data recovery may run into some difficulties and restrictions.
Data may be easily retrieved with information about disk order.
If recovery of any disk is impossible, data recovery is neither possible.
Data is easily recoverable from any component.
Data is easily recoverable. The information about the disk order number, the block size and the method of parity information distribution is required for recovery.
Data is easily recoverable on condition that there is only one damaged disk. If more disks are damaged, recovery is impossible.
Data is easily recoverable. Information about the disk order number, the block size and the method of Reed-Solomon distribution is required for recovery.
Data is easily recoverable on condition that damaged are one or two disks. If more disks are damaged, recovery is impossible.
It’s required to perform reconstruction of each component of the complex array and then assemble an array of components (of an upper level).
Data is recoverable on condition of a sufficient number of components being in order. For instance, RAID50 needs each of RAID5 components working, even in a mode of limited functioning.