Network

Different types of RAID

In this tutorial, we are going to see different types of RAID.

RAID (Redundant Array of Inexpensive Disks, sometimes Redundant Array of Independent Disks) is a technology that allows you to create a storage unit from several hard drives. The unit thus created (called a cluster) therefore has a high fault tolerance (high availability), or a higher capacity/writing speed. The distribution of data on several hard disks thus makes it possible to increase security and to make the associated services more reliable.

This technology was developed in 1987 by three researchers (Patterson, Gibson, and Katz) at the University of California (Berkeley). Since 1992, the RAID Advisory Board has been managing these specifications. It consists in building up a large-capacity disk (therefore expensive) with the help of smaller inexpensive disks (i.e. with a low MTBF, Mean Time Between Failures).
 

 
Disks mounted using RAID technology can be used in different ways, called RAID Levels. The University of California has defined 5 levels, to which levels 0 and 6 have been added. Each of these describes the way in which data is distributed on the disks:

  • Level 0: called striping
  • Level 1: called mirroring, shadowing, or duplexing
  • Level 2: called striping with parity (obsolete)
  • Level 3: called disk array with bit-interleaved data
  • Level 4: called disk array with block-interleaved data
  • Level 5: called disk array with block-interleaved distributed parity
  • Level 6: called disk array with block-interleaved distributed parity

Each of these levels represents a mode of use of the cluster, depending on :

  • Performance
  • Cost
  • Disk access

 

Level 0 or RAID-0:

RAID-0, called striping, consists of storing data by distributing it over all the disks in the cluster. In this way, there is no redundancy, so we cannot talk about fault tolerance. In fact, if one of the disks fails, all of the data spread over the disks will be lost.

However, since each disk in the cluster has its own controller, this is a solution that offers high transfer speed.

RAID 0 thus consists of the logical juxtaposition (aggregation) of several physical hard disks. In RAID-0 mode, data is written in “stripes”:
 


Image source: https://commons.wikimedia.org/wiki/File:RAID_0.svg

 
 
The interleaving factor is used to characterize the relative size of the fragments ( strips ) stored on each physical unit. The average transfer rate depends on this factor (the smaller each stripe, the better the transfer rate).

If one of the elements of the cluster is larger than the others, the striping system will be blocked when the smallest disk is filled. The final size is thus equal to twice the capacity of the smaller of the two disks:

  • Two 20 GB disks will result in a 40 GB logical disk.
  • A 10 GB disk used in conjunction with a 27 GB disk will give a 20 GB logical disk (17 GB of the second disk will then be unused).

Note It is recommended to use disks of the same size to make RAID-0 because otherwise the disk of greater capacity will not be fully used.
 

Level 1 or RAID-1:

The purpose of level 1 is to duplicate the information to be stored on several disks, we, therefore, speak of mirroring, or shadowing to designate this process.
 


Image source: https://commons.wikimedia.org/wiki/File:RAID_1.svg

 
 
This provides greater data security because if one of the disks fails, the data is backed up on the other. In addition, reading can be much faster when both disks are in operation. Finally, since each disk has its own controller, the server can continue to function even when one of the disks fails, just as a truck can continue to drive if one of its tires bursts because it has several on each wheel…

On the other hand, RAID1 technology is very expensive since only half of the storage capacity is actually used.
 

Level 2 or RAID-2:

RAID-2 level is now obsolete because it offers an error control by Hamming code (ECC – Error Correction Code), but this is now directly integrated into the hard disk controllers.

This technology consists in storing data according to the same principle as with RAID-0 but writing the ECC control bits on a separate unit (generally 3 ECC disks are used for 4 data disks).
 


Image source: https://commons.wikimedia.org/wiki/File:RAID2_arch.svg

 
 
RAID 2 technology offers poor performance but a high level of security.
 

Level 3 or RAID-3:

Level 3 suggests storing the data as bytes on each disk and dedicating one of the disks to store a parity bit.
 


Image source: https://commons.wikimedia.org/wiki/File:Raid3.png

 
 
In this way, if one of the disks fails, it is possible to reconstitute the information from the other disks. After “reconstitution” the content of the failed disk is again integrated. On the other hand, if two disks were to fail simultaneously, it would be impossible to recover the data loss.
 

Level 4 or RAID-4:

Level 4 is very close to level 3. The difference is in the parity, which is done on a sector (called block) and not on a bit level, and which is stored on a dedicated disk. More precisely, the value of the interleaving factor is different compared to RAID 3.
 


Image source: https://commons.wikimedia.org/wiki/File:Raid4.png

 
 
Thus, to read a small number of blocks, the system does not have to access multiple physical drives, but only those on which the data is actually stored. On the other hand, the disk hosting the control data must have an access time equal to the sum of the access times of the other disks in order not to limit the performance of the cluster.
 

Level 5 or RAID-5:

Level 5 is similar to level 4, which means that parity is calculated at the level of a sector, but spread over all the disks in the cluster.
 


Image source: https://commons.wikimedia.org/wiki/File:RAID_5.svg

 
 
In this way, RAID 5 greatly improves data access (both read and write) because access to the parity bits is distributed among the various disks in the cluster.

The RAID-5 mode allows getting performances very close to those obtained in RAID-0 while ensuring a high fault tolerance, which is why it is one of the most interesting RAID modes in terms of performance and reliability.

Note: The useful disk space on a cluster of n disks is equal to n-1 disks, so it is interesting to have a large number of disks to “make RAID-5 profitable”.
 

Level 6 or RAID-6:

Level 6 has been added to the levels defined by Berkeley. It defines the use of 2 parity functions, and thus their storage on two dedicated disks. This level allows ensuring redundancy in case of simultaneous failure of two disks. This means that at least 4 disks are required to implement a RAID-6 system.
 


Image source: https://commons.wikimedia.org/wiki/File:RAID_6.svg

 
 

Comparison:

The most common RAID solutions are RAID level 1 and RAID level 5.

The choice of a RAID solution is related to three factors:

  • security: both RAID 1 and 5 offer a high level of security, however the method of rebuilding the disks varies between the two solutions. In case of a system failure, RAID 5 rebuilds the missing disk from the information stored on the other disks, while RAID 1 makes a copy from disk to disk.
  • Performance: RAID 1 offers better performance than RAID 5 for reading, but suffers during large write operations.
  • Cost: The cost is directly related to the amount of storage capacity that needs to be implemented to have a certain effective capacity. The RAID 5 solution offers a useful volume representing 80 to 90% of the allocated volume (the rest is obviously used for error control). The RAID 1 solution, on the other hand, only offers an available volume representing 50% of the total volume (because the data is duplicated).

 

How to Set up a RAID solution

There are several different ways to set up a RAID solution on a server:

  • Software: this is generally a driver in the computer’s operating system capable of creating a single logical volume with several disks (SCSI or IDE).
  • in a Hardware way:
    • with DASD hardware (Direct Access Storage Device): these are external storage units with their own power supply. In addition, these devices are equipped with connectors allowing hot-swapping of disks (this type of disk is generally said to be hot-swappable). This hardware manages its own disks so that it is recognized as a standard SCSI disk.
    • with RAID disk controllers: these are cards that plug into PCI or ISA slots and allow you to control several hard disks.

 
mcq-networking-question-answerComputer Network MCQ – Questions and Answers – Part 1Networking MCQs questions with answers to prepare for exams, tests, and certifications. These questions are taken from a real written exam and some parts are…Read More

Leave a Reply

Your email address will not be published. Required fields are marked *