Published on Nov 12, 2015
Information has become a commodity in today's world, and protecting that information has become mission critical. The Internet has helped push this information age forward. Popular websites process so much information, that any type of slowdown or downtime can mean the loss of millions of dollars. Clearly, just a bunch of hard disks won't be able to cut it anymore.
So Redundant Array of Independent (or Inexpensive) Disks (RAID) was developed to increase the performance and reliability of data storage by spreading data across multiple drives. RAID technology has grown and evolved throughout the years to meet these ever-growing demands for speed and data security.
A technique was developed to provide speed, reliability, and increased storage capacity using multiple disks, rather than single disk solutions. RAID takes multiple hard drives and allows them to be used as one large hard drive with benefits depending on the scheme or level of RAID being used. The better the RAID implementation, the more expensive it is. There is no one best RAID implementation.
Some implementations are better than others depending upon the actual application. It used to be that RAID was only available in expensive server systems. However, with the advent of inexpensive RAID controllers, it seems it has pretty much reached the mainstream market.
A drive array is a collection of hard disk drives that are grouped together. When we talk about RAID, there is often a distinction between physical drives and arrays and logical drives and arrays. Physical arrays can be divided or grouped together to form one or more logical arrays. These logical arrays can be divided into logical drives that the operating system sees. The logical drives are treated as single hard drives and can be partitioned and formatted accordingly.
The RAID controller is what manages how the data is stored and accessed across the physical and logical arrays. It ensures that the operating system sees the logical drives only and need not worry about managing the underlying schema. As far as the system is concerned, it is dealing with regular hard drives.
A RAID controller's functions can be implemented in hardware or software. Hardware implementations are better for RAID levels that require large amounts of calculations. With today's incredibly fast processors, software RAID implementations are more feasible, but the CPU still gets bogged-down with large amounts of I/O.
The basic concepts made use of in RAID are:
Mirroring involves having two copies of the same data on separate hard drives or drive arrays. So the data is effectively mirrored on another drive. The system writes data simultaneously to both hard drives. This is one of the two data redundancy methods used in RAID to protect from data loss. The benefit is that when one hard drive or array fails, the system can still continue to operate since there are two copies of data. Downtime is minimal and data recovery is relatively simple. All you need to do is rebuild the data from good copy.
A raid controller writes the same data blocks to each mirrored drive. This means that each drive or array has the same information in it. We can add another level of complexity by introducing yet another technique called striping. If we have one striped array we can mirror the array at the same time on the second striped array. To set up mirroring the number of drives will have to be in the power of two.