| By Napster
This is a very common problem that system/linux administrators have to deal with on a regular basis. I created this blog in an effort to save time.
Let’s say you have a raid 5 array with four disks: sda (sdb), sdc (sdd), and sdd.
This is what mdstat will show:
# cat/proc/mdstatPersonalities: [raid5]
md0 : inactive raid5 sdd1[3] sdb1[1] sdc1[2] sda1[0]
95425536 blocks level 5, 128k block, algorithm 2, [4/4][UUUU]
Imagine a situation where one of your hard drives fails due to power failure or improper shutdown. This is what mdstat will show you after failure.
# cat/proc/mdstatPersonalities: [raid5]
md0 : inactive raid5 sdd1[3] sdc1[2] sda1[0]
95425536 blocks level 5, 128k block, algorithm 2, [4/4][UUUU]
This is the status of an inactive raid 5 array that had four disks sda1,, sdb1, and sdc1, but now has one disk bad (sdb1).
These steps will help you fix the problem.
- You can replace the hard drive.
- Make a new partition on the hard drive.
[[email protected] ~]# fdisk /dev/sdc
The number of cylinders for this disk is set to 91201.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): p
Disk /dev/sdc: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-91201, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-91201, default 91201):
Using default value 91201
Command (m for help): p
Disk /dev/sdc: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 91201 732572001 83 Linux
Command (m for help): t
Selected partition 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help): wq
Remember to assign the type as Linux raid autodetect.
3. Stop using the inactive array
# mdadm -stop /dev/md0
4. If you try to start the array, it will throw an error saying “sdb did not have superblock information”
# mdadm -assemble /dev/md0
mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/sdb1
- Re-create This sounds counterintuitive. If we create a raid array from scratch, it will be overwritten ……. ….. No, it’s wrong. mdadm can “see” that HDDs in the new array are elements from a previous array. Knowing this, mdadm will do its best to make the new array work (i.e. If parameters match the existing array configuration, mdadm will rebuild the new array upon it in a nondestructive manner by keeping HDD content.
- I chose to assemble the array using fewer disks. This seems safer. This method will allow me to start the array with three disks, then add the fourth disk to the running raid array. This will initiate the recovery process and do all the rest.
# watch -n 1 cat /proc/mdstat
Personalities: [raid5]md0: Active raid5 sdb1[1] and sdc1[2]; sda1[0]
127872 blocks [2/1] (U_)
[======>..............] recovery = 34.4% (44608/127872) finish=2.9min speed=461K/sec
Related:Can a Bad Belt Tensioner Cause Vibration
Apache Enable compression: How compression works in HTTP/Apache