Workaround for the SSD RAID1 dual drive failure mode

At least it can keep things operating while we get parts out. Shipped a number today, placed another order for the new vendors drives. I can confirm that heat is an issue with SSDs.

As root, on the unit, decide where you are going to place an image, then

    dd if=/dev/zero of=/path/to/loopback/raid.img bs=1 count=1 \
        seek=32G
    losetup  /dev/loop0 /path/to/loopback/raid.img
    mdadm --grow /dev/md0 -n3
    mdadm /dev/md0 --add /dev/loop0

This will copy the OS to the a file, and we can (later if need be) recover from problems. Say, 2 OS drives dying (which is the failure mode that caught our eye).

But more than that, it will let you keep operating, albeit slowly, in the event of a complete SSD failure, and continue to operate while we get new parts out.

Viewed 7507 times by 1667 viewers

Facebooktwittergoogle_plusredditpinterestlinkedinmail