This one took me a while to figure out. I had to start probing why a system would crash the MD stack shortly after booting, but not in single user mode.
So I started delving into the RAID. And found that the folks who set this unit up had a RAID0 with 0.90 metadata on the devices, and then 1.2 metadata on the MDS.
So along comes the Lustre-ized kernel, and whammo. Oddly, not with the Centos 6.3 kernel. There it worked “fine”.
If you get some odd behavior with crashing on MD raid subsystems with Lustre-ized kernels, boot in another kernel, and look to see if the system has conflicting MD raid metadata.
Viewed 31558 times by 4988 viewers