Kurt Seifried

SSD Failure modes

So everyone constantly talks about SSD failure modes and how they die suddenly with no warning and blah blah blah (honestly: everything dies, and sometimes the building burns down, so make sure you have off site backups). But here is an interesting failure mode I found this week. I have two older Intel 80GB SSD’s in a RAID 0 configuration I use for scratch space (e.g. unpacking the Linux Kernel and grepping for stuff). But the RAID 0 array was suddenly very very slow. Since I was retiring that machine anyways I had to wipe it, so I popped in the DBAN disk and started wiping all the drives. It said it would take 79 hours, which seemed a bit excessive:

DBAN screen showing two identical SSDs being wiped with very different results.

That’s right, you may have missed it, I did the first time, there’s no “K” in the speed listing for the first SSD. It’s writing at 559240 BYTES/second, not Kilobytes. The second drive is writing at a reasonable ~56.5 Megabytes/sec but the first one is writing at 0.53 Megabytes/sec, about 100 times slower. This certainly explains why my RAID 0 was behaving so badly. The kicker is there are not write or read errors off of that first drive, it’s just really, really slow. So I guess the lesson to be learned is you should check if your RAID card/software can tell you drive specific statistics or periodically read/write test your SSD’s individually if things start behaving oddly.