Wednesday, 21 August 2013

Fsck Errors on RAID 5 Enclosure

Fsck Errors on RAID 5 Enclosure

We have a server (running ubuntu), with an eSATA attached 5 disk
enclosure, running hardware RAID 5, that has (had?) an ext4 file system on
it.
So the enclosure uses RAID 5 to mux the 5 drives together so that they
appear as a single large drive to ubuntu (over an eSata link).
I hope I've described that adequately.
The entire drive has been acting odd, and I can't tell if it's the
enclosure, or one of the drives. The enclosure doesn't indicate an issue
with any of the drives, and even if there were an issue with one, I'd
think the RAID 5 nature would allow it to keep chugging along until there
was an issue with two.
So now, this morning, the drive representing the enclosure went offline
and won't come back.
I have tried to fsck.ext4 it, but that ultimately ends with:
Deleted inode 3489056 has zero dtime. Fix? yes
Inode 84410440 has an invalid extent node (blk 319823992, lblk 0)
Clear? yes
fsck.ext4: e2fsck_read_bitmaps: illegal bitmap block(s) for /dev/sdd
/dev/sdd: ***** FILE SYSTEM WAS MODIFIED *****
e2fsck: aborted
/dev/sdd: ***** FILE SYSTEM WAS MODIFIED *****
So my first question has to be with where the issue lies - with the
drives, with the enclosure, maybe the eSata controller?
If it's not clear - how do I find out? The drives themselves aren't
visible (as far as I know) to the OS so I'm not sure how to proceed on
determining which is busted (and should be replaced).
If it turns out that it's one of the drives - why then is it failing?
Doesn't the RAID 5 prevent that? Can I determine which drive has failed
and just replace it (I have a spare on hand)?
So, I guess my question boils down to this:
What is the likely issue, or what can I do to narrow that down? Additional
point for how to recover from this, and for an explanation of why RAID 5
didn't protect me.
FWIW, the drives are Seagates and the enclosure is an Addonics ST55RPM
Storage Tower V.

No comments:

Post a Comment