Oh good grief. I'm working on standing up a new web server to take the place of alpha and beta (the original pair, huddling in my sister's basement at the faaaaar end of a DSL line). I had scrounged disks, an enclosure, and an old RAID card ... all obsolete by present standards but more than enough for my needs.

I got the hardware configured and the OS built. It all looked pretty good, except that the machine would halt and stutter like an old man with Parkinson's. And the syslog had all sorts of ominous entries in it like this:

Feb 24 12:55:54 solstice kernel: scsi : aborting command due to timeout : pid 13
9871, scsi0, channel 0, id 0, lun 0 0x2a 00 01 04 0c b4 00 00 80 00
Feb 24 12:55:54 solstice kernel: scsi : aborting command due to timeout : pid 13
9872, scsi0, channel 0, id 0, lun 0 0x2a 00 01 04 0d 34 00 00 80 00

One of the disks was going bad. Once I looked at the disk enclosure that status lights made that plain toot sweet.

So, I pulled the disk. You can do that -- the disk and the enclosure are setup for "hot plugging". The stuttering and halting ceased immediately. And the plaintive wailing of the alarm on the RAID card started in at once.

I slapped in a replacement disk, and the array controller started rebuilding the contents of the "bad" disk on the replacement right away. But that did not satisfy the screechy little alarm god in his little plastic alarm house on the RAID card. I rebooted, went into the controller setup and told it to silence alarm. Blessed peace. Until I had to restart the box to clear out of the controller config. Then beeeeeep   beeeeeep   beeeeeep   beeeeeep   beeeeeep.

Next time I take solstice down for a restart I will be selecting disable alarm thankyouverymuch. Unless the array rebuild finishes first and quells the fury of the dime-sized tormentor currently filling the basement with his electronic caterwauling.

Le sigh. And to think, this is what I do to relax.


Feb. 25th, 2007 01:10 am (UTC)
...Yeah, some of the olde Compaq RAID cards (the IDA controllers) had a toggle switch that stuck out the back -- you could hit it and silence the damn thing.

The problem is that, if a disk goes legs up, you really do want to know about it. Next time I'll know to choose disable. Of course, I will then have to remember to re-enable the alarm once the problem is fixed.
