Imagine my great distress when I woke up Sunday morning the 17th, the big public opening, to a screen full of alert messages – our web site (and Art on Call) had been down since about 4:30 that morning. I had a terminal window open from the night before, so I quickly tried to restart one of the Apache servers — file not found?!! There were no files in the software folder. No files in the home folders for our websites. Panicked, I checked the logs: full of I/O errors for the drive. Trying to reboot left the machine completely unresponsive. AHHHHH!!!
I knew there were backups being made by the company we’re colocated with – Onvoy – but I’d never had to use them and didn’t quite know where to start. Some quick reconnaissance in our internal wiki told me the drive was a SCSI drive. Crap. On the 17th I knew just enough about SCSI to know I didn’t know enough to run out and buy a new drive on the spot – way too many options to wade through. A call to a local hardware store (General Nanosystems) confirmed my fears – “is it SCA or LVD?” “um…”
The SCSI interface (pr. “scuzzy”) is really quite incredible. Most desktop machines use the IDE interface to connect their hard drives, which is all well and good for their needs, but production-quality servers need something more – something faster, more reliable, better engineered, and self-diagnosing… Enter SCSI drives.
I head out to Onvoy with a pit in my stomach – even if I can get a new drive today, I’m not confident I can learn or find someone who knows how to restore from the backups… Oh, did I mention it’s the biggest day for the Walker since I started working here? The grand public re-opening?
Tune in next time for part two of the saga, in which our hero saves the day — but really only postpones disaster…