Exchange Raid Recovery
If your Raid crashed - read this info before trying to rebuild!
Raids
that have exchange database files on them should be treated different
than a file server, this is serving a database and the database is
always open while running.
When the raid
goes down for whatever reason the database will not be shut down
properly and will have errors because of this. This can run from
extreme, 0 bytes in the file, to just a dirty shut down. The problem is
you can not know which and if it is mission critical, precautions must
be taken to save the database.
Everyone ASSUMES that there BACKUP IS GOOD. This is where the trouble starts.
Never assume anything. Until you have restored the database to a
DIFFERENT machine and it has been checked. Treat the situation as if
you have no backup.
If a hot spare was not part of the raid or the hot spare did not kick
in and repair itself, DO NOT try to do a rebuild until you have backed
up the system and checked the backup. If you find that the raid went
offline and did not run in a crippled state (1 drive missing) DO NOT
attempt to bring it online. It failed for a reason and that is you have
a serious problem. More than one drive failed and the raid is now
broken. Check your backups first on a recovery machine. Taking the easy
way out can cost you everything if you have no recourse. Rushing under
pressure will do the same thing.
If you have no backup or you find the backup you have is bad. You must clone the drives before you do anything to the raid. Writes to the raid are going to happen to the raid no matter what is done when adding a drive for rebuild. The raid has to scrub itself to correct any bad data pointers and to synchronize the parity across all the drives when ever any major changes are made and doing a rebuild by adding a new drive is major. This means writing to the drives and if the information is wrong the pointers will point to incorrect data areas for retrieving the data.
If the raid is not online and you can’t see the data then you definitely do not want to force a rebuild because you don’t know what is going to be written to the drives.
Exchange
database files are sensitive enough with out having bad information
written in. If the pointers to the file are incorrect by just one the
file is no good and will be difficult to recover.
Clones of the
drives at this time must be done. Any problems after this can be
corrected by re-cloning the drive. Depending on the raid, controller
card and circumstances is what should be done as far recovering the
data. This is where experience and knowledge help the most. Time is
always critical when it comes to exchange servers and knowing what to
do with out trail and error can save days.
Call us at any time for help @ 212-759-0946 or simply fill a request form.

