Networker Disaster Recovery Fails

2007-12-25 11:50:00

Sun-Managers:

My original post is listed at the end of the message, but in a nutshell I

was inquiring into the problems that Networker has when completely

restoring a system from scratch. I had terrible problems with the system

being corrupt after I tried to recover from the Networker server.

Two people hit the solution right on the head, but both had different ways

of going about it, and instead of repeating what they said, I'll just

include them. To summarize, the first one (from Birger Wethne) you need to

boot off of cd-rom, get networking running, then drop out of the install

and mount/load/run the networker client. The second one (from Mark Fromm)

you would take the disk(s) out of the dead box, mount them into a currently

working Sun box, start the networker client restoring to the newly mounted

drives, relocating the data to the mounted partitions.

The whole problem was trying to restore to a non-quiescent system, most

likely the /etc file and /usr directory have files that should not be

touched during run-time. So both solutions really get around this by

booting off of alternate media, then mounting the drive, and restoring to

the mounted drive which is not being used by the system.

It appears many many people have had problems with disaster recovery and

Networker, but with these two solutions, I've successfully restored my

system without any problems, and it's back to 100%. Thanks in addition to

all of the following people who responded:

Rich Kulawiec (rsk@gsp.org)

ganeshan@gcs.com.au

Niall Obroin (nobroin@sced.esoc.esa.de)

Steve Boronski (spb@stoke.gov.uk)

James Wendling (jbwendl@bnpcn.com)

Sean Ward (sdward@uswest.com) (Thanks Sean for the detailed help!)

==============================================

(Mark Fromm's response)

Greetings,

The way I have accomplished this (total recovery of networker client) is:

1. Take boot disk for hosed Networker client, put on a working Networker

     client with the same OS and architecture (so the

     /usr/platform/"arch"/lib/fs/ufs/bootblk file matches)

2. Partition disk, create file systems on the disk using the working

client.

3. Restore the "dead" clients files (including OS) using a recover path

     pointing to the mounted disk drive on the working client.

4. Install boot block on the drive. Example:

installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s0

5. Pull disk off working client, install in dead client, cross fingers and

     try booting.

For what it's worth, I have had headaches when the GNU dd is first in the

path doing this before, this screws up installboot. Make sure /bin/dd is

first in your path before using "installboot"

Hope that helps. Good luck.

Mark

================================================

(Birger Wethne's response)

It could be that the root file system isn't at the same patch level

as the restored /usr. There could also be problems if the new and

old Box A are different hardware.

If the new Box A is the same kind of hardware, try the following

restore procedure:

- Boot machine from Solaris 2.x CD. Answer all questions until you get

  'system identification completed' or similar. Then break out of

suninstall.

- Mount Networker client software from a file server

- Partition/newfs disk

- Mount all partitions to be restored under /a (/ as /a, /usr as /a/usr,

etc)

- Run nwrecover. Select all partitions to be restored. Relocate restore to

/a

- Run installboot

This has worked perfectly for me. I have even restored SunOS 4.x systems

using the same method. When restoring a SunOS 4.x machine, use a Solaris

2.x

CD and this procedure, but run the 4.x installboot from the restored

disk as the last step.

With this method you are back to exactly the same / file system as well.

And you save a lot of time first installing the OS just to restore.

Birger

===================================================

MY ORIGINAL MESSAGE:

We're currently running Solstice Networker 4.2.6 on an Ultra-Enterprise

4000 with an external single DLT tape drive and an external DLT4700

jukebox. We have about 8 Sun (Solaris 2.4 - 2.5.1) clients backing up to

this machine. All backups work perfectly, 100% everytime. The problem

comes when restoring a system.

For instance, I get in a new machine (Sparc Server 5) and install Solaris

2.5.1 and the networker client onto it. I'll shut down the original "Box

A" machine, and make this new machine "Box A". I then go into nwrecover

and select "Box A" as the client, and last night's backup the time frame I

want for the files. I then select all relevant file systems, such as /usr,

/usr/openwin, /etc, /opt, etc. After the client has finished restoring all

files, I then type "reboot".

Now the problems start: First, when the system is shutting down, it stops

when it's saying "syslogd: terminating on signal 15" (or something similar)

- the system just hangs. I waited at least 10 minutes, no drive activity,

nothing. I attempt a Stop-A, that doesn't even respond. I power cycle

the machine. It comes up now with the following errors:

not found:pageout_reserve

not found:pageout_reserve

not found:po_share

not found:po_share

not found:po_share

krtld: error during initial load/link phase

Memory Address not Aligned

Type help for more information

ok

And that's it. I've tried mounting the /etc directory after booting from

cdrom into single user mode, I thought the vfstab file had been restore

from the old machine, but it was safe (and correct) - I also checked the

/etc/system file - nothing in there, at all (as it should be) so it's not

trying to load modules or metadevice information.

The networker manual (the 4.2.6 version at least) isn't very informative

when it comes to Disaster Recovery, only file-related restores, not whole

systems. This seems a huge issue for our DR plan. I've tested the memory

in the boot prom and it checks out ok.

Thanks for any help that can be offered. Will summarize immediately.

--Damon

Comments

Got something to say?

You must be logged in to post a comment.