doing the impossible data recovery

2007-12-25 12:00:00

My question was:

-------------------- Begin Original Post --------------------

I've got a user who managed to blow away his entire home directory *and*

the online backup copy he had of it. The tape backup doesn't have his

critical files because of a permission problem. (My fault and his.)

He's hurtin'. Like 6-12 months of lost software development work.

On his online backup, he blew it away with an 'rm -r' on a local Solaris

2.5 file system. He hasn't touched that disk, since. It hasn't been

unmounted. Is there any way in hell of recovering those files, even if

all the file names are lost? I know this is a reach but I have to try.

What if I use dd and extract the raw data off the entire disk/partition,

dump it into a giant file, and sort through it, later? Any better

ideas? I think I've heard of places you can send disks to for emergency

recoveries like this ... ?

-------------------- End Original Post --------------------

I will append all the responses I got to the end of this message. But

here's a summary of what happened:

The #1 place people pointed me to for data recovery was OnTrack Data

Recovery (800-752-7557), so I gave them a call, before I did anything.

They said that chances of recovery from a UNIX "rm -r" were very slim,

but it might be worth a shot. After further discussion and a question

directed at one of their engineers, their response was that all they

could give me would be an ASCII dump of all the data on the disk, for me

to sort through. I could do that, myself, so they said to go for it.

Things looked pretty grim. I went over some possible options with the

user. The data that had to be recovered was a bunch of C++ source

files. I asked him if he had any printouts. No. Had he emailed any of

it to his peers for review? No. Had he made a distribution tape and

sent that off to a customer? No. Was an [old] copy of any of it on

another disk or server from any type of backup or cache or anything? No.

This all started when he went to make his own backup of his home

directory to his backup disk, "/home1". He would go into the Sun File

Manager and drag-n-drop his home directory over to /home1. But since he

also uses a PC, and can never remember which one moves/copies

with/without the "Control" key held down, he managed to MOVE his home

directory instead of COPY it, this time.

Well, when he went over to /home1, he noticed that the backups he had

been making this way for months had been stacking up on one another.

Say his home dir name was "someuser". /home/someuser was what he was

copying. What he had was /home1/someuser/someuser/someuser... each one

had its own subdirectory tree containing older and older backups.

He saw that mess, thought to himself, "This is ugly! I don't need all

these backups!" And proceeded to remove all the copies through File

Manager. It wouldn't let him, or it asked him too many confirmations,

so he got sick of that and went to the shell and did the dreaded

"rm -rf /home1/someuser". Poof. Then he went to make a second, clean

backup copy of /home/someuser, only to discover it was gone -- since it

had been moved instead of copied.

He panicked, but figured he would be saved by the tape backup system

that ran every night. But alas, when I restored from tape, we

discovered that most of the important stuff wasn't there! This requires

a diversion in the story, so bear with me. When I set up this

department's backup system, we had a long meeting to go over the

convenience/security aspects, and they [including this user!] decided it

was important to be able to easily EXCLUDE things from being backed up,

because at least two people on the network had sensitive projects they

were backing up themselves, and they didn't want those things ending up

on an easily-stealable tape down the hall. Also, having /.rhosts was

too insecure. And we had tried some fancier backup packages that used

their own net protocol, and hated them all. (That's yet another story.)

So the solution we all agreed upon was to use NFS. Not blazing fast,

but not too bad for doing reads. (NFS writes are what's really slow.)

Each client on the network exported all its partitions to the backup

server, with standard "ro" (read only) priveledges. A little more

secure than /.rhosts. I wrote a script on the backup server that, for

each client partition, mounted it, backed it up, and unmounted it. Easy

enough. This was all done as root, but of course root doesn't have

special read priveledges on NFS mounts unless that's explicitly allowed.

Which it wasn't. So the caveat I drilled into their heads was that THEY

HAD TO HAVE DIRS AND FILES WORLD READABLE IN ORDER FOR THEM TO BE BACKED

UP. They all said they understood. I felt nervous.

My backup script happens to print an for every file it hits "permission

denied" on. So one of the first things I intended to do was send each

user an email message containing a list of all the dirs/files that were

not being backed up, for them to double check. But I got sidetracked

and never did this. My bad. Adding to the regret factor was that just

days before this whole disaster happened, I also intended to change the

"ro" to "root" in /etc/dfs/dfstab everywhere, because this whole

situation of needing to exclude certain data had vanished. But, alas, I

was too late.

Anyway, as you've guessed by now, this user's critical C++ source dirs

were not world readable. So they weren't on the backup tape.

Our only saving grace was that when this user did the massive delete, he

had a bunch of text editors open on his screen, each one with one of his

crucial C++ files in it which he was working on. He was bright enough

to "Save As" those files to a safe place, not on /home1, and that saved

his bacon. He estimated that alone bumped the lost work down from 12

months to 2 months. Now it was up to us to pull as much as possible

from the unwiped data on the /home1 disk. That was his other saving

grace -- that /home1 disk was inactive, and with four to five copies of

every file (remember the nested backups) on there, we had a good chance

of pulling something useful off it.

By the time I got to the user's machine, the disk in question had been

mounted for over 24 hours, and he had been all over it (with reads

only). Some people [see below] told me to power off the disk, and

others said to unmount it as normal. I figured at this point, I should

unmount it because it had been on so long, anyway, and I didn't want

fsck running on it during reboot, which might do god-knows-what to the

critical bits on that disk.

So I umount'd the disk, and went to work. It's a 1 GB disk, one big

partition. I did 'dd if=/dev/rdsk/c0t1d0s2 of=/remov/home1.raw' (/remov

is a 4GB removable SCSI-2 hard drive with nothing on it.) This is on a

Sun Ultra, and it took an hour! My next step was to 'split -b' the .raw

file into 64MB chunks, a more managable size. This split command was

going to take an hour, also! I couldn't understand why it was going so

slow (around 300 KB/sec for local disk access?!?). For kicks, I ftp'd

the .raw file over to an SGI, and did the split there. Ran 10 TIMES as

fast, finishing in 7 minutes. But then my next step, which was to run

"strings" on each piece, started taking forever on the SGI (a new

Indigo2)! So I hauled it back over to the Ultra, which blitzed through

the "strings" process in 10 minutes or less. Go figure.

What we ended up with was 16 files of 12MB to 40MB each, full of lines

of text to wade through. I opened emacs full screen and we went for it.

The key thing to search for turned out to be "//", which hit the

occasional URL, but also hit all C++ comments. This user had a habit of

putting the file name in a comment as the first line of every file,

which turned out to be very helpful. It gave us a BOF marker as well as

a filename. We found several of his files fairly soon, and a Makefile

or two. Cluttering the mess was a huge source tree for a CAD package he

has a source license for, which is also all in C++. But almost all of

the CAD source had TABS for indention, and he used spaces. The tabs got

filtered out by "strings", so all the CAD source stuck out because it

was all crammed up against the left margin. This made it much easier to

human-scan for his source, because it looked so different.

I believe the Solaris/SYSV block size is 8KB, which is pretty big

compared to source code. So we got several complete files. There was

also a huge swap file on that disk which added to the clutter. (I now

want to learn how to fill a swap file full of zeros, which could have

been done before the raw dump.) We spent three 4-hour afternoons, a

total of 12 long hours. When we were all done, he estimated he was down

to 2 weeks or less of lost work. We saved the "strings" files for him

to use later if he wanted to search for something explicitly that we

might have missed.

I came out of this with a much clearer understanding of just how much

data 1 gigabyte is. And an even stronger appreciation for the

importance of good backups. I quickly did the "ro-->root" change

everywhere so the backups are working properly, now. Life goes on.

-------------------------------------------------------------------------

Paul Caskey mailto:pcaskey@swcp.com http://www.swcp.com/pcaskey

-------------------------------------------------------------------------

"Even if you're on the right track,

 you'll get run over if you just sit there." --Will Rogers (1879-1935)

---------------------- Responses Follow [Edited] ----------------------

>From bonzo@swcp.com Wed Dec 18 17:19 MST 1996

Date: Wed, 18 Dec 1996 16:19:21 -0800

From: Bonzo Amin <bonzo@swcp.com>

Subject: Re: doing the impossible data recovery

Paul Caskey wrote:

>

> I've got a user who managed to blow away his entire home directory *and*

> the online backup copy he had of it. The tape backup doesn't have his

> critical files because of a permission problem. (My fault and his.)

> He's hurtin'. Like 6-12 months of lost software development work.

Ow ow ow ow ow ow ow!!!

> On his online backup, he blew it away with an 'rm -r' on a local Solaris

> 2.5 file system. He hasn't touched that disk, since. It hasn't been

> unmounted. Is there any way in hell of recovering those files, even if

> all the file names are lost? I know this is a reach but I have to try.

> What if I use dd and extract the raw data off the entire disk/partition,

> dump it into a giant file, and sort through it, later? Any better

> ideas?

If I were considering such things, I'd probably power off the disk first

and think about umounting it or shutting the system down later. In

reality, that probably wouldn't matter much because the system probably

synced that disk as much as it's going to no later than fifteen seconds

after the damage was done. In any case, you should probably take that

disk offline or at least mount -r -o remount as soon as possible if you

want to recover anything from it. I'd be thoroughly paranoid of some

random background process scribbling some trifle on the disk and blowing

the whole gig.

I think there used to be tools out there like fsdb or somesuch that

would help you pick through disk internals, but I've never messed with

them. I don't know if any of them know how to deal with UFS disks, but

it's probably worth a look. I wish you could get fsck to do something

useful in this case.

> I think I've heard of places you can send disks to for emergency

> recoveries like this ... ?

I've seen them listed in the Computer Shopper, but haven't heard

anything about any of them. Chances are Excite will find you at least

a couple. There's probably quite a few around, but at this point in

time most of them probably specialize in Novell server disks. If you

can find one to do the job, it'll probably save you a lot of personal

grief in that you won't have to pull your hair out trying to learn file

system internals while under fire.

Good luck. These things happen, but they suck.

(:.:)

>From janice@pinata.West.Sun.COM Wed Dec 18 17:20 MST 1996

Date: Wed, 18 Dec 1996 17:18:38 -0700 (MST)

From: "Janice Anthes [Sun New Mexico SE]" <janice@pinata.West.Sun.COM>

Reply-To: "Janice Anthes [Sun New Mexico SE]" <janice@pinata.West.Sun.COM>

Subject: Re: doing the impossible data recovery

To: pcaskey@bassetbyte.com

Before you panic. Go to Sun's online Catalyst

catalog. http://cataylst.sun.com

and do a search on Data Recovery.

I found several companies who claim they provide this

service.

Good luck.

Janice

=============================================

       Janice Anthes, Systems Engineer

         janice.anthes@west.sun.COM

           Albuquerque, New Mexico

Phone: (505) 262-5204 FAX: (505) 268-5264

=============================================

        

>From mstier@mindspring.com Wed Dec 18 18:07 MST 1996

X-Sender: mstier@mindspring.com

Mime-Version: 1.0

Date: Wed, 18 Dec 1996 20:09:04 -0500

To: pcaskey@bassetbyte.com (Paul Caskey)

From: Matthew Stier <mstier@mindspring.com>

Subject: Re: doing the impossible data recovery

Content-Type: text/plain; charset="us-ascii"

Content-Length: 1252

DD'ing seems to be the only solution. However be prepared to do a LOT of

scavaging.


--
Matthew Stier
mstier@mindspring.com
http://www.mindspring.com/~mstier

>From Ian_MacPhedran@mackenzie.usask.ca Wed Dec 18 18:57 MST 1996
X-Authentication-Warning: imhotep.USask.Ca: macphed owned process doing -bs
Date: Wed, 18 Dec 1996 19:57:22 -0600 (CST)
From: Ian MacPhedran <Ian_MacPhedran@mackenzie.usask.ca>
X-Sender: macphed@imhotep
To: Paul Caskey <pcaskey@bassetbyte.com>
Subject: Re: doing the impossible data recovery
In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

Well, if no one else is writing to that disk, it may still be okay. (It
might have been better to dismount it immediately to protect the data on
it.)

You can try to sort out stuff from a copy of the raw disk, it might work
for you.

As you say, there are places which do this sort of thing - you might want
to do a search through the web:
http://www.datarec.com/ - Data Recovery Labs
http://www.cbltech.com/ - CBL Data Recovery Technologies
http://www.vantagetech.com/ - VANTAGE Technologies, Inc
http://www.mind.net/adr/adr.htm - Advanced Data Recovery Inc
etc.
(Note: I have not used the services of any of these vendors - I just did a
search, and these came back as ones offering this service. You don't say
where you are located - you will want to find a vendor close to you.)

Ian.
----------------------------------------------------------------------------
Ian MacPhedran, Engineering Computer Centre, 2B13 Engineering Building,
University of Saskatchewan, 57 Campus Drive, Saskatoon SK S7N 5A9, CANADA
Phone: (306)966-4832 Fax: (306)966-5205 Email: Ian_MacPhedran@engr.USask.CA

>From danno@fv.com Wed Dec 18 19:02 MST 1996
X-Authentication-Warning: mailrus.fv.com: danno owned process doing -bs
Date: Wed, 18 Dec 1996 21:04:24 -0500 (EST)
From: Dan Pritts <danno@fv.com>
To: Paul Caskey <pcaskey@bassetbyte.com>
Subject: Re: doing the impossible data recovery
In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

On Wed, 18 Dec 1996, Paul Caskey wrote:

call ontrack data recovery. I dont' know their phone number but I bet it's
in 800 information.

If the disk has not been touched since this happened, the chances
are very good that they can get stuff back. It won't be cheap, but it
will be cheaper than 6 months of lost work.

I am afraid i don't know whether to suggest that you umount the disk, or
just power off the system. Call ontrack quick, though.

dan pritts
Unix System Admin First Virtual Holdings, Inc.
danno@fv.com 313-213-3791

>From shifter@portal.stwing.upenn.edu Wed Dec 18 19:59 MST 1996
From: Shifter <shifter@portal.stwing.upenn.edu>
Subject: Re: doing the impossible data recovery
To: pcaskey@bassetbyte.com
Date: Wed, 18 Dec 1996 22:01:51 -0500 (EST)
In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov> from "Paul Caskey" at Dec 18, 96 04:45:40 pm

If the disk hasn't been written to since, I would do a dd on
the raw device, and put that into a file (on a separate
filesystem). Then you could use emacs and go thru the
tedious cutting and pasting things out of that big file and
into smaller files.

-John

--

Shifter
shifter@portal.stwing.upenn.edu

>From 6@swcp.com Wed Dec 18 21:36 MST 1996
Date: Wed, 18 Dec 1996 20:38:17 -0800
From: 6 <6@swcp.com>
Organization: Light Dreams
MIME-Version: 1.0
To: Paul Caskey <pcaskey@bassetbyte.com>
Subject: Re: doing the impossible data recovery
References: <199612182345.QAA17146@sherlock.cmc.sandia.gov>

Paul Caskey wrote:
>
> What if I use dd and extract the raw data off the entire disk/partition,
> dump it into a giant file, and sort through it, later? Any better
> ideas? I think I've heard of places you can send disks to for emergency
> recoveries like this ... ?

If it is text there is a pretty good chance at salvaging ALOT of it.

First TURN OFF THE DISK do not unmount, get out out from under OS control.

Next, DD the raw disk device out to tape.

You might be able to get away with just going through the DD and
extracting text files from there. Other options are to go at it sector
by sector Pulling the information out.

>From jefi@kat.ina.de Thu Dec 19 01:53 MST 1996
From: Jens Fischer <jefi@kat.ina.de>
Date: Thu, 19 Dec 1996 09:55:06 +0100
To: pcaskey@bassetbyte.com
Subject: Re: doing the impossible data recovery
X-Sun-Charset: US-ASCII

Hi Paul,

have a look at man fsdb and man fsdb_ufs. fsdb is a tool for examining
and reconstruction of damaged filesystems. However, it will not be easy
to reconstruct your data as you need alot of knowledge about filesystem
structures.

Hope that helps

Regards - Jens Fischer

>From harvey@iotek.ns.ca Thu Dec 19 08:02 MST 1996
Date: Thu, 19 Dec 1996 11:08:29 -0400 (AST)
From: Harvey Wamboldt <harvey@iotek.ns.ca>
To: Paul Caskey <pcaskey@bassetbyte.com>
Subject: Re: doing the impossible data recovery
In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

On Wed, 18 Dec 1996, Paul Caskey wrote:

> I've got a user who managed to blow away his entire home directory
> *and* the online backup copy he had of it.

> ...

> What if I use dd and extract the raw data off the entire disk/partition,
> dump it into a giant file, and sort through it, later? Any better
> ideas? I think I've heard of places you can send disks to for emergency
> recoveries like this ... ?

I'm no expert on Unix data recovery, but on PC's its fairly trivial to
read a disk block, then run it through a filter which decides if it is
text, and if it is, write that text block into a file. Then later,
with a text editor, you can visually stitch your text files back
together. This doesn't work for binaries though, and you have to
handle the partial text blocks at the end of files intelligently. It
is fairly simple to "score" a block of text based on letter pair
frequencies, ie pairs such as "th", "ed", "es" score high while "3x",
"u{" etc score low. These blocks can even be sorted on "long words"
to move related information closer to each other. I don't have any
programs to help with this, but the crypto guys might.

Best of luck,

-H-

Harvey M Wamboldt ^ E-Mail: harvey@iotek.ns.ca
MDA Inc 1000 Windmill Rd. Suite 60 ^ Fax: (902)468-2278
Dartmouth NS, B3B 1L7, Canada ^ Phone: (902)481-3531

>From iv08480@issc02.mdc.com Thu Dec 19 09:47 MST 1996
Date: Thu, 19 Dec 1996 10:48:35 -0600
From: iv08480@issc02.mdc.com (Colin Melville)
To: pcaskey@bassetbyte.com
Subject: Re: doing the impossible data recovery
X-Sun-Charset: US-ASCII
Content-Type: text
Content-Length: 2218

Paul,

Don't know about the dd stuff, sounds like a real long-shot.

Search the web for data recovery, I know they're out there. If you can't
find anything, se me a note, I'll call our local HP engineer. He
mentioned a disk recovery service he had to use for a platter crash
once...very expensive (multi K$!!).

Good luck,
Colin

%)====================================================%)
%) Colin Melville | cmelville@bigfoot.com %)
%) UNIX Systems Administrator| NTS Technical Services %)
%) UNIX Server Support Team | http://www.ntstech.com %)
%) %)
%) Views expressed are my own. %)
%) %)
%) Supporting: McDonnell Douglas Aircraft Corp. %)
%) http://www.mdc.com %)
%)====================================================%)

>From foster@bial1.ucsd.edu Thu Dec 19 10:31 MST 1996
From: foster@bial1.ucsd.edu
Date: Thu, 19 Dec 1996 09:32:54 +0800
To: pcaskey@bassetbyte.com
Subject: Re: doing the impossible data recovery
X-Sun-Charset: US-ASCII
Content-Type: text
Content-Length: 302

Try OnTrack Data Recovery. It's a bit expensive, but not when you're
talking about 6mo. of software development.

800-752-7557

They were able to recover all files from an optical disk that had its
file table trashed! I think they could recover your files for you.

Dave Foster
foster@bial1.ucsd.edu

>From jk@stallion.ee Fri Dec 20 12:42 MST 1996
Date: Fri, 20 Dec 1996 21:44:09 +0200 (EET)
From: Jyri Kaljundi <jk@stallion.ee>
X-Sender: jk@nebula
To: Paul Caskey <pcaskey@bassetbyte.com>
Subject: Re: doing the impossible data recovery
In-Reply-To: <199612182345.QAA17145@sherlock.cmc.sandia.gov>

Hi Paul,

On Wed, 18 Dec 1996, Paul Caskey wrote:

> I've got a user who managed to blow away his entire home directory *and*
> the online backup copy he had of it. The tape backup doesn't have his
> critical files because of a permission problem. (My fault and his.)
> He's hurtin'. Like 6-12 months of lost software development work.

There is one great company in Norway that does data recovery. Have a look
at their web site at http://www.ibas.no/ or e-mail ibas@ibas.no

Usually they recover disks after crashes and disasters, but rm -rf might
have a solution also. It is not cheap, beginning from 1000-2000 US
dollars. But then 6-12 months work is not cheap either.

I hope they can help you,

Juri

Comments

Got something to say?

You must be logged in to post a comment.