WANTED: Unix File Expiration Utility

2007-12-25 8:14:00

REPLIES/RECOMMENDATIONS/SUGGESTIONS:

1) Use: quot -v

        or quot -av

        or a diskhogs script that uses quot and massages its output.

> You can massage quot to report

> which users haven't accessed so many blocks of storage

> in 30, 60, 90 days.

        --

> Try the "quot -av" command. It returns three columns of

> disk capacity not accessed in the previous 30, 60 and 90 days

> associated with each user. We then "publish" the list of major

> offenders and let peer pressure take care of deleting the stale

> files. In our case, assuming that a stale file should be

> automatically deleted would get me into hot water!

>> This doesn't really do what we need.

2) Use: find . -size +1000 -exec ls -l {} \;

> findbig

>> Neither does this.

3) Use: find /usr/dir -atime -90 -print | xargs rm

        This reply suggested...

> You need to use a -90, not a +90

> xargs optimizes the command line to rm, exec spawns an rm for each

> file.

>> Contrary to their suggestion, we found +90 to be correct.

>> Their recommendation to use xargs was helpful.

4) Use: find /usr/dir -mtime +90 -ls

> When I "sweep" our file systems I use "-mtime +NN".

> I want to know what's been modified not accessed.

>> Last accessed time is what we need.

>> Last modified would give us files modified long ago

>> but continue to be accessed/used daily.

5) Try: perl with some of the examples in the perl book.

> You might want to look at files & file access attributes with perl,

> but that will require more programming time.

>> We've not yet tried this one, but have been meaning to delve into

>> using perl at some point.

6) Buy: Epoch's nifty file cascade backup.

> If a file hasn't been accessed in 30 days

> it gets moved to a floptical

> with a pointer in the old file to the new location.

> After 90 days the file is moved to tape.

>> Interesting, but probably not. At least, not yet.

7) Get: lfu

> Don't know what exactly you're up to, but a utility we use here is

> called lfu. Here's the scenario: We have a central file server with a

> complete /usr/local tree (actually a couple, mounted with amd).

> Client workstations have local /usr/local directories and NFS mount

> the central /usr/local. When a file is accessed, it gets cached to

> the local /usr/local. When it doesn't get accessed for a while it

> gets removed and a symlink to the NFS mounted version is put in its

> place. If you start accessing it over the symlink it will eventually

> get brought back into the cache to replace the symlink. Has

> configuration to maintain free disk space, etc. A bit buggy at the

> moment, but someone here is working with the developer to iron out

> some of the bugs.

> The above scenario of course assumes you're trying to weed out

> replication and not complete rm the file out of online existence.

>> Not sure this really addresses what we need or whether this would be useful

>> for any purpose(s) at our site.

8) Maybe:

> > It seems that using "find" with the -atime switch/option

> > doesn't give us much of a hit list as we might expect.

        --

> Possibly due to (from the manual page):

> find does not follow symbolic links to other files or direc-

> tories; it applies the selection criteria to the symbolic

> links themselves, as if they were ordinary files.

>> This does not apply in our case since we run find on the actual hard mounted

>> local disk filesystem(s).

9) Maybe:

> Could it be that you're accessing more files than you expect? Perhaps

> a security sweep that does "file" or checksums, etc. which is reading

> the files? Someone greping through everything in a directory?


--
> My experience (at least with SunOS 4.x) is that find works
> correctly. You just don't have many files that have not been accessed
> in the last 90 days. Maybe your users actually use their files :-)
> You might also check to see if you are doing some other kind of
> system sweep that accesses files (i.e. a grep through all files).
> Also, some users might be doing this to subvert your efforts to get
> rid of old files.

>> It appears to be correct that find does work as expected.
>> It appears that Backup Copilot, which we use to do dumps for unattended
>> backups, is the culprit which changes the access datetimes everytime we
>> (daily incrementals/weekly level zero dumps) run it via crontab.
>> Does this concur with other people's experiences with Backup Copilot?
>> I will send out a request for more information regarding this question.

ORIGINAL REQUEST:

--- Forwarded mail from espiritu@cgi.com (Rex Espiritu)

>From espiritu@cgi.com Fri Apr 16 15:00:30 1993
>From sun-managers-relay@ra.mcs.anl.gov Fri Apr 16 22:42:48 1993
To: Sun Managers Mailing List <sun-managers@eecs.nwu.edu>
Subject: WANTED: Unix File Expiration Utility

We're attempting to establish a regular (monthly) sweep of our Unix filesystems
to determine what file(s) (hierarchy/directories/subdirectories) have not been
accessed for a "long" time.

It seems that using "find" with the -atime switch/option doesn't give us much
of a hit list as we might expect. We're currently using a shell script with
something similar to the following:

find /usr/dir -atime +90 -ls
...
find /usr/dir -atime +90 -exec rm {} \;

Are there any utilities available which would help us accomplish this?

Any suggestions on how better to use find and/or recommendations would be
greatly appreciated.

Thanks in advance.

--
M. Rex Espiritu, Jr. Carnegie Group, Inc.
espiritu@cgi.com 5 PPG Place
Voice: 412 642-6900 x233 FAX: -6906 Pittsburgh, PA 15222

--- End of forwarded message from espiritu@cgi.com (Rex Espiritu)

THANKS TO:

David Fetrow <fetrow@biostat.washington.edu>
Michael G. Harrington <mgh@bihobl2.bih.harvard.edu>
Daniel Trinkle <trinkle@cs.purdue.edu>
Steve Holmes <sjh@math.purdue.edu>
David T. Bath <dtb@otto.bf.rmit.oz.au>
John Marsh <john@rod.mitre.org>
John A. Murphy <jam@philabs.philips.com>
Paul Begley <peb@sandoz.ueci.com>
Bert Robbins <bert@penril.com>
Lewis E. Wolfgang <wolfgang@sunspot.nosc.mil>
danny@ews7.dseg.ti.com
Fuat C. Baran <fuat@watsun.cc.columbia.edu>
Mike Robinson <mike@castle.edinburgh.ac.uk>

Comments

Got something to say?

You must be logged in to post a comment.