utmp(x) and wtmp(x) history

2007-12-25 12:00:00

After reading the responses, I decided the best way to repair the wtmpx

file was to write a (simple)perl script to read in the records, cut out

the garbage(only 164 bytes of crap was screwing everything up), and

close the space back up(I should be a surgeon). This worked, and I also

found out lots of interesting stuff about the files in question. Thank

you to these people who gave me sample scripts and a good history of the

[uw]tmpx? files.

BILLY <billy@student.adelaide.edu.au>

Jean-Philippe.LEROY@st.com

Jim Harmon <jharmon@telecnnct.com>

Chris_Marble@hmc.edu

jsdy@cais.com

Aleksandar Milivojevic <alex@srce.hr>

"Karl E. Vogel" <vogelke@c17mis.region2.wpafb.af.mil>

Original question:

====================

In short, can everyone tell me all they know about the utmp,wtmp,utmpx,

and wtmpx files?

I have read the man pages for [uw]tmpx? and fwtmp, know how to truncate

them and know how to rotate them, and realize that the "x" files are

extended versions of the non-"x" files, but I wonder:

why are all four files necessary, i.e. why is all accounting info not

kept in one huge file? Is this so older programs can still read the old

format of the utmp and wtmp?

why is there both a U-tmp(x) and a W-tmp(x)(emphasis on first letter)?

Again, I am curious why four files are needed.

what function each serves(sure they help commands like who and write,

but more specifically what commands rely on which files and why)

what is the history behind the files(since the x files are "extensions",

I assume that at some point there were only the utmp and wtmp)

is there an equivalent to fwtmp that can read the wtmpx file and write

it out in ascii so I can try to repair my wtmpx file? This is the real

reason for this message: my wtmpx file is messed up somehow because a

"last" command only lists people up to Dec 6. It looks as though noone

has logged in since then.

I could just truncate the file and get on with life, but I need to keep

the information intact(I do analysis on the connections to this

machine). The file is still growing, so the new logins are getting

written still, but there must be a bad spot in the file that "last"

chokes on. I might eventually write a C program to do what i want, but

I wanted to understand the history, structure, and uses for these files

first(also maybe there is already a program out there). I will check if

there are any good backups after Dec 6 of the wtmpx file(maybe it just

recently got hosed), but I would still like to know this stuff just to

be more educated.

Thank you.

====================

RESPONSES:

----------

=> why is there both a U-tmp(x) and a W-tmp(x)(emphasis on first

letter)?

=> what function each serves(sure they help commands like who and write,

=> but more specifically what commands rely on which files and why)

utmp(x) contains the current state, and is used by things like

finger(1),

write(1) and who(1)

wtmp(x) contains the login history, and is used by things like last(1)

=> is there an equivalent to fwtmp that can read the wtmpx file and

write

=> it out in ascii so I can try to repair my wtmpx file?

not that i know of... but [uw]tmp(x) manipulators are easy to write...

in

perl, you'd want something like this to read [uw]tmpx:

open(UTMPX, "/var/adm/utmpx"); # or whatever

while(read(UTMPX, $utmpx, 372)) {

    ($user, $id, $line, $pid, $type, $exit_1, $exit_2, $tv_1, $tv_2,

     $session, $pad_1, $pad_2, $pad_3, $pad_4, $pad_5, $syslen, $host)

        = unpack('A32 A4 A32 l s ss xx ll l lllll s A257', $utmpx);

    # do stuff here

}

close(UTMPX);

have a peek through /usr/include/utmp.h and utmpx.h to get an idea of

the

structures and functions available... if it helps, i can send you a perl

hack

i wrote (from which i pulled the code above) that basically duplicates

"finger|sort"...

----------

>From a unix administration book (accounting chapter) :

"First of all utmp is created by the init daemon when it runs for the

first time. wtmp must be create dby the administrator. Each record is

writen in utmp by a terminal: for example login writes user name and

remote node (if any) and the connection time. When the connection ends

init process will clean this information. So the file size is more or

less stable and proportional to the number of terminals. The records are

similar in wtmp but it will contain two records by session: one for the

begining and one for the end date. This file needs to be clean

periodically based on the number of connection (nb of terminals and

users)..."

To clean wtmp you just need to "cp /dev/null /var/adm/wtmp".

----------

There's an administrative command called "wtmpfix" that will probably do

what you're looking for.

look for it in the (1m) section of the Answerbook Manpages.

It should be in the the /usr/lib/acct dir.

----------

We wrote a program here to read in and trim the files as desired.

We didn't want to simply truncate but retain the last login date

for each user no matter how old. Our program's written in perl

and should be readable and modifyable. Hope it helps.

[http://www3.hmc.edu/docs/coolstuff/wtmpx]

----------

As you clearly have deduced, "tradition" accounts for a lot of this.

In the beginning, there were just the utmp and wtmp files, in /etc/.

The utmp file, as now, contains structures for those who are currently

logged in. With the introduction of System V, certain other processes

logged themselves into the utmp file [notably 'init'], and the locations

of terminal lines became fixed in the file - no longer would a new login

just insert itself in the first empty slot. This meant, too, that many

programs began to depend on the format of utmp and wtmp.

Meanwhile, again as from the beginning, "wtmp" was just the

concatenation of 'utmp' structures to indicate when users had logged in

and out and when other system, events (notably time changes and reboots)

had happened. No attempt was made to verify whether the file was intact

before appending another 'wtmp' record. This is of especial importance

to you, as we will see.

But along came networked logins, X-windows sessions, and other things

that needed to be logged along with a 'utmp'/'wtmp' entry. Different

groups have reacted to this in different ways. Sun decided to add the

utmpx and wtmpx files. Some of the information is mirrored; but the

string lengths are notably longer. Other information is added, and

other information is omitted.

So, now, when a Sun program needs to get all of the information for a

given current login, it looks in both the utmp and the utmpx files. For

historical information, it looks in both the wtmp and wtmpx files.

I've had the problem you describe, when a 'wtmp' structure was partially

written to the "wtmp" file just when the machine went down. You need to

re-synch the file, by reading as many good records as you can, skipping

over the bad record, and repeating. You have a particular problem with

the Sun solution, in that you might want to maintain consistency between

"wtmp" and "wtmpx". I did this by doing a 'who wtmp', 'dd'ing the

appropriate number of records, using 'dd' again to skip over the mangled

record, etc. This may or may not be more onerous when synchronizing

with the 'utmpx' structures in "wtmpx".

----------

Sometimes after crash you'll get messed up wtmpx file (becose it was

not cleanly closed). If you look wtmpx, you'll see that there is some

garbage (usualy lots of zeros) that confuses commands like last.

Since wtmpx is binary file, it will be hard to repair it by hand.

But, you can write small program (similar to last) that will read the

file and ignore errors in it.

----------

  Is /usr/lib/utmpd running? That should be started in

/etc/rc2.d/S88utmpd.

   It's supposed to correct distortions in the utmp and utmpx files, but

   it can misbehave. The current version of utmpd seems to work quite

   well as long as the defaults are set properly in /etc/default/utmpd:

        SCAN_PERIOD=300

        MAX_FDS = 0

   These values come from

        http://remus.rutgers.edu/~adrian/solaris/problems.html

   We use the values

        SCAN_PERIOD=30

        MAX_FDS = 3

   If none of this helps, try modifying the S88utmpd script to remove

the utmp

   and utmpx files from /var/adm, and then create new ones before

starting

   utmpd:

        rm /var/adm/utmp /var/adm/utmpx

        cp /dev/null /var/adm/utmp

        cp /dev/null /var/adm/utmpx

        chown root /var/adm/utmp*

        chgrp bin /var/adm/utmp*

        chmod 644 /var/adm/utmp*

----------

Comments

Got something to say?

You must be logged in to post a comment.