Psuedo Terminal Weirdness

2007-12-25 7:53:00

Sorry about the delay in posting this summary but it was a fairly low

priority and it's taken a while to have time to try out some of the

options and play about.

The original problem:

>We have an SS2 running 4.1.1. It uses xdm from X11R5 to run the

>console (OW3.0 xnews - yuck) and one Xterminal. Today the users

>complained that they couldn't start a new xterminal. When started the

>csh in the term complained about reading EOF about 20 times (the `Use

>"logout" to logout.' message) before giving up and dying. Obviously it

>was reading EOF from the pty (/dev/?typ4) Starting 2 xterms at once

>allowed the second xterm to succeed on /dev/?typ5. I've even removed

>and re-mknoded /dev/?typ4 with no effect. It looks like the kernel's

>pty driver for this device has screwed up. I can't see any processes

>that have it open and the file permissions are `crw-rw-rw- 1 root'.

>I'd rather not reboot until I find out what's going on even though I'm

>90% certain that that will fix things up. Any clues?

There were various suggestions but mostly people felt this was a long

standing bug that popped up occasionally. There was a mention of a

patch that might work but I haven't tried it out mainly because we'll

soon be moving to 4.1.3. I have since confirmed that a reboot will fix

the problem. I did managed to bring one pty back to life on a running

system. I was trying to find out why rlogin failed to open the pty

that xterm succeeded with and also reading and writing the tty and

pty. Suddenly I noticed that a user had successfully opened the pty

for use! I've not yet managed to find a repeatable procedure. One

person suggested the use of fuser to find a process holding on to the

pty in some strange mode but each time I've tried fuser finds no

processes. The old `kill -1 1' was suggested but was ineffective.

And now the responses:

>Date: Wed, 12 Aug 92 02:40:53 PDT

>From: iapsd!seri!glenn@uunet.uu.net (Glenn Herteg)

>

>I have seen this problem for years, at least back to SunOS 4.0.1,

>running SunView. I have at various times thought it might be solved

>by SunOS patches, though I never got around to installing them. I

>have not seen the problem lately, though we're not exercising our

>systems the same way we used to. If you find out what the problem

>is, I'm intensely curious myself ...

>

>One way around the problem is "set ignoreeof" in your .cshrc file.

>You start a window, notice the problem, but it doesn't cause any

>significant problems per se to the rest of the system. Now you

>just close up the window and tuck it away in the corner of your

>screen, where it ties up the pseudo-terminal and you can create

>other windows t will.

>

>The only thing I've ever see clear this problem is a reboot.

>

>Now here's an oddball thought. A few days ago I reset my system

>clock about a year back. Suddenly the csh in that window started

>scrolling prompts interminably, effectively locking up the workstation

>to local accesses. I cleared the problem with a remote login and

>reset the time forward by a few days (still nearly a year back

>from wall clock time). Since the symptom of apparently reading

>an incessant stream of useless data is so similar, it occurs to

>me that maybe the pty driver is somehow getting confused about

>the time... though why this should make any difference is unclear.

>

>Glenn Herteg

>glenn%iapsd@uunet.uu.net

------------------------------------------------------------------

>From: kalli!glenn@fourx.Aus.Sun.COM (Glenn Satchell)

>Date: Wed, 12 Aug 1992 12:30:59 EST

>

>Yes, rebooting will fix it for now... You should also try installing

>Patch 100188-02: One of the bugs fixed is "Process not letting go of a

>pty (bugID 1040722)". Note that this supercedes patch 100414-01.

>

>regards,

>

>Glenn Satchell

>Unix Professional Services (Sydney Australia)

>kalli!glenn@fourx.aus.sun.com

------------------------------------------------------------------

>From: Brent Alan Wiese <brent@crick.ssctr.bcm.tmc.edu>

>Date: Tue, 11 Aug 92 10:16:40 CDT

>

>... It is interesting to note that rlogin and telnet will not

>pickup these EOF-pty's, only xterm seems to.

------------------------------------------------------------------

>Date: Tue, 11 Aug 92 09:24:13 CDT

>From: Mike Raffety <miker@sbcoc.com>

>

>Try fuser on the master and slave pty devices; I'll bet SOMETHING's

>still got it open in a funny mode.

------------------------------------------------------------------

>From: Steve_Kilbane@gec-epl.co.uk

>Date: Tue, 11 Aug 92 08:35:36 BST

>

>In article <9208102345.AA23976@gorton.anu.edu.au> you write:

>> When started the

>>csh in the term complained about reading EOF about 20 times (the `Use

>>"logout" to logout.' message) before giving up and dying. Obviously it

>>was reading EOF from the pty (/dev/?typ4)

>

>Not necessarily. I've seen this behaviour on normal terminal lines, where a

>program has set NDELAY, then died. The csh then gets no bytes from the

>terminal, and treats it as EOF. In our case, it was cleared by logging out

>(in fact, the login csh bombs out, as you discovered), and init resets

>the terminal line. In the case of ptys, things tend to be a bit more screwed,

>because init isn't resetting them.

>

>I don't know if this is your problem, though, because NDELAY is a

>characteristic of the open file table entry, rather than of a device or an

>inode, so this should be cleared when the file is closed.

>

>> I can't see any processes

>>that have it open and the file permissions are `crw-rw-rw- 1 root'.

>>I'd rather not reboot until I find out what's going on even though I'm

>>90% certain that that will fix things up. Any clues?

>

>Should do. It'll certainly close the file:-).

>

>Hope this is of some help...

>

>Steve

------------------------------------------------------------------

>Date: Tue, 11 Aug 92 16:55:10 +1000

>From: Chris Keane <chris@rufus.state.COM.AU>

>

>It's probable that some program has it open and the EXCL open performed

>by login isn't working. This happens sometimes.

>You can find out which program it is by using /etc/fuser /dev/ttyp4

>

>A quick dirty solution is to chmod 000 /dev/ttyp4 until the next time

>you reboot.

>

>Chris.

------------------------------------------------------------------

>Date: Tue, 11 Aug 92 13:30:34 EST

>From: ivan@fac.anu.edu.au (Ivan Dean)

>

>...

>You might look for processes that have '?' listed as their pty. In at least one

>case, a process like that was affecting someone else's pty. Otherwise, have you

>tried HUPing the init process, with kill -HUP 1 ?

>

>Ivan

______________________________________________________________________________

James Ashton System Administrator

                                             Department of Systems Engineering

Voice +61 6 249 0681 Research School of Physical Sciences and Engineering

FAX +61 2 249 2698 Australian National University

Email James.Ashton@syseng.anu.edu.au GPO Box 4 Canberra ACT 2601 Australia

Comments

Got something to say?

You must be logged in to post a comment.