sockets stuck in CLOSE_WAIT

2007-12-24 21:39:00

We have a weblogic application running on 2 SUN 3500's running Solaris 8
with a patchset that I downloaded
in December. This app was just ported last week from 2 4500's also running
Solaris 8 and most recent patch
on those servers was October. weblogic has 2 ports that it uses for http
access: 7001 is the
primary port, 7002 is a secure access port. The way I understand it, users
login to the url on 7001, it boots them
over to 7002 for a secure login, then sends them back to 7002. We are
using weblogic 5.10 and we have
4 weblogic instances on each server.

The problem - ever since we ported the app, one of the weblogic instances
will start building up a bunch of
CLOSE_WAIT states on port 7002. This will grow until we hit max file
descriptors then the instance crashes.
The problem has switched back and forth between the 2 servers (they receive
their data from a load balancer
that is set up to round robin between the total of 8 instances, 4 on each
box). This is what I'm currently seeing in
netstat -

weblogic1:/usr/local# netstat -an | grep 7002 | more
172.31.1.165.7002 *.* 0 0 32768 0 LISTEN
172.31.1.164.7002 *.* 0 0 32768 0 LISTEN
172.31.1.166.7002 *.* 0 0 32768 0 LISTEN
172.31.1.162.7002 *.* 0 0 32768 0 LISTEN
172.31.1.162.7002 12.60.208.54.1456 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1458 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1563 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1566 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1075 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1085 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1090 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1108 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1124 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1132 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.54.1139 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.56.1174 8760 0 33580 0
CLOSE_WAIT
172.31.1.162.7002 12.60.208.53.1230 8760 0 33580 0
ESTABLISHED
172.31.1.162.7002 12.60.208.54.1184 8760 0 33580 0
CLOSE_WAIT

We can't figure out what is going wrong. The patch level? something wrong
on the 12.60.208 subnet? something
with weblogic? I'm completely stuck at this point, all I can recommend to
my application support is to kill the
process and restart it, then it will run along smoothly until it happens
again.

thanks for any help -
Elizabeth Jones

Comments

Got something to say?

You must be logged in to post a comment.