unhelpful error messages

2007-12-25 8:46:00

I recently asked:

>Last night one of my systems, a SS10 with SunOS4.1.3,

>crashed with the following errors:

>mb_mapfree: MR == 0!!!

>mb_mapfree: MR == 0!!!

>mb_mapfree: MR == 0!!!

>panic on cpu0: tmp_rename

>Anyone able to decipher this? The system uses tmpfs, my suspicion

>is that /tmp filled up and bumped into the system trying to swap.

I got one reply:

        From: Joe Silva <jsilva@com.polaris1.rocknroll>

        Message-Id: <9405041919.AA05384@rocknroll.polaris1.com>

        To: se@uk.ac.lancs.comp

        Subject: Re: unhelpful error messages

        Cc: jsilva@COM.DMC

        Content-Type: X-sun-attachment

        Sender: postmaster@uk.ac.nsfnet-relay

        Status: RO

        Steve,

        It appears you may have a software bug. See attached.

        Joe

         Bug Id: 1029783

         Category: kernel

         Subcategory: driver

         Release summary: 4.1prebeta, 4.0.3

         Synopsis: mb_mapalloc can call back while driver 'protected' by spl.

                 Integrated in releases: 4.1

         Summary:

        There is a potential problem for drivers using mb_mapalloc() directly

        and expecting a callback in the case where resources aren't immediately

        available.

        The callback can occur while the driver is in its interrupt handler.

        This is because the callback routine is called by mb_mapfree, which

        can be called from the interrupt handler from another driver which

        has a higher priority.

        The scenario:

                Driver A calls mb_mapalloc(), DVMA is not available, so driver A's

                callback routine is queued.

                

                Driver A gets an interrupt.

                During the driver A interrupt handler, driver B gets an interrupt,

                this can happen since driver B interrupts are higher priority.

                Driver B calls mb_mapfree() which calls driver A's callback routine.

                

                Driver A didn't expect to be re-entered this way and something bad

                happens, such as a request list being damaged.

The observant will note that the bug above was fixed in SunOS 4.1

As I stated, we use 4.1.3. A chat with a Sun engineer over the phone

and a quick look at Sun's online databse pointed me to patch #100507-05.

Keywords: tmpfs crash fail assertion leaks anonymous tmp_rename panic spars files

Synopsis: SunOS 4.1.1, 4.1.2, 4.1.3: tmpfs jumbo patch

SunOS release: 4.1.1 4.1.2 4.1.3 4.1.3C

Topic: fixes for several tmpfs bugs

So I installed the patch. The guy who was running a large suimulation

when the system crashed ran his simulation again last night and everything

was OK

Steve

Comments

Got something to say?

You must be logged in to post a comment.