From nobody  Wed Jan  1 00:29:02 1997
Received: (from nobody@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id AAA18663;
          Wed, 1 Jan 1997 00:29:02 -0800 (PST)
Message-Id: <199701010829.AAA18663@freefall.freebsd.org>
Date: Wed, 1 Jan 1997 00:29:02 -0800 (PST)
From: andrew@ugh.net.au
To: freebsd-gnats-submit@freebsd.org
Subject: sysinstall: ppp: recursive call in malloc()
X-Send-Pr-Version: www-1.0

>Number:         2347
>Category:       bin
>Synopsis:       sysinstall: ppp: recursive call in malloc()
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    brian
>State:          closed
>Quarter:
>Keywords:
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jan  1 00:30:01 PST 1997
>Closed-Date:    Wed Jan 29 15:34:06 PST 1997
>Last-Modified:  Wed Jan 29 15:42:32 PST 1997
>Originator:     Andrew Stevenson
>Release:        2.2-BETA
>Organization:
>Environment:
FreeBSD sally.ugh.net.au 2.2-BETA_A FreeBSD 2.2-BETA_A #0: Tue Dec 24 03:41:49
1996	jkh@time.cdrom.com:/usr/src/sys/compile/GENERIC i386
>Description:
When installing via PPP I got the message on vty2 (the PPP screen anyway):
ppp in malloc(): warning: recursive call.
 I typed show mem to see what was wrong and things froze - modem stopped
receiving, ppp didnt respond - I went to vty4 (the emergency holographic
shell one) to do a ps and see what was going on - i typed ps and nothing
happened. I switched to vty0 and nothing had changed - i could
no longer go back to vty4 (it just beeped when i tried). I pressec ^C and
was asked if I was sure - the whole machine locked up. I pressed ctrl-alt-
delete. On reboot from floppy machine hung at the rootfs is 1440 Kbyte
compiled in MFS.

Since getting the system installed by floppy PPP still dies the same way
after about an hour's "hard use". I have to kill it from another window.
If I restart it it just hangs and so I have to reboot the box.
>How-To-Repeat:
Try getting PPP to transfer a lot of data for an hour or two - It seems
to happen when the modem has been flat chat at 28.8 for an hour or so.
>Fix:

>Release-Note:
>Audit-Trail:

From: Poul-Henning Kamp <phk@critter.dk.tfs.com>
To: andrew@ugh.net.au
Cc: freebsd-gnats-submit@freebsd.org
Subject: Re: bin/2347: sysinstall: ppp: recursive call in malloc() 
Date: Wed, 01 Jan 1997 10:55:42 +0100

 In message <199701010829.AAA18663@freefall.freebsd.org>, andrew@ugh.net.au writ
 
 >When installing via PPP I got the message on vty2 (the PPP screen anyway):
 >ppp in malloc(): warning: recursive call.
 
 This is actually a pretty devastating thing.
 
 It means that a call to malloc(3),free(3) or realloc(3) got interrupted
 by a SIGALRM and the signal handler called one of them again.
 
 I'm not definitively sure what POSIX & friends say about the reentrancy
 of malloc(3) but I belive it is not required to be reentrant.
 
 Anybody know definitively what POSIX say here ?
 
 I could add sigprocmask(2) calls around malloc and friends, but that would
 mean adding two syscalls per malloc/free/realloc calls, so I don't like
 the idea.  It could be an option that the program could select of course.
 
 Alternatively any program like ppp that uses SIGALRM for a state machine
 is severely limited in its implementation.
 
 --
 Poul-Henning Kamp           | phk@FreeBSD.ORG       FreeBSD Core-team.
 http://www.freebsd.org/~phk | phk@login.dknet.dk    Private mailbox.
 whois: [PHK]                | phk@tfs.com           TRW Financial Systems, Inc.
 Power and ignorance is a disgusting cocktail.

From: Garrett Wollman <wollman@lcs.mit.edu>
To: Poul-Henning Kamp <phk@critter.dk.tfs.com>
Cc: freebsd-gnats-submit@freefall.freebsd.org
Subject: Re: bin/2347: sysinstall: ppp: recursive call in malloc() 
Date: Wed, 1 Jan 1997 14:25:20 -0500

 <<On Wed, 1 Jan 1997 02:00:02 -0800 (PST), Poul-Henning Kamp <phk@critter.dk.tfs.com> said:
 
 >  I'm not definitively sure what POSIX & friends say about the reentrancy
 >  of malloc(3) but I belive it is not required to be reentrant.
  
 >  Anybody know definitively what POSIX say here ?
  
 Well, I can't tell you what POSIX says, but I can tell you what ANSI
 X3J11 said: the only thing you can do inside a signal handler is set a
 variable of type `volatile sig_atomic_t'.
 
 Hearsay: one of the UNIX standards (maybe P1003.n, maybe XPGn, maybe
 Spec 1170) specifies a list of *system calls* which may be called from
 signal handlers.  I think one or more of them may also require that
 longjmp() work from a signal handler, but not necessarily that
 anything work after doing that.
 
 -GAWollman
 
 --
 Garrett A. Wollman   | O Siem / We are all family / O Siem / We're all the same
 wollman@lcs.mit.edu  | O Siem / The fires of freedom 
 Opinions not those of| Dance in the burning flame
 MIT, LCS, ANA, or NSA|                     - Susan Aglukark and Chad Irschick

From: J Wunsch <j@uriah.heep.sax.de>
To: andrew@ugh.net.au
Cc: freebsd-gnats-submit@freebsd.org
Subject: Re: bin/2347: sysinstall: ppp: recursive call in malloc()
Date: Wed, 1 Jan 1997 19:25:38 +0100 (MET)

 As andrew@ugh.net.au wrote:
 
 > When installing via PPP I got the message on vty2 (the PPP screen anyway):
 > ppp in malloc(): warning: recursive call.
 
 So this is now the reason for the problem you're describing in PR #
 bin/2345, both are duplicates, but this one here is more detailed.
 
 For no known to me reason, sysinstall uses an /etc/malloc.conf
 containing the letter "A", meaning it should abort (signal 6) any
 program that causes a malloc warning.
 
 You could probably work around this by typing "rm /etc/malloc.conf"
 early in the emergency holographic shell.  Hmm, no, you gotta type it
 _before_ you're starting PPP, so i think the only way to achieve this
 is to first use the `fixit' floppy.  Download the fixit image, copy it
 to a floppy, then select `fixit' as the first step of your
 installation session, before installing anything.  Remove said symlink
 from the fixit shell.
 
 Ick.  ISTR that the fixit process was broken in BETA, i'm not sure.  I
 could provide you with an alternate (more recent) boot floppy, where
 you can even start the Emergency Holographic Shell before (but i think
 you need the fixit session anyway, since there's no rm command
 available that early).
 
 >  I typed show mem to see what was wrong and things froze - modem stopped
 > receiving, ppp didnt respond - I went to vty4 (the emergency holographic
 
 (signal 6)
 
 > shell one) to do a ps and see what was going on - i typed ps and nothing
 > happened. I switched to vty0 and nothing had changed - i could
 > no longer go back to vty4 (it just beeped when i tried). I pressec ^C and
 
 There are two basically known bugs here.  One is broken by design :),
 in that the console driver prevents you from switching to a screen
 when no process has this VTY open.  This bugfeature has been copied
 from SCO, but it's IMHO a very silly one.
 
 The second is that syscons seems to have serious trouble with VT
 switching these days.  This is also basically known (but i don't think
 anybody is working on a fix for this right now).  It renders the
 Emergency Holographic Shell fairly unusable. :-(  The only workaround
 i've found for this is starting yet another shell on top of the EHS
 early during installation.  Apparently, this keeps yet another process
 running on that VTY, and this seems to tell syscons that switching to
 that screen is possible.
 
 > Try getting PPP to transfer a lot of data for an hour or two - It seems
 > to happen when the modem has been flat chat at 28.8 for an hour or so.
 
 Hmm, this would be interesting to find.  Maybe it's something related
 to your setup, i'm not sure.  I'm using PPP a lot, and i also have
 malloc.conf set to `Abort'.  Yet, i have not seen this phenomenon.  If
 you can trigger it later, once your system has already been installed,
 it would be great if you could help us out with a stacktrace from the
 coredump.  I'm afraid this PR will remain open until this.  In order
 to do this, you also have to type
 
 	ln -s AJ /etc/malloc.conf
 
 Errm, nope. :-((( For security reasons, setuid binaries don't generate
 coredumps now, not even if your real UID matches the UID of the
 process.  So the only chance to get a core is to remove the suid and
 sgid bits from /usr/sbin/ppp, and run it as root.  It would be really
 great if you could trigger the bug again.  (If you did, send it as
 mail to freebsd-gnats-submit@freebsd.org, with a subject line of just
 ``bin/2347''.  This way, your mail will be appended to the audit-trail
 of this PR.)
 
 -- 
 cheers, J"org
 
 joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/ -- NIC: JW11-RIPE
 Never trust an operating system you don't have sources for. ;-)
Responsible-Changed-From-To: freebsd-bugs->brian 
Responsible-Changed-By: jkh 
Responsible-Changed-When: Fri Jan 17 02:35:03 PST 1997 
Responsible-Changed-Why:  
Brian has taken over ppp support now, and I've proven that this is 
a generic ppp problem, not a ppp-running-from-sysinstall problem 
(I can make it happen while running multi-user).  I believe that phk 
has also commented on this problem and some investigation of the 
offending signal handler done (in case Brian wants to follow up with that). 
State-Changed-From-To: open->closed 
State-Changed-By: brian 
State-Changed-When: Wed Jan 29 15:34:06 PST 1997 
State-Changed-Why:  
The timer handling code is no longer called directly from the signal handler. 
Instead, the signal handler sets a 'request' variable which is inspected 
after the select() call at the top level. 
>Unformatted:
