From stolz@i2.informatik.rwth-aachen.de  Fri Aug 20 13:26:49 2004
Return-Path: <stolz@i2.informatik.rwth-aachen.de>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 06B9D16A4CE
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 20 Aug 2004 13:26:49 +0000 (GMT)
Received: from atlas.informatik.rwth-aachen.de (atlas.informatik.RWTH-Aachen.DE [137.226.194.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D956D43D2D
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 20 Aug 2004 13:26:47 +0000 (GMT)
	(envelope-from stolz@i2.informatik.rwth-aachen.de)
Received: from i2.informatik.rwth-aachen.de (menelaos.informatik.RWTH-Aachen.DE [137.226.194.73])
	by atlas.informatik.rwth-aachen.de (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id i7KDQkkL029097
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 20 Aug 2004 15:26:46 +0200
Received: (from stolz@localhost)
	by i2.informatik.rwth-aachen.de (8.12.10/8.12.10/Submit) id i7KDQkKv078315;
	Fri, 20 Aug 2004 15:26:46 +0200 (CEST)
	(envelope-from stolz)
Message-Id: <200408201326.i7KDQkKv078315@i2.informatik.rwth-aachen.de>
Date: Fri, 20 Aug 2004 15:26:46 +0200 (CEST)
From: Volker Stolz <vs@freebsd.org>
Reply-To: Volker Stolz <stolz@i2.informatik.rwth-aachen.de>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: gcore/procfs not finding /proc/pid/file on repeated invocations when running from NFS
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         70708
>Category:       kern
>Synopsis:       [nfs] gcore/procfs not finding /proc/pid/file on repeated invocations when running from NFS
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Aug 20 13:30:13 GMT 2004
>Closed-Date:    
>Last-Modified:  Tue Nov 30 17:20:17 GMT 2004
>Originator:     Volker Stolz
>Release:        FreeBSD 4.10-STABLE i386
>Organization:
Lehrstuhl fr Informatik II; RWTH Aachen Universitt
>Environment:

System: FreeBSD menelaos.informatik.rwth-aachen.de 4.10-STABLE FreeBSD 4.10-STABLE #17: Tue Jun 22 17:00:16 CEST 2004 root@menelaos.informatik.rwth-aachen.de:/usr/obj/usr/src/sys/MENELAOS i386
Effect: Can't find /proc/pid/file on repeated invocations

System: FreeBSD monster.theater.foldr.org 5.2.1-RELEASE FreeBSD 5.2.1-RELEASE #0: Mon Feb 23 20:45:55 GMT 2004     root@wv1u.btc.adaptec.com:/usr/obj/usr/src/sys/GENERIC  i386
Effect: gcore fails to restart the program(?)

>Description:
The attached program invokes /usr/bin/gcore via system() to obtain its core.
Interestingly, it only works for the first time on 4.10 when on a NFS-mounted directory
and then only again after 'changing' the directory. It does not happen on a local drive!
See How-To-Repeat below. The procfs-file-entry for some reason gets set to "uknown"
in between.

On 5.2.1, gcore 'fails to wake up the process' despite the -s (in quotes because
I'm not sure what's going on).

vs@monster [15:00:42]> ./a.out
I'm 50559
Calling gcore -s 50559

[1]+  Stopped                 ./a.out
vs@monster [15:12:52]>
...
vs@monster [15:13:00]> jobs
[1]+  Stopped                 ./a.out


>How-To-Repeat:
Compile attached program on a NFS-directory. Note that I was only able to
reliably reproduce this when /invoking/via/long/nfs/mounted/path/to/a.out!
It takes longer to happen when I'm simply in NFS-mounted $HOME and doesn't seem to
happen at all on a local drive. Notice how an intermediate "cd ." helps.

pthread@menelaos [15:18:45]> ./a.out
I'm 78260
Calling gcore -s 78260

[1]+  Stopped                 ./a.out
pthread@menelaos [15:18:46]> ./a.out
I'm 78263
Calling gcore -s 78263
gcore: /proc/78263/file: No such file or directory
system failed: 256
lr-xr-xr-x  1 stolz  info2  7 Aug 20 15:18 /proc/78263/file -> unknown
[1]+  Done                    ./a.out
pthread@menelaos [15:18:46]> ./a.out
I'm 78268
Calling gcore -s 78268
gcore: /proc/78268/file: No such file or directory
system failed: 256
lr-xr-xr-x  1 stolz  info2  7 Aug 20 15:18 /proc/78268/file -> unknown
pthread@menelaos [15:18:48]> cd .
pthread@menelaos [15:18:49]> ./a.out
I'm 78273
Calling gcore -s 78273

[1]+  Stopped                 ./a.out
pthread@menelaos [15:18:51]> ./a.out
I'm 78276
Calling gcore -s 78276
gcore: /proc/78276/file: No such file or directory
system failed: 256
lr-xr-xr-x  1 stolz  info2  7 Aug 20 15:18 /proc/78276/file -> unknown
[1]+  Done                    ./a.out

=================================================================
#include <sys/types.h>
#include <unistd.h>

#include <signal.h>
#include <stdlib.h>
#include <stdio.h>

static void gencore(char *);

static char buf[256];
static char lsbuf[256];

static pid_t pid;

int main(int argc, char** argv) {

        pid = getpid();
        fprintf(stderr,"I'm %i\n",pid);
        sprintf(buf,"gcore -s %i",pid);
        sprintf(lsbuf,"ls -l /proc/%i/file",pid);
        gencore(buf);
        exit(0);
}

void gencore(char *buf) {
	int res;
	fprintf(stderr,"Calling %s\n",buf);
        res = system(buf);
	if (res != 0) {
	  fprintf(stderr,"system failed: %i\n",res);
	  system(lsbuf);
	}
        exit(0);
}
=================================================================

>Fix:

	


>Release-Note:
>Audit-Trail:

From: David Schultz <das@FreeBSD.ORG>
To: Volker Stolz <stolz@i2.informatik.rwth-aachen.de>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/70708: gcore/procfs not finding /proc/pid/file on repeated invocations when running from NFS
Date: Tue, 30 Nov 2004 12:14:39 -0500

 This is due to a bug in procfs.  Specifically, procfs uses
 vn_fullpath() to generate the appropriate symbolic link for
 /proc/$pid/file, and vn_fullpath() fails if one or more pathname
 components is missing from the name cache.  We cannot solve this
 problem in procfs without a fully functional getcwd()
 implementation in the kernel.
 
 However, for this particular problem, it may be possible to work
 around the problem in procfs, since the only reason gcore needs
 procfs is to determine if the binary is ELF or not.  Possible
 solutions:
 
 - Extend ptrace(2) to make it possible to open a traced process'
   text file without procfs.  This is generally a good idea anyway,
   because /proc/$pid/file will never work for processes whose text
   files have been unlinked from the file system.
 
 - Don't bother checking whether the binary is ELF or not, since
   gcore no longer supports a.out.  This would result in invalid
   dumps for non-ELF processes, and merely defers the problem until
   someone invents an ELF++ dump format.
 
 I have no time to work on this right now, so it would be great if
 someone could pick this up.
>Unformatted:
