From tegge@not.fast.no Sat Jul 24 20:13:01 1999
Return-Path: <tegge@not.fast.no>
Received: from midten.fast.no (midten.fast.no [195.139.251.11])
	by hub.freebsd.org (Postfix) with ESMTP id 7951014D5A
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 24 Jul 1999 20:12:55 -0700 (PDT)
	(envelope-from tegge@not.fast.no)
Received: from not.fast.no (IDENT:tegge@not.fast.no [195.139.251.12])
	by midten.fast.no (8.9.3/8.9.3) with ESMTP id FAA22010
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 25 Jul 1999 05:10:47 +0200 (CEST)
Received: (from tegge@localhost)
	by not.fast.no (8.9.3/8.8.8) id FAA03328;
	Sun, 25 Jul 1999 05:10:47 +0200 (CEST)
	(envelope-from tegge@not.fast.no)
Message-Id: <199907250310.FAA03328@not.fast.no>
Date: Sun, 25 Jul 1999 05:10:47 +0200 (CEST)
From: Tor Egge <tegge@not.fast.no>
Reply-To: tegge@not.fast.no
To: FreeBSD-gnats-submit@freebsd.org
Subject: buffer leak in cluster_wbuild
X-Send-Pr-Version: 3.2

>Number:         12800
>Category:       kern
>Synopsis:       buffer leak in cluster_wbuild
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    tegge
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Jul 24 20:20:00 PDT 1999
>Closed-Date:    Thu Nov 15 18:19:36 2001
>Last-Modified:  Thu Nov 15 18:21:20 PST 2001
>Originator:     Tor Egge
>Release:        FreeBSD 4.0-CURRENT i386
>Organization:
Fast Search & Transfer ASA
>Environment:

FreeBSD not.fast.no 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Fri Jul 23 02:23:08 CEST 1999     root@not.fast.no:/usr/src/sys/compile/NOT_SMP  i386

>Description:

If a candidate buffer for clustering contains a page marked BUSY,
cluster_wbuild fails to return the buffer to the appropriate previous state.
This causes processes to be stuck in getblk, due to the buffer
being marked busy, with noone around to unbusy it.

When running the program below for reproducing the problem, some other
symptomps were also present:

	- Spurious SIGBUS signals.

	  mmap returned success, but the address area returned by mmap
	  is apparently not always accessible, causing the SIGBUS during
	  the memset operation.

	- corrupt coredumps.  Probably due to problems accessing the
	  area returned by mmap().

>How-To-Repeat:

On a machine with 512MB memory with > 1GB swap:

  As root:

	sysctl -w vm.swap_enabled=0
	sysctl -w kern.corefile='%N.core.%P'

  As a normal user, on a file system with lots of free space:

	cc -g -O2 -static -o badwrite badwrite.c
	ulimit -c unlimited
	./badwrite 400
	
  Wait 10-15 minutes, then use Ctrl-C to stop the program.

  There should be some corrupt core files in the current directory, and
  some stuck processes on the machines.

-------------- start of badwrite.c -----
#include <sys/types.h>
#include <sys/mman.h>
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>

#define PAGESIZE 4096

#define TRASHSIZE (2*1024*1024)

void writeit(void);

int main(int argc, char **argv)
{
  int childcnt, i;
  pid_t pid;

  if (argc  >= 2)
    childcnt = atoi(argv[1]);
  else
    childcnt = 20;

  if (childcnt < 0)
    childcnt = 1;

  if (childcnt > 800)
    childcnt = 800;

  for (i = 0; i < childcnt; i++) {
    pid = fork();
    if (pid < 0)
      exit(1);
    if (pid == 0) {
      writeit();
      exit(0);
    }
  }

  while (wait(NULL) >= 0) {
  }
  exit(0);
}


void writeit(void)
{
  char buf[100];
  pid_t pid;
  int fd;
  int tbuf[8192];
  char *trashmem;
  int trashpos;
  char *mapos;
  off_t fd_off, map_off;
  int pageoffset;
  
  int wgot, wtry;

  pid = getpid();
  sprintf(buf, "tmpfile.%d", (int) pid);

  srandom(time(NULL) + pid);
  
  fd = open(buf, O_RDWR | O_CREAT | O_TRUNC | O_APPEND, 0666);
  if (fd < 0)
    exit(1);

  unlink(buf);

  trashmem = malloc(TRASHSIZE);

  memset(tbuf, 0, sizeof(tbuf));

  fd_off = 0;

  while (1) {
    wtry = 1 + (random() % sizeof(tbuf));
    
    wgot = write(fd, tbuf, wtry);
    
    if (wgot != wtry)
      exit(1);

    usleep(random() % 65536);

#if 1
    trashpos = (random() % (TRASHSIZE / PAGESIZE)) * PAGESIZE;
    trashmem[trashpos]++;
#endif

    fd_off += wgot;

    pageoffset = fd_off & (PAGESIZE - 1);

    map_off = fd_off - pageoffset;

    mapos = mmap(NULL, PAGESIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE,
		 fd, map_off);

    if (mapos == NULL || mapos == (char *) -1)
      exit(1);
    
    memset(mapos + pageoffset, 0, PAGESIZE - pageoffset);

    munmap(mapos, PAGESIZE);
  }
  
}
----------------

>Fix:

>Release-Note:
>Audit-Trail:

From: Tor.Egge@fast.no
To: tegge@not.fast.no
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/12800: buffer leak in cluster_wbuild
Date: Sun, 25 Jul 1999 07:23:18 +0200

 With this patch installed, the problem with processes getting stuck in
 getblk disappeared.
 
 The spurious SIGBUSes were due to mmap allowing us to map memory
 completely after the end of the file.  When accessing the pages that
 weren't even partially backed by the file, the result was a SIGBUS.
 
 The coredump routines needs some more robustness against the program
 having performed incorrect mmap() operations.
 
 ---------------
 Index: vfs_cluster.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/vfs_cluster.c,v
 retrieving revision 1.87
 diff -u -r1.87 vfs_cluster.c
 --- vfs_cluster.c	1999/07/08 06:05:53	1.87
 +++ vfs_cluster.c	1999/07/25 05:08:52
 @@ -774,6 +774,20 @@
  					splx(s);
  					break;
  				}
 +				if (tbp->b_flags & B_VMIO) {
 +					vm_page_t m;
 +
 +					for (j = 0; 
 +					     j < tbp->b_npages; j += 1) {
 +						m = tbp->b_pages[j];
 +						if (m->flags & PG_BUSY) {
 +							BUF_UNLOCK(tbp);
 +							splx(s);
 +							goto finishcluster;
 +						}
 +					}
 +				}
 +					
  				/*
  				 * Ok, it's passed all the tests,
  				 * so remove it from the free list
 @@ -798,7 +812,7 @@
  					for (j = 0; j < tbp->b_npages; j += 1) {
  						m = tbp->b_pages[j];
  						if (m->flags & PG_BUSY)
 -							goto finishcluster;
 +							panic("cluster_wbuild: PG_BUSY: m=%p, tbp=%p\n", m, tbp);
  					}
  				}
  					
 
 ---------------
 

From: Tor.Egge@fast.no
To: freebsd-gnats-submit@freebsd.org
Cc: dillon@apollo.backplane.com
Subject: Re: kern/12800: buffer leak in cluster_wbuild
Date: Mon, 30 Aug 1999 21:00:22 +0200

 On FreeBSD it is legal to mmap() regions beyond end of the backing file.
 The supplied test program tried to access pages after the end of the 
 backing file.  That was a bug in the test program, and SIGBUS is the
 normal expected behavior on FreeBSD when accessing those pages.
 
 Matt Dillon has suggested the following patch which is better than
 the one previously suggested by me.
 
 Index: vfs_cluster.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/vfs_cluster.c,v
 retrieving revision 1.88
 diff -u -r1.88 vfs_cluster.c
 --- vfs_cluster.c	1999/08/28 00:46:23	1.88
 +++ vfs_cluster.c	1999/08/30 03:49:02
 @@ -797,8 +797,10 @@
  				if (i != 0) { /* if not first buffer */
  					for (j = 0; j < tbp->b_npages; j += 1) {
  						m = tbp->b_pages[j];
 -						if (m->flags & PG_BUSY)
 +						if (m->flags & PG_BUSY) {
 +							bqrelse(tbp);
  							goto finishcluster;
 +						}
  					}
  				}
  					
 
 
 The problem with corrupt coredumps still remains.
 
 - Tor Egge
 
Responsible-Changed-From-To: freebsd-bugs->tegge 
Responsible-Changed-By: mike 
Responsible-Changed-When: Fri Jul 20 16:57:53 PDT 2001 
Responsible-Changed-Why:  

Originator is a committer.  Tor, feel free to send this back to 
freebsd-bugs, if you can't deal with it. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=12800

State-Changed-From-To: open->closed
State-Changed-By: tegge
State-Changed-When: Thu Nov 15 18:19:36 2001
State-Changed-Why:
Buffer leak fixed in revision 1.89 of sys/kern/vfs_cluster.c.
>Unformatted:
