From ade@FreeBSD.org  Sun Aug 14 09:06:38 2005
Return-Path: <ade@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2D25016A41F
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 14 Aug 2005 09:06:38 +0000 (GMT)
	(envelope-from ade@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E945843D46
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 14 Aug 2005 09:06:37 +0000 (GMT)
	(envelope-from ade@FreeBSD.org)
Received: from freefall.freebsd.org (ade@localhost [127.0.0.1])
	by freefall.freebsd.org (8.13.3/8.13.3) with ESMTP id j7E96bvq018882
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 14 Aug 2005 09:06:37 GMT
	(envelope-from ade@freefall.freebsd.org)
Received: (from ade@localhost)
	by freefall.freebsd.org (8.13.3/8.13.1/Submit) id j7E96bPI018881;
	Sun, 14 Aug 2005 09:06:37 GMT
	(envelope-from ade)
Message-Id: <200508140906.j7E96bPI018881@freefall.freebsd.org>
Date: Sun, 14 Aug 2005 09:06:37 GMT
From: Ade Lovett <ade@FreeBSD.org>
Reply-To: Ade Lovett <ade@FreeBSD.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: Incorrect initialization of nswbuf
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         84903
>Category:       kern
>Synopsis:       Incorrect initialization of nswbuf
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    ade
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Aug 14 09:10:11 GMT 2005
>Closed-Date:    Tue Aug 16 18:56:32 GMT 2005
>Last-Modified:  Tue Aug 16 18:56:32 GMT 2005
>Originator:     Ade Lovett
>Release:        All FreeBSD > 5.0
>Organization:
Supernews
>Environment:

	Any FreeBSD system (RELENG_5, RELENG_6, and HEAD) after
	revision 1.132 of sys/vm/vnode_pager.c (4 years, 1 month ago)

>Description:

Whilst attempting to nail down some serious performance issues (compared
with 4.x) in preparation for a 6.x rollout here, we've come across
something of a fundamental bug.

In this particular environment (a Usenet transit server, so very high
network and disk I/O) we observed that processes were spending a
considerable amount of time in state 'wswbuf', traced back to getpbuf()
in vm/vm_pager.c

To cut a long story short, the order in which nswbuf is being
initialized is completely, totally, and utterly wrong -- this was
introduced by revision 1.132 of vm/vnode_pager.c just over 4 years ago.

In vnode_pager.c we find:

static void
vnode_pager_init(void)
{
	vnode_pbuf_freecnt = nswbuf / 2 + 1;
}

Unfortunately, nswbuf hasn't been assigned to yet, just happens to be
zero (in all cases), and thus the kernel believes that there is only
ever *one* swap buffer available.

kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does the
calculation and assignment, is called rather further on in the process,
by which time the damage has been done.

The net result is that *any* calls involving getpbuf() will be
unconditionally serialized, completely destroying any kind of
concurrency (and performance).

Given the memory footprint of our machines, we've hacked in a simple:

	nswbuf = 0x100;

into vnode_pager_init(), since the calculation ends up giving us the
maximum number anyway.  There are a number of possible 'correct' fixes
in terms of re-ordering the startup sequence.

With the aforementioned hack, we're now seeing considerably better
machine operation, certainly as good as similar 4.10-STABLE boxes.

As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD, and
should, IMO, be considered an absolutely required fix for 6.0-RELEASE.

>How-To-Repeat:

	N/A
>Fix:

	We have implemented a local hack as above, given that the
	memory footprint of the machines would result in the
	maximal value of nswbuf being assigned in any case.

	This is not a real fix however.

	A solution has been offered by Alexander Kabaev <kabaev@gmail.com>
	as follows, which appears to do the right thing, at least on
	RELENG_6/i386, which is the only type of machine I have easy
	access to for testing purposes.

	In my opinion, it would be a fatal error to release 6.0 in
	any shape or form without addressing this issue.

Index: vm_init.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_init.c,v
retrieving revision 1.46
diff -u -r1.46 vm_init.c
--- vm_init.c	25 Apr 2005 19:22:05 -0000	1.46
+++ vm_init.c	9 Aug 2005 01:59:12 -0000
@@ -124,7 +124,7 @@
 	vm_map_startup();
 	kmem_init(virtual_avail, virtual_end);
 	pmap_init();
-	vm_pager_init();
+	/* vm_pager_init(); */
 }
 
 void
Index: vm_pager.c
===================================================================
RCS file: /home/ncvs/src/sys/vm/vm_pager.c,v
retrieving revision 1.105
diff -u -r1.105 vm_pager.c
--- vm_pager.c	18 May 2005 20:45:33 -0000	1.105
+++ vm_pager.c	9 Aug 2005 01:59:55 -0000
@@ -202,6 +202,8 @@
 	struct buf *bp;
 	int i;
 
+	vm_pager_init();
+
 	mtx_init(&pbuf_mtx, "pbuf mutex", NULL, MTX_DEF);
 	bp = swbuf;
 	/*
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->closed 
State-Changed-By: ade 
State-Changed-When: Tue Aug 16 18:55:22 GMT 2005 
State-Changed-Why:  
Fixed in revision 1.223 of sys/vm/vnode_pager.c et al. 
Subsequently MFC'd to RELENG_6 (1.221.2.2) and RELENG_5 (1.196.2.8) 


Responsible-Changed-From-To: freebsd-bugs->ade 
Responsible-Changed-By: ade 
Responsible-Changed-When: Tue Aug 16 18:55:22 GMT 2005 
Responsible-Changed-Why:  
Fixed in revision 1.223 of sys/vm/vnode_pager.c et al. 
Subsequently MFC'd to RELENG_6 (1.221.2.2) and RELENG_5 (1.196.2.8) 

http://www.freebsd.org/cgi/query-pr.cgi?pr=84903 
>Unformatted:
