From nobody@FreeBSD.org  Tue Nov 27 05:19:44 2001
Return-Path: <nobody@FreeBSD.org>
Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21])
	by hub.freebsd.org (Postfix) with ESMTP id 559E837B419
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 27 Nov 2001 05:19:43 -0800 (PST)
Received: (from nobody@localhost)
	by freefall.freebsd.org (8.11.6/8.11.6) id fARDJhS90471;
	Tue, 27 Nov 2001 05:19:43 -0800 (PST)
	(envelope-from nobody)
Message-Id: <200111271319.fARDJhS90471@freefall.freebsd.org>
Date: Tue, 27 Nov 2001 05:19:43 -0800 (PST)
From: Derren Lu <derrenl@synology.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: system panic in quotaoff 
X-Send-Pr-Version: www-1.0

>Number:         32331
>Category:       kern
>Synopsis:       system panic in quotaoff
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 27 05:20:05 PST 2001
>Closed-Date:    Wed Aug 21 16:21:41 PDT 2002
>Last-Modified:  Wed Aug 21 16:21:41 PDT 2002
>Originator:     Derren Lu
>Release:        4.1-20000812-STABLE
>Organization:
Synology Inc
>Environment:
FreeBSD derrenl-dev.synology.com 4.1-20000812-STABLE 4.1-20000812-STABLE #8 Tue Nov 10 11:53:07 CST 2001 root@derrenl-dev.synology.com:/usr/src/sys/compile/MYKERNEL  i386
>Description:
In a machine with serveral quata-enabled file systems, you can make it panic with the following steps: 
1. unmount some of these file systems, 2. do some quota operations (e.g. setquota) on existed file systems, 3. re-mount those unmounted file systems and enable quota, 4. reapeat step 1 ~ 3 and system will panic in qutaoff. 

The code stack is:
dqflush()
quotaoff()
ffs_flushfiles()
ffs_unmount()
dounmount()
unmount()
syscall2()
Xint0x80_syscall()


The line begins with "=>" is exactly the source of panic. It gets page fault because the value of dq->dq_ump is 0.
static void
dqflush()
{
...
    for (dqh = ...) {
        for (dq = ...) {
            nextdq = dq->dq_hash.le_next;
=>          if (dq->dq_ump->um_quotas[dq->dq_type] != vp)
                contine;
            ...
        }
    }
...
}


>How-To-Repeat:
In a machine with serveral quata-enabled file systems, you can make it panic with the following steps: 
1. quotaoff some of these file systems, 2. do some quota operations (e.g. setquota), 3.quotaon them, 4. reapeat step 1 ~ 4 and system will panic in qutaoff
>Fix:
I traced the codes and found the problem may be in dqget() of sys/ufs/ufs/ufs_quota.c

static int
dqget()
{
...
if (numdquot < desireddquot) {
    dq = ...
    bzero((char *) dq, sizeof *dq);
    numdquot++;
} else {
    if ((dq = dqfreelist.tqh_first) == NULL) {
        ...
    }
    if (dq->dq_cnt || (dq->dq_flags & DO_MOD))
        panic("...");
    TAILQ_REMOVE(&dqfreelist, dq, dq_freelist);
=>  LIST_REMOVE(dq, dq_hast);
}
...
}

In my opinion, the "LIST_REMOVE(dq, dq_hash)" is buggy because it assumes when you get a free dquot from freelist this dquot must also be linked in dqhash list. Unfortunately, this assumption is incorrect. Once this dquot is freed from quotaoff(), it will be removed from dqhast list in dqflush(). When you get this dquot in dqget(), it is actually not be linked in dqhash list. However, the dqflush() doesn't reset the fields of this dquot. So the execution of "LIST_REMOVE(dq, dq_hash)" may relink the next element of this dquot to dqhash list if it does have next element before freed. Since this "next elment" may also be a freed dquot, its dq_ump field may be 0 and this may result system panic in next dqflush().

To solve this problem, I try to add a new quota flag DQ_FLUSHED.

#define DQ_FLUSHED  0x40

And modify functions dqflush() and dqget(). Those lines begin with "=>" is the codes I added.

dqget()
{
...
if (numdquot < desireddquot) {
    dq = ...
    bzero((char *) dq, sizeof *dq);
    numdquot++;
} else {
    if ((dq = dqfreelist.tqh_first) == NULL) {
        ...
    }
    if (dq->dq_cnt || (dq->dq_flags & DO_MOD))
        panic("...");
    TAILQ_REMOVE(&dqfreelist, dq, dq_freelist);
    /* 
     * only the dquot linked in dqhast list will be removed from hash
     * list 
     */
=>  if ((dq->dq_flags & DQ_FLUSHED) == 0)
       LIST_REMOVE(dq, dq_hast);
}
...
}


dqflush()
{
...
    for (dqh = ...) {
        for (dq = ...) {
            ...
            dq->dq_ump = (struct ufsmount *)0;
            /*
             * Mark this dquot so that dqget() will not remove it from
             * dqhast list one more time.
             */
=>          dq->dq_flags |= DQ_FLUSHED;
        }
    }
...
}


>Release-Note:
>Audit-Trail:

From: Brad Laue <brad@brad-x.com>
To: freebsd-gnats-submit@FreeBSD.org, derrenl@synology.com
Cc:  
Subject: Re: kern/32331: system panic in quotaoff
Date: Tue, 01 Jan 2002 12:54:19 -0500

 I'd like to confirm this same set of circumstances as of 4.2-STABLE and 
 4.4-STABLE - I don't know whether this patch is still relevant or 
 whether it worked, but this problem still exists.
 
 I was unable to get the output of a crash report (the kernel message 
 appearing onscreen), but I can provide other info if it is needed.
 
State-Changed-From-To: open->feedback 
State-Changed-By: phk 
State-Changed-When: Thu Jan 10 07:03:10 PST 2002 
State-Changed-Why:  
Can you please try the patch I committed to sys/ufs/ufs/ufs_quota.c 
in version 1.50 ? 

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/ufs/ufs/ufs_quota.c.diff?r1=1.49&r2=1.50 

Thanks! 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=32331 
State-Changed-From-To: feedback->closed 
State-Changed-By: keramida 
State-Changed-When: Wed Aug 21 16:21:28 PDT 2002 
State-Changed-Why:  
Feedback timeout. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=32331 
>Unformatted:
