From nobody@FreeBSD.org  Thu Nov 13 15:04:32 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6FD3B106567F
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Nov 2008 15:04:32 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 621498FC1B
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Nov 2008 15:04:32 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id mADF4Vh5059925
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Nov 2008 15:04:31 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id mADF4Vmd059924;
	Thu, 13 Nov 2008 15:04:31 GMT
	(envelope-from nobody)
Message-Id: <200811131504.mADF4Vmd059924@www.freebsd.org>
Date: Thu, 13 Nov 2008 15:04:31 GMT
From: Andrew Gierth <andrew@tao11.riddles.org.uk>
To: freebsd-gnats-submit@FreeBSD.org
Subject: page fault under load with igb/LRO
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         128840
>Category:       kern
>Synopsis:       [igb] page fault under load with igb/LRO
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    jfv
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Nov 13 15:10:01 UTC 2008
>Closed-Date:    
>Last-Modified:  Mon Aug 23 17:58:28 UTC 2010
>Originator:     Andrew Gierth
>Release:        FreeBSD 7.1-PRERELEASE (2008-11-09)
>Organization:
>Environment:
FreeBSD redacted.example.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Mon Nov 10 20:41:49 UTC 2008     root@:/usr/obj/usr/src/sys/REDACTED  amd64

>Description:
Kernel page fault due to null pointer passed from tcp_lro_flush to ether_input:

(kgdb) where
#0  doadump () at pcpu.h:195
#1  0xffffffff80281888 in boot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:418
#2  0xffffffff80281cec in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:574
#3  0xffffffff803c91c3 in trap_fatal (frame=0xc, eva=Variable "eva" is not available.
)
    at /usr/src/sys/amd64/amd64/trap.c:764
#4  0xffffffff803c95a4 in trap_pfault (frame=0xfffffffface6f9f0, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:680
#5  0xffffffff803c9efa in trap (frame=0xfffffffface6f9f0)
    at /usr/src/sys/amd64/amd64/trap.c:449
#6  0xffffffff803aee3e in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:209
#7  0xffffffff8031eadf in ether_input (ifp=0xffffff00010bb800, m=0x0)
    at /usr/src/sys/net/if_ethersubr.c:531
#8  0xffffffff8034779b in tcp_lro_flush (cntl=0xffffff000120a258, 
    lro=0xffffff00036e4000) at /usr/src/sys/netinet/tcp_lro.c:168
#9  0xffffffff801c4d07 in igb_rxeof (rxr=0xffffff000120a258, count=73)
    at /usr/src/sys/dev/e1000/if_igb.c:4018
#10 0xffffffff801c4ffb in igb_handle_rx (context=0xffffff000120a200, pending=Variable "pending" is not available.
)
    at /usr/src/sys/dev/e1000/if_igb.c:1337
#11 0xffffffff802b796d in taskqueue_run (queue=0xffffff0002400800)
    at /usr/src/sys/kern/subr_taskqueue.c:282
#12 0xffffffff802b7c32 in taskqueue_thread_loop (arg=Variable "arg" is not available.
)
    at /usr/src/sys/kern/subr_taskqueue.c:401
#13 0xffffffff8025ec2f in fork_exit (
    callout=0xffffffff802b7bc0 <taskqueue_thread_loop>, 
    arg=0xffffff00011584d8, frame=0xfffffffface6fc80)
    at /usr/src/sys/kern/kern_fork.c:804
#14 0xffffffff803af20e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:455

[...]

#8  0xffffffff8034779b in tcp_lro_flush (cntl=0xffffff000120a258, 
    lro=0xffffff00036e4000) at /usr/src/sys/netinet/tcp_lro.c:168
168             (*ifp->if_input)(cntl->ifp, lro->m_head);
(kgdb) print lro
$1 = (struct lro_entry *) 0xffffff00036e4000
(kgdb) print *lro
$2 = {next = {sle_next = 0xffffff00036e3c80}, m_head = 0x0, 
  m_tail = 0xffffff0003a96900, timestamp = 0, ip = 0xffffff0003aac810, 
  tsval = 87166632, tsecr = 1844070041, source_ip = 4124597842, 
  dest_ip = 4107820626, next_seq = 2241419788, ack_seq = 1871884633, 
  len = 122, data_csum = 53193, window = 22336, source_port = 24564, 
  dest_port = 14357, append_cnt = 0, mss = 56}

Note that m_head == NULL, hence the crash.

>How-To-Repeat:
I got this from running PostgreSQL's "pgbench" benchmark with 100
concurrent connections from a remote host (over a gigE network). This is
a request/response workload with relatively small requests and responses;
the crash occurred after several minutes of load. The server side was the
one that crashed.

Repeating the same workload (and heavier versions of it) with
hw.igb.enable_lro=0 did not produce any crashes.

>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-net 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sat Nov 15 11:46:24 UTC 2008 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=128840 
Responsible-Changed-From-To: freebsd-net->jfv 
Responsible-Changed-By: andre 
Responsible-Changed-When: Mon Aug 23 17:58:01 UTC 2010 
Responsible-Changed-Why:  
Over to maintainer. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=128840 
>Unformatted:
