From peter@spinner.DIALix.COM  Fri Nov  1 01:06:08 1996
Received: from spinner.DIALix.COM (peter@spinner.DIALix.COM [192.203.228.67])
          by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id BAA09767
          for <FreeBSD-gnats-submit@freebsd.org>; Fri, 1 Nov 1996 01:06:04 -0800 (PST)
Received: (from peter@localhost)
          by spinner.DIALix.COM (8.8.2/8.8.2) id RAA09028;
          Fri, 1 Nov 1996 17:06:00 +0800 (WST)
Message-Id: <199611010906.RAA09028@spinner.DIALix.COM>
Date: Fri, 1 Nov 1996 17:06:00 +0800 (WST)
From: Peter Wemm <peter@spinner.DIALix.COM>
Reply-To: peter@spinner.DIALix.COM
To: FreeBSD-gnats-submit@freebsd.org
Subject: TCP doesn't time out of FIN_WAIT_1 and floods packets.
X-Send-Pr-Version: 3.2

>Number:         1940
>Category:       kern
>Synopsis:       TCP doesn't time out of FIN_WAIT_1 and floods packets.
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Nov  1 01:10:01 PST 1996
>Closed-Date:    Sun Apr 26 10:02:04 PDT 1998
>Last-Modified:  Sun Apr 26 10:02:37 PDT 1998
>Originator:     Peter Wemm
>Release:        FreeBSD 2.2-CURRENT i386
>Organization:
not.
>Environment:

Machine built from -current sources within the last week.  The machine is
running a web proxy server (squid-1.1.beta).

>Description:

An unusual jump in network outbound network traffic turned out to be caused
by a -current machine with a stuck connection.

The proxy server had initiated the connection:
Active Internet connections
Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)
[..]
tcp        0    366  fermi.29485            kaos.http  FIN_WAIT_1
[..]

I am not sure of the sequence of events that lead to this getting in this
state, but it was running this sequence of packets, even after squid had
closed the connection (and even after having been killed):

peter@fermi[8:27am]/home/squid-125# tcpdump -s 1500 -N 
tcpdump: listening on ed0
08:28:26.419290 kaos.http > fermi.29485: . ack 2272139 504 win 9112 (DF)
08:28:26.419459 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
08:28:26.426922 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
08:28:26.427038 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
08:28:26.435747 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
08:28:26.435877 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
08:28:26.445638 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
08:28:26.445750 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
08:28:26.454399 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
08:28:26.454548 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
[...]

I presume this is a half-close, what is meant to happen here?

>How-To-Repeat:

Unknown, but I hope it doesn't.. :-]

>Fix:
	
It seems that the tcp connection managed to get into a state where both
ends were out of sync.  I don't know my tcp states well enough to
understand what is meant to happen here, but hanging in FIN_WAIT_1 forever
doesn't seem nice.

>Release-Note:
>Audit-Trail:

From: David Greenman <dg@root.com>
To: peter@spinner.DIALix.COM
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/1940: TCP doesn't time out of FIN_WAIT_1 and floods packets. 
Date: Fri, 01 Nov 1996 01:35:02 -0800

 >I am not sure of the sequence of events that lead to this getting in this
 >state, but it was running this sequence of packets, even after squid had
 >closed the connection (and even after having been killed):
 >
 >peter@fermi[8:27am]/home/squid-125# tcpdump -s 1500 -N 
 >tcpdump: listening on ed0
 >08:28:26.419290 kaos.http > fermi.29485: . ack 2272139 504 win 9112 (DF)
 >08:28:26.419459 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
 >08:28:26.426922 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
 >08:28:26.427038 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
 >08:28:26.435747 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
 >08:28:26.435877 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
 >08:28:26.445638 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
 >08:28:26.445750 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
 >08:28:26.454399 kaos.http > fermi.29485: . ack 1 win 9112 (DF)
 >08:28:26.454548 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
 >[...]
 >
 >I presume this is a half-close, what is meant to happen here?
 >
 >>How-To-Repeat:
 >
 >Unknown, but I hope it doesn't.. :-]
 >
 >>Fix:
 >	
 >It seems that the tcp connection managed to get into a state where both
 >ends were out of sync.  I don't know my tcp states well enough to
 >understand what is meant to happen here, but hanging in FIN_WAIT_1 forever
 >doesn't seem nice.
 
    Looking at the sequence number in fermi's ack (being only 206 away from the
 32bit boundry), it looks like the sequence number wrapped around and the code
 didn't deal with it correctly. Just a guess, but this might be a new bug
 caused by the change of len from signed to unsigned.
 
 -DG
 
 David Greenman
 Core-team/Principal Architect, The FreeBSD Project

From: Peter Wemm <peter@spinner.DIALix.COM>
To: dg@root.com
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/1940: TCP doesn't time out of FIN_WAIT_1 and floods 
 packets.
Date: Fri, 01 Nov 1996 18:08:03 +0800

 David Greenman wrote:
 
 > >08:28:26.419290 kaos.http > fermi.29485: . ack 2272139 504 win 9112 (DF)
 > >08:28:26.419459 fermi.29485 > kaos.http: F 0:0(0) ack 4294967090 win 17280 <
     nop,nop,timestamp 371690 324020,nop,nop,cc 21426> (DF)
 [..]
 >   Looking at the sequence number in fermi's ack (being only 206 away from the
 > 32bit boundry), it looks like the sequence number wrapped around and the code
 > didn't deal with it correctly. Just a guess, but this might be a new bug
 > caused by the change of len from signed to unsigned.
 > 
 > -DG
 
 Hmm, isn't that the other way around?  The remote machine is the one with 
 the large sequence number, we're attempting to send a fin and acking at 
 their (2^32 - 206) sequence number..
 
 Interestingly, we're stuck with 366 un-acked bytes in the queue, but that 
 would be relative to our seqence number of 2272139504.  The remote server 
 is firewalled to the hilt, I can't tell what they are running, except that 
 it's something that runs Netscape's Enterprise server version 2.0a (at a 
 guess, NT).
 
 What bothers me is that we kept on trying and didn't time out, even though 
 we didn't seem to get an acceptable ack to our fin.  It's a little hard to 
 see for sure, since tcpdump had made the seq/ack's relative..
 
 Cheers,
 -Peter
 
 
State-Changed-From-To: open->closed 
State-Changed-By: phk 
State-Changed-When: Sun Apr 26 10:02:04 PDT 1998 
State-Changed-Why:  
timed out 
>Unformatted:
