From nobody@FreeBSD.org  Sun Aug 11 10:33:14 2002
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1046F37B400
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 11 Aug 2002 10:33:14 -0700 (PDT)
Received: from www.freebsd.org (www.FreeBSD.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id CAAE443E65
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 11 Aug 2002 10:33:13 -0700 (PDT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.12.4/8.12.4) with ESMTP id g7BHXCOT061466
	for <freebsd-gnats-submit@FreeBSD.org>; Sun, 11 Aug 2002 10:33:12 -0700 (PDT)
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.12.4/8.12.4/Submit) id g7BHXCRb061465;
	Sun, 11 Aug 2002 10:33:12 -0700 (PDT)
Message-Id: <200208111733.g7BHXCRb061465@www.freebsd.org>
Date: Sun, 11 Aug 2002 10:33:12 -0700 (PDT)
From: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
To: freebsd-gnats-submit@FreeBSD.org
Subject: TCP timers' sysctl's overflow
X-Send-Pr-Version: www-1.0

>Number:         41552
>Category:       kern
>Synopsis:       TCP timers' sysctl's overflow
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    jdp
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Aug 11 10:40:01 PDT 2002
>Closed-Date:    Fri Aug 23 18:11:44 PDT 2002
>Last-Modified:  Fri Aug 23 18:11:44 PDT 2002
>Originator:     G.P. de Boer
>Release:        4.6.1-RELEASE-p10
>Organization:
none
>Environment:
FreeBSD stranraer 4.6.1-RELEASE-p10 FreeBSD 4.6.1-RELEASE-p10 #4: Sun Aug 11 16:06:11 CEST 2002     root@stranraer:/usr/obj/usr/src/sys/KERNEL-12-06-2002  i386     
>Description:
When setting syscontrols like net.inet.tcp.keepidle on a system with clocktick-granularity above 1000 Hz, there's an overflow triggered, resulting in at least inaccurate, but sometimes negative TCP timeouts. This could result in a situation where keep-alive isn't working as expected or at all, which could then be exploited to DoS a host.
>How-To-Repeat:
root@stranraer:~$ sysctl -w net.inet.tcp.keepidle=7200000
net.inet.tcp.keepidle: 720000 -> 757549

On systems with clocktick-granularity >1000. 
>Fix:
As already merged into RELENG_4: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_timer.c?r1=1.34.2.12

   
>Release-Note:
>Audit-Trail:

From: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
To: freebsd-gnats-submit@FreeBSD.org, g.p.de.boer@st.hanze.nl
Cc:  
Subject: Re: kern/41552: TCP timers' sysctl's overflow
Date: Sun, 11 Aug 2002 20:41:36 +0200

 Just a little follow-up to raise another issue. I was having growing amounts of
 TCP-connections which idled in the LAST_ACK state. They didn't timeout. I
 found somebody who had this problem on 4.2 and applied his patch to the 4.6.1
 source. That solved the issue.
 
 I looked at the tcp sources a bit, but since it's not really the easiest 
 protocol on
 earth I couldn't find out if there was already some kind of timeout for 
 LAST_ACK.
 My question: Does the problem with 'net.inet.tcp.keepidle' have as 
 side-effect that
 connections in LAST_ACK state never time out or is there another issue?
 I can't hardly believe there's no timeout for LAST_ACK anywhere, but just 
 curious.
 
 Here's a link to the original post about the LAST_ACK problem on 4.2:
 http://archives.neohapsis.com/archives/freebsd/2001-03/0363.html
 
 With regards,
 G.P. de Boer
 

From: Garrett Wollman <wollman@lcs.mit.edu>
To: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: kern/41552: TCP timers' sysctl's overflow
Date: Mon, 12 Aug 2002 14:40:57 -0400 (EDT)

 <<On Sun, 11 Aug 2002 10:33:12 -0700 (PDT), "G.P. de Boer" <g.p.de.boer@st.hanze.nl> said:
 
 > When setting syscontrols like net.inet.tcp.keepidle on a system with
 > clocktick-granularity above 1000 Hz, there's an overflow triggered,
 > resulting in at least inaccurate, but sometimes negative TCP
 > timeouts.
 
 1 kHz timers are just barely within spec for TCP (using the 32-bit
 fields in RFC 1323).
 
 -GAWollman
 

From: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
To: Garrett Wollman <wollman@lcs.mit.edu>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: kern/41552: TCP timers' sysctl's overflow
Date: Mon, 12 Aug 2002 21:21:33 +0200

 At 20:40 12-8-2002, you wrote:
 
 > > When setting syscontrols like net.inet.tcp.keepidle on a system with
 > > clocktick-granularity above 1000 Hz, there's an overflow triggered,
 > > resulting in at least inaccurate, but sometimes negative TCP
 > > timeouts.
 >
 >1 kHz timers are just barely within spec for TCP (using the 32-bit
 >fields in RFC 1323).
 
 Well.. since LINT says 1000Hz is advisable for dummynet use. For
 polling 1000 or even 2000Hz is advised. IF this is a problem with
 RFC1323, which strikes me as odd, then there's more to this problem
 than meets the eye. A setting in LINT shouldn't break anything so
 fundamental as TCP.
 
 Anyway.. it's a integer overflow and it breaks stuff in nasty ways. It's
 possible to DoS a host with malfunctioning keep-alives: I already had
 more than 400 hanging connections (in LAST_ACK state) in a few days
 on a moderately loaded server. The fix is there already, I just think it
 should be in -RELEASE too.
 
 With regards,
 Pieter
 

From: Garrett Wollman <wollman@lcs.mit.edu>
To: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: kern/41552: TCP timers' sysctl's overflow
Date: Mon, 12 Aug 2002 15:45:22 -0400 (EDT)

 <<On Mon, 12 Aug 2002 21:21:33 +0200, "G.P. de Boer" <g.p.de.boer@st.hanze.nl> said:
 
 > Well.. since LINT says 1000Hz is advisable for dummynet use. For
 > polling 1000 or even 2000Hz is advised. IF this is a problem with
 > RFC1323, which strikes me as odd, then there's more to this problem
 > than meets the eye. A setting in LINT shouldn't break anything so
 > fundamental as TCP.
 
 RFC 1323 specifies that the timestamp clock is to have a period
 between 1 ms and 1 s.  (See page 21, third paragraph from the bottom.)
 The timestamp clock in FreeBSD is the system variable `ticks', which
 is incremented once for every clock interrupt, so its period is
 approximately 1/HZ s.  In order to support higher clock frequencies, a
 scaling factor would need to be introduced.
 
 -GAWollman
 

From: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
To: Bruce Evans <bde@zeta.org.au>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: kern/41552: TCP timers' sysctl's overflow
Date: Tue, 13 Aug 2002 00:05:12 +0200

 At 23:43 12-8-2002, Bruce Evans wrote:
 
 > >  Anyway.. it's a integer overflow and it breaks stuff in nasty ways. It's
 > >  possible to DoS a host with malfunctioning keep-alives: I already had
 > >  more than 400 hanging connections (in LAST_ACK state) in a few days
 > >  on a moderately loaded server. The fix is there already, I just think it
 > >  should be in -RELEASE too.
 >
 >The overflow was fixed by jdp a couple of weeks ago in -current and
 >RELENG_4.  It is not fixed in any of the security branches.  Do you
 >want it there?  I think the "fix" for most security bugs caused by
 >unusual options is to not use unusual options.
 
 Ofcourse, unless you haven't got better things to do, which is not ever
 the case.
 
 Now the question pops up if setting HZ -is- unusual. I can imagine that
 there are many admins around who turned on polling for extra
 performance/robustness and tuned option HZ because LINT says so.
 
 As a non-corporate user I can't tell how much people actually did that, but
 to me it sounds logical to use polling on heavily loaded networking servers,
 which comes with increasing the number of clock-interrupts per second. On
 such servers a bug like this is even more dangerous than on my simple
 cable-modeming gateway, granted that these systems handle many
 connections.
 
 In conclusion: In my opinion this should be fixed soon, in the security-
 branches. But if you think/know there aren't many people having trouble
 with this, because setting HZ isn't very usual, we'll just have to patch it
 ourselves if need be, or wait for 4.7 :)
 
 -- Pieter
 
 
 

From: Bruce Evans <bde@zeta.org.au>
To: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/41552: TCP timers' sysctl's overflow
Date: Tue, 13 Aug 2002 07:43:26 +1000 (EST)

 On Mon, 12 Aug 2002, G.P. de Boer wrote:
 
 [Garrett Wollman wrote]
 >  > > When setting syscontrols like net.inet.tcp.keepidle on a system with
 >  > > clocktick-granularity above 1000 Hz, there's an overflow triggered,
 >  > > resulting in at least inaccurate, but sometimes negative TCP
 >  > > timeouts.
 >  >
 >  >1 kHz timers are just barely within spec for TCP (using the 32-bit
 >  >fields in RFC 1323).
 
 Um, that is for the TCP timers.  I think these have nothing to do with
 HZ except that setting HZ to a large value breaks the scaling for them.
 
 >  Anyway.. it's a integer overflow and it breaks stuff in nasty ways. It's
 >  possible to DoS a host with malfunctioning keep-alives: I already had
 >  more than 400 hanging connections (in LAST_ACK state) in a few days
 >  on a moderately loaded server. The fix is there already, I just think it
 >  should be in -RELEASE too.
 
 The overflow was fixed by jdp a couple of weeks ago in -current and
 RELENG_4.  It is not fixed in any of the security branches.  Do you
 want it there?  I think the "fix" for most security bugs caused by
 unusual options is to not use unusual options.
 
 Bruce
 

From: Bruce Evans <bde@zeta.org.au>
To: "G.P. de Boer" <g.p.de.boer@st.hanze.nl>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: kern/41552: TCP timers' sysctl's overflow
Date: Tue, 13 Aug 2002 22:20:22 +1000 (EST)

 On Tue, 13 Aug 2002, G.P. de Boer wrote:
 
 > At 23:43 12-8-2002, Bruce Evans wrote:
 >
 > > >  Anyway.. it's a integer overflow and it breaks stuff in nasty ways. It's
 > > >  possible to DoS a host with malfunctioning keep-alives: I already had
 > > >  more than 400 hanging connections (in LAST_ACK state) in a few days
 > > >  on a moderately loaded server. The fix is there already, I just think it
 > > >  should be in -RELEASE too.
 > >
 > >The overflow was fixed by jdp a couple of weeks ago in -current and
 > >RELENG_4.  It is not fixed in any of the security branches.  Do you
 > >want it there?  I think the "fix" for most security bugs caused by
 > >unusual options is to not use unusual options.
 >
 > Ofcourse, unless you haven't got better things to do, which is not ever
 > the case.
 >
 > Now the question pops up if setting HZ -is- unusual. I can imagine that
 > there are many admins around who turned on polling for extra
 > performance/robustness and tuned option HZ because LINT says so.
 
 Garrett clarified that setting hz to more than 1000 breaks more than the
 TCP timer sysctls.  It violates an RFC.  So such setting should be very
 unusual.  But hasn't hz = 1024 been the normal setting on alphas for many
 years now?  As I understand it, you can't set hz using the HZ option on
 alphas -- the boot firmware decides the clock interrupt frequency and
 FreeBSD only scales this for stathz, not for hz (which is sort of backwards
 since a large value for stathz is more useful than a large value of hz
 except on systems that do too much polling).
 
 > As a non-corporate user I can't tell how much people actually did that, but
 > to me it sounds logical to use polling on heavily loaded networking servers,
 > which comes with increasing the number of clock-interrupts per second. On
 > such servers a bug like this is even more dangerous than on my simple
 > cable-modeming gateway, granted that these systems handle many
 > connections.
 
 I hope using polling is not usual.  I think polling is only best in unusual
 configurations.  "Heavily loaded networking servers" might qualify.  I have
 no experience with them, but I have a lot of experience making (non-network)
 intrerrupt handlers efficient and have never found interrupt overhead per
 sec to be the main bottleneck in carefully written drivers.
 
 Bruce
 

From: John Polstra <jdp@polstra.com>
To: bug-followup@freebsd.org
Cc: bde@zeta.org.au
Subject: Re: kern/41552: TCP timers' sysctl's overflow
Date: Tue, 13 Aug 2002 09:24:19 -0700 (PDT)

 In article <200208131220.g7DCK5eQ076224@freefall.freebsd.org>,
 Bruce Evans  <bde@zeta.org.au> wrote:
 >  On Tue, 13 Aug 2002, G.P. de Boer wrote:
 >  > Now the question pops up if setting HZ -is- unusual. I can imagine that
 >  > there are many admins around who turned on polling for extra
 >  > performance/robustness and tuned option HZ because LINT says so.
 >  
 >  Garrett clarified that setting hz to more than 1000 breaks more than the
 >  TCP timer sysctls.  It violates an RFC.
 
 I don't think that's a valid argument.  Our hz value is an
 implementation detail which shouldn't have anything to do with RFCs.
 The nugget of truth buried in what Garrett said is that increasing hz
 tickles a _bug_ in our TCP implementation, which in turn causes the
 RFC to be violated.  (The bug is the direct use of hz rather than the
 use of a scaled version of it.)  The proper solution is to fix that
 bug, not to restrict the value of hz artificially.
 
 In my opinion, 100 ticks per second is a ridiculously low value for
 hz on modern systems.  Even a PII/400 can run at hz=10000 without
 significant overhead.  (I.e., the overhead can hardly be measured.)
 There are plenty of reasonable applications that utterly rely on
 elevated hz values.  Dummynet is just one example.
 
 Note, I don't think the fix referenced in this PR should be merged
 into the security branches anyway, since it is not security related.
 
 John

From: serkoon <serkoon@thedarkside.nl>
To: freebsd-gnats-submit@FreeBSD.org, g.p.de.boer@st.hanze.nl
Cc:  
Subject: kern/41552: TCP timers' sysctl's overflow
Date: Thu, 15 Aug 2002 22:53:18 +0200

  >Note, I don't think the fix referenced in this PR should be merged
  >into the security branches anyway, since it is not security related.
 
 Imo a bug which makes a host vulnerable to a DoS-attack by using up
 all available sockets/filedescriptors -is- a security-bug. I guess you'll
 agree on that.
 
 Then, why don't you feel that way in this particular ocassion? Is it that
 there just aren't many people around with HZ set at 1000 or up, so this
 bug, although it may be a security-bug, isn't that important because
 there are many higher prioritized things to fix?
 
 Pieter
 

From: John Polstra <jdp@polstra.com>
To: serkoon@thedarkside.nl
Cc: bug-followup@freebsd.org
Subject: Re: kern/41552: TCP timers' sysctl's overflow
Date: Fri, 16 Aug 2002 12:21:58 -0700 (PDT)

 In article <200208152100.g7FL04jL011288@freefall.freebsd.org>,
 serkoon  <serkoon@thedarkside.nl> wrote:
 >   >Note, I don't think the fix referenced in this PR should be merged
 >   >into the security branches anyway, since it is not security related.
 >  
 >  Imo a bug which makes a host vulnerable to a DoS-attack by using up
 >  all available sockets/filedescriptors -is- a security-bug. I guess you'll
 >  agree on that.
 
 Yes, but this one only happens when you use a rather unusual kernel
 configuration.  You could set NMBCLUSTERS to 5, and that would open up
 a DoS attack too.  But I don't think FreeBSD's urgent-security-fixes
 branch should address either of those potential problems.
 
 >  Then, why don't you feel that way in this particular ocassion? Is it that
 >  there just aren't many people around with HZ set at 1000 or up, so this
 >  bug, although it may be a security-bug, isn't that important because
 >  there are many higher prioritized things to fix?
 
 It's not a matter of priorities.  It's just that the purpose of the
 security branches is to achieve maximum stability by including only
 the most essential security-related fixes.  The more stuff you put
 into those branches, the less stable they will become.  We have seen
 that in real life in the -stable branches, and in fact that is the
 reason the security branches were created in the first place.
 
 In this case I believe you should either maintain the patch locally
 until 4.7 comes out (October 1), or else follow the -stable branch
 rather than the security branch.
 
 John
State-Changed-From-To: open->patched 
State-Changed-By: njl 
State-Changed-When: Fri Aug 23 17:00:53 PDT 2002 
State-Changed-Why:  
Patch was MFCed. 


Responsible-Changed-From-To: freebsd-bugs->jdp 
Responsible-Changed-By: njl 
Responsible-Changed-When: Fri Aug 23 17:00:53 PDT 2002 
Responsible-Changed-Why:  
jdp can close this if he wants 

http://www.freebsd.org/cgi/query-pr.cgi?pr=41552 
State-Changed-From-To: patched->closed 
State-Changed-By: njl 
State-Changed-When: Fri Aug 23 18:10:36 PDT 2002 
State-Changed-Why:  
Patch was merged into 4-STABLE and will be in a future release.  It will NOT 
be merged into the RELENG_4_6 security branch. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=41552 
>Unformatted:
