From pornin@bolet.org  Sun Aug 31 05:39:32 2003
Return-Path: <pornin@bolet.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8B68816A4BF
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 31 Aug 2003 05:39:32 -0700 (PDT)
Received: from gnah.bolet.org (gnah.bolet.org [80.65.226.87])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6C02044001
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 31 Aug 2003 05:39:30 -0700 (PDT)
	(envelope-from pornin@bolet.org)
Received: from gnah.bolet.org (localhost [127.0.0.1])
	by gnah.bolet.org (8.12.9/8.12.9) with ESMTP id h7VCdSlJ021919
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 31 Aug 2003 14:39:28 +0200 (CEST)
	(envelope-from pornin@bolet.org)
Received: (from pornin@localhost)
	by gnah.bolet.org (8.12.9/8.12.9/Submit) id h7VCdRDg021918;
	Sun, 31 Aug 2003 14:39:27 +0200 (CEST)
Message-Id: <200308311239.h7VCdRDg021918@gnah.bolet.org>
Date: Sun, 31 Aug 2003 14:39:27 +0200 (CEST)
From: Thomas Pornin <pornin@bolet.org>
Reply-To: Thomas Pornin <pornin@bolet.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: IPsec tunnel (ESP) over IPv6: MTU computation is wrong
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         56233
>Category:       kern
>Synopsis:       IPsec tunnel (ESP) over IPv6: MTU computation is wrong
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-net
>State:          analyzed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Aug 31 05:40:14 PDT 2003
>Closed-Date:    
>Last-Modified:  Tue Jun 15 17:48:15 UTC 2010
>Originator:     Thomas Pornin
>Release:        FreeBSD 4.9-PRERELEASE alpha
>Organization:
>Environment:
System: FreeBSD gnah.bolet.org 4.9-PRERELEASE FreeBSD 4.9-PRERELEASE #0: Sat Aug 30 06:24:21 CEST 2003 root@gnah.bolet.org:/drive/obj/drive/src/sys/GNAH alpha

(problem actually occurs on i386 as well)


>Description:

The maximum ESP header size computation, function esp_hdrsiz(), in
sys/netinet6/esp_output.c, is wrong when used with a 16-byte block
cipher such as Rijndael. It is also wrong with a 8-byte block cipher,
but in the other direction, hence inducing no particular problem except
a possible slight performance hit.


>How-To-Repeat:

My office network and my own home network use the same ISP, which
provides IPv4 and IPv6 connectivity through ADSL links. Both networks
have a router/firewall running FreeBSD (4-STABLE from the end of August
2003).

I set up an IPsec tunnel between the two routers; the setkey
configuration file is simple:

spdadd [home-network-IP]/48 [office-network]/48 any -P out ipsec esp/tunnel/[home-router-IP]-[office-router-IP]/require ;
spdadd [office-network-IP]/48 [home-network]/48 any -P in ipsec esp/tunnel/[office-router-IP]-[home-router-IP]/require ;
add [home-router-IP] [office-router-IP] esp 0x10001 -m tunnel -E rijndael-cbc [symmetric-cipher-key] -A hmac-sha1 [authentication-key] ;
add [office-router-IP] [home-router-IP] esp 0x10002 -m tunnel -E rijndael-cbc [symmetric-cipher-key] -A hmac-sha1 [authentication-key] ;

(This is the configuration file on my home router; on the office router,
a similar configuration file is used.)

If I call N a machine of my home network, H my home router, O the office
router, and T a machine on the office network, I can send IPv6 packets
from N to T and back, and some tcpdumps show that those packet are duly
encrypted and authenticated between H and O.

Problems come when I try to send long packets. Typically, I establish
a TCP connection (through rlogin or ssh) and begin viewing a text with
an editor. The machine T wants to send a quite big packet to N. T has a
local MTU with O of 1500 (ethernet link) but the MTU is smaller between
O and H, so O sends to T an ICMP packet "too big" stating that T should
lower its MTU for this connection to 1407. Which T does. And there is
the true problem: 1407 is too big, the maximum being 1406. T and O enter
a loop where T repeatedly sends its 1407-byte packet, and O repeatedly
reject the packets and sends ICMPs stating that the MTU is 1407.


Details on the MTU computation:
External connectivity uses PPPoE and implies a MTU of 1492 (this is
important ! With a MTU of 1500, the problem is much less a nuisance).
The external IPv6 header uses 40 bytes. The ESP header uses 8 bytes
before the encrypted part, and 12 after (for the HMAC-SHA1 truncated
to 96 bits, as described in RFC 2404). So there remains 1432 bytes for
the encrypted part. The first 16 bytes are for the CBC initial value,
and there are 1416 bytes for data. However, since Rijndael uses 16-byte
blocks, the encrypted part must have a length multiple of 16. So the
real maximum encrypted data size is 1408 bytes. Since the Pad-Length
and Next-Header fields are mandatory, only 1406 bytes of data of
available.

Hence, the packets T sends to N through O and H must not exceed 1406
bytes, including their own IPv6 header.

>Fix:

The immediate work-around is to use an 8-byte block cipher such as
Blowfish. With such a block size, the MTU becomes 1422. In my setting, O
sends ICMP packets requesting a MTU of 1415, which is wrong again, but
in the safe direction: T sends packets shorter than needed, but data
flows.

A quick fix in the source code would be to change the code of
esp_hdrsiz() in sys/netinet6/esp_output.c, lines 139 and 153. This
function uses an estimate function which goes along the line:

ESP header length + IV length + 9 + Authlen

where the ESP header length is 8 bytes, the IV length is equal to the
cipher block length, and Authlen is the authentication algorithm
output length, here 12 for HMAC-SHA-1-96. The "9" is described in
a comment lines 149 and 150: it is the maximum padding length,
including the Pad-Length and Next-Header fields. This value is
correct for 8-byte block ciphers such as Blowfish and 3DES, but
should be "17" for 16-byte block ciphers.

So a quick fix would be to replace the "9" values in both lines 139
and 153 by "17".

A complete fix would require a more exact computation of the header
length, but it depends on the outter MTU and the cipher block length,
and I don't know to which extent that data is available at that
place in the kernel code.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->bms 
Responsible-Changed-By: bms 
Responsible-Changed-When: Wed Jun 16 09:12:31 GMT 2004 
Responsible-Changed-Why:  
I'll take this 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56233 
State-Changed-From-To: open->feedback 
State-Changed-By: bms 
State-Changed-When: Wed Jun 16 09:12:45 GMT 2004 
State-Changed-Why:  
May have been fixed by recent commits in this area, can you test them? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56233 

From: Thomas Pornin <pornin@bolet.org>
To: Bruce M Simpson <bms@FreeBSD.org>
Cc:  
Subject: Re: kern/56233: IPsec tunnel (ESP) over IPv6: MTU computation is wrong
Date: Wed, 16 Jun 2004 12:35:42 +0200

 On Wed, Jun 16, 2004 at 09:13:11AM +0000, Bruce M Simpson wrote:
 > May have been fixed by recent commits in this area, can you test them?
 
 Not immediately: one of the machines involved is a very small Alpha and
 it takes three days to recompile the world.
 
 I now begin the update procedure; I may be able to test things next
 Sunday, or at some point next week (if the update is successful, that
 is).
 
 
 	--Thomas
 

From: Thomas Pornin <pornin@bolet.org>
To: Bruce M Simpson <bms@FreeBSD.org>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/56233: IPsec tunnel (ESP) over IPv6: MTU computation is wrong
Date: Wed, 30 Jun 2004 12:13:59 +0200

 On Wed, Jun 16, 2004 at 09:13:11AM +0000, Bruce M Simpson wrote:
 > Synopsis: IPsec tunnel (ESP) over IPv6: MTU computation is wrong
 > 
 > State-Changed-From-To: open->feedback
 > State-Changed-By: bms
 > State-Changed-When: Wed Jun 16 09:12:45 GMT 2004
 > State-Changed-Why: 
 > May have been fixed by recent commits in this area, can you test them?
 > 
 > http://www.freebsd.org/cgi/query-pr.cgi?pr=56233
 
 I have finally been able to test it. The Alpha machine took 42 hours
 to recompile the world, and I had to do it twice because at some point
 the compile stop, apparently due to some bug in the filesystem code
 (all processes entering a specific directory blocked forever).
 
 In brief: I am sorry to report that the problem is still there.
 
 
 With details: the two routers involved are, respectively:
 1. a PC, running -STABLE from Jun 16 10:59:00 GMT 2004
 2. an Alpha, running -STABLE from Jun 20 09:58:00 GMT 2004
 
 An IPv6 ESP tunnel is established between those two routers, using
 Rijndael as symmetric encryption cipher (with a 128-bit key) and
 HMAC/SHA-1 for integrity check. That tunnel is established over a link
 which provides a MTU of 1492 (ADSL link with PPPoE at both ends).
 In this configuration, it can be computed that encapsulated traffic
 (packets which go through the tunnel) can be at most 1406 bytes long
 (accounting for the outer IPv6 header, including ESP/AH headers).
 
 I am transfering a big file with rcp from a third machine (a PC, which
 uses router 1) to the Alpha (router 2). The machine 3 is connected to
 router 1 through a standard ethernet link with MTU 1500. When the file
 is sent (after initial rcp work), the machine 3 first attempts to send
 a 1500-byte packet to router 1. The packet is too big to go through
 the tunnel. As is mandated by IPv6, router 1 does _not_ fragment the
 packet, but instead reports the problem to machine 3 with an ICMPv6
 packet "too big". That packet should contain the maximum MTU of 1406 (or
 some slightly lower number). It contains 1407, which is wrong; machine
 3 then sends a 1407-byte packet, which is rejected with the same ICMPv6
 packet, and this process loops forever.
 
 
 I am willing to make other tests if needed, but:
 -- my Alpha machine is really slow to recompile things;
 -- I will be on vacation for the next three weeks.
 
 I think that the problem could be exhibited with a more simple setup
 (ethernet LAN) by reducing arbitrarily the MTU of some of the interfaces
 ("ifconfig xl0 mtu 1492"). I have not tried, nor will I for the next
 three weeks.
 
 
 	--Thomas Pornin
Responsible-Changed-From-To: bms->freebsd-net 
Responsible-Changed-By: bms 
Responsible-Changed-When: Sat Sep 23 16:28:40 UTC 2006 
Responsible-Changed-Why:  
I must focus on more specific areas. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56233 
Responsible-Changed-From-To: freebsd-net->gnn 
Responsible-Changed-By: bms 
Responsible-Changed-When: Sun Sep 24 08:57:37 UTC 2006 
Responsible-Changed-Why:  
by request 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56233 
State-Changed-From-To: feedback->analyzed 
State-Changed-By: linimon 
State-Changed-When: Sat Mar 1 20:12:50 UTC 2008 
State-Changed-Why:  
Feedback was received, stating that the problem still exists. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56233 
Responsible-Changed-From-To: gnn->freebsd-net 
Responsible-Changed-By: gnn 
Responsible-Changed-When: Tue Jun 15 17:47:41 UTC 2010 
Responsible-Changed-Why:  
I'm not working on IPSec at the moment, handing this one back. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=56233 
>Unformatted:
