From jdc@koitsu.dyndns.org  Tue Nov 24 19:18:50 2009
Return-Path: <jdc@koitsu.dyndns.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A5720106566B
	for <freebsd-gnats-submit@freebsd.org>; Tue, 24 Nov 2009 19:18:50 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from QMTA01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [76.96.30.16])
	by mx1.freebsd.org (Postfix) with ESMTP id 8F5168FC0A
	for <freebsd-gnats-submit@freebsd.org>; Tue, 24 Nov 2009 19:18:50 +0000 (UTC)
Received: from OMTA18.emeryville.ca.mail.comcast.net ([76.96.30.74])
	by QMTA01.emeryville.ca.mail.comcast.net with comcast
	id 96ah1d0071bwxycA174usq; Tue, 24 Nov 2009 19:04:54 +0000
Received: from koitsu.dyndns.org ([98.248.46.159])
	by OMTA18.emeryville.ca.mail.comcast.net with comcast
	id 97E41d00F3S48mS8e7E4Av; Tue, 24 Nov 2009 19:14:05 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 3A0A81E3035; Tue, 24 Nov 2009 11:05:39 -0800 (PST)
Message-Id: <20091124190539.3A0A81E3035@icarus.home.lan>
Date: Tue, 24 Nov 2009 11:05:39 -0800 (PST)
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
Reply-To: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: libfetch: fetchParseURL(3) returns success with invalid URLs
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         140835
>Category:       kern
>Synopsis:       [libfetch] fetchParseURL(3) returns success with invalid URLs
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    des
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 24 19:20:02 UTC 2009
>Closed-Date:    
>Last-Modified:  Fri Sep  9 04:40:04 UTC 2011
>Originator:     Jeremy Chadwick
>Release:        FreeBSD 8.0-PRERELEASE amd64
>Organization:
>Environment:
System: FreeBSD icarus.home.lan 8.0-PRERELEASE FreeBSD 8.0-PRERELEASE #0: Tue Nov 17 20:07:21 PST 2009 root@icarus.home.lan:/usr/obj/usr/src/sys/X7SBA_RELENG_8_amd64 amd64
>Description:
libfetch contains a function, fetchParseURL(3), whose man page
states the following:

     fetchParseURL() takes a URL in the form of a null-terminated string and
     splits it into its components function according to the Common Internet
     Scheme Syntax detailed in RFC1738.  A regular expression which produces
     this syntax is:

         <scheme>:(//(<user>(:<pwd>)?@)?<host>(:<port>)?)?/(<document>)?

     If the URL does not seem to begin with a scheme name, the following syn-
     tax is assumed:

         ((<user>(:<pwd>)?@)?<host>(:<port>)?)?/(<document>)?

     Note that some components of the URL are not necessarily relevant to all
     URL schemes.  For instance, the file scheme only needs the <scheme> and
     <document> components.

     .....

     fetchParseURL() returns a pointer to a struct url containing the individ-
     ual components of the URL.  If it is unable to allocate memory, or the
     URL is syntactically incorrect, fetchParseURL() returns a NULL pointer.

But when passed a URL such as the below (note the delimiter is
colon-slash, not colon-slash-slash)

	http:/www.somesite.com/

fetchParseURL(3) returns a pointer to a struct with the following
data:

	url->scheme = http
	url->user   = <null>
	url->pwd    = <null>
	url->host   = <null>
	url->port   = 0
	url->doc    = /www.somesite.com/

Given the documentation, fetchParseURL(3) should return NULL in this
scenario; it was able to work out the scheme by itself, which
implies that the RFC1738-compliancy paragraph of the documentation
should apply strictly.

This issue came to light on freebsd-stable:

http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052969.html
http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052971.html
http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052972.html
http://lists.freebsd.org/pipermail/freebsd-stable/2009-November/052973.html
>How-To-Repeat:
$ fetch http:/www.somesite.com/
fetch: http:/www.somesite.com/: No address record
$ fetch http:/localhost/
fetch: http:/localhost/: No address record
>Fix:
None known.

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->des 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Wed Feb 23 22:30:37 UTC 2011 
Responsible-Changed-Why:  
des, is this still your territory? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=140835 

From: Mark <markjdb@gmail.com>
To: bug-followup@FreeBSD.org, freebsd@jdc.parodius.com
Cc:  
Subject: Re: kern/140835: [libfetch] fetchParseURL(3) returns success with
 invalid URLs
Date: Fri, 9 Sep 2011 00:30:01 -0400

 --bcaec5485fca73cb1f04ac7aa18b
 Content-Type: text/plain; charset=ISO-8859-1
 
 It looks like this behaviour is intentional: fetchParseURL() seems to
 treat <scheme>:/<stuff> as shorthand for <scheme>://localhost/stuff,
 so that I can write "fetch file:/home/mark/foo" or so. It turns out
 that firefox and curl do this too, but only for the "file" scheme.
 Since this special syntax doesn't seem to be mentioned anywhere in RFC
 1738, I've made a patch which changes fetchParseURL() to accept this
 syntax for only the file scheme (and return NULL otherwise). The patch
 also fixes a couple of style bugs.
 
 I'm not sure if the "proper" fix for this is to drop support for that
 syntax completely, since it's not in the RFC and it's not documented
 anywhere in fetch/libfetch. Then again, maybe having a
 strictly-conforming RFC 1738 implementation isn't important - fetch(1)
 already does things like treat "ftp.freebsd.org" as
 "ftp://ftp.freebsd.org" and so on. Any thoughts?
 
 -Mark
 
 --bcaec5485fca73cb1f04ac7aa18b
 Content-Type: text/plain; charset=US-ASCII; name="libfetch_URLparse2.patch.txt"
 Content-Disposition: attachment; filename="libfetch_URLparse2.patch.txt"
 Content-Transfer-Encoding: base64
 X-Attachment-Id: f_gscnywe70
 
 ZGlmZiAtLWdpdCBhL2xpYi9saWJmZXRjaC9mZXRjaC5jIGIvbGliL2xpYmZldGNoL2ZldGNoLmMK
 aW5kZXggNTA0NGZlMy4uNDhlMTk0ZiAxMDA2NDQKLS0tIGEvbGliL2xpYmZldGNoL2ZldGNoLmMK
 KysrIGIvbGliL2xpYmZldGNoL2ZldGNoLmMKQEAgLTMwOSwxNSArMzA5LDE2IEBAIGZldGNoUGFy
 c2VVUkwoY29uc3QgY2hhciAqVVJMKQogCiAJLyogc2NoZW1lIG5hbWUgKi8KIAlpZiAoKHAgPSBz
 dHJzdHIoVVJMLCAiOi8iKSkpIHsKLQkJc25wcmludGYodS0+c2NoZW1lLCBVUkxfU0NIRU1FTEVO
 KzEsCisJCXNucHJpbnRmKHUtPnNjaGVtZSwgVVJMX1NDSEVNRUxFTiArIDEsCiAJCSAgICAiJS4q
 cyIsIChpbnQpKHAgLSBVUkwpLCBVUkwpOwogCQlVUkwgPSArK3A7CiAJCS8qCi0JCSAqIE9ubHkg
 b25lIHNsYXNoOiBubyBob3N0LCBsZWF2ZSBzbGFzaCBhcyBwYXJ0IG9mIGRvY3VtZW50Ci0JCSAq
 IFR3byBzbGFzaGVzOiBob3N0IGZvbGxvd3MsIHN0cmlwIHNsYXNoZXMKKwkJICogVHJlYXQgImZp
 bGU6LzxmaWxlPiIgYXMgc2hvcnRoYW5kIGZvciAiZmlsZTovLy88ZmlsZT4iLgogCQkgKi8KIAkJ
 aWYgKFVSTFsxXSA9PSAnLycpCiAJCQlVUkwgPSAocCArPSAyKTsKKwkJZWxzZSBpZiAoc3RyY21w
 KHUtPnNjaGVtZSwgImZpbGUiKSAhPSAwKQorCQkJZ290byBvdWNoOwogCX0gZWxzZSB7CiAJCXAg
 PSBVUkw7CiAJfQpkaWZmIC0tZ2l0IGEvdXNyLmJpbi9mZXRjaC9mZXRjaC5jIGIvdXNyLmJpbi9m
 ZXRjaC9mZXRjaC5jCmluZGV4IDc1NTNiZDguLjFjNTFkZTkgMTAwNjQ0Ci0tLSBhL3Vzci5iaW4v
 ZmV0Y2gvZmV0Y2guYworKysgYi91c3IuYmluL2ZldGNoL2ZldGNoLmMKQEAgLTM0MiwxMSArMzQy
 LDExIEBAIGZldGNoKGNoYXIgKlVSTCwgY29uc3QgY2hhciAqcGF0aCkKIAkvKiBwYXJzZSBVUkwg
 Ki8KIAl1cmwgPSBOVUxMOwogCWlmICgqVVJMID09ICdcMCcpIHsKLQkJd2FybngoImVtcHR5IFVS
 TCIpOworCQl3YXJueCgiRW1wdHkgVVJMIik7CiAJCWdvdG8gZmFpbHVyZTsKIAl9CiAJaWYgKCh1
 cmwgPSBmZXRjaFBhcnNlVVJMKFVSTCkpID09IE5VTEwpIHsKLQkJd2FybngoIiVzOiBwYXJzZSBl
 cnJvciIsIFVSTCk7CisJCXdhcm54KCIlczogSW52YWxpZCBVUkwiLCBVUkwpOwogCQlnb3RvIGZh
 aWx1cmU7CiAJfQogCg==
 --bcaec5485fca73cb1f04ac7aa18b--
>Unformatted:
