From oleg@lath.rinet.ru  Wed Jan  7 07:26:17 2004
Return-Path: <oleg@lath.rinet.ru>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 11D5C16A4D0; Wed,  7 Jan 2004 07:26:17 -0800 (PST)
Received: from lath.rinet.ru (lath.rinet.ru [195.54.192.90])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id A0DD343D2D; Wed,  7 Jan 2004 07:26:13 -0800 (PST)
	(envelope-from oleg@lath.rinet.ru)
Received: from lath.rinet.ru (localhost [127.0.0.1])
	by lath.rinet.ru (8.12.9p2/8.12.9) with ESMTP id i07FQB3O023444
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 7 Jan 2004 18:26:11 +0300 (MSK)
	(envelope-from oleg@lath.rinet.ru)
Received: (from oleg@localhost)
	by lath.rinet.ru (8.12.9p2/8.12.9/Submit) id i07FQB7S023443;
	Wed, 7 Jan 2004 18:26:11 +0300 (MSK)
	(envelope-from oleg)
Message-Id: <200401071526.i07FQB7S023443@lath.rinet.ru>
Date: Wed, 7 Jan 2004 18:26:11 +0300 (MSK)
From: Oleg Bulyzhin <oleg@rinet.ru>
Reply-To: Oleg Bulyzhin <oleg@rinet.ru>
To: FreeBSD-gnats-submit@freebsd.org
Cc: gshapiro@freebsd.org
Subject: [PATCH] wrong tokenization of unstructured data
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         61019
>Category:       bin
>Synopsis:       [PATCH] wrong tokenization of unstructured data
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    gshapiro
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jan 07 07:30:21 PST 2004
>Closed-Date:    Wed Feb 18 19:26:57 PST 2004
>Last-Modified:  Wed Feb 18 19:26:57 PST 2004
>Originator:     Oleg Bulyzhin
>Release:        FreeBSD 4.9-RELEASE-p1 i386
>Organization:
Cronyx Plus LLC
>Environment:
System: FreeBSD lath.rinet.ru 4.9-RELEASE-p1 FreeBSD 4.9-RELEASE-p1 #1: Thu Dec 11 14:25:00 MSK 2003 root@lath.rinet.ru:/lh/obj/lh/src/sys/lath i386

	All sendmail versions are affected (8.12.* 8.11.* 8.9.*)
	
>Description:
	Sendmail use prescan() function for data tokenization. This function
	use some implicit checks and convertions (like checks for unbalanced
	braces, angle braces etc).
	When prescan() used for 'unstructured' data tokenization (mail headers
	for example) global variable SuprErrs set to 'true' and all those error
	messages just skipped but 'syntax enforcing' still works (unbalanced
	'>' stripping for example).

	Due to such prescan() behaviour certain symbols are 'invisible' for
	sendmail. This can lead to wrong mail filtering (and maybe other
	ugly things).

	
>How-To-Repeat:
	Add following in sendmail.cf:

	Ksyslog syslog
	HSubject: $>+log_subject
	Slog_subject
	R$*		$: $(syslog "Subject: " $1 $)

	restart sendmail 
	do the following:

	root@lath# echo | mail -s '-->bug<--' postmaster@localhost
	root@lath# grep "Subject:" /var/log/maillog
	Jan  7 17:59:19 lath sm-mta[23337]: i07ExJ3O023337: Subject: --bug<-->
	root@lath#

	Subject '-->bug<--' was converted to '--bug<-->':
	'>' symbol was unbalanced and prescan() stripped it. Then prescan()
	found unbalanced '<' and added extra '>' symbol.

	
>Fix:
	Well, to my mind there is design flow: there should be 2 different
	functions: one for tokenization only and other for syntax checks.
	Though my sendmail knowledge is not deep enough - maybe i'm wrong.

	Anyway here is little (without altering whole sendmail sources) patch
	for sendmail 8.12.9p2:

--- parseaddr.c.orig	Thu Sep 25 08:53:37 2003
+++ parseaddr.c	Wed Dec 31 17:49:47 2003
@@ -721,6 +721,8 @@
 			c = (*p++) & 0x00ff;
 			if (c == '\0')
 			{
+				if (SuprErrs) break;
+
 				/* diagnose and patch up bad syntax */
 				if (state == QST)
 				{
@@ -748,7 +750,7 @@
 					break;
 
 				/* special case for better error management */
-				if (delim == ',' && !route_syntax)
+				if (delim == ',' && !route_syntax && !SuprErrs)
 				{
 					usrerr("553 Unbalanced '<'");
 					c = '>';
@@ -824,7 +826,7 @@
 				if (anglecnt <= 0)
 				{
 					usrerr("553 Unbalanced '>'");
-					c = NOCHAR;
+					if (!SuprErrs) c = NOCHAR;
 				}
 				else
 					anglecnt--;


	


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->gshapiro 
Responsible-Changed-By: ceri 
Responsible-Changed-When: Wed Jan 7 11:35:14 PST 2004 
Responsible-Changed-Why:  
Over to Mr. sendmail. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=61019 
State-Changed-From-To: open->feedback 
State-Changed-By: gshapiro 
State-Changed-When: Mon Feb 16 12:06:01 PST 2004 
State-Changed-Why:  
This isn't a change we can introduce in 8.12 as it changes the   
behavior.  However, I can probably get it added to 8.13.  Please 
test the patch below (which will probably apply against 8.12 as 
well) to see if it meets your needs.  It turns off address syntax 
checking in header checks if $>+ is used (as your example does). 

diff -ur sendmailx/envelope.c sendmail/envelope.c 
--- sendmailx/envelope.c	Mon Feb 16 11:42:36 2004 
+++ sendmail/envelope.c	Mon Feb 16 11:55:16 2004 
@@ -1131,7 +1131,7 @@ 
**	links in the net. 
*/ 

-	pvp = prescan(from, delimchar, pvpbuf, sizeof pvpbuf, NULL, NULL); 
+	pvp = prescan(from, delimchar, pvpbuf, sizeof pvpbuf, NULL, NULL, true); 
if (pvp == NULL) 
{ 
/* don't need to give error -- prescan did that already */ 
diff -ur sendmailx/headers.c sendmail/headers.c 
--- sendmailx/headers.c	Mon Feb 16 11:42:36 2004 
+++ sendmail/headers.c	Mon Feb 16 11:55:16 2004 
@@ -1804,7 +1804,7 @@ 
char pvpbuf[PSBUFSIZE]; 

res = prescan(p, oldstyle ? ' ' : ',', pvpbuf, 
-				      sizeof pvpbuf, &oldp, NULL); 
+				      sizeof pvpbuf, &oldp, NULL, true); 
p = oldp; 
#if _FFR_IGNORE_BOGUS_ADDR 
/* ignore addresses that can't be parsed */ 
diff -ur sendmailx/main.c sendmail/main.c 
--- sendmailx/main.c	Mon Feb 16 11:42:36 2004 
+++ sendmail/main.c	Mon Feb 16 11:45:26 2004 
@@ -4508,8 +4508,8 @@ 
register char **pvp; 
char pvpbuf[PSBUFSIZE]; 

-		pvp = prescan(++p, ',', pvpbuf, sizeof pvpbuf, 
-			      &delimptr, ConfigLevel >= 9 ? TokTypeNoC : NULL); 
+		pvp = prescan(++p, ',', pvpbuf, sizeof pvpbuf, &delimptr, 
+			      ConfigLevel >= 9 ? TokTypeNoC : NULL, true); 
if (pvp == NULL) 
continue; 
p = q; 
diff -ur sendmailx/mime.c sendmail/mime.c 
--- sendmailx/mime.c	Mon Feb 16 11:42:36 2004 
+++ sendmail/mime.c	Mon Feb 16 11:55:16 2004 
@@ -132,7 +132,7 @@ 
p = hvalue("Content-Transfer-Encoding", header); 
if (p == NULL || 
(pvp = prescan(p, '0', pvpbuf, sizeof pvpbuf, NULL, 
-			   MimeTokenTab)) == NULL || 
+			   MimeTokenTab, true)) == NULL || 
pvp[0] == NULL) 
{ 
cte = NULL; 
@@ -154,7 +154,7 @@ 
} 
if (p != NULL && 
(pvp = prescan(p, '0', pvpbuf, sizeof pvpbuf, NULL, 
-			   MimeTokenTab)) != NULL && 
+			   MimeTokenTab, true)) != NULL && 
pvp[0] != NULL) 
{ 
if (tTd(43, 40)) 
@@ -985,7 +985,7 @@ 
p = hvalue("Content-Transfer-Encoding", header); 
if (p == NULL || 
(pvp = prescan(p, '0', pvpbuf, sizeof pvpbuf, NULL, 
-			   MimeTokenTab)) == NULL || 
+			   MimeTokenTab, true)) == NULL || 
pvp[0] == NULL) 
{ 
/* "can't happen" -- upper level should have caught this */ 
diff -ur sendmailx/parseaddr.c sendmail/parseaddr.c 
--- sendmailx/parseaddr.c	Mon Feb 16 11:42:36 2004 
+++ sendmail/parseaddr.c	Mon Feb 16 11:58:46 2004 
@@ -90,7 +90,7 @@ 
if (delimptr == NULL) 
delimptr = &delimptrbuf; 

-	pvp = prescan(addr, delim, pvpbuf, sizeof pvpbuf, delimptr, NULL); 
+	pvp = prescan(addr, delim, pvpbuf, sizeof pvpbuf, delimptr, NULL, true); 
if (pvp == NULL) 
{ 
if (tTd(20, 1)) 
@@ -460,6 +460,7 @@ 
**			terminating delimiter. 
**		toktab -- if set, a token table to use for parsing. 
**			If NULL, use the default table. 
+**		fixup -- if true, fixup unbalanced addresses 
** 
**	Returns: 
**		A pointer to a vector of tokens. 
@@ -611,13 +612,14 @@ 
#define NOCHAR		(-1)	/* signal nothing in lookahead token */ 

char ** 
-prescan(addr, delim, pvpbuf, pvpbsize, delimptr, toktab) 
+prescan(addr, delim, pvpbuf, pvpbsize, delimptr, toktab, fixup) 
char *addr; 
int delim; 
char pvpbuf[]; 
int pvpbsize; 
char **delimptr; 
unsigned char *toktab; 
+	bool fixup; 
{ 
register char *p; 
register char *q; 
@@ -721,7 +723,9 @@ 
if (c == '0') 
{ 
/* diagnose and patch up bad syntax */ 
-				if (state == QST) 
+				if (!fixup) 
+					break; 
+				else if (state == QST) 
{ 
usrerr("553 Unbalanced '"'"); 
c = '"'; 
@@ -747,7 +751,7 @@ 
break; 

/* special case for better error management */ 
-				if (delim == ',' && !route_syntax) 
+				if (delim == ',' && !route_syntax && fixup) 
{ 
usrerr("553 Unbalanced '<'"); 
c = '>'; 
@@ -799,7 +803,8 @@ 
if (cmntcnt <= 0) 
{ 
usrerr("553 Unbalanced ')'"); 
-					c = NOCHAR; 
+					if (fixup) 
+						c = NOCHAR; 
} 
else 
cmntcnt--; 
@@ -823,7 +828,8 @@ 
if (anglecnt <= 0) 
{ 
usrerr("553 Unbalanced '>'"); 
-					c = NOCHAR; 
+					if (fixup) 
+						c = NOCHAR; 
} 
else 
anglecnt--; 
@@ -1358,7 +1364,7 @@ 
/* scan the new replacement */ 
xpvp = prescan(mval, '0', pvpbuf, 
sizeof pvpbuf, NULL, 
-						       NULL); 
+						       NULL, true); 
if (xpvp == NULL) 
{ 
/* prescan pre-printed error */ 
@@ -1530,7 +1536,7 @@ 
{ 
/* scan the new replacement */ 
xpvp = prescan(replac, '0', pvpbuf, 
-					       sizeof pvpbuf, NULL, NULL); 
+					       sizeof pvpbuf, NULL, NULL, true); 
if (xpvp == NULL) 
{ 
/* prescan already printed error */ 
@@ -2559,7 +2565,7 @@ 
**	domain will be appended. 
*/ 

-	pvp = prescan(name, '0', pvpbuf, sizeof pvpbuf, NULL, NULL); 
+	pvp = prescan(name, '0', pvpbuf, sizeof pvpbuf, NULL, NULL, true); 
if (pvp == NULL) 
return name; 
if (REWRITE(pvp, 3, e) == EX_TEMPFAIL) 
@@ -2686,7 +2692,7 @@ 
sm_dprintf("maplocaluser: "); 
printaddr(sm_debug_file(), a, false); 
} 
-	pvp = prescan(a->q_user, '0', pvpbuf, sizeof pvpbuf, NULL, NULL); 
+	pvp = prescan(a->q_user, '0', pvpbuf, sizeof pvpbuf, NULL, NULL, true); 
if (pvp == NULL) 
{ 
if (tTd(29, 9)) 
@@ -2985,7 +2991,8 @@ 
SuprErrs = true; 
QuickAbort = false; 
pvp = prescan(buf, '0', pvpbuf, sizeof pvpbuf, NULL, 
-			      bitset(RSF_RMCOMM, flags) ? NULL : TokTypeNoC); 
+			      bitset(RSF_RMCOMM, flags) ? NULL : TokTypeNoC, 
+			      bitset(RSF_RMCOMM, flags) ? true : false); 
SuprErrs = saveSuprErrs; 
if (pvp == NULL) 
{ 
@@ -3192,7 +3199,7 @@ 
{ 
SuprErrs = true; 
QuickAbort = false; 
-		*pvp = prescan(buf, '0', pvpbuf, size, NULL, NULL); 
+		*pvp = prescan(buf, '0', pvpbuf, size, NULL, NULL, true); 
if (*pvp != NULL) 
rstat = rewrite(*pvp, rsno, 0, e, size); 
else 
diff -ur sendmailx/readcf.c sendmail/readcf.c 
--- sendmailx/readcf.c	Mon Feb 16 11:42:36 2004 
+++ sendmail/readcf.c	Mon Feb 16 11:55:16 2004 
@@ -206,7 +206,8 @@ 
expand(&bp[1], exbuf, sizeof exbuf, e); 
rwp->r_lhs = prescan(exbuf, 't', pvpbuf, 
sizeof pvpbuf, NULL, 
-					     ConfigLevel >= 9 ? TokTypeNoC : NULL); 
+					     ConfigLevel >= 9 ? TokTypeNoC : NULL, 
+					     true); 
nfuzzy = 0; 
if (rwp->r_lhs != NULL) 
{ 
@@ -293,7 +294,8 @@ 
expand(q, exbuf, sizeof exbuf, e); 
rwp->r_rhs = prescan(exbuf, 't', pvpbuf, 
sizeof pvpbuf, NULL, 
-					     ConfigLevel >= 9 ? TokTypeNoC : NULL); 
+					     ConfigLevel >= 9 ? TokTypeNoC : NULL, 
+					     true); 
if (rwp->r_rhs != NULL) 
{ 
register char **ap; 
diff -ur sendmailx/sendmail.h sendmail/sendmail.h 
--- sendmailx/sendmail.h	Mon Feb 16 11:42:36 2004 
+++ sendmail/sendmail.h	Mon Feb 16 11:55:16 2004 
@@ -404,7 +404,7 @@ 
extern bool	invalidaddr __P((char *, char *, bool)); 
extern ADDRESS	*parseaddr __P((char *, ADDRESS *, int, int, char **, 
ENVELOPE *, bool)); 
-extern char	**prescan __P((char *, int, char[], int, char **, unsigned char *)); 
+extern char	**prescan __P((char *, int, char[], int, char **, unsigned char *, bool)); 
extern void	printaddr __P((SM_FILE_T *, ADDRESS *, bool)); 
extern ADDRESS	*recipient __P((ADDRESS *, ADDRESS **, int, ENVELOPE *)); 
extern char	*remotename __P((char *, MAILER *, int, int *, ENVELOPE *)); 


http://www.freebsd.org/cgi/query-pr.cgi?pr=61019 

From: Oleg Bulyzhin <oleg@rinet.ru>
To: Gregory Neil Shapiro <gshapiro@FreeBSD.org>
Cc: freebsd-gnats-submit@FreeBSD.org, Oleg Bulyzhin <oleg@rinet.ru>
Subject: Re: bin/61019: [PATCH] wrong tokenization of unstructured data
Date: Wed, 18 Feb 2004 18:43:54 +0300 (MSK)

 On Mon, 16 Feb 2004, Gregory Neil Shapiro wrote:
 
 > Synopsis: [PATCH] wrong tokenization of unstructured data
 >
 > State-Changed-From-To: open->feedback
 > State-Changed-By: gshapiro
 > State-Changed-When: Mon Feb 16 12:06:01 PST 2004
 > State-Changed-Why:
 > This isn't a change we can introduce in 8.12 as it changes the
 > behavior.  However, I can probably get it added to 8.13.  Please
 > test the patch below (which will probably apply against 8.12 as
 > well) to see if it meets your needs.  It turns off address syntax
 > checking in header checks if $>+ is used (as your example does).
 
 Thanks. Patch works fine. (it can be applied to 8.12.9p2 - only sendmail.h patch
 failed and i patched it by hands).
 
 But i've got one more question: due to readcf() uses prescan() with syntax
 checks enabled it's not possible to use RHS or LHS with unbalanced '>' or '<'
 symbols. Is it bug or feature?
 
 -- 
 Oleg.
 
 ================================================================
 === Oleg Bulyzhin -- OBUL-RIPN -- OBUL-RIPE -- oleg@rinet.ru ===
 ================================================================
 
State-Changed-From-To: feedback->closed 
State-Changed-By: gshapiro 
State-Changed-When: Wed Feb 18 19:26:05 PST 2004 
State-Changed-Why:  
I made the change so the rulesets will work as well in 8.13. 

I'm going to close this bug as it isn't really a FreeBSD bug and 
will be fixed when 8.13.0 is released and imported into FreeBSD. 

Thanks for the fix! 

http://www.freebsd.org/cgi/query-pr.cgi?pr=61019 
>Unformatted:
