From jr@opal.com  Thu Jul 13 15:11:28 2006
Return-Path: <jr@opal.com>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0465E16A4DE
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 13 Jul 2006 15:11:28 +0000 (UTC)
	(envelope-from jr@opal.com)
Received: from smtp.vzavenue.net (smtp.vzavenue.net [66.171.59.140])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1606743DBE
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 13 Jul 2006 15:11:26 +0000 (GMT)
	(envelope-from jr@opal.com)
Received: from 118.79.171.66.subscriber.vzavenue.net (HELO linwhf.opal.com) ([66.171.79.118])
  by smtp.vzavenue.net with ESMTP; 13 Jul 2006 11:11:24 -0400
Received: from linwhf.opal.com (localhost [127.0.0.1])
	by linwhf.opal.com (8.13.6/8.13.6) with ESMTP id k6DFBMqq093904
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 13 Jul 2006 11:11:22 -0400 (EDT)
	(envelope-from jr@opal.com)
Received: from 127.0.0.1 ([127.0.0.1] helo=linwhf.opal.com) by ASSP-nospam;
	13 Jul 2006 11:11:22 -0400
Received: (from jr@localhost)
	by linwhf.opal.com (8.13.6/8.13.6/Submit) id k6DFBMQM093903;
	Thu, 13 Jul 2006 11:11:22 -0400 (EDT)
	(envelope-from jr)
Message-Id: <200607131511.k6DFBMQM093903@linwhf.opal.com>
Date: Thu, 13 Jul 2006 11:11:22 -0400 (EDT)
From: "J.R. Oldroyd" <fbsd@opal.com>
Reply-To: "J.R. Oldroyd" <fbsd@opal.com>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: fix to hexdump(1)/od(1) following "UTF-8 zero-width character patch"
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         100215
>Category:       bin
>Synopsis:       [patch] fix to hexdump(1)/od(1) following "UTF-8 zero-width character patch"
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    jkoshy
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Thu Jul 13 15:20:12 GMT 2006
>Closed-Date:    Mon Jul 31 14:19:11 GMT 2006
>Last-Modified:  Mon Jul 31 14:19:11 GMT 2006
>Originator:     J.R. Oldroyd
>Release:        FreeBSD 6.1-STABLE i386
>Organization:
>Environment:
System: FreeBSD linwhf.opal.com 6.1-STABLE FreeBSD 6.1-STABLE #1: Thu May 18 16:03:24 EDT 2006 xxx@linwhf.opal.com:/usr/obj/usr/src/sys/LINWHF i386

>Description:
This patch fixes a problem in hexdump(1)/od(1) which arises following
the application of the "UTF-8 zero-width character patch":
	http://www.freebsd.org/cgi/query-pr.cgi?pr=misc/100212

There is an assertion in hexdump's conv.c to verify that the width
of a character is > 0; this is no longer always the case as some
characters are actually zero width.

>How-To-Repeat:
apply "UTF-8 zero-width character patch", misc/100212
od -c file-containing-zero-width-chars, such as the utf8demo.txt file
	mentioned in that patch

The assertion will fail and od will dump core.
>Fix:
--- /usr/src/usr.bin/hexdump/conv.orig	Fri Jul 16 07:07:07 2004
+++ /usr/src/usr.bin/hexdump/conv.c	Tue Jun 27 15:20:51 2006
@@ -134,7 +134,10 @@
 			*pr->cchar = 'C';
 			assert(strcmp(pr->fmt, "%3C") == 0);
 			width = wcwidth(wc);
-			assert(width > 0);
+			if (width == 0) {
+				(void)printf(" ");
+				width = 1;
+			}
 			pad = 3 - width;
 			if (pad < 0)
 				pad = 0;


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->jkoshy 
Responsible-Changed-By: jkoshy 
Responsible-Changed-When: Sun Jul 16 11:37:34 UTC 2006 
Responsible-Changed-Why:  
Take this PR. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=100215 

From: "J.R. Oldroyd" <fbsd@opal.com>
To: Joseph Koshy <jkoshy@FreeBSD.ORG>
Cc: freebsd-gnats-submit@FreeBSD.ORG
Subject: Re: bin/100215: [patch] fix to hexdump(1)/od(1) following "UTF-8 zero-width character patch"
Date: Fri, 28 Jul 2006 09:10:08 -0400

 Based on recognition that wcwidth() can return -1 for non-printable
 characters and noting that a change to misc/100212 will make use of
 this, the patch posted here should be changed to:
 
 --- conv.orig   Fri Jul 16 07:07:07 2004
 +++ conv.c      Tue Jun 27 15:20:51 2006
 @@ -134,7 +134,10 @@
                         *pr->cchar = 'C';
                         assert(strcmp(pr->fmt, "%3C") == 0);
                         width = wcwidth(wc);
 -                       assert(width > 0);
 +                       if (width <= 0) {
 +                               (void)printf(" ");
 +                               width = 1;
 +                       }
                         pad = 3 - width;
                         if (pad < 0)
                                 pad = 0;

From: jkoshy@FreeBSD.ORG (Joseph Koshy)
To: "J.R. Oldroyd" <fbsd@opal.com>
Cc: Joseph Koshy <jkoshy@FreeBSD.ORG>, freebsd-gnats-submit@FreeBSD.ORG,
Subject: Re: bin/100215: [patch] fix to hexdump(1)/od(1) following "UTF-8 
 zero-width character patch"
Date: Mon, 31 Jul 2006 09:09:47 +0000 (UTC)

 From a reading of the code it appears that the following, more
 minimal, change suffices as `printf(3)` is defined to use spaces
 for padding.
 
 Index: conv.c
 ===================================================================
 RCS file: /cvs/FreeBSD/src/usr.bin/hexdump/conv.c,v
 retrieving revision 1.8
 diff -u -r1.8 conv.c
 --- conv.c	16 Jul 2004 11:07:07 -0000	1.8
 +++ conv.c	31 Jul 2006 08:27:27 -0000
 @@ -134,7 +134,7 @@
  			*pr->cchar = 'C';
  			assert(strcmp(pr->fmt, "%3C") == 0);
  			width = wcwidth(wc);
 -			assert(width > 0);
 +			assert(width >= 0);
  			pad = 3 - width;
  			if (pad < 0)
  				pad = 0;
 
 
 The `wcwidth(wc)` invocation is in a code path that is taken only
 if `iswprint(wc)` is true; so the assert() can check for `width >= 0`.
 
 Could you please review?
 
 Regards,
 Koshy
 <jkoshy@freebsd.org>
State-Changed-From-To: open->closed 
State-Changed-By: jkoshy 
State-Changed-When: Mon Jul 31 14:17:28 UTC 2006 
State-Changed-Why:  
Fixed in rev 1.9 of "src/usr.bin/hexdump/conv.c". 

Thank you for contributing. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=100215 
>Unformatted:
