From cejkar@dcse.fee.vutbr.cz  Mon Dec 13 05:01:13 1999
Return-Path: <cejkar@fit.vutbr.cz>
Received: from boco.fee.vutbr.cz (boco.fee.vutbr.cz [147.229.9.11])
	by hub.freebsd.org (Postfix) with ESMTP id 0D31B15324
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 13 Dec 1999 05:01:07 -0800 (PST)
	(envelope-from cejkar@dcse.fee.vutbr.cz)
Received: from kazi.dcse.fee.vutbr.cz (kazi.dcse.fee.vutbr.cz [147.229.8.12])
	by boco.fee.vutbr.cz (8.9.3/8.9.3) with ESMTP id OAA40445
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 13 Dec 1999 14:01:03 +0100 (CET)
Received: (from cejkar@localhost)
	by kazi.dcse.fee.vutbr.cz (8.9.3/8.9.3) id OAA28722;
	Mon, 13 Dec 1999 14:01:02 +0100 (CET)
Message-Id: <199912131301.OAA28722@kazi.dcse.fee.vutbr.cz>
Date: Mon, 13 Dec 1999 14:01:02 +0100 (CET)
From: cejkar@fit.vutbr.cz
Reply-To: cejkar@fit.vutbr.cz
To: FreeBSD-gnats-submit@freebsd.org
Subject: sort(1) doesn't sort correctly in some cases
X-Send-Pr-Version: 3.2

>Number:         15458
>Category:       bin
>Synopsis:       sort(1) doesn't sort correctly in some cases
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    gabor
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Dec 13 05:10:02 PST 1999
>Closed-Date:    Sat Mar 17 21:07:59 GMT 2007
>Last-Modified:  Sat Mar 17 21:07:59 GMT 2007
>Originator:     Rudolf Cejka
>Release:        FreeBSD 4.0-CURRENT i386
>Organization:
Brno University of Technology, FEE&CS, Czech Republic
>Environment:

Maybe it is not relevant, but it is 4.0-CURRENT, Dec 10 1999.

>Description:

Sort(1) doesn't work in some cases for some locales. In cs_CZ.ISO_8859-2
(will be shortly commited; maybe similar problem could be seen with es_ES)
there is collation definition:

	(H,h);\
	(CH,Ch,ch);\
	(I,i);\

So sort should sort "h" and "ch" in order "h", "ch". But it sorts
these two words incorrectly as "ch", "h". If I want to sort for
example "ha" and "ch", it will be sorted correctly: "ha", "ch".

The problem is in "optimalizations", where only substrings of two
strings of minimal length of one of them are compared in strcoll()
function. This is not possible to do in this manner for languages,
where collating symbols could be longer than one character.

>How-To-Repeat:
>Fix:

Here is my patch for /usr/src/gnu/usr.bin/sort/sort.c:

--- sort.c.orig	Mon Dec 13 13:21:26 1999
+++ sort.c	Mon Dec 13 13:25:33 1999
@@ -208,6 +208,7 @@
 	return strcoll(s[0], s[1]);
 }
 
+#if 0
 static int
 collcmp(char *a, char *b, int mini)
 {
@@ -223,6 +224,7 @@
 	b[mini] = sb;
 	return r;
 }
+#endif
 #endif /* __FreeBSD__ */
 
 static void
@@ -1153,7 +1155,7 @@
 	  }
       else
 #ifdef __FreeBSD__
-	diff = collcmp (texta, textb, min (lena, lenb));
+	diff = strcoll (texta, textb);
 #else
 	diff = memcmp (texta, textb, min (lena, lenb));
 #endif
@@ -1203,7 +1205,7 @@
 	{
 #endif
 #ifdef __FreeBSD__
-	  diff = collcmp (ap, bp, mini);
+	  diff = strcoll (ap, bp);
 #else
 	  diff = memcmp (ap, bp, mini);
 #endif


>Release-Note:
>Audit-Trail:

From: "Andrey A. Chernov" <ache@freebsd.org>
To: cejkar@dcse.fee.vutbr.cz
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: bin/15458: sort(1) doesn't sort correctly in some cases
Date: Tue, 21 Dec 1999 17:06:47 -0800

 On Mon, Dec 13, 1999 at 02:01:02PM +0100, cejkar@dcse.fee.vutbr.cz wrote:
 > Sort(1) doesn't work in some cases for some locales. In cs_CZ.ISO_8859-2
 > (will be shortly commited; maybe similar problem could be seen with es_ES)
 > there is collation definition:
 > 
 > 	(H,h);\
 > 	(CH,Ch,ch);\
 > 	(I,i);\
 
 > Here is my patch for /usr/src/gnu/usr.bin/sort/sort.c:
 
 It is general problem in GNU sort which compare strings character-by-character.
 Your patch not helps, if f.e. ignore case or skip blanks flags are given.
 Correct patch require big redesign of sort. Try to contact GNU sort
 maintainers first to ask them to fix this bug in future sort versions.
 
 -- 
 Andrey A. Chernov
 http://nagual.pp.ru/~ache/
 MTH/SH/HE S-- W-- N+ PEC>+ D A a++ C G>+ QH+(++) 666+>++ Y
 

From: Cejka Rudolf <cejkar@dcse.fee.vutbr.cz>
To: "Andrey A. Chernov" <ache@FreeBSD.ORG>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: bin/15458: sort(1) doesn't sort correctly in some cases
Date: Wed, 22 Dec 1999 15:18:22 +0100

 Andrey A. Chernov wrote (1999/12/21):
 > It is general problem in GNU sort which compare strings character-by-character.
 > Your patch not helps, if f.e. ignore case or skip blanks flags are given.
 
 At this poing you are right.
 
 > Correct patch require big redesign of sort. Try to contact GNU sort
 > maintainers first to ask them to fix this bug in future sort versions.
 
 We should not contact GNU sort maintainers because this is FreeBSD
 specific problem: In our source tree there is a very old patched
 sort-1.14 and they have sort-2.0 already. And sort-2.0 works much better
 and hasn't this problem.
 
 So the best solution should be to import sort-2.0 from textutils-2.0.
 I have tried this and it looks it works: We have to configure textutils-2.0
 with "configure --with-catgets" and copy-out these files: COPYING, 
 intl/cat-compat.c, po/cat-id-tbl.c, lib/closeout.[ch], config.h,
 lib/error.[ch], lib/getopt.[ch], lib/getopt1.c, lib/hard-locale.[ch],
 intl/libgettext.h, intl/libintl.h, lib/long-options.[ch], lib/memcoll.[ch],
 man/sort.1, src/sort.c, src/sys2.h, src/system.h, lib/version-etc.[ch],
 lib/xalloc.h and lib/xmalloc.c. After this in Makefile we have to define
 all *.c as SRCS and add -DLOCALEDIR=\"/usr/share/nls\" and it works.
 
 But I expect another problems again there ;-)
 
 -- 
 Rudolf Cejka   (cejkar@dcse.fee.vutbr.cz;  http://www.fee.vutbr.cz/~cejkar)
 Brno University of Technology, Faculty of El. Engineering and Comp. Science
 Bozetechova 2, 612 66  Brno, Czech Republic
 
State-Changed-From-To: open->analyzed 
State-Changed-By: ache 
State-Changed-When: Wed Dec 22 12:57:30 PST 1999 
State-Changed-Why:  
I agree that we need to switch to sort-2.0 
State-Changed-From-To: analyzed->feedback 
State-Changed-By: ache 
State-Changed-When: Sat Jun 8 13:05:01 PDT 2002 
State-Changed-Why:  
We switch to latest GNU sort in -current. Is PR problem still exists with it? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=15458 
State-Changed-From-To: feedback->patched 
State-Changed-By: ache 
State-Changed-When: Mon Jun 10 04:45:02 PDT 2002 
State-Changed-Why:  
Problem fixed in -current by upgrading to new GNU sort 

http://www.freebsd.org/cgi/query-pr.cgi?pr=15458 
State-Changed-From-To: patched->closed 
State-Changed-By: le 
State-Changed-When: Thu Jul 22 13:25:43 GMT 2004 
State-Changed-Why:  
As the problem seems to be fixed long ago, close this PR. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=15458 
State-Changed-From-To: closed->patched 
State-Changed-By: le 
State-Changed-When: Thu Jul 22 13:38:47 GMT 2004 
State-Changed-Why:  
Re-open this PR as submitter says problem still exists in -stable. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=15458 
State-Changed-From-To: patched->feedback 
State-Changed-By: gabor 
State-Changed-When: Fri Mar 16 21:49:27 UTC 2007 
State-Changed-Why:  
Dear Submitter, 

could you check if it's still an issue on a recent release, please? 

Thanks in advance, 
Gabor 


Responsible-Changed-From-To: freebsd-bugs->gabor 
Responsible-Changed-By: gabor 
Responsible-Changed-When: Fri Mar 16 21:49:27 UTC 2007 
Responsible-Changed-Why:  
Track. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=15458 
State-Changed-From-To: feedback->closed 
State-Changed-By: gabor 
State-Changed-When: Sat Mar 17 21:06:36 UTC 2007 
State-Changed-Why:  
Submitter agreed, that this can be closed now. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=15458 
>Unformatted:
