From nobody@FreeBSD.org  Fri Sep 23 04:33:44 2005
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DF7C516A41F
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Sep 2005 04:33:44 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [216.136.204.117])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9D6A643D45
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Sep 2005 04:33:44 +0000 (GMT)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id j8N4Xiem066468
	for <freebsd-gnats-submit@FreeBSD.org>; Fri, 23 Sep 2005 04:33:44 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id j8N4XiA8066467;
	Fri, 23 Sep 2005 04:33:44 GMT
	(envelope-from nobody)
Message-Id: <200509230433.j8N4XiA8066467@www.freebsd.org>
Date: Fri, 23 Sep 2005 04:33:44 GMT
From: Toby Peterson <toby@apple.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [PATCH] hexdump -s speedup on /dev
X-Send-Pr-Version: www-2.3

>Number:         86485
>Category:       bin
>Synopsis:       [patch] hexdump(1): hexdump -s speedup on /dev
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Sep 23 04:40:09 GMT 2005
>Closed-Date:    
>Last-Modified:  Sun Jun 24 20:57:55 UTC 2012
>Originator:     Toby Peterson
>Release:        HEAD
>Organization:
Apple Computer, Inc.
>Environment:
n/a
>Description:
hexdump -s is very slow on devices when offset is large, because it doesn't try to fseek them.      
>How-To-Repeat:
Run hexdump -s <large offset> on a readable device.
>Fix:
Patch: http://lamancha.opendarwin.org/~toby/freebsd/hexdump.diff
>Release-Note:
>Audit-Trail:

From: "Garrett Cooper" <yanefbsd@gmail.com>
To: bug-followup@FreeBSD.org, toby@apple.com
Cc:  
Subject: Re: bin/86485: [PATCH] hexdump(1): hexdump -s speedup on /dev
Date: Sat, 21 Jun 2008 13:47:13 -0700

 Hi,
 Could you please resubmit the patch (link is broken)?
 Thanks,
 -Garrett

From: Toby Peterson <toby@apple.com>
To: Garrett Cooper <yanefbsd@gmail.com>
Cc: bug-followup@FreeBSD.org
Subject: Re: bin/86485: [PATCH] hexdump(1): hexdump -s speedup on /dev
Date: Sat, 21 Jun 2008 19:05:21 -0700

 --Apple-Mail-7--430079199
 Content-Type: text/plain;
 	charset=US-ASCII;
 	format=flowed;
 	delsp=yes
 Content-Transfer-Encoding: 7bit
 
 On Jun 21, 2008, at 1:47 PM, Garrett Cooper wrote:
 
 > Hi,
 > Could you please resubmit the patch (link is broken)?
 > Thanks,
 > -Garrett
 
 
 That webserver is long defunct, but the attached patch should resolve  
 the issue.
 
 - Toby
 
 --Apple-Mail-7--430079199
 Content-Disposition: attachment;
 	filename=hexdump-fseeko.diff
 Content-Type: application/octet-stream;
 	x-unix-mode=0644;
 	name="hexdump-fseeko.diff"
 Content-Transfer-Encoding: 7bit
 
 ? hexdump
 ? hexdump.1.gz
 ? od.1.gz
 Index: Makefile
 ===================================================================
 RCS file: /home/ncvs/src/usr.bin/hexdump/Makefile,v
 retrieving revision 1.9
 diff -u -r1.9 Makefile
 --- Makefile	22 Jul 2004 13:14:42 -0000	1.9
 +++ Makefile	22 Jun 2008 02:01:24 -0000
 @@ -7,6 +7,6 @@
  MLINKS=	hexdump.1 hd.1
  LINKS=	${BINDIR}/hexdump ${BINDIR}/od
  LINKS+=	${BINDIR}/hexdump ${BINDIR}/hd
 -WARNS?=	6
 +WARNS?=	1
  
  .include <bsd.prog.mk>
 Index: display.c
 ===================================================================
 RCS file: /home/ncvs/src/usr.bin/hexdump/display.c,v
 retrieving revision 1.22
 diff -u -r1.22 display.c
 --- display.c	4 Aug 2004 02:47:32 -0000	1.22
 +++ display.c	22 Jun 2008 02:01:24 -0000
 @@ -44,6 +44,7 @@
  
  #include <ctype.h>
  #include <err.h>
 +#include <errno.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>
 @@ -384,12 +385,13 @@
  			return;
  		}
  	}
 -	if (S_ISREG(sb.st_mode)) {
 -		if (fseeko(stdin, skip, SEEK_SET))
 -			err(1, "%s", fname);
 +	/* try to seek first; fall back on ESPIPE */
 +	if (fseeko(stdin, skip, SEEK_SET) == 0) {
  		address += skip;
  		skip = 0;
  	} else {
 +		if (errno != ESPIPE)
 +			err(1, "%s", fname);
  		for (cnt = 0; cnt < skip; ++cnt)
  			if (getchar() == EOF)
  				break;
 
 --Apple-Mail-7--430079199
 Content-Type: text/plain;
 	charset=US-ASCII;
 	format=flowed
 Content-Transfer-Encoding: 7bit
 
 
 
 --Apple-Mail-7--430079199--

From: Garrett Cooper <yaneurabeya@gmail.com>
To: "bug-followup@FreeBSD.org" <bug-followup@FreeBSD.org>,
 "toby@apple.com" <toby@apple.com>
Cc:  
Subject: Re: bin/86485: [PATCH] hexdump(1): hexdump -s speedup on /dev
Date: Sat, 27 Dec 2008 02:37:53 -0800

 I'm actually seeing a 20 ~ 30 fold performance _decrease_ after  
 applying this patch with hexump ok Tiger when accessing /dev/zero.  
 Could you submit better reproduction steps / criteria?
 Thanks!
 -Garrett

From: Alexander Best <alexbestms@wwu.de>
To: <bug-followup@FreeBSD.org>
Cc: Garrett Cooper <yaneurabeya@gmail.com>,
 Toby Peterson <toby@apple.com>
Subject: Re: bin/86485: [patch] hexdump(1): hexdump -s speedup on /dev
Date: Sun, 28 Feb 2010 17:29:51 +0100 (CET)

 i can't verify the performance decrease running HEAD (r204383) garrett
 mentioned. here are some benchmark (hexdump being the unpatched binary and
 =2E/hexdump including toby's patches):
 
 time hexdump -n 2 -s 999999999 /dev/zero  4,45s user 0,43s system 98% cpu
 4,938 total
 
 time ./hexdump -n 2 -s 999999999 /dev/zero  0,00s user 0,00s system 89% cpu
 0,005 total
 
 however while the unpatched hexdump binary succeeds doing
 
 hexdump -n 2 -s 99999999 /dev/ada0  0,52s user 0,52s system 19% cpu 5,418
 total
 
 the patched binary outputs a warning
 
 hexdump: /dev/ada0: Invalid argument
 5f5e0ff
 =2E/hexdump -n 2 -s 99999999 /dev/ada0  0,00s user 0,00s system 89% cpu 0,0=
 06
 total
 
 to me the patch doesn't look right however, because
 
 1. if a file is not seekable, fseeko() shouldn't be used. so the
 
 "if (S_ISREG(sb.st_mode))"
 
 statement should stay. removing it causes the warning i got during
 benchmarking, because fseeko() itself outputs a warning if it's being run o=
 n a
 non-seekable file.
 2. the real cause for the slowdown on non-seekable files is the use of
 getchar() which is testing whether EOF has been reached using a blocksize o=
 f 1
 byte.
 
 dd is much faster when dealing with non-seekable files. the difference howe=
 ver
 is that dd won't accept a seek value which is bigger than the filesize.
 hexdump on the other hand will accept any seek value. if it's bigger than t=
 he
 filesize it outputs the last byte(s) before the EOF.
 
 the dd code dealing with non-seekable files can be found in
 /usr/src/bin/dd/position.c:pos_in()
 
 maybe it's possible to use some of it to get rid of getchar().
 
 cheers.
 alex
State-Changed-From-To: open->analyzed 
State-Changed-By: arundel 
State-Changed-When: Sun Aug 29 12:01:55 UTC 2010 
State-Changed-Why:  
The cause for this issue is the use of getchar() which tests every character 
against EOF. This causes huge overhead as can be seen in this comparison 
between the BSD and Linux hexdump versions: 

FreeBSD:	Linux: 
real 44,85	real 0.00 
user 4,51	user 0.00 
sys 38,76	sys 0.00 

The command used for this was 
'time -p hexdump -n 100 -s 1000000000 /dev/random'. Higher values for -s would 
simply take too much time on FreeBSD. ;) 


Responsible-Changed-From-To: freebsd-bugs->arundel 
Responsible-Changed-By: arundel 
Responsible-Changed-When: Sun Aug 29 12:01:55 UTC 2010 
Responsible-Changed-Why:  
Assign to me. Although i don't have commit rights to src i'm working on this 
issue atm. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=86485 

From: Alexander Best <arundel@freebsd.org>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: bin/86485: [patch] hexdump(1): hexdump -s speedup on /dev
Date: Wed, 16 Nov 2011 19:20:45 +0000

 --y0ulUmNC+osPPQO6
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 the following patch should fix the issue correctly and without any hacks.
 
 cheers.
 alex
 
 --y0ulUmNC+osPPQO6
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="hexdump-stat.h.diff"
 
 diff --git a/sys/sys/stat.h b/sys/sys/stat.h
 index 1b03bd2..ac2aeab 100644
 --- a/sys/sys/stat.h
 +++ b/sys/sys/stat.h
 @@ -245,6 +245,8 @@ struct nstat {
  #endif
  #if __BSD_VISIBLE
  #define	S_ISWHT(m)	(((m) & 0170000) == 0160000)	/* whiteout */
 +#define S_ISSEEK(m)	((((m) & 0170000) != 0140000) && \
 +			(((m) & 0170000) != 0010000))	/* seekable file */
  #endif
  
  #if __BSD_VISIBLE
 diff --git a/usr.bin/hexdump/display.c b/usr.bin/hexdump/display.c
 index 991509d..831bd9e 100644
 --- a/usr.bin/hexdump/display.c
 +++ b/usr.bin/hexdump/display.c
 @@ -380,7 +380,7 @@ doskip(const char *fname, int statok)
  			return;
  		}
  	}
 -	if (S_ISREG(sb.st_mode)) {
 +	if (S_ISSEEK(sb.st_mode)) {
  		if (fseeko(stdin, skip, SEEK_SET))
  			err(1, "%s", fname);
  		address += skip;
 
 --y0ulUmNC+osPPQO6--

From: Alexander Best <arundel@freebsd.org>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: bin/86485: [patch] hexdump(1): hexdump -s speedup on /dev
Date: Wed, 16 Nov 2011 22:04:18 +0000

 --zYM0uCDKw75PZbzx
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 here are some statistics i did. the output was gathered from running 5 times
 
 '/usr/bin/time -p hexdump -n 100 -s 100m /dev/random'
 
 both with the unpatched and patched version of hexdump(1). as you can see the
 improvement in speed is quite dramatic.
 
 cheers.
 alex
 
 --zYM0uCDKw75PZbzx
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="ministat.log"
 
 x patched
 + unpatched
 +------------------------------------------------------------+
 |x    +                                                      |
 |x    +                                                      |
 |x    +                  ++    +                             |
 |x   ++                  +++   +             +        +     +|
 |A                                                           |
 |       |________________AM________________|                 |
 +------------------------------------------------------------+
     N           Min           Max        Median           Avg        Stddev
 x  15             0             0             0             0             0
 +  15          0.39          5.92          2.48     2.4286667     1.7736035
 Difference at 80.0% confidence
 	2.42867 +/- 0.601278
 	inf% +/- inf%
 	(Student's t, pooled s = 1.25413)
 
 --zYM0uCDKw75PZbzx--

From: Alexander Best <arundel@freebsd.org>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: bin/86485: [patch] hexdump(1): hexdump -s speedup on /dev
Date: Wed, 16 Nov 2011 22:26:37 +0000

 ... and the rusage stats. ;)
 
 patched:
 
 real 0,00
 user 0,00
 sys  0,00
          0  maximum resident set size
          0  average shared memory size
          0  average unshared data size
          0  average unshared stack size
        125  page reclaims
          0  page faults
          0  swaps
          0  block input operations
          0  block output operations
          0  messages sent
          0  messages received
          0  signals received
          2  voluntary context switches
          0  involuntary context switches
 
 unpatched:
 
 real 2,98
 user 0,47
 sys  2,47
       1080  maximum resident set size
         24  average shared memory size
       2038  average unshared data size
        128  average unshared stack size
        131  page reclaims
          0  page faults
          0  swaps
          0  block input operations
          0  block output operations
          0  messages sent
          0  messages received
          0  signals received
         16  voluntary context switches
        341  involuntary context switches
 
 cheers.
 alex

From: Alexander Best <arundel@freebsd.org>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: bin/86485: [patch] hexdump(1): hexdump -s speedup on /dev
Date: Sun, 20 Nov 2011 16:18:14 +0000

 the real issue is that lseek() (and thus fseeko()), don't guarantee that a seek
 has been successful. for devices such as tape drives, zero gets returned,
 although the seek did not succeed.
 
 so although seeking is not possible on fifos, pipes and sockets, that doesn't
 mean that lseek() will work on all other file types.
 
 this means that the provided patch is too simple. hexdump(1) along with all
 other userspace applications that want to use lseek() (e.g. via fseeko), need
 to handle all possible cases themselves.
 
 a revised patch will be submitted shortly.
 
 cheers.
 alex

From: Alexander Best <arundel@freebsd.org>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: bin/86485: [patch] hexdump(1): hexdump -s speedup on /dev
Date: Mon, 21 Nov 2011 21:17:45 +0000

 --RnlQjJ0d97Da+TV1
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 here's a revised patch. basically the new logic, when to seek and when to use
 getchar() is:
 
 1) if the file argument is a fifo, pipe or socket   --  goto 4)
 2) if the file argument is a tape drive             --  goto 4)
 3) for all other cases try fseeko(), if that fails  --  goto 4)
 
 4) use getchar()
 
 it might also be a good idea to mention that hexdump will not fail in case it
 is being run against a device without a medium (DVD or Blue-ray) inserted.
 
 it's questionable, whether this behavior is correct or not. strictly speaking,
 hexdump does what it has been asked for: skip over some amount of data and
 then print what's there.
 
 cheers.
 alex
 
 --RnlQjJ0d97Da+TV1
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="hexdump.diff"
 
 diff --git a/usr.bin/hexdump/display.c b/usr.bin/hexdump/display.c
 index 991509d..8c8b065 100644
 --- a/usr.bin/hexdump/display.c
 +++ b/usr.bin/hexdump/display.c
 @@ -35,8 +35,10 @@ static char sccsid[] = "@(#)display.c	8.1 (Berkeley) 6/6/93";
  #include <sys/cdefs.h>
  __FBSDID("$FreeBSD$");
  
 +#include <sys/ioctl.h>
  #include <sys/param.h>
  #include <sys/stat.h>
 +#include <sys/conf.h>
  
  #include <ctype.h>
  #include <err.h>
 @@ -368,7 +370,7 @@ next(char **argv)
  void
  doskip(const char *fname, int statok)
  {
 -	int cnt;
 +	int type;
  	struct stat sb;
  
  	if (statok) {
 @@ -380,16 +382,38 @@ doskip(const char *fname, int statok)
  			return;
  		}
  	}
 -	if (S_ISREG(sb.st_mode)) {
 -		if (fseeko(stdin, skip, SEEK_SET))
 +	if (S_ISFIFO(sb.st_mode) || S_ISSOCK(sb.st_mode)) {
 +		noseek();
 +		return;
 +	}
 +	if (S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode)) {
 +		if (ioctl(fileno(stdin), FIODTYPE, &type))
  			err(1, "%s", fname);
 -		address += skip;
 -		skip = 0;
 -	} else {
 -		for (cnt = 0; cnt < skip; ++cnt)
 -			if (getchar() == EOF)
 -				break;
 -		address += cnt;
 -		skip -= cnt;
 +		/*
 +		 * Most tape drives don't support seeking,
 +		 * yet fseeko() would succeed.
 +		 */
 +		if (type & D_TAPE) {
 +			noseek();
 +			return;
 +		}
 +        }
 +	if (fseeko(stdin, skip, SEEK_SET)) {
 +		noseek();
 +		return;
  	}
 +	address += skip;
 +	skip = 0;
 +}
 +
 +void
 +noseek(void)
 +{
 +	int count;
 +
 +	for (count = 0; count < skip; ++count)
 +		if (getchar() == EOF)
 +			break;
 +	address += count;
 +	skip -= count;
  }
 diff --git a/usr.bin/hexdump/hexdump.h b/usr.bin/hexdump/hexdump.h
 index be85bd9..1d4bb85 100644
 --- a/usr.bin/hexdump/hexdump.h
 +++ b/usr.bin/hexdump/hexdump.h
 @@ -97,6 +97,7 @@ u_char	*get(void);
  void	 newsyntax(int, char ***);
  int	 next(char **);
  void	 nomem(void);
 +void	 noseek(void);
  void	 oldsyntax(int, char ***);
  size_t	 peek(u_char *, size_t);
  void	 rewrite(FS *);
 
 --RnlQjJ0d97Da+TV1--
Responsible-Changed-From-To: arundel->eadler 
Responsible-Changed-By: eadler 
Responsible-Changed-When: Sat Jan 7 05:05:07 UTC 2012 
Responsible-Changed-Why:  
arundel has a patch but can't commit. I'll take this as a reminder to 
bug someone about this (and maybe commit it) 

http://www.freebsd.org/cgi/query-pr.cgi?pr=86485 
State-Changed-From-To: analyzed->open 
State-Changed-By: eadler 
State-Changed-When: Mon Feb 13 05:20:48 UTC 2012 
State-Changed-Why:  
don't like this state 

http://www.freebsd.org/cgi/query-pr.cgi?pr=86485 
Responsible-Changed-From-To: eadler->freebsd-bugs 
Responsible-Changed-By: eadler 
Responsible-Changed-When: Sun Jun 24 20:57:54 UTC 2012 
Responsible-Changed-Why:  
not dealing with this for a while 

http://www.freebsd.org/cgi/query-pr.cgi?pr=86485 
>Unformatted:
