From nobody@FreeBSD.org  Thu May  9 20:59:34 2013
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
	by hub.freebsd.org (Postfix) with ESMTP id 93FCC397
	for <freebsd-gnats-submit@FreeBSD.org>; Thu,  9 May 2013 20:59:34 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from oldred.FreeBSD.org (oldred.freebsd.org [8.8.178.121])
	by mx1.freebsd.org (Postfix) with ESMTP id 859EC17B
	for <freebsd-gnats-submit@FreeBSD.org>; Thu,  9 May 2013 20:59:34 +0000 (UTC)
Received: from oldred.FreeBSD.org ([127.0.1.6])
	by oldred.FreeBSD.org (8.14.5/8.14.5) with ESMTP id r49KxXqQ077415
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 9 May 2013 20:59:33 GMT
	(envelope-from nobody@oldred.FreeBSD.org)
Received: (from nobody@localhost)
	by oldred.FreeBSD.org (8.14.5/8.14.5/Submit) id r49KxXG3077414;
	Thu, 9 May 2013 20:59:33 GMT
	(envelope-from nobody)
Message-Id: <201305092059.r49KxXG3077414@oldred.FreeBSD.org>
Date: Thu, 9 May 2013 20:59:33 GMT
From: Jason Keller <jkeller@bbiinternational.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Optimized Checksum Code for ZFS
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         178467
>Category:       kern
>Synopsis:       [zfs] [request] Optimized Checksum Code for ZFS
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-fs
>State:          suspended
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Thu May 09 21:00:00 UTC 2013
>Closed-Date:    
>Last-Modified:  Thu Aug 15 03:44:03 UTC 2013
>Originator:     Jason Keller
>Release:        9.1
>Organization:
BBI International
>Environment:
FreeBSD chewy 9.1-RELEASE-p3 FreeBSD 9.1-RELEASE-p3 #0: Mon Apr 29 18:27:25 UTC 2013     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
This isn't so much a problem as it is an RFE.  Basically, the SHA256
checksum code within ZFS looks like it could use a little helping hand.
On my limited testing, it would appear that Solaris 11 has at least a
20-25% edge in efficiency when doing SHA256 checksumming for ZFS.

IANAP, but it would be extremely nice to be able to have the same (or
better) efficiency for ZFS on FreeBSD.  I have not done specific testing
with Fletcher4, but that also seemed to be slightly better tuned in
Solaris 11 as well.
>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->suspended 
State-Changed-By: linimon 
State-Changed-When: Fri May 10 03:35:38 UTC 2013 
State-Changed-Why:  
assign, and note that someone will need to provide a patch. 


Responsible-Changed-From-To: freebsd-bugs->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Fri May 10 03:35:38 UTC 2013 
Responsible-Changed-Why:  

http://www.freebsd.org/cgi/query-pr.cgi?pr=178467 
Responsible-Changed-From-To: freebsd-fs->zfs-devel 
Responsible-Changed-By: delphij 
Responsible-Changed-When: Fri May 10 06:16:03 UTC 2013 
Responsible-Changed-Why:  
Assign to zfs-devel@ 

http://www.freebsd.org/cgi/query-pr.cgi?pr=178467 

From: "Steven Hartland" <smh@freebsd.org>
To: <bug-followup@freebsd.org>,
	<jkeller@bbiinternational.com>
Cc:  
Subject: Re: kern/178467: [request] Optimized Checksum Code for ZFS
Date: Fri, 10 May 2013 09:47:59 +0100

 We would need something more to go no than "looks like" I'm afraid.
 
 Also Fletcher4 is the default checksum which achieves ~4GB/s
 per core in hashing performance, where as SHA-256 even with
 hand written assembly manages less than 1/10th that performance,
 so if your looking for performance for checksums use the
 default Fletcher4 instead of the SHA-256.
 
 That said new processors do have HW support which could be
 used to accelerate SHA-256 support, details of this can be
 found here:-
 http://download.intel.com/embedded/processor/whitepaper/327457.pdf
 
 These sorts of core feature enhancements should be discussed
 and implemented upstream at illumos.
 
     Regards
     Steve

From: Jason Keller <jkeller@bbiinternational.com>
To: Steven Hartland <smh@freebsd.org>, "bug-followup@freebsd.org"
	<bug-followup@freebsd.org>
Cc:  
Subject: RE: kern/178467: [request] Optimized Checksum Code for ZFS
Date: Fri, 10 May 2013 13:21:26 +0000

 Ok, thank you Steven - I'll gather up more detailed information when I have=
  my test environment fully fleshed out so I have absolute apples to apples =
 numbers and can fully constrain my testing to one hardware platform (the pl=
 atforms were slightly different, same processors and memory though).  I'll =
 file that as an RFE with Illumos if that's what you think is best.  I just =
 wanted to put that out there, since I certainly noticed the difference in m=
 y many weeks of testing different platforms here (OmniOS, Solaris 11.0, Sol=
 aris 11.1, FreeBSD 9.1, Nexenta 4 CE).  Didn't really know where I should f=
 ile that particular RFE, so I figured I'd start with the kernel team.  I di=
 dn't think that the SHA256 implementation in FreeBSD was taken exactly from=
  Illumos.
 

From: "Steven Hartland" <smh@freebsd.org>
To: <bug-followup@freebsd.org>,
	<jkeller@bbiinternational.com>
Cc:  
Subject: Re: kern/178467: [request] Optimized Checksum Code for ZFS
Date: Fri, 10 May 2013 15:29:11 +0100

 Actually after double checking it looks like FreeBSD doesn't use the
 same SHA-256 implementation in ZFS as illumos so there may well be
 something to look at there.
 
 Would be good to know the difference in performance between FreeBSD
 and Openindiana (illumos distribution).
 
     Regards
     Steve

From: Jason Keller <jkeller@bbiinternational.com>
To: Steven Hartland <smh@freebsd.org>, "bug-followup@freebsd.org"
	<bug-followup@freebsd.org>
Cc:  
Subject: RE: kern/178467: [request] Optimized Checksum Code for ZFS
Date: Fri, 10 May 2013 14:35:10 +0000

 I'll gather some numbers together comparing OmniOS (Illumos) vs FreeBSD and=
  get back some numbers for you.  I can use my Xeon E3-1240 at home for the =
 benchmarking, it'll just take me some time to gather everything together to=
  do it.  Are there any specific tools you'd like me to run, or just basic z=
 pool iostat and mpstat / top -P ?
 

From: Jason Keller <jkeller@bbiinternational.com>
To: Steven Hartland <smh@freebsd.org>, "bug-followup@freebsd.org"
	<bug-followup@freebsd.org>
Cc:  
Subject: RE: kern/178467: [request] Optimized Checksum Code for ZFS
Date: Fri, 10 May 2013 16:26:33 +0000

 Steven,
 
 It also looks as if kern/125738 is related to hardware acceleration of SHA2=
 56 in ZFS where it's available - PJD took this one, but doesn't look like h=
 e had time to work on it.  So they are similar, though this request is a bi=
 t more broad.
 
 Also, is there any way to scrub my mobile number off there in the ticket de=
 tails?  Totally spaced out that it's in my default signature here at work.
 
 --Jason
 

From: Alan Somers <asomers@freebsd.org>
To: bug-followup@FreeBSD.org, jkeller@bbiinternational.com
Cc:  
Subject: Re: kern/178467: [zfs] [request] Optimized Checksum Code for ZFS
Date: Mon, 24 Jun 2013 09:28:08 -0600

 FWIW, I spent a full day trying to accelerate Fletcher-4 using SIMD
 instructions (tested on Sandy Bridge and Nehalem).  I was unable to
 improve on the current code; the Fletcher-4 hash is very fast and
 doesn't vectorize well.  However, I believe that AVX-2 will probably
 be able to beat the non-vectorized version.  I plan to try it out as
 soon as I can get my hands on a Haswell CPU.  I've also spent several
 weeks analyzing the strength of Fletcher-4, and concluded that it's
 really quite good.  Good enough for every non-cryptographic
 application, certainly.  My recommendation is that all ZFS users
 should prefer Fletcher-4 over SHA-256.  I haven't tried vectorizing
 SHA-256 and don't plan to.

From: Jason Keller <jkeller@bbiinternational.com>
To: Alan Somers <asomers@freebsd.org>, "bug-followup@FreeBSD.org"
	<bug-followup@FreeBSD.org>
Cc:  
Subject: RE: kern/178467: [zfs] [request] Optimized Checksum Code for ZFS
Date: Mon, 24 Jun 2013 16:20:35 +0000

 Thank you very much for following up on this.  Any further optimization of =
 checksums is gravy, for everyone that uses FreeBSD/ZFS.  Sure Fletcher4 is =
 pretty light, but every little bit helps.  Fewer CPU cycles used =3D less l=
 atency =3D more win.  I think for crypto/dedup applications, perhaps effort=
  should be focused on an optimized implementation of Keccak (SHA3 winner) i=
 nstead of SHA256?
Responsible-Changed-From-To: zfs-devel->freebsd-fs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Thu Aug 15 03:43:25 UTC 2013 
Responsible-Changed-Why:  
apparently there is no such alias.  reassign to freebsd-fs. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=178467 
>Unformatted:
