From nobody@FreeBSD.org  Wed Jun 20 17:41:29 2007
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 54B2B16A421
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 20 Jun 2007 17:41:29 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [69.147.83.33])
	by mx1.freebsd.org (Postfix) with ESMTP id 442A813C45B
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 20 Jun 2007 17:41:29 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l5KHfTb0053103
	for <freebsd-gnats-submit@FreeBSD.org>; Wed, 20 Jun 2007 17:41:29 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.13.1/8.13.1/Submit) id l5KHfScQ053102;
	Wed, 20 Jun 2007 17:41:28 GMT
	(envelope-from nobody)
Message-Id: <200706201741.l5KHfScQ053102@www.freebsd.org>
Date: Wed, 20 Jun 2007 17:41:28 GMT
From: Mykola Zubach <zuborg@advancedhosters.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: improved gmirror balance algorithm
X-Send-Pr-Version: www-3.0

>Number:         113885
>Category:       kern
>Synopsis:       [gmirror] [patch] improved gmirror balance algorithm
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    mav
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Jun 20 17:50:01 GMT 2007
>Closed-Date:    Tue Dec 08 23:36:44 UTC 2009
>Last-Modified:  Mon Mar 22 00:10:03 UTC 2010
>Originator:     Mykola Zubach
>Release:        6.2-RELEASE
>Organization:
AdvancedHosters
>Environment:
FreeBSD DS021 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Wed Jan 17 18:10:49 UTC 2007     adm@ahb30:/usr/obj/usr/src/sys/Z  i386
>Description:
gmirror have poor read balance algorithms, which doesn't allow to utilize
disks performance

'prefer' algo is suitable only for non-symmetrical enviroment, such as
one local and one network-mounted disk

'load' is broken - dummy 'round-robin' shows best results in all benchmarks

'split' is completely senseless - it uses 'round-robin' for small
requests and split big request to differents providers - but it's
clearly understood that one hdd will take less time to complete one
big requests (equal to several small sequental requests), than two (or
more) drives to that several small requests, because even if there is
no other activity, one disk doesn't spent time to move actuator and
doesn't wait for plate rotate.

'round-robin' is best but is dummy and doesn't try to select disk, which
may take less time to complete request: single sequential read thread
uses both providers at half speed (total speed is same as for single
drive) two simultaneous sequential read threads fights each other use
10% of throughput, while they could use separate provider at full speed
next threads aggravates situation
>How-To-Repeat:
it's repeatable on all gmirror installations
>Fix:
I've rewritten 'load' algo

It remembers bio_offset value of last operation (read or write) for
each provider and assumes that preferable disk is that which have
smallest value of offset distance:
|bio_offset - disk->last_offset|

It also remembers last read time, and try to use that disk which was
used longer time ago

From all providers it selects that one which have lesser value of:
offset_distance / (tunable_parameter + use_delay)

This algorithm is well suitable for multithread enviroment (web-servers,
db-servers...), for example it allow web server work at almost full
speed when awstats (log analyzer) is running, while using 'round-robin'
cause 50% slowdown (bw drops from 100Mbit/s to 50Mbit/s and remains on
that level until awstats finishes its job).

Patch attached with submission follows:

--- /sys/geom/mirror/g_mirror.h	Wed May 10 10:10:03 2006
+++ g_mirror.h	Wed Jun 20 18:24:21 2007
@@ -131,6 +131,7 @@
 	int		 d_state;	/* Disk state. */
 	u_int		 d_priority;	/* Disk priority. */
 	struct bintime	 d_delay;	/* Disk delay. */
+	off_t		 last_offset;	/* LBA of last operation. */
 	struct bintime	 d_last_used;	/* When disk was last used. */
 	uint64_t	 d_flags;	/* Additional flags. */
 	u_int		 d_genid;	/* Disk's generation ID. */
--- /sys/geom/mirror/g_mirror.c	Tue Sep 19 14:16:14 2006
+++ g_mirror.c	Wed Jun 20 19:45:30 2007
@@ -45,7 +45,6 @@
 #include <sys/sched.h>
 #include <geom/mirror/g_mirror.h>
 
-
 static MALLOC_DEFINE(M_MIRROR, "mirror_data", "GEOM_MIRROR Data");
 
 SYSCTL_DECL(_kern_geom);
@@ -71,6 +70,11 @@
 TUNABLE_INT("kern.geom.mirror.sync_requests", &g_mirror_syncreqs);
 SYSCTL_UINT(_kern_geom_mirror, OID_AUTO, sync_requests, CTLFLAG_RDTUN,
     &g_mirror_syncreqs, 0, "Parallel synchronization I/O requests.");
+static u_int g_mirror_plusdelay = 60000;
+TUNABLE_INT("kern.geom.mirror.plusdelay", &g_mirror_plusdelay);
+SYSCTL_UINT(_kern_geom_mirror, OID_AUTO, plusdelay, CTLFLAG_RW,
+   &g_mirror_plusdelay, 0, "Delay in 1/64k s to plus.");
+
 
 #define	MSLEEP(ident, mtx, priority, wmesg, timeout)	do {		\
 	G_MIRROR_DEBUG(4, "%s: Sleeping %p.", __func__, (ident));	\
@@ -453,6 +457,7 @@
 	disk->d_priority = md->md_priority;
 	disk->d_delay.sec = 0;
 	disk->d_delay.frac = 0;
+	disk->last_offset = 0;
 	binuptime(&disk->d_last_used);
 	disk->d_flags = md->md_dflags;
 	if (md->md_provider[0] != '\0')
@@ -861,7 +866,7 @@
 static void
 g_mirror_update_delay(struct g_mirror_disk *disk, struct bio *bp)
 {
-
+	return;
 	if (disk->d_softc->sc_balance != G_MIRROR_BALANCE_LOAD)
 		return;
 	binuptime(&disk->d_delay);
@@ -1423,8 +1428,11 @@
 	struct g_consumer *cp;
 	struct bio *cbp;
 	struct bintime curtime;
+	off_t  bio_offset = bp->bio_offset;
+	long long int	dist, best_dist = -1;
+	long long int	use_delay = 0, best_use_delay = 0;
 
-	binuptime(&curtime);
+	getbinuptime(&curtime);
 	/*
 	 * Find a disk which the smallest load.
 	 */
@@ -1432,16 +1440,23 @@
 	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
 		if (dp->d_state != G_MIRROR_DISK_STATE_ACTIVE)
 			continue;
-		/* If disk wasn't used for more than 2 sec, use it. */
-		if (curtime.sec - dp->d_last_used.sec >= 2) {
-			disk = dp;
-			break;
-		}
-		if (disk == NULL ||
-		    bintime_cmp(&dp->d_delay, &disk->d_delay) < 0) {
+
+		dist = dp->last_offset - bio_offset;
+		if (dist < 0)
+			dist = -dist;
+
+/* TODO: please rewrite this code using '<<' and '>>' */
+		use_delay = g_mirror_plusdelay + (curtime.sec - dp->d_last_used.sec)*65536 + (curtime.frac - dp->d_last_used.frac)/65536/65536/65536;
+
+/* select disk with lesser (dist/(g_mirror_plusdelay + real_use_delay)) ratio - its heads looks to be close to [bio_offset] and disk was being used enough long time ago */
+		if (best_dist == -1 || dist * best_use_delay < best_dist * use_delay) {
 			disk = dp;
+			best_dist = dist;
+			best_use_delay = use_delay;
 		}
 	}
+//G_MIRROR_DEBUG(0, "%li.%03hu %lld %s(%lld) %lld/%lld", (long int)curtime.sec, (unsigned short) (curtime.frac/65536/65536/65536/65.536), bp->bio_offset/512, disk->d_name, disk->last_offset/512, best_dist, use_delay);
+
 	KASSERT(disk != NULL, ("NULL disk for %s.", sc->sc_name));
 	cbp = g_clone_bio(bp);
 	if (cbp == NULL) {
@@ -1456,7 +1471,8 @@
 	cp = disk->d_consumer;
 	cbp->bio_done = g_mirror_done;
 	cbp->bio_to = cp->provider;
-	binuptime(&disk->d_last_used);
+	disk->d_last_used = curtime;
+	disk->last_offset = bio_offset;
 	G_MIRROR_LOGREQ(3, cbp, "Sending request.");
 	KASSERT(cp->acr >= 1 && cp->acw >= 1 && cp->ace >= 1,
 	    ("Consumer %s not opened (r%dw%de%d).", cp->provider->name, cp->acr,
@@ -1610,6 +1626,7 @@
 				g_io_deliver(bp, bp->bio_error);
 				return;
 			}
+			disk->last_offset = bp->bio_offset;
 			bioq_insert_tail(&queue, cbp);
 			cbp->bio_done = g_mirror_done;
 			cp = disk->d_consumer;


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-geom 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Thu Jun 21 01:18:50 UTC 2007 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=113885 

From: "Will Andrews" <will@firepipe.net>
To: bug-followup@freebsd.org
Cc: "Mykola Zubach" <zuborg@advancedhosters.com>
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Tue, 2 Dec 2008 14:50:29 -0700

 I have attached what I believe is a better version of your patch.  It:
 
 1) Fixes the type ambiguity of the new use_delay/best_use_delay and
 dist/best_dist variables, to match the variables used in their calculations;
 they should be uint64_t and off_t, respectively.
 2) Uses bit shifts instead of multiplication/division in the use delay and
 distance calculations.  The precision loss should be acceptable in this
 situation.
 3) Cleans up the style of the code; add more commenting, better comments.
 4) Gets rid of the g_mirror_disk.d_delay variable since it is no longer
 used, along with the function the original patch short-circuited.
 
 In my testing, with 16 simultaneous processes performing the same test at
 the same time, by throughput, random reads/writes improved by about 35% (low
 variance), while sequential reads/writes improved by 100-400% (high
 variance).  IOs also increased proportionally.  Testing was done using
 "rawio -a -p 16 /dev/mirror/testa", where the test mirror composed of two
 160GB Seagate SATA disks and the system is a dual Opteron 246 with 1.5GB of
 RAM with no other load, running 8.0-CURRENT as of 12/1/2008.  CPU usage
 impact vs. old load algorithm appears negligible as well.
 
 Regards,
 Will.
 
 --- gmirror.patch begins here ---
 Index: g_mirror.c
 ===================================================================
 --- g_mirror.c	(revision 185567)
 +++ g_mirror.c	(working copy)
 @@ -25,7 +25,7 @@
   */
  
  #include <sys/cdefs.h>
 -__FBSDID("$FreeBSD$");
 +__FBSDID("$FreeBSD: src/sys/geom/mirror/g_mirror.c,v 1.94 2007/10/20 23:23:19 julian Exp $");
  
  #include <sys/param.h>
  #include <sys/systm.h>
 @@ -45,7 +45,6 @@
  #include <sys/sched.h>
  #include <geom/mirror/g_mirror.h>
  
 -
  static MALLOC_DEFINE(M_MIRROR, "mirror_data", "GEOM_MIRROR Data");
  
  SYSCTL_DECL(_kern_geom);
 @@ -71,7 +70,12 @@
  TUNABLE_INT("kern.geom.mirror.sync_requests", &g_mirror_syncreqs);
  SYSCTL_UINT(_kern_geom_mirror, OID_AUTO, sync_requests, CTLFLAG_RDTUN,
      &g_mirror_syncreqs, 0, "Parallel synchronization I/O requests.");
 +static u_int g_mirror_plusdelay = 60000;
 +TUNABLE_INT("kern.geom.mirror.plusdelay", &g_mirror_plusdelay);
 +SYSCTL_UINT(_kern_geom_mirror, OID_AUTO, plusdelay, CTLFLAG_RW,
 +    &g_mirror_plusdelay, 0, "Additional load delay in 1/65536ths of a second.");
  
 +
  #define	MSLEEP(ident, mtx, priority, wmesg, timeout)	do {		\
  	G_MIRROR_DEBUG(4, "%s: Sleeping %p.", __func__, (ident));	\
  	msleep((ident), (mtx), (priority), (wmesg), (timeout));		\
 @@ -451,8 +455,7 @@
  	disk->d_id = md->md_did;
  	disk->d_state = G_MIRROR_DISK_STATE_NONE;
  	disk->d_priority = md->md_priority;
 -	disk->d_delay.sec = 0;
 -	disk->d_delay.frac = 0;
 +	disk->last_offset = 0;
  	binuptime(&disk->d_last_used);
  	disk->d_flags = md->md_dflags;
  	if (md->md_provider[0] != '\0')
 @@ -863,16 +866,6 @@
  }
  
  static void
 -g_mirror_update_delay(struct g_mirror_disk *disk, struct bio *bp)
 -{
 -
 -	if (disk->d_softc->sc_balance != G_MIRROR_BALANCE_LOAD)
 -		return;
 -	binuptime(&disk->d_delay);
 -	bintime_sub(&disk->d_delay, &bp->bio_t0);
 -}
 -
 -static void
  g_mirror_done(struct bio *bp)
  {
  	struct g_mirror_softc *sc;
 @@ -904,8 +897,6 @@
  		g_topology_lock();
  		g_mirror_kill_consumer(sc, bp->bio_from);
  		g_topology_unlock();
 -	} else {
 -		g_mirror_update_delay(disk, bp);
  	}
  
  	pbp->bio_inbed++;
 @@ -1472,25 +1463,45 @@
  	struct g_consumer *cp;
  	struct bio *cbp;
  	struct bintime curtime;
 +	off_t  bio_offset = bp->bio_offset;
 +	off_t  best_dist = -1, dist;
 +	uint64_t best_use_delay = 0, use_delay = 0;
  
 -	binuptime(&curtime);
 +	getbinuptime(&curtime);
  	/*
 -	 * Find a disk which the smallest load.
 +	 * Find the disk which has the smallest ratio of distance to use
 +	 * delay, i.e. its head looks closest to bio_offset and it was used
 +	 * least recently.
  	 */
  	disk = NULL;
  	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
  		if (dp->d_state != G_MIRROR_DISK_STATE_ACTIVE)
  			continue;
 -		/* If disk wasn't used for more than 2 sec, use it. */
 -		if (curtime.sec - dp->d_last_used.sec >= 2) {
 +
 +		dist = dp->last_offset - bio_offset;
 +		if (dist < 0)
 +			dist = -dist;
 +
 +		/*
 +		 * Calculate the use delay as follows: Add the sysctl
 +		 * configured delay, then convert the bintime structure
 +		 * in terms of 1/65536ths of a second before adding its
 +		 * components.  So multiply seconds difference by 65536
 +		 * and drop all but the 16 most significant bits in the
 +		 * fraction, since they're all greater than 1/65536.
 +		 */
 +		use_delay = g_mirror_plusdelay;
 +		use_delay += ((curtime.sec - dp->d_last_used.sec) << 16);
 +		use_delay += ((curtime.frac - dp->d_last_used.frac) >> 48);
 +
 +		if (best_dist == -1 ||
 +		    dist * best_use_delay < best_dist * use_delay) {
  			disk = dp;
 -			break;
 +			best_dist = dist;
 +			best_use_delay = use_delay;
  		}
 -		if (disk == NULL ||
 -		    bintime_cmp(&dp->d_delay, &disk->d_delay) < 0) {
 -			disk = dp;
 -		}
  	}
 +
  	KASSERT(disk != NULL, ("NULL disk for %s.", sc->sc_name));
  	cbp = g_clone_bio(bp);
  	if (cbp == NULL) {
 @@ -1505,7 +1516,8 @@
  	cp = disk->d_consumer;
  	cbp->bio_done = g_mirror_done;
  	cbp->bio_to = cp->provider;
 -	binuptime(&disk->d_last_used);
 +	disk->d_last_used = curtime;
 +	disk->last_offset = bio_offset;
  	G_MIRROR_LOGREQ(3, cbp, "Sending request.");
  	KASSERT(cp->acr >= 1 && cp->acw >= 1 && cp->ace >= 1,
  	    ("Consumer %s not opened (r%dw%de%d).", cp->provider->name, cp->acr,
 @@ -1659,6 +1671,7 @@
  				g_io_deliver(bp, bp->bio_error);
  				return;
  			}
 +			disk->last_offset = bp->bio_offset;
  			bioq_insert_tail(&queue, cbp);
  			cbp->bio_done = g_mirror_done;
  			cp = disk->d_consumer;
 Index: g_mirror.h
 ===================================================================
 --- g_mirror.h	(revision 185567)
 +++ g_mirror.h	(working copy)
 @@ -23,7 +23,7 @@
   * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
   * SUCH DAMAGE.
   *
 - * $FreeBSD$
 + * $FreeBSD: src/sys/geom/mirror/g_mirror.h,v 1.24 2006/11/01 22:51:49 pjd Exp $
   */
  
  #ifndef	_G_MIRROR_H_
 @@ -133,7 +133,7 @@
  	struct g_mirror_softc	*d_softc; /* Back-pointer to softc. */
  	int		 d_state;	/* Disk state. */
  	u_int		 d_priority;	/* Disk priority. */
 -	struct bintime	 d_delay;	/* Disk delay. */
 +	off_t		 last_offset;	/* LBA of last operation. */
  	struct bintime	 d_last_used;	/* When disk was last used. */
  	uint64_t	 d_flags;	/* Additional flags. */
  	u_int		 d_genid;	/* Disk's generation ID. */
 
 --- gmirror.patch ends here ---
 

Responsible-Changed-From-To: freebsd-geom->pjd 
Responsible-Changed-By: pjd 
Responsible-Changed-When: czw 4 gru 19:51:01 2008 UTC 
Responsible-Changed-Why:  
I'll handle this one. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=113885 
Responsible-Changed-From-To: pjd->freebsd-geom 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Thu May 28 22:15:15 UTC 2009 
Responsible-Changed-Why:  
pjd is not actively working on GEOM at the moment. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=113885 

From: Emil Mikulic <emikulic@gmail.com>
To: bug-followup@FreeBSD.org
Cc: will@firepipe.net
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance
	algorithm
Date: Thu, 16 Jul 2009 17:56:19 +1000

 Will Andrews' patch is *fantastic*
 
 With this patch and gmirror set to "load" style balancing, I can run two
 long sequential reads in parallel and get practically linear scaling on
 a two-disk mirror.
 
 Could someone please commit this?
 
 --Emil

From: freebsdpr <freebsdpr@satin.sensation.net.au>
To: bug-followup@FreeBSD.org
Cc: freebsdpr <freebsdpr@sensation.net.au>
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Tue, 21 Jul 2009 17:45:37 +1000 (EST)

 I was also surprised to discover that gmirror, regardless of the algorithm 
 used, does not seem to offer either random or sequential read performance 
 any better than a single drive. I have a new SATA backplane which shows 
 individual drive activity indicators - with these you can easily see that 
 the "load" algorithm seems to be selecting (and staying on) only a single 
 drive at a time, for anywhere between 0.1 - 1 seconds. Some simple testing 
 confirmed that there's no discernable read performance benefit between 1 
 or >1 drives - so much for my 4 drive RAID1 idea!
 
 In comparison, a 5 drive graid3 array offers a sequential read speed of 
 nearly 4 times a single drive... with read verify ON.
 
 ----
 
 Onto the "load" patch above - it doesn't seem to work for me. I thought it 
 may have been because I had 4 drives in the array, but even after dropping 
 back to 2 it still only reads from a *single* drive. Any ideas? I'm using 
 7.1R-amd64.
 
 Geom name: db0
 State: COMPLETE
 Components: 2
 Balance: load  <--- ***

From: Ivan Voras <ivoras@freebsd.org>
To: bug-followup@freebsd.org, zuborg@advancedhosters.com
Cc:  
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Tue, 25 Aug 2009 11:11:12 +0200

 The patch will not increase streaming read performance beyond what's 
 possible with a single drive, it will improve random read performance in 
 certain cases where reads are localized in such ways that reading some 
 of them from one drive and others from the other drive helps.
 
 The reason why there is no scalability with streaming read performance 
 vs what can be achieved with RAID0/3/5 is that there is no striping 
 here. For example: if you need to read 4 striped blocks from a RAID0 of 
 two drives, blocks 0 and 2 will be sequentially stored on the first 
 drive, blocks 1 and 3 will be sequentially stored on the second drive. 
 Thus reading the 4 blocks will result in two sequential reads per drive. 
 OTOH, with RAID1, blocks 0 and 2 will be stored with a "gap" between 
 them, containing block 1, and cannot be read sequentially, but a seek is 
 needed. This is why e.g. the "split" method (which effectively does 
 striping on the request level) doesn't help much with performance.
 

From: freebsdpr <freebsdpr@satin.sensation.net.au>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Tue, 22 Sep 2009 01:13:05 +1000 (EST)

 > The patch will not increase streaming read performance beyond what's
 possible with a single drive[...]
 
 I meant that after applying the patch it literally only reads from a 
 single drive. "iostat -w 1 -x" shows it consistently using the last drive 
 of the set for all reads.

From: Alex Keda <admin@lissyara.su>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Fri, 23 Oct 2009 09:20:00 +0400

 so, where is maintainer?
 may be somebody commit it?
 very useful feature.

From: Maxim Sobolev <sobomax@FreeBSD.org>
To: bug-followup@FreeBSD.org, zuborg@advancedhosters.com
Cc:  
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Wed, 02 Dec 2009 16:03:52 -0800

 Here is another patch, which implements different approach. Basically it 
 looks at the recently served requests and also the queue length to 
 decide where to send the next request to.
 
 http://sobomax.sippysoft.com/~sobomax/geom_mirror.diff
 
 Regards,
 -- 
 Maksym Sobolyev
 Sippy Software, Inc.
 Internet Telephony (VoIP) Experts
 T/F: +1-646-651-1110
 Web: http://www.sippysoft.com
 MSN: sales@sippysoft.com
 Skype: SippySoft

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/113885: commit references a PR
Date: Thu,  3 Dec 2009 21:48:01 +0000 (UTC)

 Author: mav
 Date: Thu Dec  3 21:47:51 2009
 New Revision: 200086
 URL: http://svn.freebsd.org/changeset/base/200086
 
 Log:
   Change 'load' balancing mode algorithm:
   - Instead of measuring last request execution time for each drive and
   choosing one with smallest time, use averaged number of requests, running
   on each drive. This information is more accurate and timely. It allows to
   distribute load between drives in more even and predictable way.
   - For each drive track offset of the last submitted request. If new request
   offset matches previous one or close for some drive, prefer that drive.
   It allows to significantly speedup simultaneous sequential reads.
   
   PR:		kern/113885
   Reviewed by:	sobomax
 
 Modified:
   head/sys/geom/mirror/g_mirror.c
   head/sys/geom/mirror/g_mirror.h
 
 Modified: head/sys/geom/mirror/g_mirror.c
 ==============================================================================
 --- head/sys/geom/mirror/g_mirror.c	Thu Dec  3 21:44:41 2009	(r200085)
 +++ head/sys/geom/mirror/g_mirror.c	Thu Dec  3 21:47:51 2009	(r200086)
 @@ -451,9 +451,6 @@ g_mirror_init_disk(struct g_mirror_softc
  	disk->d_id = md->md_did;
  	disk->d_state = G_MIRROR_DISK_STATE_NONE;
  	disk->d_priority = md->md_priority;
 -	disk->d_delay.sec = 0;
 -	disk->d_delay.frac = 0;
 -	binuptime(&disk->d_last_used);
  	disk->d_flags = md->md_dflags;
  	if (md->md_provider[0] != '\0')
  		disk->d_flags |= G_MIRROR_DISK_FLAG_HARDCODED;
 @@ -863,16 +860,6 @@ bintime_cmp(struct bintime *bt1, struct 
  }
  
  static void
 -g_mirror_update_delay(struct g_mirror_disk *disk, struct bio *bp)
 -{
 -
 -	if (disk->d_softc->sc_balance != G_MIRROR_BALANCE_LOAD)
 -		return;
 -	binuptime(&disk->d_delay);
 -	bintime_sub(&disk->d_delay, &bp->bio_t0);
 -}
 -
 -static void
  g_mirror_done(struct bio *bp)
  {
  	struct g_mirror_softc *sc;
 @@ -904,8 +891,6 @@ g_mirror_regular_request(struct bio *bp)
  		g_topology_lock();
  		g_mirror_kill_consumer(sc, bp->bio_from);
  		g_topology_unlock();
 -	} else {
 -		g_mirror_update_delay(disk, bp);
  	}
  
  	pbp->bio_inbed++;
 @@ -1465,30 +1450,35 @@ g_mirror_request_round_robin(struct g_mi
  	g_io_request(cbp, cp);
  }
  
 +#define TRACK_SIZE  (1 * 1024 * 1024)
 +#define LOAD_SCALE	256
 +#define ABS(x)		(((x) >= 0) ? (x) : (-(x)))
 +
  static void
  g_mirror_request_load(struct g_mirror_softc *sc, struct bio *bp)
  {
  	struct g_mirror_disk *disk, *dp;
  	struct g_consumer *cp;
  	struct bio *cbp;
 -	struct bintime curtime;
 +	int prio, best;
  
 -	binuptime(&curtime);
 -	/*
 -	 * Find a disk which the smallest load.
 -	 */
 +	/* Find a disk with the smallest load. */
  	disk = NULL;
 +	best = INT_MAX;
  	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
  		if (dp->d_state != G_MIRROR_DISK_STATE_ACTIVE)
  			continue;
 -		/* If disk wasn't used for more than 2 sec, use it. */
 -		if (curtime.sec - dp->d_last_used.sec >= 2) {
 -			disk = dp;
 -			break;
 -		}
 -		if (disk == NULL ||
 -		    bintime_cmp(&dp->d_delay, &disk->d_delay) < 0) {
 +		prio = dp->load;
 +		/* If disk head is precisely in position - highly prefer it. */
 +		if (dp->d_last_offset == bp->bio_offset)
 +			prio -= 2 * LOAD_SCALE;
 +		else
 +		/* If disk head is close to position - prefer it. */
 +		if (ABS(dp->d_last_offset - bp->bio_offset) < TRACK_SIZE)
 +			prio -= 1 * LOAD_SCALE;
 +		if (prio <= best) {
  			disk = dp;
 +			best = prio;
  		}
  	}
  	KASSERT(disk != NULL, ("NULL disk for %s.", sc->sc_name));
 @@ -1505,12 +1495,18 @@ g_mirror_request_load(struct g_mirror_so
  	cp = disk->d_consumer;
  	cbp->bio_done = g_mirror_done;
  	cbp->bio_to = cp->provider;
 -	binuptime(&disk->d_last_used);
  	G_MIRROR_LOGREQ(3, cbp, "Sending request.");
  	KASSERT(cp->acr >= 1 && cp->acw >= 1 && cp->ace >= 1,
  	    ("Consumer %s not opened (r%dw%de%d).", cp->provider->name, cp->acr,
  	    cp->acw, cp->ace));
  	cp->index++;
 +	/* Remember last head position */
 +	disk->d_last_offset = bp->bio_offset + bp->bio_length;
 +	/* Update loads. */
 +	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
 +		dp->load = (dp->d_consumer->index * LOAD_SCALE +
 +		    dp->load * 7) / 8;
 +	}
  	g_io_request(cbp, cp);
  }
  
 
 Modified: head/sys/geom/mirror/g_mirror.h
 ==============================================================================
 --- head/sys/geom/mirror/g_mirror.h	Thu Dec  3 21:44:41 2009	(r200085)
 +++ head/sys/geom/mirror/g_mirror.h	Thu Dec  3 21:47:51 2009	(r200086)
 @@ -133,8 +133,8 @@ struct g_mirror_disk {
  	struct g_mirror_softc	*d_softc; /* Back-pointer to softc. */
  	int		 d_state;	/* Disk state. */
  	u_int		 d_priority;	/* Disk priority. */
 -	struct bintime	 d_delay;	/* Disk delay. */
 -	struct bintime	 d_last_used;	/* When disk was last used. */
 +	u_int		 load;		/* Averaged queue length */
 +	off_t		 d_last_offset;	/* Last read offset */
  	uint64_t	 d_flags;	/* Additional flags. */
  	u_int		 d_genid;	/* Disk's generation ID. */
  	struct g_mirror_disk_sync d_sync;/* Sync information. */
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: mav 
State-Changed-When: Thu Dec 3 21:51:12 UTC 2009 
State-Changed-Why:  
Patch committed to the HEAD. 


Responsible-Changed-From-To: freebsd-geom->mav 
Responsible-Changed-By: mav 
Responsible-Changed-When: Thu Dec 3 21:51:12 UTC 2009 
Responsible-Changed-Why:  
Patch committed to the HEAD. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=113885 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/113885: commit references a PR
Date: Tue,  8 Dec 2009 23:24:04 +0000 (UTC)

 Author: mav
 Date: Tue Dec  8 23:23:45 2009
 New Revision: 200285
 URL: http://svn.freebsd.org/changeset/base/200285
 
 Log:
   MFC r200086:
   Change 'load' balancing mode algorithm:
   - Instead of measuring last request execution time for each drive and
   choosing one with smallest time, use averaged number of requests, running
   on each drive. This information is more accurate and timely. It allows to
   distribute load between drives in more even and predictable way.
   - For each drive track offset of the last submitted request. If new request
   offset matches previous one or close for some drive, prefer that drive.
   It allows to significantly speedup simultaneous sequential reads.
   
   PR:             kern/113885
 
 Modified:
   stable/8/sys/geom/mirror/g_mirror.c
   stable/8/sys/geom/mirror/g_mirror.h
 Directory Properties:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
   stable/8/sys/dev/xen/xenpci/   (props changed)
 
 Modified: stable/8/sys/geom/mirror/g_mirror.c
 ==============================================================================
 --- stable/8/sys/geom/mirror/g_mirror.c	Tue Dec  8 23:15:48 2009	(r200284)
 +++ stable/8/sys/geom/mirror/g_mirror.c	Tue Dec  8 23:23:45 2009	(r200285)
 @@ -451,9 +451,6 @@ g_mirror_init_disk(struct g_mirror_softc
  	disk->d_id = md->md_did;
  	disk->d_state = G_MIRROR_DISK_STATE_NONE;
  	disk->d_priority = md->md_priority;
 -	disk->d_delay.sec = 0;
 -	disk->d_delay.frac = 0;
 -	binuptime(&disk->d_last_used);
  	disk->d_flags = md->md_dflags;
  	if (md->md_provider[0] != '\0')
  		disk->d_flags |= G_MIRROR_DISK_FLAG_HARDCODED;
 @@ -863,16 +860,6 @@ bintime_cmp(struct bintime *bt1, struct 
  }
  
  static void
 -g_mirror_update_delay(struct g_mirror_disk *disk, struct bio *bp)
 -{
 -
 -	if (disk->d_softc->sc_balance != G_MIRROR_BALANCE_LOAD)
 -		return;
 -	binuptime(&disk->d_delay);
 -	bintime_sub(&disk->d_delay, &bp->bio_t0);
 -}
 -
 -static void
  g_mirror_done(struct bio *bp)
  {
  	struct g_mirror_softc *sc;
 @@ -904,8 +891,6 @@ g_mirror_regular_request(struct bio *bp)
  		g_topology_lock();
  		g_mirror_kill_consumer(sc, bp->bio_from);
  		g_topology_unlock();
 -	} else {
 -		g_mirror_update_delay(disk, bp);
  	}
  
  	pbp->bio_inbed++;
 @@ -1465,30 +1450,35 @@ g_mirror_request_round_robin(struct g_mi
  	g_io_request(cbp, cp);
  }
  
 +#define TRACK_SIZE  (1 * 1024 * 1024)
 +#define LOAD_SCALE	256
 +#define ABS(x)		(((x) >= 0) ? (x) : (-(x)))
 +
  static void
  g_mirror_request_load(struct g_mirror_softc *sc, struct bio *bp)
  {
  	struct g_mirror_disk *disk, *dp;
  	struct g_consumer *cp;
  	struct bio *cbp;
 -	struct bintime curtime;
 +	int prio, best;
  
 -	binuptime(&curtime);
 -	/*
 -	 * Find a disk which the smallest load.
 -	 */
 +	/* Find a disk with the smallest load. */
  	disk = NULL;
 +	best = INT_MAX;
  	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
  		if (dp->d_state != G_MIRROR_DISK_STATE_ACTIVE)
  			continue;
 -		/* If disk wasn't used for more than 2 sec, use it. */
 -		if (curtime.sec - dp->d_last_used.sec >= 2) {
 -			disk = dp;
 -			break;
 -		}
 -		if (disk == NULL ||
 -		    bintime_cmp(&dp->d_delay, &disk->d_delay) < 0) {
 +		prio = dp->load;
 +		/* If disk head is precisely in position - highly prefer it. */
 +		if (dp->d_last_offset == bp->bio_offset)
 +			prio -= 2 * LOAD_SCALE;
 +		else
 +		/* If disk head is close to position - prefer it. */
 +		if (ABS(dp->d_last_offset - bp->bio_offset) < TRACK_SIZE)
 +			prio -= 1 * LOAD_SCALE;
 +		if (prio <= best) {
  			disk = dp;
 +			best = prio;
  		}
  	}
  	KASSERT(disk != NULL, ("NULL disk for %s.", sc->sc_name));
 @@ -1505,12 +1495,18 @@ g_mirror_request_load(struct g_mirror_so
  	cp = disk->d_consumer;
  	cbp->bio_done = g_mirror_done;
  	cbp->bio_to = cp->provider;
 -	binuptime(&disk->d_last_used);
  	G_MIRROR_LOGREQ(3, cbp, "Sending request.");
  	KASSERT(cp->acr >= 1 && cp->acw >= 1 && cp->ace >= 1,
  	    ("Consumer %s not opened (r%dw%de%d).", cp->provider->name, cp->acr,
  	    cp->acw, cp->ace));
  	cp->index++;
 +	/* Remember last head position */
 +	disk->d_last_offset = bp->bio_offset + bp->bio_length;
 +	/* Update loads. */
 +	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
 +		dp->load = (dp->d_consumer->index * LOAD_SCALE +
 +		    dp->load * 7) / 8;
 +	}
  	g_io_request(cbp, cp);
  }
  
 
 Modified: stable/8/sys/geom/mirror/g_mirror.h
 ==============================================================================
 --- stable/8/sys/geom/mirror/g_mirror.h	Tue Dec  8 23:15:48 2009	(r200284)
 +++ stable/8/sys/geom/mirror/g_mirror.h	Tue Dec  8 23:23:45 2009	(r200285)
 @@ -133,8 +133,8 @@ struct g_mirror_disk {
  	struct g_mirror_softc	*d_softc; /* Back-pointer to softc. */
  	int		 d_state;	/* Disk state. */
  	u_int		 d_priority;	/* Disk priority. */
 -	struct bintime	 d_delay;	/* Disk delay. */
 -	struct bintime	 d_last_used;	/* When disk was last used. */
 +	u_int		 load;		/* Averaged queue length */
 +	off_t		 d_last_offset;	/* Last read offset */
  	uint64_t	 d_flags;	/* Additional flags. */
  	u_int		 d_genid;	/* Disk's generation ID. */
  	struct g_mirror_disk_sync d_sync;/* Sync information. */
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: mav 
State-Changed-When: Tue Dec 8 23:36:07 UTC 2009 
State-Changed-Why:  
Fix merged down to 7/8-STABLE. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=113885 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/113885: commit references a PR
Date: Tue,  8 Dec 2009 23:34:47 +0000 (UTC)

 Author: mav
 Date: Tue Dec  8 23:34:34 2009
 New Revision: 200286
 URL: http://svn.freebsd.org/changeset/base/200286
 
 Log:
   MFC r200086:
   Change 'load' balancing mode algorithm:
   - Instead of measuring last request execution time for each drive and
   choosing one with smallest time, use averaged number of requests, running
   on each drive. This information is more accurate and timely. It allows to
   distribute load between drives in more even and predictable way.
   - For each drive track offset of the last submitted request. If new request
   offset matches previous one or close for some drive, prefer that drive.
   It allows to significantly speedup simultaneous sequential reads.
   
   PR:             kern/113885
 
 Modified:
   stable/7/sys/geom/mirror/g_mirror.c
   stable/7/sys/geom/mirror/g_mirror.h
 Directory Properties:
   stable/7/sys/   (props changed)
   stable/7/sys/contrib/pf/   (props changed)
 
 Modified: stable/7/sys/geom/mirror/g_mirror.c
 ==============================================================================
 --- stable/7/sys/geom/mirror/g_mirror.c	Tue Dec  8 23:23:45 2009	(r200285)
 +++ stable/7/sys/geom/mirror/g_mirror.c	Tue Dec  8 23:34:34 2009	(r200286)
 @@ -451,9 +451,6 @@ g_mirror_init_disk(struct g_mirror_softc
  	disk->d_id = md->md_did;
  	disk->d_state = G_MIRROR_DISK_STATE_NONE;
  	disk->d_priority = md->md_priority;
 -	disk->d_delay.sec = 0;
 -	disk->d_delay.frac = 0;
 -	binuptime(&disk->d_last_used);
  	disk->d_flags = md->md_dflags;
  	if (md->md_provider[0] != '\0')
  		disk->d_flags |= G_MIRROR_DISK_FLAG_HARDCODED;
 @@ -863,16 +860,6 @@ bintime_cmp(struct bintime *bt1, struct 
  }
  
  static void
 -g_mirror_update_delay(struct g_mirror_disk *disk, struct bio *bp)
 -{
 -
 -	if (disk->d_softc->sc_balance != G_MIRROR_BALANCE_LOAD)
 -		return;
 -	binuptime(&disk->d_delay);
 -	bintime_sub(&disk->d_delay, &bp->bio_t0);
 -}
 -
 -static void
  g_mirror_done(struct bio *bp)
  {
  	struct g_mirror_softc *sc;
 @@ -904,8 +891,6 @@ g_mirror_regular_request(struct bio *bp)
  		g_topology_lock();
  		g_mirror_kill_consumer(sc, bp->bio_from);
  		g_topology_unlock();
 -	} else {
 -		g_mirror_update_delay(disk, bp);
  	}
  
  	pbp->bio_inbed++;
 @@ -1465,30 +1450,35 @@ g_mirror_request_round_robin(struct g_mi
  	g_io_request(cbp, cp);
  }
  
 +#define TRACK_SIZE  (1 * 1024 * 1024)
 +#define LOAD_SCALE	256
 +#define ABS(x)		(((x) >= 0) ? (x) : (-(x)))
 +
  static void
  g_mirror_request_load(struct g_mirror_softc *sc, struct bio *bp)
  {
  	struct g_mirror_disk *disk, *dp;
  	struct g_consumer *cp;
  	struct bio *cbp;
 -	struct bintime curtime;
 +	int prio, best;
  
 -	binuptime(&curtime);
 -	/*
 -	 * Find a disk which the smallest load.
 -	 */
 +	/* Find a disk with the smallest load. */
  	disk = NULL;
 +	best = INT_MAX;
  	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
  		if (dp->d_state != G_MIRROR_DISK_STATE_ACTIVE)
  			continue;
 -		/* If disk wasn't used for more than 2 sec, use it. */
 -		if (curtime.sec - dp->d_last_used.sec >= 2) {
 -			disk = dp;
 -			break;
 -		}
 -		if (disk == NULL ||
 -		    bintime_cmp(&dp->d_delay, &disk->d_delay) < 0) {
 +		prio = dp->load;
 +		/* If disk head is precisely in position - highly prefer it. */
 +		if (dp->d_last_offset == bp->bio_offset)
 +			prio -= 2 * LOAD_SCALE;
 +		else
 +		/* If disk head is close to position - prefer it. */
 +		if (ABS(dp->d_last_offset - bp->bio_offset) < TRACK_SIZE)
 +			prio -= 1 * LOAD_SCALE;
 +		if (prio <= best) {
  			disk = dp;
 +			best = prio;
  		}
  	}
  	KASSERT(disk != NULL, ("NULL disk for %s.", sc->sc_name));
 @@ -1505,12 +1495,18 @@ g_mirror_request_load(struct g_mirror_so
  	cp = disk->d_consumer;
  	cbp->bio_done = g_mirror_done;
  	cbp->bio_to = cp->provider;
 -	binuptime(&disk->d_last_used);
  	G_MIRROR_LOGREQ(3, cbp, "Sending request.");
  	KASSERT(cp->acr >= 1 && cp->acw >= 1 && cp->ace >= 1,
  	    ("Consumer %s not opened (r%dw%de%d).", cp->provider->name, cp->acr,
  	    cp->acw, cp->ace));
  	cp->index++;
 +	/* Remember last head position */
 +	disk->d_last_offset = bp->bio_offset + bp->bio_length;
 +	/* Update loads. */
 +	LIST_FOREACH(dp, &sc->sc_disks, d_next) {
 +		dp->load = (dp->d_consumer->index * LOAD_SCALE +
 +		    dp->load * 7) / 8;
 +	}
  	g_io_request(cbp, cp);
  }
  
 
 Modified: stable/7/sys/geom/mirror/g_mirror.h
 ==============================================================================
 --- stable/7/sys/geom/mirror/g_mirror.h	Tue Dec  8 23:23:45 2009	(r200285)
 +++ stable/7/sys/geom/mirror/g_mirror.h	Tue Dec  8 23:34:34 2009	(r200286)
 @@ -133,8 +133,8 @@ struct g_mirror_disk {
  	struct g_mirror_softc	*d_softc; /* Back-pointer to softc. */
  	int		 d_state;	/* Disk state. */
  	u_int		 d_priority;	/* Disk priority. */
 -	struct bintime	 d_delay;	/* Disk delay. */
 -	struct bintime	 d_last_used;	/* When disk was last used. */
 +	u_int		 load;		/* Averaged queue length */
 +	off_t		 d_last_offset;	/* Last read offset */
  	uint64_t	 d_flags;	/* Additional flags. */
  	u_int		 d_genid;	/* Disk's generation ID. */
  	struct g_mirror_disk_sync d_sync;/* Sync information. */
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: freebsdpr <freebsdpr@satin.sensation.net.au>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Tue, 5 Jan 2010 19:48:35 +1100 (EST)

 I'm still not sure this is working as expected. For a sequential read it 
 gets "stuck" on one drive for several seconds before switching (I counted 
 over 30 seconds in one instance), rather than swapping back and forth as 
 each drive leapfrogs each other. Why is the drive at 95% utilisation 
 preferred over the one at 0% util? Is there perhaps a way to tell the OS 
 that we're about to do a sequential read and position all drives in the 
 set at identical positions to force them to all be used?
 
 For random reads it is switching between the drives more rapidly (eg, 
 "iostat -w 1 -x" shows activity on both drives for every second), but 
 their respective loads never go above 50%, which is pretty much the same 
 situation before I applied the patches... aggregate performance is no 
 better than a single drive.
 
 FYI I applied patch-5.diff to a 7.0R amd64 system, but had similar results 
 with earlier versions on a 7.1R amd64 system.
 
 Even though it doesn't seem to be working here - I appreciate the people 
 who are looking into this. Thanks. :)

From: Alexander Motin <mav@FreeBSD.org>
To: freebsdpr <freebsdpr@satin.sensation.net.au>
Cc: bug-followup <bug-followup@FreeBSD.org>
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Tue, 05 Jan 2010 11:08:43 +0200

 freebsdpr wrote:
 >  I'm still not sure this is working as expected. For a sequential read it 
 >  gets "stuck" on one drive for several seconds before switching (I counted 
 >  over 30 seconds in one instance), rather than swapping back and forth as 
 >  each drive leapfrogs each other. Why is the drive at 95% utilisation 
 >  preferred over the one at 0% util? Is there perhaps a way to tell the OS 
 >  that we're about to do a sequential read and position all drives in the 
 >  set at identical positions to force them to all be used?
 
 It works exactly as expected: sequential read uses only one drive at a
 time to get best effect from drive's read-ahead cache. Jumping between
 drives won't increase performance, but reduce it in most cases, as
 instead of reading, drives will have to wait for plate revolution over
 the data already red by another ones. With several simultaneous read
 streams benefits of current scheme will be much more significant.
 
 If you wish to make all drives busy, you may try 'split' method. But for
 regular HDDs you won't get benefit.
 
 >  For random reads it is switching between the drives more rapidly (eg, 
 >  "iostat -w 1 -x" shows activity on both drives for every second), but 
 >  their respective loads never go above 50%, which is pretty much the same 
 >  situation before I applied the patches... aggregate performance is no 
 >  better than a single drive.
 
 Probably you are reading in single stream. The more requests you run
 simultaneously, the better results will be. To utilize all drives you
 should have at least as much requests as number of drives in array.
 
 -- 
 Alexander Motin

From: freebsdpr <freebsdpr@satin.sensation.net.au>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Mon, 22 Mar 2010 11:06:02 +1100 (EST)

 On Tue, 5 Jan 2010, freebsdpr wrote:
 
 > I'm still not sure this is working as expected. For a sequential read it gets 
 > "stuck" on one drive for several seconds before switching (I counted over 30 
 ...
 
 I can confirm this *is* working and that I was using the wrong approach 
 for testing. I now have a 4 drive gmirror array for MySQL indexes (the 
 requests are predominantly random reads with the occasional burst of 
 write), and running 6 concurrent MySQL query processes results in the 
 average utilisation of each of the four drives being 90%+.
 
 iostat -w 60 -x ad12 ad14 ad16 ad18
 
                          extended device statistics
 device     r/s   w/s    kr/s    kw/s wait svc_t  %b
 ad12      81.5  31.2  3410.0  1815.8    1 134.1  94
 ad14      80.8  31.2  3405.4  1815.8    1  72.7  95
 ad16      83.3  31.2  3488.7  1815.8    3 105.7  94
 ad18      82.8  31.2  3502.1  1815.8    2 110.4  95
 
 (writes look more busy than they are - it's more like 50Mbytes/sec for 2 
 seconds out of the minute rather than 2Mbytes every single second)
 
 This is a real world application, not a synthetic test, so it is working 
 very nicely. Well done!
>Unformatted:
