From nobody@FreeBSD.org  Mon Jun  7 20:46:07 2010
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5BFB41065672
	for <freebsd-gnats-submit@FreeBSD.org>; Mon,  7 Jun 2010 20:46:07 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 30FB78FC19
	for <freebsd-gnats-submit@FreeBSD.org>; Mon,  7 Jun 2010 20:46:07 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o57Kk60j090087
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 7 Jun 2010 20:46:06 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id o57Kk608090086;
	Mon, 7 Jun 2010 20:46:06 GMT
	(envelope-from nobody)
Message-Id: <201006072046.o57Kk608090086@www.freebsd.org>
Date: Mon, 7 Jun 2010 20:46:06 GMT
From: Sven Kirmess <sven.kirmess@kzone.ch>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Booting with one component of a gmirror, then with the other leads to an inconsistent gmirror device.
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         147667
>Category:       kern
>Synopsis:       [gmirror] Booting with one component of a gmirror, then with the other leads to an inconsistent gmirror device.
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-geom
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jun 07 20:50:01 UTC 2010
>Closed-Date:    
>Last-Modified:  Tue Jan 15 08:50:01 UTC 2013
>Originator:     Sven Kirmess
>Release:        7.3 and 8.0 on i386 release from DVD
>Organization:
>Environment:
FreeBSD free1.kzone.ch 7.3-RELEASE-p1 FreeBSD 7.3-RELEASE-p1 #0: Wed May 26 04:29:05 UTC 2010     root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  i386

>Description:
Booting with one component of a gmirror, then with the other leads to an inconsistent gmirror device. The kernel does not detect that it should have to do a resync nor does it fail one of the two devices.

See "How to repeat the problem" for a detailed description.
>How-To-Repeat:
- Create a gmirror (gm0 on ad1 and ad2).

- Shut down the system and remove ad2.
- Boot the system with ad1.
- Create a file in /: $ touch /ad1.txt
- Shut down the system.

- Remove ad1 and add ad2 back into the system.
- Boot the system with ad2
- Create a file in /: $ touch /ad2.txt
- Shut down the system.

- Add ad1 back into the system.
- Boot the system with ad1 and ad2.

The kernel happily mounts the mirror and doesn't do a resync. You'll get output like this:

$ gmirror status
      Name    Status  Components
mirror/gm0  COMPLETE  ad1
                      ad2
$

If you do an ls / you'll only see one of the two files (that's expected).

Now if you shut down the system again and boot with only ad1, you'll see /ad1.txt and if you boot with only ad2 you'll see /ad2.txt. That's not expected.

That means the mirror is in an inconsistet state and the kernel didn't detect that.

This is what I would expect:
- Whenever gmirror adds a disk to a mirror, it writes the time down on the disk.
- When the driver starts a mirror, it checks that both disks have the exact same time. If that's not the case, the active disk (the one used to boot up to this state) is used to start the mirror. The other is marked as failed (or something) and the error is logged. The administrator is forced to remove and re-add the other disk. We should not resync automatically as this will lead to data loss on the disk we sync to. (ad1 might temporarily fail and refuse to boot, during the next boot ad1 might work again and be the first boot disk the BIOS picks. Now the system would boot from ad1 and sync back to ad2, overwriting everything changed on ad2 since ad1 failed.)
>Fix:
none

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-geom 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Jun 14 00:01:45 UTC 2010 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=147667 

From: Alexander Motin <mav@FreeBSD.org>
To: bug-followup@FreeBSD.org, sven.kirmess@kzone.ch
Cc:  
Subject: Re: kern/147667: [gmirror] Booting with one component of a gmirror,
 then with the other leads to an inconsistent gmirror device.
Date: Tue, 15 Jan 2013 10:46:40 +0200

 Described situation is predictable with the existing gmirror metadata
 format. gmirror compares disks only by their generation IDs. If you
 would just reboot extra time with one of the disk in experiment, that
 disk would win and another would be resynced automatically.
 
 Intel Matrix RAID supported by new graid module handles this situation
 by giving each disk information about every other. That makes one disk
 loose the challenge and be rebuilt in every situation.
 
 -- 
 Alexander Motin
>Unformatted:
