From nobody@FreeBSD.org  Thu Aug 23 15:06:42 2001
Return-Path: <nobody@FreeBSD.org>
Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21])
	by hub.freebsd.org (Postfix) with ESMTP id 55F3237B409
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 23 Aug 2001 15:06:42 -0700 (PDT)
	(envelope-from nobody@FreeBSD.org)
Received: (from nobody@localhost)
	by freefall.freebsd.org (8.11.4/8.11.4) id f7NM6gh65191;
	Thu, 23 Aug 2001 15:06:42 -0700 (PDT)
	(envelope-from nobody)
Message-Id: <200108232206.f7NM6gh65191@freefall.freebsd.org>
Date: Thu, 23 Aug 2001 15:06:42 -0700 (PDT)
From: Marc van Woerkom <3d@FreeBSD.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: This document should be translated, commented and added
X-Send-Pr-Version: www-1.0

>Number:         30008
>Category:       docs
>Synopsis:       [patch] French softupdates document should be translated, commented and added
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-doc
>State:          closed
>Quarter:        
>Keywords:       handbook
>Date-Required:  
>Class:          wish
>Submitter-Id:   current-users
>Arrival-Date:   Thu Aug 23 15:10:00 PDT 2001
>Closed-Date:    Sat Feb 04 16:25:12 UTC 2012
>Last-Modified:  Sat Feb 04 16:25:12 UTC 2012
>Originator:     Marc van Woerkom
>Release:        
>Organization:
>Environment:
>Description:
Softupdates could need some promotion/explanation.
I stumbled on a nice french document, that might help:

http://www.freebsd-fr.org/docs/fr/others/systeme-fichier/


>How-To-Repeat:
http://daily.daemonnews.org/view_story.php3?story_id=2327
>Fix:
If it has not been translated yet, please mail me and I'll
do it.

>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->analyzed 
State-Changed-By: dd 
State-Changed-When: Mon Sep 3 08:59:09 PDT 2001 
State-Changed-Why:  
As far as I can tell, it has not been translated, and nobody other 
than you (the originator) has volunteered to do it. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=30008 

From: Salvo Bartolotta <bartequi@neomedia.it>
To: Marc van Woerkom <3d@FreeBSD.org>
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: docs/30008: This document should be translated, commented and added
Date: Thu, 04 Oct 2001 10:45:07 +0200 (CEST)

 If you haven't finished yet, I might try to give you a hand; well, it's going 
 to be a nice exercise. :-)
 
 -- Salvo

From: Marc van Woerkom <3d@hub.freebsd.org>
To: bartequi@neomedia.it
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: docs/30008: This document should be translated, commented and added
Date: Mon,  8 Oct 2001 09:43:23 -0700 (PDT)

 Hi,
 
 no I had no time yet.
 Should we split it?
 It is easy to start with a raw translation
 by babelfish or that GIST service from NS6/Mozilla.
 
 Regards,
 Marc

From: Salvo Bartolotta <bartequi@neomedia.it>
To: 3d@FreeBSD.org, marc.vanwoerkom@fernuni-hagen.de
Cc: freebsd-gnats-submit@FreeBSD.org
Subject: Re: docs/30008: This document should be translated, commented and added
Date: Mon, 08 Oct 2001 19:18:29 +0200 (CEST)

 Hello Marc,
 
 Yes, we might split it into a couple of parts for translation purposes.  Just 
 let me know on what part(s) you have begun to work.  I am not sure whether I 
 will make use of Babelfish.  I read French fairly fluently -- and I have even 
 been to France a few times.  The main problem for me will probably be to 
 *properly* translate into English -- hopefully decent, or academic-like, 
 prose. I am not sure Babelfish can help with this fine-tuning. :-)
 
 As far as the "credits" (if I may say so) are concerned, we could either say 
 "translated by X & Y", or something along the lines of "Part A translated b X; 
 Part B translated by Y".  No problems for me.  Not going to enter a copyright 
 war. :-))
 
 At any rate, we should discuss possible terminology choices/problems (ie if 
 any arise), so that the article will sound uniform.
 
 -- Salvo

From: Marc van Woerkom <3d@hub.freebsd.org>
To: bartequi@neomedia.it
Cc: marc.vanwoerkom@fernuni-hagen.de,
	freebsd-gnats-submit@FreeBSD.org
Subject: Re: docs/30008: This document should be translated, commented and added
Date: Mon,  8 Oct 2001 10:28:22 -0700 (PDT)

 This sounds like your french is better than mine.
  
 Then I suppose the following - could you do a first translation
 and I try to proofread the English and fill the gaps or
 part you have no fun doing?
 
 This would yield a "translated by X with a little help from Y" :-)
 
 Regards,
 mArc
 

From: Salvo Bartolotta <bartequi@neomedia.it>
To: 3d@FreeBSD.org, Marc van Woerkom <3d@hub.freebsd.org>
Cc: bartequi@neomedia.it, marc.vanwoerkom@fernuni-hagen.de,
	freebsd-gnats-submit@FreeBSD.org
Subject: Re: docs/30008: This document should be translated, commented and added
Date: Mon, 08 Oct 2001 20:11:24 +0200 (CEST)

 This sounds as if you are pessimistic.  Translating part(s) of the article may 
 be a pleasant exercise.  Nobody is urging you on.  No patent or $$$ involved.  
 :-)
 
 IIUC, you have read [part of] the article, but you haven't yet started 
 translating.  I might begin to attack the first (?) part in the next few days.
 
 Well, spare time is the main issue here *sigh*
 
 -- Salvo

From: Marc Ernst Eddy van Woerkom <Marc.Vanwoerkom@FernUni-Hagen.de>
To: bartequi@neomedia.it
Cc: 3d@FreeBSD.org, 3d@hub.freebsd.org, bartequi@neomedia.it,
	marc.vanwoerkom@FernUni-Hagen.de, freebsd-gnats-submit@FreeBSD.org
Subject: Re: docs/30008: This document should be translated, commented and
         added
Date: Tue, 9 Oct 2001 14:08:03 +0200 (MET DST)

 Starting is the problem. :)
 So I take part two. 
 
 
 By the way, have you seen
 
 http://www.osnews.com/story.php?news_id=153
 
 This is another bit on soft updates that should be kept as well.
 I ask them for reproduction.
 
 Regards,
 Marc

From: Salvo Bartolotta <bartequi@neomedia.it>
To: freebsd-gnats-submit@FreeBSD.org, 3d@FreeBSD.org
Cc:  
Subject: Re: docs/30008: This document should be translated, commented and added
Date: Mon, 22 Apr 2002 21:04:11 +0200 (CEST)

 This message is in MIME format.
 
 ---MOQ1019502250527e5a8eeb16cdbdce7846a30fa5d69c
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 8bit
 
 Dear FreeBSD doc'ers,
 
 I've translated about one half of the article -- infinite shame on me[*] -- 
 and I'm working on the second half.  Meanwhile I submit this first draft for 
 your review/comments/flames/whatsoever.
 
 [*] See time(6).
 
 ---MOQ1019502250527e5a8eeb16cdbdce7846a30fa5d69c
 Content-Type: text/html; name="index.html"; charset=ISO-8859-1
 Content-Transfer-Encoding: 8bit
 Content-Disposition: inline; filename="index.html"
 
 
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
   <head>
     <meta name="generator" content="HTML Tidy, see www.w3.org">
     <title>Softupdates and Journaling Filesystems</title>
     <meta name="GENERATOR" content=
     "Modular DocBook HTML Stylesheet Version 1.71 ">
     <link rel="NEXT" title="Write Caching and Reboot"
     href="x30.html">
   </head>
 
   <body class="ARTICLE" bgcolor="#FFFFFF" text="#000000" link=
   "#0000FF" vlink="#840084" alink="#0000FF">
     <div class="ARTICLE">
       <div class="TITLEPAGE">
         <h1 class="TITLE"><a name="AEN2">Softupdates and Journaling 
 	  Filesystems
         </a></h1>
 
         <div class="AUTHORGROUP">
           <a name="AEN4"></a>
 
           <h3 class="AUTHOR"><a name="AEN5">Thomas Pornin</a></h3>
 
           <div class="AFFILIATION">
             <div class="ADDRESS">
               <p class="ADDRESS">thomas.pornin@ens.fr</p>
             </div>
           </div>
         </div>
 
         <div>
           <div class="ABSTRACT">
             <a name="AEN12"></a>
 
             <p>5 mai 2000</p>
           </div>
         </div>
 
         <div>
           <div class="ABSTRACT">
             <a name="AEN14"></a>
 
             
               <p>This is an introductory paper on the principles of
 		softupdates and filesystem journaling.  It deals 
 		mostly with Linux and the free BSD systems, but it can
 		apply to other operating systems. This is not a reference
 		text.  I wrote it after I had gained insight into the
 		problem; if I made any mistakes anywhere, send me an 
 		e-mail message, and I'll make corrections.  Contact me
 		for release permission.  The original is available (in 
 		html) here:
                     <a href="http://www.di.ens.fr/~pornin/jfs.html"
                     target=
                     "_top">http://www.di.ens.fr/~pornin/jfs.html</a>
             </p>
           </div>
         </div>
         <hr>
       </div>
 
       <div class="SECT1">
         <h1 class="SECT1"><a name="AEN17">1. Introduction</a></h1>
 
         <p>Accidents will happen. Kernel bugs, hardware failures, 
 	 power failures, students fooling around: there are a good number of
 	 causes, which cannot all be made negligible.   When you manage
 	 a filesystem, you feel like reconciling the following (fairly 
 	 conflicting) goals:</p>
 
         <ol type="1">
           <li>
             <p>it should be fast</p>
           </li>
 
           <li>
             <p>in case of crash, you should lose the least possible 
 	     data</p>
           </li>
 
           <li>
             <p>in case of crash, it should recover as quickly as possible</p>
           </li>
 
           <li>
             <p>in case of crash, it should recover automatically, 
 	     without human intervention (certain sysadmins sleep at 
 	     night)</p>
           </li>
         </ol>
 
         <p>Let's lay our cards on the table: ext2 only fulfils 1.
 	Run in synchronous mode, it does 2 and 4, but not at all 1.
         The traditional ufs/ffs (BSD, Solaris...)
         fulfils 2 and 4, and does not behave very well towards 1 in 
 	certain cases (but this entails a far less serious limitation
 	than ext2 running in synchronous mode).  Ffs with softupdates 
 	does 1, 2 and 4, that is, it remains safe while running almost 
 	as fast as ext2.  Point 3 is potentially attainable but it is
         still theoretical.  Ext3, or more generally journaling
 	filesystems, fulfils 1-4 naturally, but the cost
         in performance (by comparison with 1) is a little
         higher than that of using softupdates -- yet it remains acceptable.
         Note that ext3 is still being developed, the journaling 
 	of data (cf. below) entailing the division by two of certain 
 	performances.</p>
       </div>
     </div>
 
     <div class="NAVFOOTER">
       <hr align="LEFT" width="100%">
 
       <table summary="Footer navigation table" width="100%" border=
       "0" cellpadding="0" cellspacing="0">
         <tr>
           <td width="33%" align="left" valign="top">&nbsp;</td>
 
           <td width="34%" align="center" valign="top">&nbsp;</td>
 
           <td width="33%" align="right" valign="top"><a href=
           "x30.html" accesskey="N">Next</a></td>
         </tr>
 
         <tr>
           <td width="33%" align="left" valign="top">&nbsp;</td>
 
           <td width="34%" align="center" valign="top">&nbsp;</td>
 
           <td width="33%" align="right" valign="top">Write Caching 
           and Reboot</td>
         </tr>
       </table>
     </div>
   </body>
 </html>
 
 
 ---MOQ1019502250527e5a8eeb16cdbdce7846a30fa5d69c
 Content-Type: text/html; name="x30.html"; charset=ISO-8859-1
 Content-Transfer-Encoding: 8bit
 Content-Disposition: inline; filename="x30.html"
 
 
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
   <head>
     <meta name="generator" content="HTML Tidy, see www.w3.org">
     <title>Write Caching and Reboot</title>
     <meta name="GENERATOR" content=
     "Modular DocBook HTML Stylesheet Version 1.71 ">
     <link rel="HOME" title=
     "Softupdates and Journaling Filesystems" href=
     "index.html">
     <link rel="PREVIOUS" title=
     "Softupdates and Journaling Filesystems" href=
     "index.html">
     <link rel="NEXT" title=
     "Advanced Fault Tolerance Methods" href=
     "x47.html">
   </head>
 
   <body class="SECT1" bgcolor="#FFFFFF" text="#000000" link=
   "#0000FF" vlink="#840084" alink="#0000FF">
     <div class="NAVHEADER">
       <table summary="Header navigation table" width="100%" border=
       "0" cellpadding="0" cellspacing="0">
         <tr>
           <th colspan="3" align="center">Softupdates and Journaling Filesystems
           </th>
         </tr>
 
         <tr>
           <td width="10%" align="left" valign="bottom"><a href=
           "index.html" accesskey=
           "P">Previous</a></td>
 
           <td width="80%" align="center" valign="bottom">
           </td>
 
           <td width="10%" align="right" valign="bottom"><a href=
           "x47.html" accesskey="N">Next</a></td>
         </tr>
       </table>
       <hr align="LEFT" width="100%">
     </div>
 
     <div class="SECT1">
       <h1 class="SECT1"><a name="AEN30">2. Write Caching and Reboot</a></h1>
 
       <p>When discussing filesystem operation, it is convenient to 
 	consider two items: data and metadata. By "data" we mean the
 	content of files.  By "metadata" we mean the content of 
 	directories, block allocation structures, and all other matters
 	connected with administration.  Losing metadata is very painful,
 	because it compromises the structure itself of a filesystem, 
 	therefore the loss of data may be significant (failure at point
 	2) and recovery is not necessarily automatic, requiring human
 	intervention (failure at point 4) (failure at point 3 as well, 
 	since human intervention takes time).</p>
 
       <p>For performance reasons, reads and writes should be cached in 
 	memory, that is:</p>
 
       <ul>
         <li>
           <p>every piece of data being read crosses an unused area of
 	   memory; in this fashion, if the data needs to be read again
 	   and it is still in memory, a disk access is avoided</p>
         </li>
 
         <li>
           <p>every piece of data being written crosses an unused 
 	   area of memory first, which allows the system to group
 	   writes together on adjacent areas; which, in practice, speeds
 	   up things significantly.</p>
         </li>
       </ul>
 
       <p>What concerns us here is write caching.  One of 
 	its side effects is that, in case of sudden crash,
 	the last writes (scheduled but not yet performed) are lost,
 	since memory contents are not preserved across reboots.
 	We may thus lose data (annoying, but not too annoying)
 	and metadata (which can be really painful).</p>
 
       <p>Various ways of countering this effect have been developed
 	on different systems.  Two traditional methods first:</p>
 
       <ul>
         <li>
           <p>&Agrave; l'ext2: that's "Linus Torvalds'" way.
 	   
           The problem is not dealt with.  Data are written in large
 	  blocks on disk in order to achieve maximum speed.
           The rest is immaterial, it is performance (and the benchmarks 
 	  published in "Wired" or "PC Expert") that matters.
           That actually attains high performances and the code 
 	  dealing with it is small, easy to debug.        
 
           When a crash occurs, the question is not "will fsck work?"
           but "Where have I put my backups?" (Linus himself dixit:
 	  "Nobody sensible would think of fsck as an alternative to 
 	  backups"). That is very acceptable for workstation use,
           in which case users are sitting in front of their machines
 	  when they are switched on, and they are accustomed to 
 	  reinstalling every now and then anyway.  Those who 
           seek safety may wish to consider synchronous mode,
           i.e. without write caching.  It has considerable reboot 
 	  tolerance (yet it needs fsck all the same), but each write 
 	  operation laaaaabo(u)rs to complete.</p>
         </li>
 
         <li>
           <p>&Agrave; l'ufs (ffs is the acronym of the last version 
           of the Unix FileSystem, the first 'f' standing for 'fast',
           according to the method of the much-missed &Eacute;mile 
 	  Cou&eacute;): the system distinguishes data writes from 
 	  metadata writes; the latter are synchronous (without write
 	  caching).  This makes it possible to write a (single) file
 	  as quickly as does ext2, but creating or removing many 
 	  small files labours.  This is what is standard under Solaris;
 	  it is clearly seen when decompressing a source archive
 	  containing a large number of files.  This method is called
           "metadata synchronous update".</p>
         </li>
       </ul>
 
       <p>With the two traditional methods, in case of crash, it is
 	necessary to make sure that everything in the filesystem works,
 	therefore fsck at boot time; and this fsck has to cover the entire
 	filesystem, which takes time on a disk of many GBs.</p>
     </div>
 
     <div class="NAVFOOTER">
       <hr align="LEFT" width="100%">
 
       <table summary="Footer navigation table" width="100%" border=
       "0" cellpadding="0" cellspacing="0">
         <tr>
           <td width="33%" align="left" valign="top"><a href=
           "index.html" accesskey=
           "P">Previous</a></td>
 
           <td width="34%" align="center" valign="top"><a href=
           "index.html" accesskey="H">Summary</a></td>
 
           <td width="33%" align="right" valign="top"><a href=
           "x47.html" accesskey="N">Next</a></td>
         </tr>
 
         <tr>
           <td width="33%" align="left" valign="top">Softupdates and
           Journaling Filesystems</td>
 
           <td width="34%" align="center" valign="top">&nbsp;</td>
 
           <td width="33%" align="right" valign="top">
           Advanced Fault Tolerant Methods</td>
         </tr>
       </table>
     </div>
   </body>
 </html>
 
 
 ---MOQ1019502250527e5a8eeb16cdbdce7846a30fa5d69c--

From: Salvo Bartolotta <bartequi@neomedia.it>
To: freebsd-gnats-submit@FreeBSD.org, 3d@FreeBSD.org
Cc:  
Subject: Re: docs/30008: This document should be translated, commented and added
Date: Sat, 25 May 2002 17:29:10 +0200 (CEST)

 This message is in MIME format.
 
 ---MOQ1022340550bbe564e284cb9c4c0461b687f576a955
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 8bit
 
 Dear FreeBSD doc'ers,
 
 I've translated the central part (i.e. part III) of the document.  This draft, 
 which I submit for your review/comments/flames/whatever, will (hopefully) give 
 you the gist of Pornin's article.
 
 Although I have benefited from a number of effective suggestions from Giorgos 
 (very kind and helpful, as always), neverthelss I am fully to blame for 
 anything wrong/queer/inconsistent.  Shame on me (if any :-)
 ---MOQ1022340550bbe564e284cb9c4c0461b687f576a955
 Content-Type: text/html; name="x47.html"; charset=ISO-8859-1
 Content-Transfer-Encoding: 8bit
 Content-Disposition: inline; filename="x47.html"
 
 
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
   <head>
     <meta name="generator" content="HTML Tidy, see www.w3.org">
     <title>Advanced Fault Tolerant Methods</title>
     <meta name="GENERATOR" content=
     "Modular DocBook HTML Stylesheet Version 1.71 ">
     <link rel="HOME" title=
     "Softupdates and Journaling Filesystems" href=
     "index.html">
     <link rel="PREVIOUS" title=
     "Write Caching and reboot" href="x30.html">
     <link rel="NEXT" title="Different Questions" href="x95.html">
   </head>
 
   <body class="SECT1" bgcolor="#FFFFFF" text="#000000" link=
   "#0000FF" vlink="#840084" alink="#0000FF">
     <div class="NAVHEADER">
       <table summary="Header navigation table" width="100%" border=
       "0" cellpadding="0" cellspacing="0">
         <tr>
           <th colspan="3" align="center">Softupdates and Journaling filesystems</th>
         </tr>
 
         <tr>
           <td width="10%" align="left" valign="bottom"><a href=
           "x30.html" accesskey="P">Previous</a></td>
 
           <td width="80%" align="center" valign="bottom">
           </td>
 
           <td width="10%" align="right" valign="bottom"><a href=
           "x95.html" accesskey="N">Next</a></td>
         </tr>
       </table>
       <hr align="LEFT" width="100%">
     </div>
 
     <div class="SECT1">
       <h1 class="SECT1"><a name="AEN47">3. Advanced Fault Tolerant Methods</a></h1>
 
       <p>Let us specify, incidentally, what the mechanics can ensure:
 	each write of a sector (512 bytes) is atomic, i.e. once it has 
 	been started, it is completed even though the power goes down,
 	the kernel crashes and the processor catches fire.</p>
 
       <div class="SECT2">
         <h2 class="SECT2"><a name="AEN50">3.1. Deferred ordered
         write</a></h2>
 
         <p>First of all, people proposed the "deferred ordered write": 
 	metadata updates are asynchronous, but they are performed in
 	the [proper/correct] order.  That is, the system quickly returns
 	execution to applications, saying "ok, everything is all right,
 	writes have been carried out", but it performs writes in the 
 	background at disk speed, paying attention to order; e.g. the 
 	creation of numerous files in the same directory actually
 	involves numerous updates of the same disk portion, and the 
 	system can group them together and carry them out in one single
 	access.  Yet "medatada updates" are ordered, that is, there 
 	are dependencies between various updates: when a file is created
 	in a directory A and then another file is created in the same
 	directory, this second operation needs to take place at the same
 	time, or after the first one -- certainly not before.</p>
 
         <p>Deferred ordered writes pose the following problem: it is
 	easy to create cyclic dependencies, which block the system or   
 	else require a non-atomic update, and so a crash at the "wrong"
 	moment puts us in a delicate position.  This is rare, but Murphy 
 	arranges for it to happen.  Typical example: I move a file from
 	directory A to directory B, and, almost simultaneously, I 
 	move a file from directory B to directory A.</p>
       </div>
 
       <div class="SECT2">
         <h2 class="SECT2"><a name="AEN54">3.2. Softupdates</a></h2>
 
         <p>To pull off the coup, people developed "softupdates".
 	This is derived from a paper by Ganger and Patt (from the 
 	University of Michigan).  The *BSD implementation comes from
 	a certain Mr. McKusick (a key player in the original BSD 
 	project).  As far as I have understood, it would have been 
 	sponsored by Sun (which is interested in its inclusion in 
 	Solaris), and an agreement would have been made: when the code 
 	has been debugged, it will pass to the BSD license; which is 
 	not yet the case at present.  From the moment the change in 
 	license has taken place, FreeBSD will include the code in its
 	kernel by default; currently, it is necessary to recompile the 
 	kernel in order to get softupdates.  I don't know what NetBSD 
 	and OpenBSD will do. Probably the same.</p>
 
         <p>The principle of softupdates consists in maintaining a twofold
 	wait file; updates arrive in a wait buffer first, and then    
 	they pass, one by one, to a second buffer, where dependencies
 	are checked.  If an update completes a dependency loop, it is
 	sent back to the wait buffer, better times will come; the rest
 	of the cycle passes to a list with higher update priority.
 	This algorithm is similar to what CVS does in order to merge
 	various modifications of the same file.</p>
 
         <p>In fact, softupdates entails this:</p>
 
         <ul>
           <li>
             <p>Good filesystem performance, even in the decompression
 	    of numerous small files.  My benchmarks show that decompressing 
 	    the sources of an egcs takes 10% more time than ext2, on 
 	    the same machine and on the same portion of the disk -- your
 	    mileage may vary, as the 'Mericans say, with your
 	    hardware.</p>
           </li>
 
           <li>
             <p>Excellent crash tolerance. I would even say that fsck
 	    is warranted to recover all by itself, unless the crash
 	    is due to the disk itself (in which case whatever it does
 	    before stopping is immaterial; however, no filesystems
 	    can tolerate that).</p>
           </li>
 
           <li>
             <p>Specifically, the FFS implementation ensures upward 
 	    compatibility: the filesystem is unmounted, then it is 
 	    remounted without softupdates, and this works perfectly. 
 	    That's a painless upgrade.</p>
           </li>
 
           <li>
             <p>Fsck takes a long time, since it has to traverse the entire
 	     filesystem.</p>
           </li>
 
           <li>
             <p>When a file is deleted, its place is not immediately 
 	     freed for reuse, but this can take as much as 30 seconds.
 	     This is because the wait buffer is untidy; therefore the
 	     system does not traverse the information contained therein
 	     when seeking free blocks for its files; when a file is
 	     deleted, its blocks can thus be reallocated only when  
 	     the [related] update reaches the second-level buffer.
 	     In practice, it is not a big deal, but you might run into
 	     trouble when you do a "make world" (recompilation and 
 	     reinstallation of the base system, on every good BSD 
 	     system: since the whole system is reinstalled in a short
 	     time, the binaries in /bin, in particular, are deleted, 
 	     and new ones are immediately placed there again, which 
 	     produces "frictional occupation" [Cf. "frictional
 	     unemployment": here English paralles French.  N.o.T]. If
 	     saturation point is reached, it means trouble. I myself
 	     have run into this case, it is not fiction.  This is 
 	     typical of systems with a small / partition, since it is 
 	     separate from /var, /usr, and /tmp.</p>
           </li>
         </ul>
 
         <p>Let us note that in the case of fsck it is theoretically 
 	possible to accelerate recovery significantly.  Essentially,
  	it would be a matter of performing updates in such a way that,
 	in case of crash, the only inconvenience consisted in missing 
 	blocks, that is, blocks that had not come back to the free 
 	blocks spool yet, albeit not referenced elsewhere in the 
 	filesystem.  In this case, the filesystem could be reutilized
 	immediately, and fsck could be run in the background.  Here is 
 	what would fulfil point 3.  This possibility has been suggested, 
 	I do not know whether it will be carried out, but it clearly should
 	[The feature is already implemented in FreeBSD 5-CURRENT.  N.o.T.].</p>
 
         <p>As a whole, softupdates is a fine mechanism, elegant and 
 	effective. Cf. <a href=
         "http://www.ece.cmu.edu/~ganger/papers/CSE-TR-254-95/"
         target=
         "_top">http://www.ece.cmu.edu/~ganger/papers/CSE-TR-254-95/</a></p>
       </div>
 
       <div class="SECT2">
         <h2 class="SECT2"><a name="AEN73">3.3. Log-structured
         filesystems</a></h2>
 
         <p>There are also "log-structured filesystems".  The idea is 
 	simple: all writes (data et metadata) are done in an
 	uninterrupted flow of operations.  Effort is shifted onto  
 	reading, since finding a piece of data may be rather complicated
 	in such a scheme.  Actually, it is necessary to "garbage-collect"
 	the flow of operations (the log) retrospectively in order to 
 	find the requisite information.  There exist some more or less
 	prototypal implementations for BSD and Linux, named LFS
         (cf <a href=
         "http://collective.cpoint.net/prof/lfs/" target=
         "_top">http://collective.cpoint.net/prof/lfs/</a> for Linux). 
 
         In a filesystem, reads are usually more frequent than writes.
 	This is not the case for what lives in /var/log, where LFS
 	can be practical.  Nevertheless, the use of LFS is marginal.</p>
       </div>
 
       <div class="SECT2">
         <h2 class="SECT2"><a name="AEN77">3.4.
         Journaling</a></h2>
 
         <p>Finally, there is journaling.  Journaling is, as it were,
 	"transactional": when the system wants to make a series of
 	updates, it builds a new version of the related metadata in a
 	different place in the filesystem; then, when this new version
 	(called "transaction", a concept connected with databases) is 
 	ready, it switches to the new version in one "atomic" 
 	operation [atomic relates to "atomos", a Greek word meaning 
 	"indivisible".  Here it indicates that the system switches to
 	the new version (when it is ready) in one single operation, 
 	therefore preventing any possible data corruption or "intermediate" 
 	states.  N.o.T.].  Thus the filesystem is always in a consistent
 	state.</p>
 
         <p>To be more precise [warning: several technical details follow. 
 	N.o.T.]: when metadata updates need to be performed, the new 
 	version is built in a particular region of the disk, namely the
 	journal.</p>  
 
 	<p>Incidentally, in ext3, the journal is a file like 
 	any other, referenced by a special superblock field.  The 
 	final version of ext3 will automatically create the journal
 	if it is not present, and will not show it up in the filesystem;
 	which will avoid its accidental deletion.</p>
 
 	<p>The preparation of the new version entails the inclusion of
 	all the requisite items; in particular, if there are any circular
 	dependencies, the whole cycle is within.  Once the new version
 	is ready, a commit operation is performed: the "good" sector 
 	is modified so as to point to the new version instead of the 
 	old one.  Next, the new version is copied over the old one,
 	and a second commit is performed to free its place in the 
 	journal.</p>
 
 	<p>As a side note, you could simply consider marking the journal
 	modified, but this would fragment it too much; since it is 
 	always used, this is not desirable at all.</p>
 
         <p>In case of accidental crash, recovery is necessary, 
 	which consists in traversing the journal in order to:</p>
 
         <ul>
           <li>
             <p>discard transactions not yet finished.</p>
           </li>
 
           <li>
             <p>finish copying transactions for which the first commit, 
             but not the second, has been performed.</p>
           </li>
         </ul>
 
         <p>Since the journal is typically 100 times smaller than the 
 	filesystem, recovery is very fast (it's the difference between
 	30 seconds and an hour).</p>
 
         <p>You will notice that, at the end of the process, each piece
 	of metadata is written twice (and read once, but memory buffers 
 	are nevertheless useful in this instance).  In the case of 
 	metadata, that is not a serious issue, since metadata is small,
 	so it is the time to move the disk heads [i.e. seek latency] 
 	that is important.  Since everything works asynchronously 
 	between two commits, the kernel optimizes this sort of things
 	very well. That's why a journaling filesystem is (nearly) as 
 	fast as an FFS with softupdates (in fact, it can be shown 
 	that softupdates remains faster so long as the system has a 
 	good amount of memory, but the reverse applies when it swaps
 	heavily), the difference in speed being very small, smaller
 	than that between ext2 and ffs/softupdates.</p>
 
         <p>On the other hand, ext3 in its present form (0.0.2d) is 
 	also a journaling filesystem.  In this instance, the problem
 	of double writes is noticeable [ext3 journalizes both data and
 	metadata.  N.o.T.], and actually its solution means 
 	reducing by half the time to write a file.  This problem
 	will be solved in a later version (0.0.4 in theory -- in fact,
 	the code already exists, but has not been sufficiently tested 
 	to be activated with reasonable safety).  There are various 
 	safety issues to be taken into account.  In rejecting a 
 	transaction not yet committed, problems may arise if the 
 	blocks have already begun to fill with data from another file.
 	It is rather difficult to recover pieces of a priviledged file
 	within another.  Stephen Tweedie (the developer of ext3) says 
 	that he has thought about this, and that the necessary framework 
 	has already been put in place.</p>
 
         <p>There are other journaling filesystems, apart from ext3.
         Linux has ReiserFS, whose latest version includes a journaling
 	layer handling only metadata.  Reiser, its author, has been 
 	heard ranting about a new form of super-journaling which cleans
 	all this up.  This super-journaling will be present in the next
 	version of ReseirFS.  Apart from those, at least another two 
 	operating systems have had journaling filesystems in their 
 	"production" versions for a certain time: Tru64 (the former OSF, 
 	Digital/Compaq's Unix for Alpha) has advfs, and it works rather
 	well, and Windows NT has ntfs.  The latter has been present for at
 	least five years and is really robust [fortunately, since
 	NT has a tendency to crash often - N.o.A.].  Furthermore, SGI
 	is porting its journaling filesystem (XFS) to Linux, and it is
 	beginning to distribute the code under GPL; IBM is also one of
 	the party, with its JFS (which comes from AIX).</p>
 
 	<p>Journaling makes it possible to attain points 1 to 4.
 	On the other hand, ext3 remains compatible with ext2:  an ext3
 	filesystem can be unmounted and then remounted as ext2; which
 	works seamlessly.  In my opinion, ext3 will be superior to 
 	softupdates when pure metadata journaling has been implemented, 
 	unless "background fsck" has been set up for softupdates [it 
 	actually is, under FreeBSD 5.0 -CURRENT.  N.o.T.].  There 
 	might be other factors that can make a difference, though.  
 	For example, ext2/3 is simpler and requires less CPU and code
 	in order to run; but ffs has a better directory structure 
 	(binary tree instead of a linear list), which speeds up write
 	access to directories containing a large number of files (e.g.
 	a traditional news spool).  Even in this instance, the OS plays
 	an important role, Linux having a tendency to smooth over 
 	certain difficulties thanks to dcache [Linux's VFS layer 
 	maintains a cache of currently active and recently used names. 
 	This cache is referred to as the dcache.  N.o.T.].</p>
 
         <p>Journaling is also an elegant means of not losing one's 
 	metadata.  I very much love the transactional features. And, 
 	on the other hand, I have been using ext3 for all my partitions 
 	(except /tmp) for several months, without any problems.  In 
 	this case, too, <i class="EMPHASIS">your mileage may vary</i>.</p>
       </div>
     </div>
 
     <div class="NAVFOOTER">
       <hr align="LEFT" width="100%">
 
       <table summary="Footer navigation table" width="100%" border=
       "0" cellpadding="0" cellspacing="0">
         <tr>
           <td width="33%" align="left" valign="top"><a href=
           "x30.html" accesskey="P">Previous</a></td>
 
           <td width="34%" align="center" valign="top"><a href=
           "index.html" accesskey="H">Summary</a></td>
 
           <td width="33%" align="right" valign="top"><a href=
           "x95.html" accesskey="N">Next</a></td>
         </tr>
 
         <tr>
           <td width="33%" align="left" valign="top">Write Caching 
           and reboot</td>
 
           <td width="34%" align="center" valign="top">&nbsp;</td>
 
           <td width="33%" align="right" valign="top">Other Questions</td>
         </tr>
       </table>
     </div>
   </body>
 </html>
 
 
 ---MOQ1022340550bbe564e284cb9c4c0461b687f576a955--
State-Changed-From-To: analyzed->closed 
State-Changed-By: eadler 
State-Changed-When: Sat Feb 4 16:25:11 UTC 2012 
State-Changed-Why:  
this text has changed considerably since the PR has been filed. If this 
still needs translations / updates / etc please contact the French 
Documentation team and email doc@ with offers to help. The bug tracker 
doesn't work for things like this. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=30008 
>Unformatted:
