Received: from spf5.us4.outblaze.com (spf5.us4.outblaze.com [205.158.62.27]) by sdf.lonestar.org (8.13.1/8.12.10) with ESMTP id j15CDnHl005189 for ; Sat, 5 Feb 2005 12:13:49 GMT Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) by spf5.us4.outblaze.com (Postfix) with ESMTP id 6B93B76E69 for ; Sat, 5 Feb 2005 12:14:38 +0000 (GMT) Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1CxP2X-0004MP-To for migo@homemail.com; Sat, 05 Feb 2005 07:27:53 -0500 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1CxP0z-0003ho-Ed for gnu-arch-users@gnu.org; Sat, 05 Feb 2005 07:26:18 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1CxP0w-0003gq-UB for gnu-arch-users@gnu.org; Sat, 05 Feb 2005 07:26:15 -0500 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1CxP0w-0003ZE-5X for gnu-arch-users@gnu.org; Sat, 05 Feb 2005 07:26:14 -0500 Received: from [64.233.184.204] (helo=wproxy.gmail.com) by monty-python.gnu.org with esmtp (Exim 4.34) id 1CxOTq-00062n-06 for gnu-arch-users@gnu.org; Sat, 05 Feb 2005 06:52:02 -0500 Received: by wproxy.gmail.com with SMTP id 36so439488wri for ; Sat, 05 Feb 2005 03:52:01 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=pMyCuD5kU6iobVyb0ByIuPTIjVRf+wfvwMuFbmfonpxaaFUpH3xJcxAIUZImyhPXwjSwu/pzvxnd38duVwRzKw2rsgx6hsR25k9bE9T29SPN1wWklbXEdeWm/lprDQLvzuM2tCJfHBvJWBQ/0txziLpBYgyq7fWHMbcpTLTzp4o= Received: by 10.54.19.58 with SMTP id 58mr134649wrs; Sat, 05 Feb 2005 03:52:01 -0800 (PST) Received: by 10.54.19.59 with HTTP; Sat, 5 Feb 2005 03:52:01 -0800 (PST) Message-ID: Date: Sat, 5 Feb 2005 20:52:01 +0900 From: Miles Bader To: Catalin Marinas Subject: Re: [Gnu-arch-users] Re: arch performance with large trees In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <1103646999.28012.117.camel@pc1117> <20050204131006.GB4299@linux-sh.org> Cc: gnu-arch-users@gnu.org, Paul Mundt , Cliff Brake , Miika Komu , miles@gnu.org X-BeenThere: gnu-arch-users@gnu.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: snogglethorpe@gmail.com, miles@gnu.org List-Id: a discussion list for all things arch-ish List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: gnu-arch-users-bounces+migo=homemail.com@gnu.org Errors-To: gnu-arch-users-bounces+migo=homemail.com@gnu.org Status: RO Content-Length: 3538 Lines: 88 On Sat, 05 Feb 2005 09:06:10 +0000, Catalin Marinas wrote: > Ah, I thought I would see something like old_file -> new_file in the > log. The patch available through BKCVS (and > http://linux.bkbits.net:8080/linux-2.5/user=miles/gnupatch@3f182a18fK70hG0Z8Oi4JmJ34CkUZA) > does a full delete/add of these files so the information would need to > be retrieved from bkbits.net. Larry McVoy stated that it is OK to > retrieve some meta information but not the patch itself. You would > actually need to access the file diff from bkbits.net to get this > information Hmmm.... Looking at that file, there seem to be enough hints in the header comment to accurately guess rename info without looking at bkbits: Truly deleted files look like this: # BitKeeper/deleted/.del-rte_ma1_cb-ksram.ld~f045845fc65842ff # 2002/12/20 22:18:52-08:00 miles@lsi.nec.co.jp +0 -0 # Delete: arch/v850/rte_ma1_cb-ksram.ld # ... diff -Nru a/arch/v850/rte_ma1_cb-ksram.ld b/arch/v850/rte_ma1_cb-ksram.ld --- a/arch/v850/rte_ma1_cb-ksram.ld 2005-02-05 03:33:05 -08:00 +++ /dev/null Wed Dec 31 16:00:00 196900 @@ -1,157 +0,0 @@ Truly added files look like this: # include/asm-v850/v850e_uarta.h # 2003/07/18 10:10:42-07:00 miles@lsi.nec.co.jp +0 -0 # BitKeeper file /home/torvalds/v2.5/linux/include/asm-v850/v850e_uarta.h # # include/asm-v850/v850e_uarta.h # 2003/07/18 10:10:42-07:00 miles@lsi.nec.co.jp +278 -0 ... diff -Nru a/include/asm-v850/v850e_uarta.h b/include/asm-v850/v850e_uarta.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/include/asm-v850/v850e_uarta.h 2005-02-05 01:13:48 -08:00 @@ -0,0 +1,278 @@ Note there are two header comment entries for the same file, and the line count in the diff's single hunk matches the count in one of them (the other is zero). Renamed files look like this: # drivers/serial/v850e_uart.c # 2003/07/16 19:04:06-07:00 miles@lsi.nec.co.jp +215 -276 # Refactor v850 UART driver ... diff -Nru a/drivers/serial/v850e_uart.c b/drivers/serial/v850e_uart.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/serial/v850e_uart.c 2005-02-05 01:13:48 -08:00 @@ -0,0 +1,549 @@ ... diff -Nru a/drivers/serial/nb85e_uart.c b/drivers/serial/nb85e_uart.c --- a/drivers/serial/nb85e_uart.c 2005-02-05 01:13:48 -08:00 +++ /dev/null Wed Dec 31 16:00:00 196900 @@ -1,610 +0,0 @@ That you don't get the old name of the file, but the line count difference (549 - 610) is the same as the sum of the +/- numbers in the new file's header comment entry (+251 + -276). So you can at least distinguish renamed files from added/deleted files, and know the new name of renamed files; you can dramatically narrow the set of candidates for the old name of renamed files by looking at the line counts. I suspect in practice the only time you're going to get multiple candidates for the old name of a particular renamed file is when multiple identical files (e.g. boilerplate) are be renamed in the same patch; in that case probably a simple heuristic like "closest path prefix" will do a good job of find the right one (e.g., that will correctly handle the case where the same file in multiple architectures is being renamed identically). -Miles -- Do not taunt Happy Fun Ball. _______________________________________________ Gnu-arch-users mailing list Gnu-arch-users@gnu.org http://lists.gnu.org/mailman/listinfo/gnu-arch-users GNU arch home page: http://savannah.gnu.org/projects/gnu-arch/