Path: g2news1.google.com!news3.google.com!news4.google.com!newshub.sdsu.edu!
erode.bofh.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Kernel SCM saga..
Date: Wed, 06 Apr 2005 17:50:10 +0200
Message-ID: <3QkX8-7i5-9@gated-at.bofh.it>
X-Original-To: Kernel Mailing List <linux-ker...@vger.kernel.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 79
Organization: linux.* mail to news gateway
X-Original-Date: Wed, 6 Apr 2005 08:42:08 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504060800280.2215@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org


Ok,
 as a number of people are already aware (and in some cases have been
aware over the last several weeks), we've been trying to work out a
conflict over BK usage over the last month or two (and it feels like
longer ;). That hasn't been working out, and as a result, the kernel team
is looking at alternatives.

[ And apparently this just hit slashdot too, so by now _everybody_ knows ]

It's not like my choice of BK has been entirely conflict-free ("No,
really? Do tell! Oh, you mean the gigabytes upon gigabytes of flames we
had?"), so in some sense this was inevitable, but I sure had hoped that it
would have happened only once there was a reasonable open-source
alternative. As it is, we'll have to scramble for a while.

Btw, don't blame BitMover, even if that's probably going to be a very
common reaction. Larry in particular really did try to make things work
out, but it got to the point where I decided that I don't want to be in
the position of trying to hold two pieces together that would need as much
glue as it seemed to require.

We've been using BK for three years, and in fact, the biggest problem
right now is that a number of people have gotten very very picky about
their tools after having used the best. Me included, but in fact the
people that got helped most by BitKeeper usage were often the people
_around_ me who had a much easier time merging with my tree and sending
their trees to me.

Of course, there's also probably a ton of people who just used BK as a
nicer (and much faster) "anonymous CVS" client. We'll get that sorted out,
but the immediate problem is that I'm spending most my time trying to see
what the best way to co-operate is.

NOTE! BitKeeper isn't going away per se. Right now, the only real thing
that has happened is that I've decided to not use BK mainly because I need
to figure out the alternatives, and rather than continuing "things as
normal", I decided to bite the bullet and just see what life without BK
looks like. So far it's a gray and bleak world ;)

So don't take this to mean anything more than it is. I'm going to be
effectively off-line for a week (think of it as a normal "Linus went on a
vacation" event) and I'm just asking that people who continue to maintain
BK trees at least try to also make sure that they can send me the result
as (individual) patches, since I'll eventually have to merge some other
way.

That "individual patches" is one of the keywords, btw. One thing that BK 
has been extremely good at, and that a lot of people have come to like 
even when they didn't use BK, is how we've been maintaining a much finer- 
granularity view of changes. That isn't going to go away. 

In fact, one impact BK ha shad is to very fundamentally make us (and me in
particular) change how we do things. That ranges from the fine-grained
changeset tracking to just how I ended up trusting submaintainers with
much bigger things, and not having to work on a patch-by-patch basis any
more. So the three years with BK are definitely not wasted: I'm convinced 
it caused us to do things in better ways, and one of the things I'm 
looking at is to make sure that those things continue to work.

So I just wanted to say that I'm personally very happy with BK, and with 
Larry. It didn't work out, but it sure as hell made a big difference to 
kernel development. And we'll work out the temporary problem of having to 
figure out a set of tools to allow us to continue to do the things that BK 
allowed us to do.

Let the flames begin.

		Linus

PS. Don't bother telling me about subversion. If you must, start reading
up on "monotone". That seems to be the most viable alternative, but don't
pester the developers so much that they don't get any work done. They are
already aware of my problems ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!213.200.89.82.MISMATCH!
tiscali!newsfeed1.ip.tiscali.net!news.cid.net!bofh.it!news.nic.it!robomod
From: Greg KH <g...@kroah.com>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Wed, 06 Apr 2005 18:10:09 +0200
Message-ID: <3Qlgt-7FZ-5@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.8i
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 24
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Wed, 6 Apr 2005 09:00:41 -0700
X-Original-Message-ID: <20050406160041.GA28839@kroah.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Wed, Apr 06, 2005 at 08:42:08AM -0700, Linus Torvalds wrote:
> 
> So I just wanted to say that I'm personally very happy with BK, and with 
> Larry. It didn't work out, but it sure as hell made a big difference to 
> kernel development. And we'll work out the temporary problem of having to 
> figure out a set of tools to allow us to continue to do the things that BK 
> allowed us to do.

I'd also like to publicly say that BK has helped out immensely in the
past few years with kernel development, and has been one of the main
reasons we have been able to keep up such a high patch rate over such a
long period of time.  Larry, and his team, have been nothing but great
in dealing with all of the crap that we have been flinging at him due to
the very odd demands such a large project as the kernel has caused.  And
I definitely owe him a beer the next time I see him.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news4.google.com!news.glorb.com!nntp-server.pubsub.com!
bofh.it!news.nic.it!robomod
From: Paul Mackerras <pau...@samba.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 09:00:30 +0200
Message-ID: <3Qza6-3P7-17@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: VM 7.19 under Emacs 21.4.1
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 39
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 16:51:23 +1000
X-Original-Message-ID: <16980.55403.190197.751840@cargo.ozlabs.ibm.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Linus,

> That "individual patches" is one of the keywords, btw. One thing that BK 
> has been extremely good at, and that a lot of people have come to like 
> even when they didn't use BK, is how we've been maintaining a much finer- 
> granularity view of changes. That isn't going to go away. 

Are you happy with processing patches + descriptions, one per mail?
Do you have it automated to the point where processing emailed patches
involves little more overhead than doing a bk pull?  If so, then your
mailbox (or patch queue) becomes a natural serialization point for the
changes, and the need for a tool that can handle a complex graph of
changes is much reduced.

> In fact, one impact BK ha shad is to very fundamentally make us (and me in
> particular) change how we do things.

From my point of view, the benefits that flowed from your using BK
were:

* Visibility into what you had accepted and committed to your
  repository
* Lower latency of patches going into your repository
* Much reduced rate of patches being dropped

Those things are what have enabled us PPC developers to move away from
having our own trees (with all the synchronization problems that
entailed) and work directly with your tree.  I don't see that it is
the distinctive features of BK (such as the ability to do merges
between peer repositories) that are directly responsible for producing
those benefits, so I have hope that things can work just as well with
some other system.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: David Woodhouse <dw...@infradead.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 09:30:30 +0200
Message-ID: <3QzD8-4AX-31@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
Content-Type: text/plain
MIME-Version: 1.0
X-Mailer: Evolution 2.2.1.1 (2.2.1.1-2) 
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Srs-Rewrite: SMTP reverse-path rewritten from <dw...@infradead.org> 
by pentafluge.infradead.org
	See http://www.infradead.org/rpr.html
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 19
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 07 Apr 2005 08:18:50 +0100
X-Original-Message-ID: <1112858331.6924.17.camel@localhost.localdomain>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Wed, 2005-04-06 at 08:42 -0700, Linus Torvalds wrote:
> PS. Don't bother telling me about subversion. If you must, start reading
> up on "monotone". That seems to be the most viable alternative, but don't
> pester the developers so much that they don't get any work done. They are
> already aware of my problems ;)

One feature I'd want to see in a replacement version control system is
the ability to _re-order_ patches, and to cherry-pick patches from my
tree to be sent onwards. The lack of that capability is the main reason
I always hated BitKeeper.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!newshub.sdsu.edu!erode.bofh.it!
bofh.it!news.nic.it!robomod
From: Andrew Morton <a...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 11:00:17 +0200
Message-ID: <3QB21-6pm-19@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it>
X-Original-To: David Woodhouse <dw...@infradead.org>
X-Mailer: Sylpheed version 1.0.0 (GTK+ 1.2.10; i386-vine-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 20
Organization: linux.* mail to news gateway
X-Original-Cc: torva...@osdl.org, linux-ker...@vger.kernel.org
X-Original-Date: Thu, 7 Apr 2005 01:50:19 -0700
X-Original-Message-ID: <20050407015019.4563afe0.akpm@osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
	<1112858331.6924.17.ca...@localhost.localdomain>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

David Woodhouse <dw...@infradead.org> wrote:
>
> One feature I'd want to see in a replacement version control system is
>  the ability to _re-order_ patches, and to cherry-pick patches from my
>  tree to be sent onwards.

You just described quilt & patch-scripts.

The problem with those is letting other people get access to it.  I guess
that could be fixed with a bit of scripting and rsyncing.

(I don't do that for -mm because -mm basically doesn't work for 99% of the
time.  Takes 4-5 hours to out a release out assuming that nothing's busted,
and usually something is).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!news.maxwell.syr.edu!
newsfeed.icl.net!newsfeed.fjserv.net!feed.news.tiscali.de!newsfeed01.sul.t-online.de!
newsfeed00.sul.t-online.de!t-online.de!bofh.it!news.nic.it!robomod
From: Paul Mackerras <pau...@samba.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 11:30:20 +0200
Message-ID: <3QBv6-6Sr-7@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it> 
<3QB21-6pm-19@gated-at.bofh.it>
X-Original-To: Andrew Morton <a...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: VM 7.19 under Emacs 21.4.1
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 23
Organization: linux.* mail to news gateway
X-Original-Cc: David Woodhouse <dw...@infradead.org>, torva...@osdl.org,
	linux-ker...@vger.kernel.org
X-Original-Date: Thu, 7 Apr 2005 19:20:04 +1000
X-Original-Message-ID: <16980.64324.87931.513333@cargo.ozlabs.ibm.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
	<1112858331.6924.17.ca...@localhost.localdomain>
	<20050407015019.4563afe0.a...@osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Andrew Morton writes:

> The problem with those is letting other people get access to it.  I guess
> that could be fixed with a bit of scripting and rsyncing.

Yes.

> (I don't do that for -mm because -mm basically doesn't work for 99% of the
> time.  Takes 4-5 hours to out a release out assuming that nothing's busted,
> and usually something is).

With -mm we get those nice little automatic emails saying you've put
the patch into -mm, which removes one of the main reasons for wanting
to be able to get an up-to-date image of your tree.  The other reason,
of course, is to be able to see if a patch I'm about to send conflicts
with something you have already taken, and rebase it if necessary.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!newshub.sdsu.edu!erode.bofh.it!
bofh.it!news.nic.it!robomod
From: David Woodhouse <dw...@infradead.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 11:30:26 +0200
Message-ID: <3QBvc-6Sr-25@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it> 
<3QB21-6pm-19@gated-at.bofh.it>
X-Original-To: Andrew Morton <a...@osdl.org>
Content-Type: text/plain
MIME-Version: 1.0
X-Mailer: Evolution 2.0.4 (2.0.4-1.dwmw2.1) 
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Srs-Rewrite: SMTP reverse-path rewritten from <dw...@infradead.org> 
by pentafluge.infradead.org
	See http://www.infradead.org/rpr.html
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 21
Organization: linux.* mail to news gateway
X-Original-Cc: torva...@osdl.org, linux-ker...@vger.kernel.org
X-Original-Date: Thu, 07 Apr 2005 10:25:18 +0100
X-Original-Message-ID: <1112865919.24487.442.camel@hades.cambridge.redhat.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
	 <1112858331.6924.17.ca...@localhost.localdomain>
	 <20050407015019.4563afe0.a...@osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Thu, 2005-04-07 at 01:50 -0700, Andrew Morton wrote:
> (I don't do that for -mm because -mm basically doesn't work for 99% of
> the time.  Takes 4-5 hours to out a release out assuming that
> nothing's busted, and usually something is).

On the subject of -mm: are you going to keep doing the BK imports to
that for the time being, or would it be better to leave the BK trees
alone now and send you individual patches.

For that matter, will there be a brief amnesty after 2.6.12 where Linus
will use BK to pull those trees which were waiting for that, or will we
all need to export from BK manually?

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
news.ngi.it!bofh.it!news.nic.it!robomod
From: Andrew Morton <a...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 12:00:19 +0200
Message-ID: <3QBY7-7bC-7@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it> 
<3QB21-6pm-19@gated-at.bofh.it> <3QBv6-6Sr-7@gated-at.bofh.it>
X-Original-To: Paul Mackerras <pau...@samba.org>
X-Mailer: Sylpheed version 1.0.0 (GTK+ 1.2.10; i386-vine-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 40
Organization: linux.* mail to news gateway
X-Original-Cc: dw...@infradead.org, torva...@osdl.org,
	linux-ker...@vger.kernel.org
X-Original-Date: Thu, 7 Apr 2005 02:46:05 -0700
X-Original-Message-ID: <20050407024605.35515dcc.akpm@osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
	<1112858331.6924.17.ca...@localhost.localdomain>
	<20050407015019.4563afe0.a...@osdl.org>
	<16980.64324.87931.513...@cargo.ozlabs.ibm.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Paul Mackerras <pau...@samba.org> wrote:
>
> With -mm we get those nice little automatic emails saying you've put
>  the patch into -mm, which removes one of the main reasons for wanting
>  to be able to get an up-to-date image of your tree.

Should have done that ages ago..

>  The other reason,
>  of course, is to be able to see if a patch I'm about to send conflicts
>  with something you have already taken, and rebase it if necessary.

<hack, hack>

How's this?


This is a note to let you know that I've just added the patch titled

     ppc32: Fix AGP and sleep again

to the -mm tree.  Its filename is

     ppc32-fix-agp-and-sleep-again.patch

Patches currently in -mm which might be from yourself are

add-suspend-method-to-cpufreq-core.patch
ppc32-fix-cpufreq-problems.patch
ppc32-fix-agp-and-sleep-again.patch
ppc32-fix-errata-for-some-g3-cpus.patch
ppc64-fix-semantics-of-__ioremap.patch
ppc64-improve-mapping-of-vdso.patch
ppc64-detect-altivec-via-firmware-on-unknown-cpus.patch
ppc64-remove-bogus-f50-hack-in-promc.patch
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!newshub.sdsu.edu!erode.bofh.it!
bofh.it!news.nic.it!robomod
From: Andrew Morton <a...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 12:00:24 +0200
Message-ID: <3QBYc-7bC-33@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it> 
<3QB21-6pm-19@gated-at.bofh.it> <3QBvc-6Sr-25@gated-at.bofh.it>
X-Original-To: David Woodhouse <dw...@infradead.org>
X-Mailer: Sylpheed version 1.0.0 (GTK+ 1.2.10; i386-vine-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 26
Organization: linux.* mail to news gateway
X-Original-Cc: torva...@osdl.org, linux-ker...@vger.kernel.org
X-Original-Date: Thu, 7 Apr 2005 02:49:12 -0700
X-Original-Message-ID: <20050407024912.1c8c445b.akpm@osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
	<1112858331.6924.17.ca...@localhost.localdomain>
	<20050407015019.4563afe0.a...@osdl.org>
	<1112865919.24487.442.ca...@hades.cambridge.redhat.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

David Woodhouse <dw...@infradead.org> wrote:
>
> On Thu, 2005-04-07 at 01:50 -0700, Andrew Morton wrote:
> > (I don't do that for -mm because -mm basically doesn't work for 99% of
> > the time.  Takes 4-5 hours to out a release out assuming that
> > nothing's busted, and usually something is).
> 
> On the subject of -mm: are you going to keep doing the BK imports to
> that for the time being, or would it be better to leave the BK trees
> alone now and send you individual patches.

I really don't know - I'll continue to pull the bk trees for a while, until
we work out what the new (probably interim) regime looks like.

> For that matter, will there be a brief amnesty after 2.6.12 where Linus
> will use BK to pull those trees which were waiting for that, or will we
> all need to export from BK manually?
> 

I think Linus has stopped using bk already.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Russell King <rmk+l...@arm.linux.org.uk>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 12:10:10 +0200
Message-ID: <3QC7E-7sg-17@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it> 
<3QB21-6pm-19@gated-at.bofh.it> <3QBvc-6Sr-25@gated-at.bofh.it>
X-Original-To: David Woodhouse <dw...@infradead.org>
Mail-Followup-To: David Woodhouse <dw...@infradead.org>,
	Andrew Morton <a...@osdl.org>, torva...@osdl.org,
	linux-ker...@vger.kernel.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 34
Organization: linux.* mail to news gateway
X-Original-Cc: Andrew Morton <a...@osdl.org>, torva...@osdl.org,
	linux-ker...@vger.kernel.org
X-Original-Date: Thu, 7 Apr 2005 10:55:31 +0100
X-Original-Message-ID: <20050407105531.A19605@flint.arm.linux.org.uk>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<1112858331.6924.17.ca...@localhost.localdomain> 
<20050407015019.4563afe0.a...@osdl.org> 
<1112865919.24487.442.ca...@hades.cambridge.redhat.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Thu, Apr 07, 2005 at 10:25:18AM +0100, David Woodhouse wrote:
> On Thu, 2005-04-07 at 01:50 -0700, Andrew Morton wrote:
> > (I don't do that for -mm because -mm basically doesn't work for 99% of
> > the time.  Takes 4-5 hours to out a release out assuming that
> > nothing's busted, and usually something is).
> 
> On the subject of -mm: are you going to keep doing the BK imports to
> that for the time being, or would it be better to leave the BK trees
> alone now and send you individual patches.
> 
> For that matter, will there be a brief amnesty after 2.6.12 where Linus
> will use BK to pull those trees which were waiting for that, or will we
> all need to export from BK manually?

Linus indicated (maybe privately) that the end of his BK usage would
be immediately after the -rc2 release.  I'm taking that to mean "no
more BK usage from Linus, period."

Thinking about it a bit, if you're asking Linus to pull your tree,
Linus would then have to extract the individual change sets as patches
to put into his new fangled patch management system.  Is that a
reasonable expectation?

However, it's ultimately up to Linus to decide. 8)

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:  2.6 Serial core
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
news.cdlan.net!erode.bofh.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 17:20:19 +0200
Message-ID: <3QGXN-35R-39@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3Qza6-3P7-17@gated-at.bofh.it>
X-Original-To: Paul Mackerras <pau...@samba.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 100
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 08:10:21 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504070747580.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <16980.55403.190197.751...@cargo.ozlabs.ibm.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Thu, 7 Apr 2005, Paul Mackerras wrote:
> 
> Are you happy with processing patches + descriptions, one per mail?

Yes. That's going to be my interim, I was just hoping that with 2.6.12-rc2 
out the door, and us in a "calming down" period, I could afford to not 
even do that for a while.

The real problem with the email thing is that it ends up piling up: what 
BK did in this respect was that anythign that piled up in a BK repository 
ended up still being there, and a single "bk pull" got it anyway - so if 
somebody got ignored because I was busy with something else, it didn't add 
any overhead. The queue didn't get "congested".

And that's a big thing. It comes from the "Linus pulls" model where people 
just told me that they were ready, instead of the "everybody pushes to 
Linus" model, where the destination gets congested at times.

So I do not want the "send Linus email patches" (whether mboxes or a 
single patch per email) to be a very long-term strategy. We can handle it 
for a while (in particular, I'm counting on it working up to the real 
release of 2.6.12, since we _should_ be in the calm period for the next 
month anyway), but it doesn't work in the long run.

> Do you have it automated to the point where processing emailed patches
> involves little more overhead than doing a bk pull?

It's more overhead, but not a lot. Especially nice numbered sequences like
Andrew sends (where I don't have to manually try to get the dependencies
right by trying to figure them out and hope I'm right, but instead just
sort by Subject: line) is not a lot of overhead. I can process a hundred
emails almost as easily as one, as long as I trust the maintainer (which,
when it's used as a BK replacement, I obviously do).

However, the SCM's I've looked at make this hard. One of the things (the
main thing, in fact) I've been working at is to make that process really
_efficient_. If it takes half a minute to apply a patch and remember the
changeset boundary etc (and quite frankly, that's _fast_ for most SCM's
around for a project the size of Linux), then a series of 250 emails
(which is not unheard of at all when I sync with Andrew, for example)  
takes two hours. If one of the patches in the middle doesn't apply, things
are bad bad bad.

Now, BK wasn't a speed deamon either (actually, compared to everything
else, BK _is_ a speed deamon, often by one or two orders of magnitude),
and took about 10-15 seconds per email when I merged with Andrew. HOWEVER,
with BK that wasn't as big of an issue, since the BK<->BK merges were so
easy, so I never had the slow email merges with any of the other main
developers. So a patch-application-based SCM "merger" actually would need
to be _faster_ than BK is. Which is really really really hard.

So I'm writing some scripts to try to track things a whole lot faster.  
Initial indications are that I should be able to do it almost as quickly
as I can just apply the patch, but quite frankly, I'm at most half done,
and if I hit a snag maybe that's not true at all. Anyway, the reason I can
do it quickly is that my scripts will _not_ be an SCM, they'll be a very
specific "log Linus' state" kind of thing. That will make the linear patch
merge a lot more time-efficient, and thus possible.

(If a patch apply takes three seconds, even a big series of patches is not
a problem: if I get notified within a minute or two that it failed
half-way, that's fine, I can then just fix it up manually. That's why 
latency is critical - if I'd have to do things effectively "offline", 
I'd by definition not be able to fix it up when problems happen).

> If so, then your mailbox (or patch queue) becomes a natural
> serialization point for the changes, and the need for a tool that can
> handle a complex graph of changes is much reduced.

Yes. In the short term. See above why I think the congestion issue will 
really mean that we want to have parallell merging in the not _too_ 
distant future.

NOTE! I detest the centralized SCM model, but if push comes to shove, and
we just _can't_ get a reasonable parallell merge thing going in the short
timeframe (ie month or two), I'll use something like SVN on a trusted site
with just a few committers, and at least try to distribute the merging out
over a few people rather than making _me_ be the throttle.

The reason I don't really want to do that is once we start doing it that
way, I suspect we'll have a _really_ hard time stopping. I think it's a
broken model. So I'd much rather try to have some pain in the short run 
and get a better model running, but I just wanted to let people know that 
I'm pragmatic enough that I realize that we may not have much choice.

> * Visibility into what you had accepted and committed to your
>   repository
> * Lower latency of patches going into your repository
> * Much reduced rate of patches being dropped

Yes. 

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
news.ngi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 17:40:15 +0200
Message-ID: <3QHh5-3kA-39@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it>
X-Original-To: David Woodhouse <dw...@infradead.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 95
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 08:32:04 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504070810270.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <1112858331.6924.17.ca...@localhost.localdomain>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Thu, 7 Apr 2005, David Woodhouse wrote:
>
> On Wed, 2005-04-06 at 08:42 -0700, Linus Torvalds wrote:
> > PS. Don't bother telling me about subversion. If you must, start reading
> > up on "monotone". That seems to be the most viable alternative, but don't
> > pester the developers so much that they don't get any work done. They are
> > already aware of my problems ;)
> 
> One feature I'd want to see in a replacement version control system is
> the ability to _re-order_ patches, and to cherry-pick patches from my
> tree to be sent onwards. The lack of that capability is the main reason
> I always hated BitKeeper.

I really disliked that in BitKeeper too originally. I argued with Larry
about it, but Larry (correctly, I believe) argued that efficient and
reliable distribution really requires the concept of "history is
immutable". It makes replication much easier when you know that the known
subset _never_ shrinks or changes - you only add on top of it.

And that implies no cherry-picking.

Also, there's actually a second reason why I've decided that cherry-
picking is wrong, and it's non-technical. 

The thing is, cherry-picking very much implies that the people "up" the 
foodchain end up editing the work of the people "below" them. The whole 
reason you want cherry-picking is that you want to fix up somebody elses 
mistakes, ie something you disagree with.

That sounds like an obviously good thing, right? Yes it does.

The problem is, it actually results in the wrong dynamics and psychology 
in the system. First off, it makes the implicit assumption that there is 
an "up" and "down" in the food-chain, and I think that's wrong. It's 
increasingly a "network" in the kernel. I'm less and less "the top", as 
much as a "fairly central" person. And that is how it should be. I used to 
think of kernel development as a hierarchy, but I long since switched to 
thinking about it as a fairly arbitrary network.

The other thing it does is that it implicitly puts the burden of quality 
control at the upper-level maintainer ("I'll pick the good things out of 
your tree"), while _not_ being able to cherry-pick means that there is 
pressure in both directions to keep the tree clean.

And that is IMPORTANT. I realize that not cherry-picking means that people
who want to merge upstream (or sideways or anything) are now forced to do
extra work in trying to keep their tree free of random crap. And that's a
HUGELY IMPORTANT THING! It means that the pressure to keep the tree clean
flows in all directions, and takes pressure off the "central" point. In
onther words it distributes the pain of maintenance.

In other words, somebody who can't keep their act together, and creates 
crappy trees because he has random pieces of crud in it, quite 
automatically gets actively shunned by others. AND THAT IS GOOD! I've 
pushed back on some BK users to clean up their trees, to the point where 
we've had a number of "let's just re-do that" over the years. That's 
WONDERFUL. People are irritated at first, but I've seen what the end 
result is, and the end result is a much better maintainer. 

Some people actually end up doing the cleanup different ways. For example,
Jeff Garzik kept many separate trees, and had a special merge thing.
Others just kept a messy tree for development, and when they are happy,
they throw the messy tree away and re-create a cleaner one. Either is fine
- the point is, different people like to work different ways, and that's
fine, but makign _everybody_ work at being clean means that there is no
train wreck down the line when somebody is forced to try to figure out
what to cherry-pick.

So I've actually changed from "I want to cherry-pick" to "cherry-picking
between maintainers is the wrong workflow". Now, as part of cleaning up,
people may end up exporting the "ugly tree" as patches and re-importing it
into the clean tree as the fixed clean series of patches, and that's
"cherry-picking", but it's not between developers.

NOTE! The "no cherry-picking" model absolutely also requires a model of 
"throw-away development trees". The two go together. BK did both, and an 
SCM that does one but not the other would be horribly broken.

(This is my only real conceptual gripe with "monotone". I like the model,
but they make it much harder than it should be to have throw-away trees
due to the fact that they seem to be working on the assumption of "one
database per developer" rather than "one database per tree". You don't 
have to follow that model, but it seems to be what the setup is geared 
for, and together with their "branches" it means that I think a monotone 
database easily gets very cruddy. The other problem with monotone is 
just performance right now, but that's hopefully not _too_ fundamental).

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!news.cdlan.net!erode.bofh.it!bofh.it!news.nic.it!robomod
From: Rik van Riel <r...@redhat.com>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 18:50:11 +0200
Message-ID: <3QImK-4fb-15@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3Qlgt-7FZ-5@gated-at.bofh.it>
X-Original-To: Greg KH <g...@kroah.com>
X-X-Sender: r...@chimarrao.boston.redhat.com
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 20
Organization: linux.* mail to news gateway
X-Original-Cc: Linus Torvalds <torva...@osdl.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 12:40:42 -0400 (EDT)
X-Original-Message-ID: <Pine.LNX.4.61.0504071239490.12298@chimarrao.boston.redhat.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050406160041.GA28...@kroah.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Wed, 6 Apr 2005, Greg KH wrote:

> the very odd demands such a large project as the kernel has caused.  And
> I definitely owe him a beer the next time I see him.

Seconded.  Besides, now that the code won't be on bkbits
any more, it's safe to get Larry drunk ;)

Larry, thanks for the help you have given us by making
bitkeeper available for all these years.

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!newsread.com!news-xfer.newsread.com!
nntp.abs.net!news-FFM2.ecrc.net!newsfeed00.sul.t-online.de!t-online.de!
bofh.it!news.nic.it!robomod
From: Daniel Phillips <phill...@istop.com>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 19:10:08 +0200
Message-ID: <3QIG4-4Ee-5@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3Qza6-3P7-17@gated-at.bofh.it> 
<3QGXN-35R-39@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
User-Agent: KMail/1.7
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 30
Organization: linux.* mail to news gateway
X-Original-Cc: Paul Mackerras <pau...@samba.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 13:00:51 -0400
X-Original-Message-ID: <200504071300.51907.phillips@istop.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<16980.55403.190197.751...@cargo.ozlabs.ibm.com> 
<Pine.LNX.4.58.0504070747580.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Thursday 07 April 2005 11:10, Linus Torvalds wrote:
> On Thu, 7 Apr 2005, Paul Mackerras wrote:
> > Do you have it automated to the point where processing emailed patches
> > involves little more overhead than doing a bk pull?
>
> It's more overhead, but not a lot. Especially nice numbered sequences like
> Andrew sends (where I don't have to manually try to get the dependencies
> right by trying to figure them out and hope I'm right, but instead just
> sort by Subject: line)...

Hi Linus,

In that case, a nice refinement is to put the sequence number at the end of 
the subject line so patch sequences don't interleave:

   Subject: [PATCH] Unbork OOM Killer (1 of 3)
   Subject: [PATCH] Unbork OOM Killer (2 of 3)
   Subject: [PATCH] Unbork OOM Killer (3 of 3)
   Subject: [PATCH] Unbork OOM Killer (v2, 1 of 3)
   Subject: [PATCH] Unbork OOM Killer (v2, 2 of 3)
   ...

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!news3.google.com!news.glorb.com!
news.newsland.it!news.ngi.it!bofh.it!news.nic.it!robomod
From: Al Viro <v...@parcelfarce.linux.theplanet.co.uk>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 19:20:13 +0200
Message-ID: <3QIPP-4L0-27@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it> 
<3QHh5-3kA-39@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.1i
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 51
Organization: linux.* mail to news gateway
X-Original-Cc: David Woodhouse <dw...@infradead.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 18:10:07 +0100
X-Original-Message-ID: <20050407171006.GF8859@parcelfarce.linux.theplanet.co.uk>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<1112858331.6924.17.ca...@localhost.localdomain> 
<Pine.LNX.4.58.0504070810270.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Thu, Apr 07, 2005 at 08:32:04AM -0700, Linus Torvalds wrote:
> Also, there's actually a second reason why I've decided that cherry-
> picking is wrong, and it's non-technical. 
> 
> The thing is, cherry-picking very much implies that the people "up" the 
> foodchain end up editing the work of the people "below" them. The whole 
> reason you want cherry-picking is that you want to fix up somebody elses 
> mistakes, ie something you disagree with.

No.  There's another reason - when you are cherry-picking and reordering
*your* *own* *patches*.  That's what I had been unable to explain to
Larry and that's what made BK unusable for me.

As for the immutable history...  Ever had to read or grade students'
homework?
	* the dumbest kind: "here's an answer <expression>, whaddya
mean 'where's the solution'?".
	* next one: "here's how I've solved the problem: <pages of text
documenting the attempts, with many 'oops, there had been a mistake,
here's how we fix it'>".  
	* what you really want to see: series of steps leading to answer,
with clean logical structure that allows to understand what's being
done and verify correctness.

The first corresponds to "here's a half-meg of patch, it fixes everything".
The second is chronological history (aka "this came from our CVS, all bugs
are fixed by now, including those introduced in the middle of it; see
CVS history for details").  The third is a decent patch series.

And to get from "here's how I came up to solution" to "here's a clean way
to reach the solution" you _have_ to reorder.  There's also "here are
misc notes from today, here are misc notes from yesterday, etc." and to
get that into sane shape you will need to split, reorder and probably
collapse several into combined delta (possibly getting an empty delta
as the result, if later ones negate the prior).

The point being, both history and well, publishable result can be expressed
as series of small steps, but they are not the same thing.  So far all I've
seen in the area (and that includes BK) is heavily biased towards history part
and attempts to use this stuff for manipulating patch series turn into fighting
the tool.

I'd *love* to see something that can handle both - preferably with
history of reordering, etc. being kept.  IOW, not just a tree of changesets
but a lattice - with multiple paths leading to the same node.  So far
I've seen nothing of that kind ;-/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!proxad.net!
194.117.148.138.MISMATCH!pe2.news.blueyonder.co.uk!blueyonder!
border2.nntp.ams.giganews.com!border1.nntp.ams.giganews.com!nntp.giganews.com!
feeder2.cambrium.nl!feed.tweaknews.nl!newsfeed-0.progon.net!progon.net!bofh.it!
news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 19:40:27 +0200
Message-ID: <3QJ9o-4Zc-37@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3Qza6-3P7-17@gated-at.bofh.it> 
<3QGXN-35R-39@gated-at.bofh.it> <3QIG4-4Ee-5@gated-at.bofh.it>
X-Original-To: Daniel Phillips <phill...@istop.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 57
Organization: linux.* mail to news gateway
X-Original-Cc: Paul Mackerras <pau...@samba.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 10:38:06 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504071023190.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <16980.55403.190197.751...@cargo.ozlabs.ibm.com>
 <Pine.LNX.4.58.0504070747580.28...@ppc970.osdl.org> 
 <200504071300.51907.phill...@istop.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Thu, 7 Apr 2005, Daniel Phillips wrote:
> 
> In that case, a nice refinement is to put the sequence number at the end of 
> the subject line so patch sequences don't interleave:

No. That makes it unsortable, and also much harder to pick put which part 
of the subject line is the explanation, and which part is just metadata 
for me.

So my prefernce is _overwhelmingly_ for the format that Andrew uses (which 
is partly explained by the fact that I am used to it, but also by the fact 
that I've asked for Andrew to make trivial changes to match my usage).

That canonical format is:

	Subject: [PATCH 001/123] [<area>:] <explanation>

together with the first line of the body being a

	From: Original Author <or...@email.com>

followed by an empty line and then the body of the explanation.

After the body of the explanation comes the "Signed-off-by:" lines, and 
then a simple "---" line, and below that comes the diffstat of the patch 
and then the patch itself.

That's the "canonical email format", and it's that because my normal
scripts (in BK/tools, but now I'm working on making them more generic)  
take input that way. It's very easy to sort the emails alphabetically by
subject line - pretty much any email reader will support that - since
because the sequence number is zero-padded, the numerical and alphabetic
sort is the same.

If you send several sequences, you either send a simple explaining email
before the second sequence (hey, it's not like I'm a machine - I can use
my brains too, and in particular if the final number of patches in each
sequence is different, even if the sequences got re-ordered and are
overlapping, I can still just extract one from the other by selecting for
"/123] " in the subject line), or you modify the Subject: line subtly to
still sort uniquely and alphabetically in-order, ie the subject lines for
the second series might be

	Subject: [PATCHv2 001/207] x86: fix eflags tracking
	...

All very unambiguous, and my scripts already remove everything inside the 
brackets and will just replace it with "[PATCH]" in the final version.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!news.maxwell.syr.edu!
newsfeed.icl.net!newsfeed.fjserv.net!news.mailgate.org!erode.bofh.it!
bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 19:50:21 +0200
Message-ID: <3QJiZ-573-31@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QzD8-4AX-31@gated-at.bofh.it> 
<3QHh5-3kA-39@gated-at.bofh.it> <3QIPP-4L0-27@gated-at.bofh.it>
X-Original-To: Al Viro <v...@parcelfarce.linux.theplanet.co.uk>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 48
Organization: linux.* mail to news gateway
X-Original-Cc: David Woodhouse <dw...@infradead.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 10:47:18 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504071038320.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <1112858331.6924.17.ca...@localhost.localdomain>
 <Pine.LNX.4.58.0504070810270.28...@ppc970.osdl.org>
 <20050407171006.GF8...@parcelfarce.linux.theplanet.co.uk>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Thu, 7 Apr 2005, Al Viro wrote:
> 
> No.  There's another reason - when you are cherry-picking and reordering
> *your* *own* *patches*.

Yes. I agree. There should be some support for cherry-picking in between a
temporary throw-away tree and a "cleaned-up-tree". However, it should be
something you really do need to think about, and in most cases it really
does boil down to "export as patch, re-import from patch". Especially
since you potentially want to edit things in between anyway when you
cherry-pick.

(I do that myself: If I have been a messy boy, and committed mixed-up 
things as one commit, I export it as a patch, and then I split the patch 
by hand into two or more pieces - sometimes by just editing the patch 
directly, but sometimes with a combination of by applying it, and editing 
the result, and then re-exporting it as the new version).

And in the cases where this happens, you in fact often have unrelated
changes to the _same_file_, so you really do end up having that middle 
step.

In other words, this cherry-picking can generally be scripted and done
"outside" the SCM (you can trivially have a script that takes a revision
from one tree and applies it to the other). I don't believe that the SCM
needs to support it in any fundamentally inherent manner. After all, why 
should it, when it really boilds down to 

	(cd old-tree ; scm export-as-patch-plus-comments) |
		(cd new-tree ; scm import-patch-plus-comments)

where the "patch-plus-comments" part is just basically an extended patch
(including rename information etc, not just the comments).

Btw, this method of cherry-picking again requires two _separate_ active 
trees at the same time. BK is great at that, and really, that's what 
distributed SCM's should be all about anyway. It's not just distributed 
between different machines, it's literally distributed even on the same 
machine, and it's actively _used_ that way.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
news.ngi.it!bofh.it!news.nic.it!robomod
From: Daniel Phillips <phill...@istop.com>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Thu, 07 Apr 2005 20:00:17 +0200
Message-ID: <3QJsB-5fF-25@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QHh5-3kA-39@gated-at.bofh.it> 
<3QIPP-4L0-27@gated-at.bofh.it>
X-Original-To: Al Viro <v...@parcelfarce.linux.theplanet.co.uk>
User-Agent: KMail/1.7
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 49
Organization: linux.* mail to news gateway
X-Original-Cc: Linus Torvalds <torva...@osdl.org>,
	David Woodhouse <dw...@infradead.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 13:54:34 -0400
X-Original-Message-ID: <200504071354.34581.phillips@istop.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<Pine.LNX.4.58.0504070810270.28...@ppc970.osdl.org> 
<20050407171006.GF8...@parcelfarce.linux.theplanet.co.uk>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Thursday 07 April 2005 13:10, Al Viro wrote:
> The point being, both history and well, publishable result can be expressed
> as series of small steps, but they are not the same thing.  So far all I've
> seen in the area (and that includes BK) is heavily biased towards history
> part and attempts to use this stuff for manipulating patch series turn into
> fighting the tool.
>
> I'd *love* to see something that can handle both - preferably with
> history of reordering, etc. being kept.  IOW, not just a tree of changesets
> but a lattice - with multiple paths leading to the same node.  So far
> I've seen nothing of that kind ;-/

Which is a perfect demonstration of why the scm tool has to be free/open 
source.  We should never have had to plead with BitMover to extend BK in a 
direction like that, but instead, just get the source and make it do it, like 
any other open source project.

Three years ago, there was no fully working open source distributed scm code 
base to use as a starting point, so extending BK would have been the only 
easy alternative.  But since then the situation has changed.  There are now 
several working code bases to provide a good starting point: Monotone, Arch, 
SVK, Bazaar-ng and others.

Sure, there are quibbles about all of those, but right now is not the time for 
quibbling, because a functional replacement for BK is needed in roughly two 
months, capable of losslessly importing the kernel version graph.  It only 
has to support a subset of BK functionality, e.g., pulling and cloning.  It 
is ok to be a little slow so long as it is not pathetically slow.  The 
purpose of the interim solution is just to get the patch flow process back 
online.

The key is the _lossless_ part.  So long as the interim solution imports the 
metadata losslessly, we have the flexibility to switch to a better solution 
later, on short notice and without much pain.

So I propose that everybody who is interested, pick one of the above projects 
and join it, to help get it to the point of being able to losslessly import 
the version graph.  Given the importance, I think that _all_ viable 
alternatives need to be worked on in parallel, so that two months from now we 
have several viable options.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel SCM saga..
From: Paul P Komkoff Jr 
Date: Wed Apr 06 2005 - 14:40:18 EST 

Replying to Linus Torvalds:
> Ok,
> as a number of people are already aware (and in some cases have been

Actually, I'm very disappointed things gone such counter-productive
way. All along the history, I was against Larry's opponents, but at
the end, they are right. That's pity. To quote vin diesel' character
Riddick, "there's no such word as friend", or something.

Anyway, seems that folks in Canonical was aware about it, and here's
the result of this awareness: http://bazaar-ng.org/
This need some testing though, along with really hard part - transfer
all history, nonlinear ... I don't know how anyone can do this till 1
Jul 2005, sorry :(

> PS. Don't bother telling me about subversion. If you must, start reading
> up on "monotone". That seems to be the most viable alternative, but don't
> pester the developers so much that they don't get any work done. They are
> already aware of my problems ;)

Monotone is good, but I don't really know limits of sqlite3 wrt kernel
case. And again, what we need to do to retain history ...


-- 
Paul P 'Stingray' Komkoff Jr // http://stingr.net/key <- my pgp key
This message represents the official view of the voices in my head
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: Kernel SCM saga..
From: Martin Pool 
Date: Wed Apr 06 2005 - 20:42:40 EST 

On Wed, 06 Apr 2005 23:39:11 +0400, Paul P Komkoff Jr wrote:

> http://bazaar-ng.org/

I'd like bazaar-ng to be considered too. It is not ready for adoption
yet, but I am working (more than) full time on it and hope to have it
be usable in a couple of months. 

bazaar-ng is trying to integrate a lot of the work done in other systems
to make something that is simple to use but also fast and powerful enough
to handle large projects.

The operations that are already done are pretty fast: ~60s to import a
kernel tree, ~10s to import a new revision from a patch. 

Please check it out and do pester me with any suggestions about whatever
you think it needs to suit your work.

-- 
Martin


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: Kernel SCM saga..
From: Jeff Garzik 
Date: Wed Apr 06 2005 - 20:48:49 EST 

On Thu, Apr 07, 2005 at 11:40:23AM +1000, Martin Pool wrote:
> On Wed, 06 Apr 2005 23:39:11 +0400, Paul P Komkoff Jr wrote:
> 
> > http://bazaar-ng.org/
> 
> I'd like bazaar-ng to be considered too. It is not ready for adoption
> yet, but I am working (more than) full time on it and hope to have it
> be usable in a couple of months. 
> 
> bazaar-ng is trying to integrate a lot of the work done in other systems
> to make something that is simple to use but also fast and powerful enough
> to handle large projects.
> 
> The operations that are already done are pretty fast: ~60s to import a
> kernel tree, ~10s to import a new revision from a patch. 

By "importing", are you saying that importing all 60,000+ changesets of
the current kernel tree took only 60 seconds?

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: Kernel SCM saga..
From: Martin Pool 
Date: Wed Apr 06 2005 - 21:27:37 EST 

On Wed, 06 Apr 2005 21:47:27 -0400, Jeff Garzik wrote:

>> The operations that are already done are pretty fast: ~60s to import a
>> kernel tree, ~10s to import a new revision from a patch.
> 
> By "importing", are you saying that importing all 60,000+ changesets of
> the current kernel tree took only 60 seconds?

Now that would be impressive.

No, I mean this:

% bzcat ../linux.pkg/patch-2.5.14.bz2| patch -p1 

% time bzr add -v . 
(find any new non-ignored files; deleted files automatically noticed) 
6.06s user 0.41s system 89% cpu 7.248 total

% time bzr commit -v -m 'import 2.5.14'
7.71s user 0.71s system 65% cpu 12.893 total

(OK, a bit slower in this case but it wasn't all in core.)

This is only v0.0.3, but I think the interface simplicity and speed
compares well.

I haven't tested importing all 60,000+ changesets of the current bk tree,
partly because I don't *have* all those changesets. (Larry said
previously that someone (not me) tried to pull all of them using bkclient,
and he considered this abuse and blacklisted them.)

I have been testing pulling in release and rc patches, and it scales to
that level. It probably could not handle 60,000 changesets yet, but there
is a plan to get there. In the interim, although it cannot handle the
whole history forever it can handle large trees with moderate numbers of
commits -- perhaps as many as you might deal with in developing a feature
over a course of a few months.

The most sensible place to try out bzr, if people want to, is as a way to
keep your own revisions before mailing a patch to linus or the subsystem
maintainer.

-- 
Martin


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: Kernel SCM saga..
From: David Lang 
Date: Wed Apr 06 2005 - 21:35:24 EST

On Thu, 7 Apr 2005, Martin Pool wrote:


>I haven't tested importing all 60,000+ changesets of the current bk tree,
>partly because I don't *have* all those changesets. (Larry said
>previously that someone (not me) tried to pull all of them using bkclient,
>and he considered this abuse and blacklisted them.)


pull the patches from the BK2CVS server. yes some patches are combined, 
but it will get you in the ballpark.

David Lang


>I have been testing pulling in release and rc patches, and it scales to
>that level. It probably could not handle 60,000 changesets yet, but there
>is a plan to get there. In the interim, although it cannot handle the
>whole history forever it can handle large trees with moderate numbers of
>commits -- perhaps as many as you might deal with in developing a feature
>over a course of a few months.
>
>The most sensible place to try out bzr, if people want to, is as a way to
>keep your own revisions before mailing a patch to linus or the subsystem
>maintainer.
>
>--
>Martin
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@xxxxxxxxxxxxxxx
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/



--
There are two ways of constructing a software design. One way is to make it 
so simple that there are obviously no deficiencies. And the other way is to 
make it so complicated that there are no obvious deficiencies.
-- C.A.R. Hoare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: Kernel SCM saga..
From: Martin Pool 
Date: Thu Apr 07 2005 - 00:41:20 EST 

On Wed, 2005-04-06 at 19:32 -0700, David Lang wrote:
> On Thu, 7 Apr 2005, Martin Pool wrote:
> 
> > I haven't tested importing all 60,000+ changesets of the current bk tree,
> > partly because I don't *have* all those changesets. (Larry said
> > previously that someone (not me) tried to pull all of them using bkclient,
> > and he considered this abuse and blacklisted them.)
> 
> pull the patches from the BK2CVS server. yes some patches are combined, 
> but it will get you in the ballpark.

OK, I just tried that. I know there are scripts to resynthesize
changesets from the CVS info but I skipped that for now and just pulled
each day's work into a separate bzr revision. It's up to the end of
March and still running.

Importing the first snapshot (2004-01-01) took 41.77s user, 1:23.79
total. Each subsequent day takes about 10s user, 30s elapsed to commit
into bzr. The speeds are comparable to CVS or a bit faster, and may be
faster than other distributed systems. (This on a laptop with a 5400rpm
disk.) Pulling out a complete copy of the tree as it was on a previous
date takes about 14 user, 60s elapsed.

I don't want to get too distracted by benchmarks now because there are
more urgent things to do and anyhow there is still lots of scope for
optimization. I wouldn't be at all surprised if those times could be
more than halved. I just wanted to show it is in (I hope) the right
ballpark.

-- 
Martin

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!news.ngi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 01:40:10 +0200
Message-ID: <3QOLv-1qG-7@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QoHo-2b1-21@gated-at.bofh.it> 
<3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it>
X-Original-To: Martin Pool <m...@sourcefrog.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 21
Organization: linux.* mail to news gateway
X-Original-Cc: "linux-ker...@vger.kernel.org" <linux-ker...@vger.kernel.org>,
	David Lang <dl...@digitalinsight.com>
X-Original-Date: Thu, 7 Apr 2005 16:27:44 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504071626290.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
 <20050406193911.GA11...@stingr.stingr.net>  
 <pan.2005.04.07.01.40.20.998...@sourcefrog.net>
 <20050407014727.GA17...@havoc.gtf.org>  
 <pan.2005.04.07.02.25.56.501...@sourcefrog.net>
 <Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz> 
 <1112852302.29544.75.camel@hope>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Thu, 7 Apr 2005, Martin Pool wrote:
> 
> Importing the first snapshot (2004-01-01) took 41.77s user, 1:23.79
> total.  Each subsequent day takes about 10s user, 30s elapsed to commit
> into bzr.  The speeds are comparable to CVS or a bit faster, and may be
> faster than other distributed systems. (This on a laptop with a 5400rpm
> disk.)  Pulling out a complete copy of the tree as it was on a previous
> date takes about 14 user, 60s elapsed.

If you have an exportable tree, can you just make it pseudo-public, tell
me where to get a buildable system that works well enough, point me to
some documentation, and maybe I can get some feel for it?

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Chris Wedgwood <c...@f00f.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 06:20:07 +0200
Message-ID: <3QT8r-4R3-5@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 28
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 21:13:41 -0700
X-Original-Message-ID: <20050408041341.GA8720@taniwha.stupidest.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Wed, Apr 06, 2005 at 08:42:08AM -0700, Linus Torvalds wrote:

> PS. Don't bother telling me about subversion. If you must, start reading
> up on "monotone". That seems to be the most viable alternative, but don't
> pester the developers so much that they don't get any work done. They are
> already aware of my problems ;)

I'm playing with monotone right now.  Superficially it looks like it
has tons of gee-whiz neato stuff...  however, it's *agonizingly* slow.
I mean glacial.  A heavily sedated sloth with no legs is probably
faster.

Using monotone to pull itself too over 2 hours wall-time and 71
minutes of CPU time.

Arguably brand-new CPUs are probably about 2x the speed of what I have
now and there might have been networking funnies --- but that's still
35 monutes to get ~40MB of data.

The kernel is ten times larger, so does that mean to do a clean pull
of the kernel we are looking at (71/2*10) ~ 355 minutes or 6 hours of
CPU time?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 06:50:18 +0200
Message-ID: <3QTBE-5d5-95@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it>
X-Original-To: Chris Wedgwood <c...@f00f.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 44
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Thu, 7 Apr 2005 21:42:04 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504072127250.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050408041341.GA8...@taniwha.stupidest.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Thu, 7 Apr 2005, Chris Wedgwood wrote:
> 
> I'm playing with monotone right now.  Superficially it looks like it
> has tons of gee-whiz neato stuff...  however, it's *agonizingly* slow.
> I mean glacial.  A heavily sedated sloth with no legs is probably
> faster.

Yes. The silly thing is, at least in my local tests it doesn't actually
seem to be _doing_ anything while it's slow (there are no system calls
except for a few memory allocations and de-allocations). It seems to have
some exponential function on the number of pathnames involved etc.

I'm hoping they can fix it, though. The basic notions do not sound wrong.

In the meantime (and because monotone really _is_ that slow), here's a
quick challenge for you, and any crazy hacker out there: if you want to
play with something _really_ nasty (but also very _very_ fast), take a
look at kernel.org:/pub/linux/kernel/people/torvalds/.

First one to send me the changelog tree of sparse-git (and a tool to
commit and push/pull further changes) gets a gold star, and an honorable
mention. I've put a hell of a lot of clues in there (*).

I've worked on it (and little else) for the last two days. Time for 
somebody else to tell me I'm crazy.

		Linus

(*) It should be easier than it sounds. The database is designed so that
you can do the equivalent of a nonmerging (ie pure superset) push/pull
with just plain rsync, so replication really should be that easy (if
somewhat bandwidth-intensive due to the whole-file format).

Never mind merging. It's not an SCM, it's a distribution and archival
mechanism. I bet you could make a reasonable SCM on top of it, though.
Another way of looking at it is to say that it's really a content-
addressable filesystem, used to track directory trees.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
news.cdlan.net!erode.bofh.it!bofh.it!news.nic.it!robomod
From: Martin Pool <m...@sourcefrog.net>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 08:00:12 +0200
Message-ID: <3QUHi-6n4-3@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QoHo-2b1-21@gated-at.bofh.it> 
<3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
Content-Type: multipart/signed; micalg=pgp-sha1; 
protocol="application/pgp-signature"; boundary="=-pB6dJqRpscvxIzVxlN/n"
MIME-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 76
Organization: linux.* mail to news gateway
X-Original-Cc: "linux-ker...@vger.kernel.org" <linux-ker...@vger.kernel.org>,
	David Lang <dl...@digitalinsight.com>
X-Original-Date: Fri, 08 Apr 2005 15:56:09 +1000
X-Original-Message-ID: <1112939769.29544.161.camel@hope>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
	 <20050406193911.GA11...@stingr.stingr.net>
	 <pan.2005.04.07.01.40.20.998...@sourcefrog.net>
	 <20050407014727.GA17...@havoc.gtf.org>
	 <pan.2005.04.07.02.25.56.501...@sourcefrog.net>
	 <Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz>
	 <1112852302.29544.75.camel@hope>
	 <Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org


--=-pB6dJqRpscvxIzVxlN/n
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Thu, 2005-04-07 at 16:27 -0700, Linus Torvalds wrote:
>=20
> On Thu, 7 Apr 2005, Martin Pool wrote:
> >=20
> > Importing the first snapshot (2004-01-01) took 41.77s user, 1:23.79
> > total.  Each subsequent day takes about 10s user, 30s elapsed to commit
> > into bzr.  The speeds are comparable to CVS or a bit faster, and may be
> > faster than other distributed systems. (This on a laptop with a 5400rpm
> > disk.)  Pulling out a complete copy of the tree as it was on a previous
> > date takes about 14 user, 60s elapsed.
>=20
> If you have an exportable tree, can you just make it pseudo-public, tell
> me where to get a buildable system that works well enough, point me to
> some documentation, and maybe I can get some feel for it?

Hi,

There is a "stable" release here:
http://www.bazaar-ng.org/pkg/bzr-0.0.3.tgz

All you should need to do is unpack that and symlink bzr onto your path.

You can get the current bzr development tree, stored in itself, by
rsync:

  rsync -av ozlabs.org::mbp/bzr/dev ~/bzr.dev

Inside that directory you can run 'bzr info', 'bzr status --all', 'bzr
unknowns', 'bzr log', 'bzr ignored'. =20

Repeated rsyncs will bring you up to date with what I've done -- and
will of course overwrite any local changes.=20

If someone was going to development on this then the method would
typically be to have two copies of the tree, one tracking my version and
another for your own work -- much as with bk.  In your own tree, you can
do 'bzr add', 'bzr remove', 'bzr diff', 'bzr commit'.

At the moment all you can do is diff against the previous revision, or
manually diff the two trees, or use quilt, so it is just an archival
system not a full SCM system.  In the near future there will be some
code to extract the differences as changesets to be mailed off.

I have done a rough-as-guts import from bkcvs into this, and I can
advertise that when it's on a server that can handle the load.=20

At a glance this looks very similar to git -- I can go into the
differences and why I did them the other way if you want.

--=20
Martin


--=-pB6dJqRpscvxIzVxlN/n
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQBCVhz5PGPKP6Cz6IsRAjc3AKC+q0YgPKb55cM3CsLEHBKrmK7aqACcC4oZ
n8jUcvIyU++Z3X8awv1Tylw=
=u7R8
-----END PGP SIGNATURE-----

--=-pB6dJqRpscvxIzVxlN/n--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 08:50:09 +0200
Message-ID: <3QVtD-71j-19@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QoHo-2b1-21@gated-at.bofh.it> 
<3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it> 
<3QUHi-6n4-3@gated-at.bofh.it>
X-Original-To: Martin Pool <m...@sourcefrog.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 38
Organization: linux.* mail to news gateway
X-Original-Cc: "linux-ker...@vger.kernel.org" <linux-ker...@vger.kernel.org>,
	David Lang <dl...@digitalinsight.com>
X-Original-Date: Thu, 7 Apr 2005 23:41:29 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504072334310.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
 <20050406193911.GA11...@stingr.stingr.net>  
 <pan.2005.04.07.01.40.20.998...@sourcefrog.net>
  <20050407014727.GA17...@havoc.gtf.org>  
 <pan.2005.04.07.02.25.56.501...@sourcefrog.net>
  <Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz>  
 <1112852302.29544.75.camel@hope>
  <Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org> 
 <1112939769.29544.161.camel@hope>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Martin Pool wrote:
> 
> You can get the current bzr development tree, stored in itself, by
> rsync:

I was thinking more of an exportable kernel tree in addition to the tool.

The reason I mention that is just that I know several SCM's bog down under 
load horribly, so it actually matters what the size of the tree is.

And I'm absolutely _not_ asking you for the 60,000 changesets that are in
the BK tree, I'd be prfectly happy with a 2.6.12-rc2-based one for
testing.

I know I can import things myself, but the reason I ask is because I've
got several SCM's I should check out _and_ I've been spending the last two
days writing my own fallback system so that I don't get screwed if nothing
out there works right now. 

Which is why I'd love to hear from people who have actually used various 
SCM's with the kernel. There's bound to be people who have already tried.

I've gotten a lot of email of the kind "I love XYZ, you should try it 
out", but so far I've not seen anybody say "I've tracked the kernel with 
XYZ, and it does ..."

So, this is definitely not a "Martin Pool should do this" kind of issue: 
I'd like many people to test out many alternatives, to get a feel for 
where they are especially for a project the size of the kernel..

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!news.cdlan.net!erode.bofh.it!bofh.it!news.nic.it!robomod
From: Andrea Arcangeli <and...@suse.de>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 09:30:19 +0200
Message-ID: <3QW6v-7zx-41@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Gpg-Key: 1024D/68B9CB43 13D9 8355 295F 4823 7C49  C012 DFA1 686E 68B9 CB43
User-Agent: Mutt/1.5.9i
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 128
Organization: linux.* mail to news gateway
X-Original-Cc: Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 09:14:28 +0200
X-Original-Message-ID: <20050408071428.GB3957@opteron.random>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Thu, Apr 07, 2005 at 09:42:04PM -0700, Linus Torvalds wrote:
> play with something _really_ nasty (but also very _very_ fast), take a
> look at kernel.org:/pub/linux/kernel/people/torvalds/.

Why not to use sql as backend instead of the tree of directories? That solves
userland journaling too (really one still has to be careful to know the
read-committed semantics of sql, which is not obvious stuff, but 99% of
common cases like this one just works safe automatically since all
inserts/delete/update are always atomic).

You can keep the design of your db exactly the same and even the command line
of your script the same, except you won't have deal with the implementation of
it anymore, and the end result may run even faster with proper btrees and you
won't have scalability issues if the directory of hashes fills up, and it'll
get userland journaling, live backups, runtime analyses of your queries with
genetic algorithms (pgsql 8 seems to have it) etc...

I seem to recall there's a way to do delayed commits too, so you won't
be sychronous, but you'll still have journaling. You clearly don't care
to do synchronous writes, all you care about is that the commit is
either committed completely or not committed at all (i.e. not an half
write of the patch that leaves your db corrupt).

Example:

CREATE TABLE patches (
	patch			BIGSERIAL	PRIMARY KEY,

	commiter_name		VARCHAR(32)	NOT NULL CHECK(commiter_name != ''),
	commiter_email		VARCHAR(32)	NOT NULL CHECK(commiter_email != ''),

	md5			CHAR(32)	NOT NULL CHECK(md5 != ''),
	len			INTEGER		NOT NULL CHECK(len > 0),
	UNIQUE(md5, len),

	payload			BYTEA		NOT NULL,

	timestamp		TIMESTAMP	NOT NULL
);
CREATE INDEX patches_md5_index ON patches (md5);
CREATE INDEX patches_timestamp_index ON patches (timestamp);

s/md5/sha1/, no difference.

This will automatically spawn fatal errors if there are hash collisions and it
enforces a bit of checking.

Then you need a few lines of python to insert/lookup. Example for psycopg2:

import pwd, os, socket

[..]

patch = {'commiter_name': pwd.getpwuid(os.getuid())[4],
         'commiter_email': pwd.getpwuid(os.getuid())[0] + '@' + socket.getfqdn(),
	 'md5' : md5.new(data).hexdigest(), 'len' : len(data),
	 payload : data, 'timestamp' : 'now'}
curs.execute("""INSERT INTO patches
                  VALUES (%(committer_name)s, %(commiter_email)s, 
	          %(md5)s, %(len)s, %(payload)s, %(timestamp)s)""", patch)

('now' will be evaluated by the sql server, who knows about the time too)

The speed I don't know for sure, but especially with lots of data the sql way
should at least not be significantly slower, pgsql scales with terabytes
without apparent problems (modulo the annoyance of running vacuum once per day
in cron, to avoid internal sequence number overflows after >4 giga
committs, and once per day the analyser too so it learns about your
usage patterns and can optimize the disk format for it).

For sure the python part isn't going to be noticeable, you can still write it
in C if you prefer (it'll clearly run faster if you want to run tons of
inserts for a benchmark), so then everything will run at bare-hardware
speed and there will be no time wasted interpreting bytecode (only the
sql commands have to be interpreted).

The backup should also be tiny (runtime size is going to be somewhat larger due
the more data structure it has, how much larger I don't know). I know for sure
this kind of setup works like a charm on ppc64 (32bit userland), and x86 (32bit
and 64bit userland).

monotone using sqlite sounds a good idea infact (IMHO they could use a real
dbms too, so that you also get parallelism and you could attach another app to
the backing store at the same time or you could run a live backup and to
get all other high end performance features).

If you feel this is too bloated feel free to ignore this email of course! If
instead you'd like to give this a spin, let me know and I can help to
set it up quick (either today or from Monday).

I also like quick dedicated solutions and I was about to write a backing
store with a tree of dirs + hashes similar to yours for a similar
problem, but I give it up while planning the userland journaling part
and even worse the userland fs locking with live backups, when a DBMS
gets everything right including live backups (and it provides async
interface too via sockets). OTOH for this usage journaling and locking
aren't a big issue since you may have the patch to hash by hand to find
any potentially half-corrupted bit after reboot and you probably run it
serially.

About your compression of the data, I don't think you want to do that.
The size of the live image isn't the issue, the issue is the size of the
_backups_ and you want to compress an huge thing (i.e. the tarball of
the cleartext, or the sql cleartext backup), not many tiny patches.

Comparing the size of the repositories isn't interesting, the
interesting thing is to compare the size of the backups.

BTW, this fixed compliation for my system.

--- ./Makefile.orig	2005-04-08 09:07:17.000000000 +0200
+++ ./Makefile	2005-04-08 08:52:35.000000000 +0200
@@ -8,7 +8,7 @@ all: $(PROG)
 install: $(PROG)
 	install $(PROG) $(HOME)/bin/
 
-LIBS= -lssl
+LIBS= -lssl -lz
 
 init-db: init-db.o
 

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
news.ngi.it!bofh.it!news.nic.it!robomod
From: Andrea Arcangeli <and...@suse.de>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 10:50:10 +0200
Message-ID: <3QXlM-89-11@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QoHo-2b1-21@gated-at.bofh.it> 
<3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it> 
<3QUHi-6n4-3@gated-at.bofh.it> <3QVtD-71j-19@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Gpg-Key: 1024D/68B9CB43 13D9 8355 295F 4823 7C49  C012 DFA1 686E 68B9 CB43
User-Agent: Mutt/1.5.9i
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 40
Organization: linux.* mail to news gateway
X-Original-Cc: Martin Pool <m...@sourcefrog.net>,
	"linux-ker...@vger.kernel.org" <linux-ker...@vger.kernel.org>,
	David Lang <dl...@digitalinsight.com>
X-Original-Date: Fri, 8 Apr 2005 10:38:39 +0200
X-Original-Message-ID: <20050408083839.GC3957@opteron.random>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050406193911.GA11...@stingr.stingr.net> 
<pan.2005.04.07.01.40.20.998...@sourcefrog.net> 
<20050407014727.GA17...@havoc.gtf.org> 
<pan.2005.04.07.02.25.56.501...@sourcefrog.net> 
<Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz> 
<1112852302.29544.75.camel@hope> 
<Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org> 
<1112939769.29544.161.camel@hope> <Pine.LNX.4.58.0504072334310.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Thu, Apr 07, 2005 at 11:41:29PM -0700, Linus Torvalds wrote:
> I know I can import things myself, but the reason I ask is because I've
> got several SCM's I should check out _and_ I've been spending the last two
> days writing my own fallback system so that I don't get screwed if nothing
> out there works right now. 

I tend to like bzr too (and I tend to like too many things ;), but even
if the export of the data would be available it seems still too early in
development to be able to help you this week, it seems to miss any form
of network export too.

> I'd like many people to test out many alternatives, to get a feel for 
> where they are especially for a project the size of the kernel..

The huge number of changesets is the crucial point, there are good
distributed SCM already but they are apparently not efficient enough at
handling 60k changesets.

We'd need a regenerated coherent copy of BKCVS to pipe into those SCM to
evaluate how well they scale.

OTOH if your git project already allows storing the data in there,
that looks nice ;). I don't yet fully understand how the algorithms of
the trees are meant to work (I only understand well the backing store
and I tend to prefer DBMS over tree of dirs with hashes). So I've no
idea how it can plug in well for a SCM replacement or how you want to
use it. It seems a kind of fully lockless thing where you can merge from
one tree to the other without locks and where you can make quick diffs.
It looks similar to a diff -ur of two hardlinked trees, except this one
can save a lot of bandwidth to copy with rsync (i.e.  hardlinks becomes
worthless after using rsync in the network, but hashes not). Clearly the
DBMS couldn't use the rsync binary to distribute the objects, but a
network protocol could do the same thing rsync does.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
nntp.infostrada.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 16:30:36 +0200
Message-ID: <3R2Fe-4KO-11@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it>
X-Original-To: Andrea Arcangeli <and...@suse.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 17
Organization: linux.* mail to news gateway
X-Original-Cc: Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 07:26:08 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504080724550.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org>
 <20050408071428.GB3...@opteron.random>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Andrea Arcangeli wrote:
> 
> Why not to use sql as backend instead of the tree of directories?

Because it sucks? 

I can come up with millions of ways to slow things down on my own. Please 
come up with ways to speed things up instead.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
news.cdlan.net!erode.bofh.it!bofh.it!news.nic.it!robomod
From: Matthias-Christian Ott <matthias.christ...@tiscali.de>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 18:30:19 +0200
Message-ID: <3R4x5-6lQ-39@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050406)
X-Accept-Language: en-us, en
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 32
Organization: linux.* mail to news gateway
X-Original-Cc: Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 08 Apr 2005 18:15:09 +0200
X-Original-Message-ID: <4256AE0D.201@tiscali.de>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Linus Torvalds wrote:

>On Fri, 8 Apr 2005, Andrea Arcangeli wrote:
>  
>
>>Why not to use sql as backend instead of the tree of directories?
>>    
>>
>
>Because it sucks? 
>
>I can come up with millions of ways to slow things down on my own. Please 
>come up with ways to speed things up instead.
>
>		Linus
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>  
>
SQL Databases like SQLite aren't slow.
But maybe a Berkeley Database v.4 is a better solution.

Matthias-Christian Ott
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!solnet.ch!
solnet.ch!news-zh.switch.ch!switch.ch!newsfeed-0.progon.net!progon.net!
bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 19:20:09 +0200
Message-ID: <3R5jj-71G-15@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it>
X-Original-To: Matthias-Christian Ott <matthias.christ...@tiscali.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 20
Organization: linux.* mail to news gateway
X-Original-Cc: Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 10:14:22 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081010540.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050408041341.GA8...@taniwha.stupidest.org> 
 <Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org>
 <20050408071428.GB3...@opteron.random> 
 <Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org>
 <4256AE0D....@tiscali.de>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Matthias-Christian Ott wrote:
>
> SQL Databases like SQLite aren't slow.

After applying a patch, I can do a complete "show-diff" on the kernel tree
to see the effect of it in about 0.15 seconds.

Also, I can use rsync to efficiently replicate my database without having 
to re-send the whole crap - it only needs to send the new stuff.

You do that with an sql database, and I'll be impressed.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!newshub.sdsu.edu!erode.bofh.it!
bofh.it!news.nic.it!robomod
From: Chris Wedgwood <c...@f00f.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 19:30:19 +0200
Message-ID: <3R5t9-78n-29@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
X-Orbl: [68.120.153.162]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 12
Organization: linux.* mail to news gateway
X-Original-Cc: Matthias-Christian Ott <matthias.christ...@tiscali.de>,
	Andrea Arcangeli <and...@suse.de>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 10:15:18 -0700
X-Original-Message-ID: <20050408171518.GA4201@taniwha.stupidest.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org> 
<4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Fri, Apr 08, 2005 at 10:14:22AM -0700, Linus Torvalds wrote:

> After applying a patch, I can do a complete "show-diff" on the kernel tree
> to see the effect of it in about 0.15 seconds.

How does that work?  Can you stat the entire tree in that time?  I
measure it as being higher than that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Matthias-Christian Ott <matthias.christ...@tiscali.de>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 19:40:06 +0200
Message-ID: <3R5CC-7ev-1@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050406)
X-Accept-Language: en-us, en
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 35
Organization: linux.* mail to news gateway
X-Original-Cc: Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 08 Apr 2005 19:25:17 +0200
X-Original-Message-ID: <4256BE7D.5040308@tiscali.de>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org> 
<4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Linus Torvalds wrote:

>On Fri, 8 Apr 2005, Matthias-Christian Ott wrote:
>  
>
>>SQL Databases like SQLite aren't slow.
>>    
>>
>
>After applying a patch, I can do a complete "show-diff" on the kernel tree
>to see the effect of it in about 0.15 seconds.
>
>Also, I can use rsync to efficiently replicate my database without having 
>to re-send the whole crap - it only needs to send the new stuff.
>
>You do that with an sql database, and I'll be impressed.
>
>		Linus
>
>  
>
Ok, but if you want to search for information in such big text files it 
slow, because you do linear search -- most datases use faster search 
algorithms like hashing and if you have multiple files (I don't if 
you're system uses multiple files (like bitkeeper) or not) which need a 
system call to be opened this will be very slow, because system calls 
itself are slow. And using rsync is also possible because most databases 
store their information as plain text with meta information.

Mattthias-Christian Ott
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Jeff Garzik <jgar...@pobox.com>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 19:50:23 +0200
Message-ID: <3R5Mz-7lv-27@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) 
Gecko/20050328 Fedora/1.7.6-1.2.5
X-Accept-Language: en-us, en
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Warning: 24.25.22.197 is listed at orbz.gst-group.uk.com
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 30
Organization: linux.* mail to news gateway
X-Original-Cc: Matthias-Christian Ott <matthias.christ...@tiscali.de>,
	Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 08 Apr 2005 13:35:52 -0400
X-Original-Message-ID: <4256C0F8.6030008@pobox.com>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org> 
<4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Linus Torvalds wrote:
> 
> On Fri, 8 Apr 2005, Matthias-Christian Ott wrote:
> 
>>SQL Databases like SQLite aren't slow.
> 
> 
> After applying a patch, I can do a complete "show-diff" on the kernel tree
> to see the effect of it in about 0.15 seconds.
> 
> Also, I can use rsync to efficiently replicate my database without having 
> to re-send the whole crap - it only needs to send the new stuff.
> 
> You do that with an sql database, and I'll be impressed.

Well...  it took me over 30 seconds just to 'rm -rf' the unpacked 
tarballs of git and sparse-git, over my LAN's NFS.

Granted that this sort of stuff works well with (a) rsync and (b) 
hardlinks, but it's still punishment on the i/dcache.

	Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!news.ngi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 20:00:20 +0200
Message-ID: <3R5Wc-7sj-53@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it> <3R5t9-78n-29@gated-at.bofh.it>
X-Original-To: Chris Wedgwood <c...@f00f.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 35
Organization: linux.* mail to news gateway
X-Original-Cc: Matthias-Christian Ott <matthias.christ...@tiscali.de>,
	Andrea Arcangeli <and...@suse.de>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 10:46:40 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081037310.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org>
 <20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org>
 <4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
 <20050408171518.GA4...@taniwha.stupidest.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Chris Wedgwood wrote:
> On Fri, Apr 08, 2005 at 10:14:22AM -0700, Linus Torvalds wrote:
> 
> > After applying a patch, I can do a complete "show-diff" on the kernel tree
> > to see the effect of it in about 0.15 seconds.
> 
> How does that work?  Can you stat the entire tree in that time?  I
> measure it as being higher than that.

I can indeed stat the entire tree in that time (assuming it's in memory,
of course, but my kernel trees are _always_ in memory ;), but in order to
do so, I have to be good at finding the names to stat.

In particular, you have to be extremely careful. You need to make sure 
that you don't stat anything you don't need to. We're not talking just 
blindly recursing the tree here, and that's exactly the point. You have to 
know what you're doing, but the whole point of keeping track of directory 
contents is that dammit, that's your whole job.

Anybody who can't list the files they work on _instantly_ is doing 
something damn wrong. 

"git" is really trivial, written in four days. Most of that was not 
actually spent coding, but thinking about the data structures.

			Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!newsread.com!news-xfer.newsread.com!
logbridge.uoregon.edu!xmission!nntp.infostrada.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 20:20:07 +0200
Message-ID: <3R6fl-7Qs-1@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it> <3R5CC-7ev-1@gated-at.bofh.it>
X-Original-To: Matthias-Christian Ott <matthias.christ...@tiscali.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 89
Organization: linux.* mail to news gateway
X-Original-Cc: Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 11:14:11 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081047200.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org>
 <20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org>
 <4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
 <4256BE7D.5040...@tiscali.de>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Matthias-Christian Ott wrote:
>
> Ok, but if you want to search for information in such big text files it 
> slow, because you do linear search 

No I don't. I don't search for _anything_. I have my own
content-addressable filesystem, and I guarantee you that it's faster than
mysql, because it depends on the kernel doing the right thing (which it
does).

I never do a single "readdir". It's all direct data lookup, no "searching"  
anywhere.

Databases aren't magical. Quite the reverse. They easily end up being
_slower_ than doing it by hand, simply because they have to solve a much
more generic issue. If you design your data structures and abstractions
right, a database is pretty much guaranteed to only incur overhead.

The advantage of a database is the abstraction and management it gives 
you. But I did my own special-case abstraction in git.

Yeah, I bet "git" might suck if your OS sucks. I definitely depend on name
caching at an OS level so that I know that opening a file is fast. In
other words, there _is_ an indexing and caching database in there, and
it's called the Linux VFS layer and the dentry cache.

The proof is in the pudding. git is designed for _one_ thing, and one 
thing only: tracking a series of directory states in a way that can be 
replicated. It's very very fast at that. A database with a more flexible 
abstraction migt be faster at other things, but the fact is, you do take a 
hit.

The problem with databases are:

 - they are damn hard to just replicate wildly and without control. The 
   database backing file inherently has a lot of internal state. You may 
   be able to "just copy it", but you have to copy the whole damn thing.

   In "git", the data is all there in immutable blobs that you can just 
   rsync. In fact, you don't even need rsync: you can just look at the 
   filenames, and anything new you copy. No need for any fancy "read the 
   files to see that they match". They _will_ match, or you can tell 
   immediately that a file is corrupt.

   Look at this:

	torvalds@ppc970:~/git> sha1sum .dircache/objects/e7/
        bfaadd5d2331123663a8f14a26604a3cdcb678 
	e7bfaadd5d2331123663a8f14a26604a3cdcb678  .dircache/
        objects/e7/bfaadd5d2331123663a8f14a26604a3cdcb678

   see a pattern anywhere? Imagine that you know the list of files you 
   have, and the list of files the other side has (never mind the 
   contents), and how _easy_ it is to synchronize. Without ever having to 
   even read the remote files that you know you already have.

   How do you replicate your database incrementally? I've given you enough 
   clues to do it for "git" in probably five lines of perl.

 - they tend to take time to set up and prime.

   In contrast, the filesystem is always there. Sure, you effectively have 
   to "prime" that one too, but the thing is, if your OS is doing its job, 
   you basically only need to prime it once per reboot. No need to prime 
   it for each process you start or play games with connecting to servers 
   etc. It's just there. Always. 

So if you think of the filesystem as a database, you're all set. If you 
design your data structure so that there is just one index, you make that 
the name, and the kernel will do all the O(1) hashed lookups etc for you. 
You do have to limit yourself in some ways. 

Oh, and you have to be willing to waste diskspace. "git" is _not_
space-efficient. The good news is that it is cache-friendly, since it is
also designed to never actually look at any old files that aren't part of
the immediate history, so while it wastes diskspace, it does not waste the
(much more precious) page cache.

IOW big file-sets are always bad for performance if you need to traverse
them to get anywhere, but if you index things so that you only read the 
stuff you really really _need_ (which git does), big file-sets are just an 
excuse to buy a new disk ;)

			Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 20:50:10 +0200
Message-ID: <3R6Iq-89Y-21@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it> <3R5Mz-7lv-27@gated-at.bofh.it>
X-Original-To: Jeff Garzik <jgar...@pobox.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 114
Organization: linux.* mail to news gateway
X-Original-Cc: Matthias-Christian Ott <matthias.christ...@tiscali.de>,
	Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 11:47:10 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081114220.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org>
 <20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org>
 <4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
 <4256C0F8.6030...@pobox.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Jeff Garzik wrote:
> 
> Well...  it took me over 30 seconds just to 'rm -rf' the unpacked 
> tarballs of git and sparse-git, over my LAN's NFS.

Don't use NFS for development. It sucks for BK too. 

That said, normal _use_ should actually be pretty efficient even over NFS.  
It will "stat" a hell of a lot of files to do the "show-diff", but that
part you really can't avoid unless you depend on all the tools marking
their changes somewhere. Which BK does, actually, but that was pretty
painful, and means that bk needed to re-implement all the normal ops (ie
"bk patch").

What's also nice is that exactly because "git" depends on totally 
immutable files, they actually cache very well over NFS. Even if you were 
to share a database across machines (which is _not_ what git is meant to 
do, but it's certainly possible). 

So I actually suspect that if you actually _work_ with a tree in "git", 
you will find performance very good indeed. The fact that it creates a 
number of files when you pull in a new repository is a different thing.

> Granted that this sort of stuff works well with (a) rsync and (b) 
> hardlinks, but it's still punishment on the i/dcache.

Actually, it's not. Not once it is set up. Exactly because "git" doesn't
actually access those files unless it literally needs the data in one
file, and then it's always set up so that it needs either none or _all_ of
the file. There is no data sharing anywhere, so you are never in the
situation where it needs "ten bytes from file X" and "25 bytes from file
Y".

For example, if you don't have any changes in your tree, there is exactly 
_one_ file that a "show-diff" will read: the .dircache/index file. That's 
it. After that, it will "stat()" exactly the files you are tracking, and 
nothing more. It will not touch any internal "git" data AT ALL. That 
"stat" will be somewhat expensive unless your client caches stat data too, 
but that's it.

And if it turns out that you have changed a file (or even just touched it, 
so that the data is the same, but the index file can no longer guarantee 
it with just a single "stat()"), then git will open have exactly _one_ 
file (no searching, no messing around), which contains absolutely nothing 
except for the compressed (and SHA1-signed) old contents of the file. It 
obviously _has_ to do that, because in order to know whether you've 
changed it, it needs to now compare it to the original.

IOW, "git" will literally touch the minimum IO necessary, and absolutely 
minimum cache-footprint.

The fact is, when tracking the 17,000 files in the kernel directory, most
of them are never actually changed. They literally are "free". They aren't
brought into the cache by "git" - not the file itself, not the backing
store. You set up the index file once, and you never ever touch them
again.  You could put the sha1 files on a tape, for all git cares.

The one exception obviously being when you actually instantiate the git 
archive for the first time (or when you throw it away). At that time you 
do touch all of the data, but that should be the only time.

THAT is what git is good at. It optimized for the "not a lot of changes"  
things, and pretty much all the operations are O(n) in the "size of
change", not in "size of repo".

That includes even things like "give me the diff between the top of tree
and the tree 10 days ago". If you know what your head was 10 days ago,
"git" will open exactly _four_ small files for this operation (the current
"top"  commit, the commit file of ten days ago, and the two "tree" files
associated with those). It will then need to open the backing store files 
for the files that are different between the two versions, but IT WILL 
NEVER EVEN LOOK at the files that it immediately sees are the same.

And that's actually true whether we're talking about the top-of-tree or
not. If I had the kernel history in git format (I don't - I estimate that
it would be about 1.5GB - 2GB in size, and would take me about ten days to
extract from BK ;), I could do a diff between _any_ tagged version (and I
mention "tagged" only as a way to look up the commit ID - it doesn't have
to be tagged if you know it some other way) in O(n) where 'n' is the
number of files that have changed between the revisions.

Number of changesets doesn't matter. Number of files doesn't matter. The
_only_ thing that matters is the size of the change.

Btw, I don't actually have a git command to do this yet. A bit of
scripting required to do it, but it's pretty trivial: you open the two
"commit" files that are the beginning/end of the thing, you look up what
the tree state was at each point, you open up the two tree files involved,
and you ignore all entries that match.

Since the tree files are already sorted, that "ignoring matches" is
basically free (technically that's O(n) in the number of files described,
but we're talking about something that even a slow machine can do so fast
you probably can't even time it with a stop-watch). You now have the 
complete list of files that have been changed (removed, added or "exists 
in both trees, but different contents"), and you can thus trivially create 
the whole tree with opening up _only_ the indexes for those files.

Ergo: O(n) in size of change. Both in work and in disk/cache access (where
the latter is often the more important one). Absolutely _zero_ indirection
anywhere apart from the initial stage to go from "commit" to "tree", so
there's no seeking except to actually read the files once you know what
they are (and since you know them up-front and there are no dependencies
at that point, you could even tell the OS to prefetch them if you really
cared about getting minimal disk seeks).

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
news.ngi.it!bofh.it!news.nic.it!robomod
From: Matthias-Christian Ott <matthias.christ...@tiscali.de>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 21:20:08 +0200
Message-ID: <3R7bq-aV-7@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it> <3R5CC-7ev-1@gated-at.bofh.it> 
<3R6fl-7Qs-1@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050406)
X-Accept-Language: en-us, en
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 134
Organization: linux.* mail to news gateway
X-Original-Cc: Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 08 Apr 2005 21:16:12 +0200
X-Original-Message-ID: <4256D87C.5090207@tiscali.de>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org> 
<4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org> 
<4256BE7D.5040...@tiscali.de> <Pine.LNX.4.58.0504081047200.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Linus Torvalds wrote:

>On Fri, 8 Apr 2005, Matthias-Christian Ott wrote:
>  
>
>>Ok, but if you want to search for information in such big text files it 
>>slow, because you do linear search 
>>    
>>
>
>No I don't. I don't search for _anything_. I have my own
>content-addressable filesystem, and I guarantee you that it's faster than
>mysql, because it depends on the kernel doing the right thing (which it
>does).
>
>  
>
I'm not talking about mysql, i'm talking about fast databases like 
sqlite or db4.

>I never do a single "readdir". It's all direct data lookup, no "searching"  
>anywhere.
>
>Databases aren't magical. Quite the reverse. They easily end up being
>_slower_ than doing it by hand, simply because they have to solve a much
>more generic issue. If you design your data structures and abstractions
>right, a database is pretty much guaranteed to only incur overhead.
>
>The advantage of a database is the abstraction and management it gives 
>you. But I did my own special-case abstraction in git.
>
>Yeah, I bet "git" might suck if your OS sucks. I definitely depend on name
>caching at an OS level so that I know that opening a file is fast. In
>other words, there _is_ an indexing and caching database in there, and
>it's called the Linux VFS layer and the dentry cache.
>
>The proof is in the pudding. git is designed for _one_ thing, and one 
>thing only: tracking a series of directory states in a way that can be 
>replicated. It's very very fast at that. A database with a more flexible 
>abstraction migt be faster at other things, but the fact is, you do take a 
>hit.
>
>The problem with databases are:
>
> - they are damn hard to just replicate wildly and without control. The 
>   database backing file inherently has a lot of internal state. You may 
>   be able to "just copy it", but you have to copy the whole damn thing.
>  
>
This is _not_ true for every database (specialy plain/text databases 
with meta information).

>   In "git", the data is all there in immutable blobs that you can just 
>   rsync. In fact, you don't even need rsync: you can just look at the 
>   filenames, and anything new you copy. No need for any fancy "read the 
>   files to see that they match". They _will_ match, or you can tell 
>   immediately that a file is corrupt.
>
>   Look at this:
>
>	torvalds@ppc970:~/git> sha1sum .dircache/objects/e7/
        bfaadd5d2331123663a8f14a26604a3cdcb678 
>	e7bfaadd5d2331123663a8f14a26604a3cdcb678  .dircache/
        objects/e7/bfaadd5d2331123663a8f14a26604a3cdcb678
>
>   see a pattern anywhere? Imagine that you know the list of files you 
>   have, and the list of files the other side has (never mind the 
>   contents), and how _easy_ it is to synchronize. Without ever having to 
>   even read the remote files that you know you already have.
>   How do you replicate your database incrementally? I've given you enough 
>   clues to do it for "git" in probably five lines of perl.
>  
>
I replicate my database incremently by using a hash list like you (the 
client sends its hash list, the server compares the lists and acquaints 
the client behind which data (data = hash + data) the data has to added 
(this is like your solution -- you also submit the data and the location 
(you have directories too, right?)). A database is in some cases (like 
this one) like a filesystem, but it's build one top of better filesystem 
like xfs, reiser4 or ext3 which support features like LVM, Quotas or 
Journaling (Is your filesystem also build on top of existing filesystem? 
I don't think so because you're talking about vfs operatations on the 
filesystem).

> - they tend to take time to set up and prime.
>
>   In contrast, the filesystem is always there. Sure, you effectively have 
>   to "prime" that one too, but the thing is, if your OS is doing its job, 
>   you basically only need to prime it once per reboot. No need to prime 
>   it for each process you start or play games with connecting to servers 
>   etc. It's just there. Always.
>  
>
The database -- single file (sqlite or db4) -- is always there too 
because it's on the filesystem and doesn't need a server.

>So if you think of the filesystem as a database, you're all set. If you 
>design your data structure so that there is just one index, you make that 
>the name, and the kernel will do all the O(1) hashed lookups etc for you. 
>You do have to limit yourself in some ways. 
>  
>
But as mentioned you need to _open_ each file (It doesn't matter if it's 
cached (this speeds up only reading it) -- you need a _slow_ system call 
and _very slow_ hardware access anyway).
Have a look at this comparison:
If you have big chest and lots of small chests containing the same bulk 
of gold, it's more work to collect the gold from the small chests than 
from the big one (which would contain as many a cases as little chests 
exist). You can faster find your gold because you don't have to walk to 
the other chests and you don't have to open that much caps which saves 
also time.

>Oh, and you have to be willing to waste diskspace. "git" is _not_
>space-efficient. The good news is that it is cache-friendly, since it is
>also designed to never actually look at any old files that aren't part of
>the immediate history, so while it wastes diskspace, it does not waste the
>(much more precious) page cache.
>
>IOW big file-sets are always bad for performance if you need to traverse
>them to get anywhere, but if you index things so that you only read the 
>stuff you really really _need_ (which git does), big file-sets are just an 
>excuse to buy a new disk ;)
>
>			Linus
>
>  
>
I hope my idea/opinion is clear now.

Matthias-Christian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
news.ngi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 21:40:09 +0200
Message-ID: <3R7uN-pj-17@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it> <3R5CC-7ev-1@gated-at.bofh.it> 
<3R6fl-7Qs-1@gated-at.bofh.it> <3R7bq-aV-7@gated-at.bofh.it>
X-Original-To: Matthias-Christian Ott <matthias.christ...@tiscali.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 20
Organization: linux.* mail to news gateway
X-Original-Cc: Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 12:32:10 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081231130.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org>
 <20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org>
 <4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
 <4256BE7D.5040...@tiscali.de> <Pine.LNX.4.58.0504081047200.28...@ppc970.osdl.org>
 <4256D87C.5090...@tiscali.de>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Matthias-Christian Ott wrote:
>
> But as mentioned you need to _open_ each file (It doesn't matter if it's 
> cached (this speeds up only reading it) -- you need a _slow_ system call 
> and _very slow_ hardware access anyway).

Nope. System calls aren't slow. What crappy OS are you running?

> I hope my idea/opinion is clear now.

Numbers talk. I've got something that you can test ;)

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Matthias-Christian Ott <matthias.christ...@tiscali.de>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 21:50:07 +0200
Message-ID: <3R7Er-vu-7@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QT8r-4R3-5@gated-at.bofh.it> 
<3QTBE-5d5-95@gated-at.bofh.it> <3QW6v-7zx-41@gated-at.bofh.it> 
<3R2Fe-4KO-11@gated-at.bofh.it> <3R4x5-6lQ-39@gated-at.bofh.it> 
<3R5jj-71G-15@gated-at.bofh.it> <3R5CC-7ev-1@gated-at.bofh.it> 
<3R6fl-7Qs-1@gated-at.bofh.it> <3R7bq-aV-7@gated-at.bofh.it> 
<3R7uN-pj-17@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050406)
X-Accept-Language: en-us, en
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 44
Organization: linux.* mail to news gateway
X-Original-Cc: Andrea Arcangeli <and...@suse.de>, Chris Wedgwood <c...@f00f.org>,
	Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 08 Apr 2005 21:44:39 +0200
X-Original-Message-ID: <4256DF27.5060607@tiscali.de>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org> 
<20050408041341.GA8...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random> 
<Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org> 
<4256AE0D....@tiscali.de> <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org> 
<4256BE7D.5040...@tiscali.de> <Pine.LNX.4.58.0504081047200.28...@ppc970.osdl.org> 
<4256D87C.5090...@tiscali.de> <Pine.LNX.4.58.0504081231130.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Linus Torvalds wrote:

>On Fri, 8 Apr 2005, Matthias-Christian Ott wrote:
>  
>
>>But as mentioned you need to _open_ each file (It doesn't matter if it's 
>>cached (this speeds up only reading it) -- you need a _slow_ system call 
>>and _very slow_ hardware access anyway).
>>    
>>
>
>Nope. System calls aren't slow. What crappy OS are you running?
>
>  
>
But they're slower because there're some instances checking them.

>>I hope my idea/opinion is clear now.
>>    
>>
>
>Numbers talk. I've got something that you can test ;)
>  
>
This doesn't mean it's better just because you had the time develope it 
;). But anyhow the folk needs something, they can test to see if it's 
good or not, most don't believe in concepts.

>		Linus
>
>  
>
We will see which solutions wins the "race". But I think you're 
solutions will "win", because you're Linus Torvalds -- the "Boss" of 
Linux and have to work with this system very day (usualy people are 
using what they have developed :)) -- and I have not the time develop a 
database based solution (maybe someone else is interested in developing it).

Matthias-Christian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: L...@unix-os.sc.intel.com, Tony <tony.l...@intel.com>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 23:00:24 +0200
Message-ID: <3R8Kr-1qK-67@gated-at.bofh.it>
References: <3QT8r-4R3-5@gated-at.bofh.it> <3QTBE-5d5-95@gated-at.bofh.it> 
<3QW6v-7zx-41@gated-at.bofh.it> <3R2Fe-4KO-11@gated-at.bofh.it> 
<3R4x5-6lQ-39@gated-at.bofh.it> <3R5jj-71G-15@gated-at.bofh.it> 
<3R5t9-78n-29@gated-at.bofh.it> <3R5Wc-7sj-53@gated-at.bofh.it> 
<3R65O-7K6-29@gated-at.bofh.it> <3R71L-4S-1@gated-at.bofh.it> 
<3R7bt-aV-19@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
X-Scanned-By: MIMEDefang 2.44
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 13
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 13:50:04 -0700
X-Original-Message-ID: <200504082050.j38Ko4r19673@unix-os.sc.intel.com>
X-Original-References: <20050408041341.GA8...@taniwha.stupidest.org>
 <Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random>
 <Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org> 
<4256AE0D....@tiscali.de>
 <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
 <20050408171518.GA4...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504081037310.28...@ppc970.osdl.org>
 <20050408180540.GA4...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504081149010.28...@ppc970.osdl.org>
 <20050408191638.GA5...@taniwha.stupidest.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

It looks like an operation like "show me the history of mm/memory.c" will
be pretty expensive using git.  I'd need to look at the current tree, and
then trace backwards through all 60,000 changesets to see which ones had
actual changes to this file.  Could you expand the tuple in the tree object
to include a back pointer to the previous tree in which the tuple changed?
Or does adding history to the tree violate other goals of the tree type?

-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!news.ngi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Fri, 08 Apr 2005 23:40:08 +0200
Message-ID: <3R9mU-22G-9@gated-at.bofh.it>
References: <3QT8r-4R3-5@gated-at.bofh.it> <3QTBE-5d5-95@gated-at.bofh.it> 
<3QW6v-7zx-41@gated-at.bofh.it> <3R2Fe-4KO-11@gated-at.bofh.it> 
<3R4x5-6lQ-39@gated-at.bofh.it> <3R5jj-71G-15@gated-at.bofh.it> 
<3R5t9-78n-29@gated-at.bofh.it> <3R5Wc-7sj-53@gated-at.bofh.it> 
<3R65O-7K6-29@gated-at.bofh.it> <3R71L-4S-1@gated-at.bofh.it> 
<3R7bt-aV-19@gated-at.bofh.it> <3R8Kr-1qK-67@gated-at.bofh.it>
X-Original-To: L...@unix-os.sc.intel.com, Tony <tony.l...@intel.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 27
Organization: linux.* mail to news gateway
X-Original-Cc: Kernel Mailing List <linux-ker...@vger.kernel.org>
X-Original-Date: Fri, 8 Apr 2005 14:27:38 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081412190.28951@ppc970.osdl.org>
X-Original-References: <20050408041341.GA8...@taniwha.stupidest.org>
 <Pine.LNX.4.58.0504072127250.28...@ppc970.osdl.org> 
<20050408071428.GB3...@opteron.random>
 <Pine.LNX.4.58.0504080724550.28...@ppc970.osdl.org> <4256AE0D....@tiscali.de>
 <Pine.LNX.4.58.0504081010540.28...@ppc970.osdl.org>
 <20050408171518.GA4...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504081037310.28...@ppc970.osdl.org>
 <20050408180540.GA4...@taniwha.stupidest.org> 
<Pine.LNX.4.58.0504081149010.28...@ppc970.osdl.org>
 <20050408191638.GA5...@taniwha.stupidest.org> 
<200504082050.j38Ko4r19...@unix-os.sc.intel.com>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005 L...@unix-os.sc.intel.com wrote:
>
> It looks like an operation like "show me the history of mm/memory.c" will
> be pretty expensive using git.

Yes.  Per-file history is expensive in git, because if the way it is 
indexed. Things are indexed by tree and by changeset, and there are no 
per-file indexes.

You could create per-file _caches_ (*) on top of git if you wanted to make
it behave more like a real SCM, but yes, it's all definitely optimized for
the things that _I_ tend to care about, which is the whole-repository
operations.

		Linus

(*) Doing caching on that level is probably find, especially since most
people really tend to want it for just the relatively few files that they
work on anyway. Limiting the caches to a subset of the tree should be
quite effective.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
news.panservice.it!erode.bofh.it!bofh.it!news.nic.it!robomod
From: Rajesh Venkatasubramanian <vraj...@umich.edu>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sat, 09 Apr 2005 00:50:09 +0200
Message-ID: <3RasF-2Wl-9@gated-at.bofh.it>
X-Original-To: torva...@osdl.org, linux-ker...@vger.kernel.org
User-Agent: Mozilla Thunderbird 1.0 (X11/20041206)
X-Accept-Language: en-us, en
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 19
Organization: linux.* mail to news gateway
X-Original-Date: Fri, 08 Apr 2005 18:27:38 -0400
X-Original-Message-ID: <4257055A.7010908@umich.edu>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

Linus wrote:
>> It looks like an operation like "show me the history of mm/memory.c" will
>> be pretty expensive using git.
>
> Yes.  Per-file history is expensive in git, because if the way it is 
> indexed. Things are indexed by tree and by changeset, and there are no 
> per-file indexes.

Although directory changes are tracked using change-sets, there 
seems to be no easy way to answer "give me the diff corresponding to
the commit (change-set) object <sha1>".  That will be really helpful to
review the changes.

Rajesh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!news.ngi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sat, 09 Apr 2005 01:30:14 +0200
Message-ID: <3Rb5s-3uM-27@gated-at.bofh.it>
References: <3RasF-2Wl-9@gated-at.bofh.it>
X-Original-To: Rajesh Venkatasubramanian <vraj...@umich.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 77
Organization: linux.* mail to news gateway
X-Original-Cc: linux-ker...@vger.kernel.org
X-Original-Date: Fri, 8 Apr 2005 16:29:09 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081613180.28951@ppc970.osdl.org>
X-Original-References: <4257055A.7010...@umich.edu>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Rajesh Venkatasubramanian wrote:
> 
> Although directory changes are tracked using change-sets, there 
> seems to be no easy way to answer "give me the diff corresponding to
> the commit (change-set) object <sha1>".  That will be really helpful to
> review the changes.

Actually, it is very easy indeed. Here's what you do:

 - look up the commit object ("cat-file commit <sha1>")

   This object starts out with "tree <sha1>", followed by a list of
   parent commit objects: "parent <sha1>"

   Remember the tree object (it defines what the tree looks like at
   the time of the commit). Pick the parent object you want to diff
   against (normally the first one).

   Also, print the checking messages at the end of the commit object.

 - look up the parent object ("cat-file commit <parentsha1>")

   Here you have the same kind of object, but this time you don't care
   about going deeper, you just pick up the tree <sha1> that describes
   the tree at the parent.

 - look up the two tree objects. Unlike a commit object, a tree object
   is a binary data blob, but the format is an _extremely_ simple table
   of thse guys:

	<ascii octal filemode> <space> <pathname> <NUL character> <20-byte sha1>

  and the reason it's binary is really that that way "git" doesn't end
  up having any issues with strange pathnames. If you want to have spaces
  and newlines in your pathname, go wild.

  In particular, the tree object is also _sorted_ by the pathname. This 
  makes things simple, because you now have to sorted trees, and the 
  first thing you do is just walk the two trees in lock-step, which is 
  trivial thanks to the sorted nature of the tree "array".

  So now you have three cases:
	- you have the same name, and the same sha1

	  ignore it - the file didn't change, you don't even have to look 
	  at the contents (although if the file mode changed you might
	  want to note that)

	- you have the same name in parent and child tree lists, but the
	  sha differs. Now you just need to do a "cat-file" on both of the 
	  SHA1 values, and do a "diff -u" between them.

	- you have the filename in only parent or only child. Do a 
	  "create" or "delete" diff with the content of the sha1 file.

See? Very efficient. For any files that didn't change, you didn't have to 
do anything at all - you didn't even have to look at their data.

Also note that the above algorithm really works for _any_ two commit 
points (apart for the two first steps, which are obviously all about 
finding the parent tree when you want to diff against a predecessor). 

It doesn't have to be parent and child. Pick any commit you have. And pick
them in the other order, and you'll automatically get the reverse diff.

You can even do diffs between unrelated projects this way if you use the
shared sha1 directory model, although that obviously doesn't tend to be
all that sensible ;)

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sat, 09 Apr 2005 02:20:08 +0200
Message-ID: <3RbRK-47p-11@gated-at.bofh.it>
References: <3QkX8-7i5-9@gated-at.bofh.it> <3QoHo-2b1-21@gated-at.bofh.it> 
<3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it> 
<3QUHi-6n4-3@gated-at.bofh.it> <3QVtD-71j-19@gated-at.bofh.it> 
<3QXlM-89-11@gated-at.bofh.it>
X-Original-To: Andrea Arcangeli <and...@suse.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 69
Organization: linux.* mail to news gateway
X-Original-Cc: Martin Pool <m...@sourcefrog.net>,
	"linux-ker...@vger.kernel.org" <linux-ker...@vger.kernel.org>,
	David Lang <dl...@digitalinsight.com>
X-Original-Date: Fri, 8 Apr 2005 17:12:49 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081647510.28951@ppc970.osdl.org>
X-Original-References: <Pine.LNX.4.58.0504060800280.2...@ppc970.osdl.org>
 <20050406193911.GA11...@stingr.stingr.net> 
<pan.2005.04.07.01.40.20.998...@sourcefrog.net>
 <20050407014727.GA17...@havoc.gtf.org> 
<pan.2005.04.07.02.25.56.501...@sourcefrog.net>
 <Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz> 
<1112852302.29544.75.camel@hope>
 <Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org> 
<1112939769.29544.161.camel@hope>
 <Pine.LNX.4.58.0504072334310.28...@ppc970.osdl.org> 
<20050408083839.GC3...@opteron.random>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Andrea Arcangeli wrote:
> 
> We'd need a regenerated coherent copy of BKCVS to pipe into those SCM to
> evaluate how well they scale.

Yes, that makes most sense, I believe. Especially as BKCVS does the 
linearization that makes other SCM's _able_ to take the data in the first 
place. Few enough SCM's really understand the BK merge model, although the 
distributed ones obviously have to do something similar.

> OTOH if your git project already allows storing the data in there,
> that looks nice ;).

I can express the data, and I did a sparse .git archive to prove the 
concept. It doesn't even try to save BK-specific details, but as far as I 
can tell, my git-conversion did capture all the basic things (ie not just 
the actual source tree, but hopefully all the "who did what" parts too).

Of course, my git visualization tools are so horribly crappy that it is 
hard to make sure ;)

Also, I suspect that BKCVS actually bothers to get more details out of a
BK tree than I cared about. People have pestered Larry about it, so BKCVS
exports a lot of the nitty-gritty (per-file comments etc) that just
doesn't actually _matter_, but people whine about. Me, I don't care. My
sparse-conversion just took the important parts.

> I don't yet fully understand how the algorithms of the trees are meant
> to work

Well, things like actually merging two git trees is not even something git
tries to do. It leaves that to somebody else - you can see what the
relationship is, and you can see all the data, but as far as I'm
concerned, git is really a "filesystem". It's a way of expression
revisions, but it's not a way of creating them.

> It looks similar to a diff -ur of two hardlinked trees

Yes. You could really think of it that way. It's not really about
hardlinking, but the fact that objects are named by their content does
mean that two objects (regardless of their type) can be seen as
"hardlinked" whenever their contents match.

But the more interesting part is the hierarchical virtual format it has,
ie it is not only hardlinked, but it also has the three different levels
of "views" into those hardlinked objects ("blob", "tree", "revision").

So even though the hash tree looks flat in the _physcal_ filesystem, it 
detinitely isn't flat in its own virtual world. It's just flattened to fit 
in a normal filesystem ;)

[ There's also a fourth level view in "trust", but that one hasn't been
  implemented yet since I think it might as well be done at a higher
  level. ]

Btw, the sha1 file format isn't actually designed for "rsync", since rsync 
is really a hell of a lot more capable than my format needs. The format is 
really designed for something like a offline http grabber, in that you can 
just grab files purely by filename (and verify that you got them right by 
running sha1sum on the resulting local copy). So think "wget".

				Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!proxad.net!
news.newsland.it!news.cdlan.net!erode.bofh.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sat, 09 Apr 2005 02:30:11 +0200
Message-ID: <3Rc1t-4dI-19@gated-at.bofh.it>
References: <3RasF-2Wl-9@gated-at.bofh.it> <3Rb5s-3uM-27@gated-at.bofh.it>
X-Original-To: Rajesh Venkatasubramanian <vraj...@umich.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 30
Organization: linux.* mail to news gateway
X-Original-Cc: linux-ker...@vger.kernel.org
X-Original-Date: Fri, 8 Apr 2005 17:29:31 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504081718570.28951@ppc970.osdl.org>
X-Original-References: <4257055A.7010...@umich.edu> 
<Pine.LNX.4.58.0504081613180.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Fri, 8 Apr 2005, Linus Torvalds wrote:
> 
> Also note that the above algorithm really works for _any_ two commit 
> points (apart for the two first steps, which are obviously all about 
> finding the parent tree when you want to diff against a predecessor). 

Btw, if you want to try this, you should get an updated copy. I've pushed 
a "raw" git archive of both git and sparse (the latter is much more 
interesting from an archive standpoint, since it actually has 1400 
changesets in it) to kernel.org, but I'm not convinced it gets mirrored 
out. I think the mirror scripts may mirror only things they understand.

I've also added a partial "fsck" for the "git filesystem". It doesn't do
the connectivity analysis yet, but that should be pretty straightforward
to add - it already parses all the data, it just doesn't save it away (and
the connectivity analysis will automatically show how many "root"
changesets you have, and what the different HEADs are).

I'll make a tar-file (git-0.03), although at this point I've actually been 
maintaining it in itself, so to some degree it's almost getting easier if 
I'd just have a place to rsync it..

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!newsread.com!
news-xfer.newsread.com!nntp.abs.net!news-FFM2.ecrc.net!newsfeed00.sul.t-online.de!
t-online.de!bofh.it!news.nic.it!robomod
From: Andrea Arcangeli <and...@suse.de>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sat, 09 Apr 2005 04:30:06 +0200
Message-ID: <3RdTw-5GL-1@gated-at.bofh.it>
References: <3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it> 
<3QUHi-6n4-3@gated-at.bofh.it> <3QVtD-71j-19@gated-at.bofh.it> 
<3QXlM-89-11@gated-at.bofh.it> <3RbRK-47p-11@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Gpg-Key: 1024D/68B9CB43 13D9 8355 295F 4823 7C49  C012 DFA1 686E 68B9 CB43
User-Agent: Mutt/1.5.9i
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 19
Organization: linux.* mail to news gateway
X-Original-Cc: Martin Pool <m...@sourcefrog.net>,
	"linux-ker...@vger.kernel.org" <linux-ker...@vger.kernel.org>,
	David Lang <dl...@digitalinsight.com>
X-Original-Date: Sat, 9 Apr 2005 04:27:01 +0200
X-Original-Message-ID: <20050409022701.GA14085@opteron.random>
X-Original-References: <pan.2005.04.07.01.40.20.998...@sourcefrog.net> 
<20050407014727.GA17...@havoc.gtf.org> 
<pan.2005.04.07.02.25.56.501...@sourcefrog.net> 
<Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz> 
<1112852302.29544.75.camel@hope> 
<Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org> 
<1112939769.29544.161.camel@hope> 
<Pine.LNX.4.58.0504072334310.28...@ppc970.osdl.org> 
<20050408083839.GC3...@opteron.random> 
<Pine.LNX.4.58.0504081647510.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Fri, Apr 08, 2005 at 05:12:49PM -0700, Linus Torvalds wrote:
> really designed for something like a offline http grabber, in that you can 
> just grab files purely by filename (and verify that you got them right by 
> running sha1sum on the resulting local copy). So think "wget".

I'm not entirely convinced wget is going to be an efficient way to
synchronize and fetch your tree, its simplicitly is great though. It's a
tradeoff between optimzing and re-using existing tools (like webservers).
Perhaps that's why you were compressing the stuff too? It sounds better
not to compress the stuff on-disk, and to synchronize with a rsync-like
protocol (rsync server would make it) that handles the compression in
the network protocol itself, and in turn that can apply compression to a
large blob (i.e. the diff between the trees), and not to the single tiny
files.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news2.google.com!news.maxwell.syr.edu!
newsfeed.icl.net!newsfeed.fjserv.net!news.mailgate.org!nntp.infostrada.it!
bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sat, 09 Apr 2005 07:50:06 +0200
Message-ID: <3Rh14-80n-3@gated-at.bofh.it>
References: <3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it> 
<3QUHi-6n4-3@gated-at.bofh.it> <3QVtD-71j-19@gated-at.bofh.it> 
<3QXlM-89-11@gated-at.bofh.it> <3RbRK-47p-11@gated-at.bofh.it> 
<3RdTw-5GL-1@gated-at.bofh.it>
X-Original-To: Andrea Arcangeli <and...@suse.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 31
Organization: linux.* mail to news gateway
X-Original-Cc: Martin Pool <m...@sourcefrog.net>,
	"linux-ker...@vger.kernel.org" <linux-ker...@vger.kernel.org>,
	David Lang <dl...@digitalinsight.com>
X-Original-Date: Fri, 8 Apr 2005 22:45:18 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504082240460.28951@ppc970.osdl.org>
X-Original-References: <pan.2005.04.07.01.40.20.998...@sourcefrog.net>
 <20050407014727.GA17...@havoc.gtf.org> 
<pan.2005.04.07.02.25.56.501...@sourcefrog.net>
 <Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz> 
<1112852302.29544.75.camel@hope>
 <Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org> 
<1112939769.29544.161.camel@hope>
 <Pine.LNX.4.58.0504072334310.28...@ppc970.osdl.org> 
<20050408083839.GC3...@opteron.random>
 <Pine.LNX.4.58.0504081647510.28...@ppc970.osdl.org> 
<20050409022701.GA14...@opteron.random>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Sat, 9 Apr 2005, Andrea Arcangeli wrote:
> 
> I'm not entirely convinced wget is going to be an efficient way to
> synchronize and fetch your tree

I don't think it's efficient per se, but I think it's important that 
people can just "pass the files along". Ie it's a huge benefit if any 
everyday mirror script (whether rsync, wget, homebrew or whatever) will 
just automatically do the right thing. 

> Perhaps that's why you were compressing the stuff too? It sounds better
> not to compress the stuff on-disk

I much prefer to waste some CPU time to save disk cache. Especially since 
the compression is "free" if you do it early on (ie it's done only once, 
since the files are stable). Also, if the difference is a 1.5GB kernel 
repository or a 3GB kernel repository, I know which one I'll pick ;)

Also, I don't want people editing repostitory files by hand. Sure, the 
sha1 catches it, but still... I'd rather force the low-level ops to use 
the proper helper routines. Which is why it's a raw zlib compressed blob, 
not a gzipped file.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news4.google.com!news.glorb.com!news.newsland.it!
news.cdlan.net!erode.bofh.it!bofh.it!news.nic.it!robomod
From: "David S. Miller" <da...@davemloft.net>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sun, 10 Apr 2005 01:00:12 +0200
Message-ID: <3Rx5W-4dB-7@gated-at.bofh.it>
References: <3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it> 
<3QUHi-6n4-3@gated-at.bofh.it> <3QVtD-71j-19@gated-at.bofh.it> 
<3QXlM-89-11@gated-at.bofh.it> <3RbRK-47p-11@gated-at.bofh.it> 
<3RdTw-5GL-1@gated-at.bofh.it> <3Rh14-80n-3@gated-at.bofh.it>
X-Original-To: Linus Torvalds <torva...@osdl.org>
X-Mailer: Sylpheed version 1.0.4 (GTK+ 1.2.10; sparc-unknown-linux-gnu)
X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=
.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 22
Organization: linux.* mail to news gateway
X-Original-Cc: and...@suse.de, m...@sourcefrog.net, linux-ker...@vger.kernel.org,
	dl...@digitalinsight.com
X-Original-Date: Sat, 9 Apr 2005 15:55:11 -0700
X-Original-Message-ID: <20050409155511.7432d5c7.davem@davemloft.net>
X-Original-References: <pan.2005.04.07.01.40.20.998...@sourcefrog.net>
	<20050407014727.GA17...@havoc.gtf.org>
	<pan.2005.04.07.02.25.56.501...@sourcefrog.net>
	<Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz>
	<1112852302.29544.75.camel@hope>
	<Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org>
	<1112939769.29544.161.camel@hope>
	<Pine.LNX.4.58.0504072334310.28...@ppc970.osdl.org>
	<20050408083839.GC3...@opteron.random>
	<Pine.LNX.4.58.0504081647510.28...@ppc970.osdl.org>
	<20050409022701.GA14...@opteron.random>
	<Pine.LNX.4.58.0504082240460.28...@ppc970.osdl.org>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org

On Fri, 8 Apr 2005 22:45:18 -0700 (PDT)
Linus Torvalds <torva...@osdl.org> wrote:

> Also, I don't want people editing repostitory files by hand. Sure, the 
> sha1 catches it, but still... I'd rather force the low-level ops to use 
> the proper helper routines. Which is why it's a raw zlib compressed blob, 
> not a gzipped file.

I understand the arguments for compression, but I hate it for one
simple reason: recovery is more difficult when you corrupt some
file in your repository.

It's happened to me more than once and I did lose data.

Without compression, I might be able to recover if something
causes a block of zeros to be written to the middle of some
repository file.  With compression, you pretty much just lose.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news3.google.com!news.glorb.com!news.newsland.it!
news.ngi.it!bofh.it!news.nic.it!robomod
From: Linus Torvalds <torva...@osdl.org>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sun, 10 Apr 2005 01:20:09 +0200
Message-ID: <3Rxpf-4A7-15@gated-at.bofh.it>
References: <3QujJ-6NQ-7@gated-at.bofh.it> <3QujM-6NQ-23@gated-at.bofh.it> 
<3QuWy-7pk-23@gated-at.bofh.it> <3Qv69-7ve-13@gated-at.bofh.it> 
<3Qy4a-2jy-21@gated-at.bofh.it> <3QOLv-1qG-7@gated-at.bofh.it> 
<3QUHi-6n4-3@gated-at.bofh.it> <3QVtD-71j-19@gated-at.bofh.it> 
<3QXlM-89-11@gated-at.bofh.it> <3RbRK-47p-11@gated-at.bofh.it> 
<3RdTw-5GL-1@gated-at.bofh.it> <3Rh14-80n-3@gated-at.bofh.it> 
<3Rx5W-4dB-7@gated-at.bofh.it>
X-Original-To: "David S. Miller" <da...@davemloft.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Mimedefang-Filter: osdl$Revision: 1.106 $
X-Scanned-By: MIMEDefang 2.36
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 23
Organization: linux.* mail to news gateway
X-Original-Cc: and...@suse.de, m...@sourcefrog.net, linux-ker...@vger.kernel.org,
	dl...@digitalinsight.com
X-Original-Date: Sat, 9 Apr 2005 16:13:51 -0700 (PDT)
X-Original-Message-ID: <Pine.LNX.4.58.0504091611570.1267@ppc970.osdl.org>
X-Original-References: <pan.2005.04.07.01.40.20.998...@sourcefrog.net>
 <20050407014727.GA17...@havoc.gtf.org> 
<pan.2005.04.07.02.25.56.501...@sourcefrog.net>
 <Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz> 
<1112852302.29544.75.camel@hope>
 <Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org> 
<1112939769.29544.161.camel@hope>
 <Pine.LNX.4.58.0504072334310.28...@ppc970.osdl.org> 
<20050408083839.GC3...@opteron.random>
 <Pine.LNX.4.58.0504081647510.28...@ppc970.osdl.org> 
<20050409022701.GA14...@opteron.random>
 <Pine.LNX.4.58.0504082240460.28...@ppc970.osdl.org>
 <20050409155511.7432d5c7.da...@davemloft.net>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org



On Sat, 9 Apr 2005, David S. Miller wrote:
> 
> I understand the arguments for compression, but I hate it for one
> simple reason: recovery is more difficult when you corrupt some
> file in your repository.

Trust me, the way git does things, you'll have so much redundancy that 
you'll have to really _work_ at losing data.

That's the good news.

The bad news is that this is obviously why it does eat a lot of disk. 
Since it saves full-file commits, you're going to have a lot of 
(compressed) full files around.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Path: g2news1.google.com!news1.google.com!proxad.net!news.newsland.it!
area.cu.mi.it!bofh.it!news.nic.it!robomod
From: Ingo Molnar <mi...@elte.hu>
Newsgroups: linux.kernel
Subject: Re: Kernel SCM saga..
Date: Sun, 10 Apr 2005 13:40:08 +0200
Message-ID: <3RIXl-5cJ-5@gated-at.bofh.it>
References: <3Qv69-7ve-13@gated-at.bofh.it> <3Qy4a-2jy-21@gated-at.bofh.it> 
<3QOLv-1qG-7@gated-at.bofh.it> <3QUHi-6n4-3@gated-at.bofh.it> 
<3QVtD-71j-19@gated-at.bofh.it> <3QXlM-89-11@gated-at.bofh.it> 
<3RbRK-47p-11@gated-at.bofh.it> <3RdTw-5GL-1@gated-at.bofh.it> 
<3Rh14-80n-3@gated-at.bofh.it> <3Rx5W-4dB-7@gated-at.bofh.it>
X-Original-To: "David S. Miller" <da...@davemloft.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.2.1i
X-Elte-Spamversion: MailScanner 4.31.6-itk1 (ELTE 1.2) SpamAssassin 2.63 ClamAV 0.73
X-Elte-Virusstatus: clean
X-Elte-Spamcheck: no
X-Elte-Spamcheck-Details: score=-4.9, required 5.9,
	autolearn=not spam, BAYES_00 -4.90
X-Elte-Spamscore: -4
Sender: robo...@news.nic.it
X-Mailing-List: linux-kernel@vger.kernel.org
Approved: robo...@news.nic.it
Lines: 56
Organization: linux.* mail to news gateway
X-Original-Cc: Linus Torvalds <torva...@osdl.org>, and...@suse.de,
	m...@sourcefrog.net, linux-ker...@vger.kernel.org,
	dl...@digitalinsight.com, Paul Jackson <p...@engr.sgi.com>
X-Original-Date: Sun, 10 Apr 2005 13:33:36 +0200
X-Original-Message-ID: <20050410113336.GA8103@elte.hu>
X-Original-References: <Pine.LNX.4.62.0504061931560.10...@qynat.qvtvafvgr.pbz> 
<1112852302.29544.75.camel@hope> 
<Pine.LNX.4.58.0504071626290.28...@ppc970.osdl.org> 
<1112939769.29544.161.camel@hope> 
<Pine.LNX.4.58.0504072334310.28...@ppc970.osdl.org> 
<20050408083839.GC3...@opteron.random> 
<Pine.LNX.4.58.0504081647510.28...@ppc970.osdl.org> 
<20050409022701.GA14...@opteron.random> 
<Pine.LNX.4.58.0504082240460.28...@ppc970.osdl.org> 
<20050409155511.7432d5c7.da...@davemloft.net>
X-Original-Sender: linux-kernel-ow...@vger.kernel.org


* David S. Miller <da...@davemloft.net> wrote:

> On Fri, 8 Apr 2005 22:45:18 -0700 (PDT)
> Linus Torvalds <torva...@osdl.org> wrote:
> 
> > Also, I don't want people editing repostitory files by hand. Sure, the 
> > sha1 catches it, but still... I'd rather force the low-level ops to use 
> > the proper helper routines. Which is why it's a raw zlib compressed blob, 
> > not a gzipped file.
> 
> I understand the arguments for compression, but I hate it for one
> simple reason: recovery is more difficult when you corrupt some
> file in your repository.
> 
> It's happened to me more than once and I did lose data.
> 
> Without compression, I might be able to recover if something
> causes a block of zeros to be written to the middle of some
> repository file.  With compression, you pretty much just lose.

that depends on how you compress. You are perfectly right that with 
default zlib compression, where you start the compression stream and 
stop it at the end of the file, recovery in case of damage is very hard 
for the portion that comes _after_ the damaged section. You'd have to 
reconstruct the compression state which is akin to breaking a key.

But with zlib you can 'flush' the compression state every couple of 
blocks and basically get the same recovery properties, at some very 
minimal extra space cost (because when you flush out compression state 
you get some extra padding bytes).

Flushing has another advantage as well: a small delta (even if it 
increases/decreases the file size!) in the middle of a larger file will 
still be compressed to the same output both before and after the change 
area (modulo flush block size), which rsync can pick up just fine. (IIRC 
that is one of the reasons why Debian, when compressing .deb's, does 
zlib-flushes every couple of blocks, so that rsync/apt-get can pick up 
partial .deb's as well.)

the zlib option is i think Z_PARTIAL_FLUSH, i'm using it in Tux to do 
chunks of compression. The flushing cost ismax 12 bytes or so, so if 
it's done every 4K we maximize the cost to 0.2%.

so flushing is both rsync-friendly and recovery-friendly.

(recovery isnt as simple as with plaintext, as you have to find the next 
'block' and the block length will be inevitably variable. But it should 
be pretty predictable, and tools might even exist.)

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

			  SCO's Case Against IBM

November 12, 2003 - Jed Boal from Eyewitness News KSL 5 TV provides an
overview on SCO's case against IBM. Darl McBride, SCO's president and CEO,
talks about the lawsuit's impact and attacks. Jason Holt, student and 
Linux user, talks about the benefits of code availability and the merits 
of the SCO vs IBM lawsuit. See SCO vs IBM.

Note: The materials and information included in these Web pages are not to
be used for any other purpose other than private study, research, review
or criticism.