Larry McVoy Interview

Posted by jeremy
KernelTrap

May 28, 2002

KernelTrap has spoken with Larry McVoy, BitMover founder and primary BitKeeper author. BitKeeper, a distributed source control system, has been adopted by Linux kernel creator Linus Torvalds and condemned by free software icon Richard Stallman.

In this interview, Larry looks back through the years, describing his exposure to computers and Linux. He also discusses the history of BitKeeper, from writing NSElite for Sun (which turned into their still used SCM, Teamware), to his desire to keep Linus from burning out, to the present day solution. The choice to not license BitKeeper under the GPL is also explained.

Larry discusses much beyond Bitkeeper as well, exploring some of his other interests. For the full interview, read on.

Jeremy Andrews: Please share a little about yourself and your background.

Larry McVoy: I'm 40 years old, I work in San Francisco. I have a BS and an MS from the University of Wisconsin at Madison (great CS school, they work you hard), and a tiny bit of work on PhD at University of Arizona. I've taught OS classes at Stanford, published some OS papers, sat on a number of technical conference committees, was program chair for Linux Expo '99, etc. I used to be a lot more active in the conference scene than I am now, raising kids and running a company takes up most of my time. I have two sons, Travis is 3 and Dylan is 8 months. My wife Beth does most of the kid work, she's a saint, I wouldn't get any work done without her help.

JA: Are you still planning on getting a PhD?

Larry McVoy: No, I'm past that point of my life. It's far more likely that I'll relearn all the basics when my kids are growing up, so I can help with their education. I find the current state of public schools in America depressing, and the ones in California are appalling.

JA: How and when did you get started working with computers?

Larry McVoy: I was an Art History major specializing in Greek art, pottery and sculpture, when I figured out that was really cool but not going to support me. I looked at my past course work and I think that a CS class was one of the few in which I got an A. So I switched to that after a discussion with my Dad in which he said "look, you can get excited about just about any subject if you study it enough. Deep knowledge is what makes a topic interesting." So off to the CS department I went.

It was great, I took to programming quickly, loved the intensity, loved the control over the machine, and especially loved how quickly you could go from an idea to a working program. I bought a CPM machine for $2000 in 1982 or 83, and programmed that instead of sharing a VAX with a 100 other students. That was a blast. Writing assembler versions of cp/ls/rm so that they fit in one sector of a floppy and loaded fast. Sick, but fun.

JA: When did you first become involved with Linux?

Larry McVoy: Around version 0.97 or so, in 1992 is the first record of it that I can find. I posted to comp.os.linux that it booted up on a small laptop at Fry's. I suspect I had been playing with it a bit before then but that's the first record I can find on the net (gotta love Google Groups).

JA: You've gotten a lot of attention with BitKeeper, your distributed source management system. However, before we talk about that, what other involvement have you had with kernel development?

Larry McVoy: I started out my first job porting Unix to a supercomputer. Weird machine, pointers pointed at bits not bytes, it was an MP but the memory was shared via bcopy(), etc. Fun job, I got to do tons of stuff and it was a great education. Then I went to Sun as a contractor to do POSIX P1003.1 conformance, which most people would hate, but I was a young green guy and smart enough to realize that this was more education. It's also the reason that BitKeeper exists, believe it or not, but that's a longer story. I stayed on at Sun doing performance work, made the file system go at the platter speed, fixed the VM system to free up memory fast enough (wrote 17 different pageout daemons before I admitted that paging on pages is just plain stupid), did some networking stuff. I also wrote something called NSElite, which turned into Teamware, under the guise of fixing an awful SCM system called the NSE. NSElite is sort of the grandparent of BitKeeper, they share a lot of similarities. I moved over to the hardware side of Sun and designed and shipped their first cluster based server.

After that, I went to SGI and did more performance work, I bolted the fast path of XFS to the fast path of TCP and made NFS servers which could sustain over a GByte/sec of NFS traffic (this was on R4400 MIPS based machines, roughly the performance of the original Pentium at maybe 200MHz at best). I also designed their new name server, nsd.

I was the software lead/architect at Cobalt for a while, and I was also the 4th person at Google. But by this time it was clear I wanted to do my own thing so BitMover was starting to be a reality.

I haven't done very much work on Linux but I've encouraged others to do useful work on it. I worked pretty closely with Erik Troan when he was the dude at Red Hat, sent him IRIX's chkconfig man page, and he did that. I posted the SunOS 4.x loadable module man pages and a few weeks later Linux had loadable modules. And I'm probably best known in the Linux space for LMbench, a benchmarking suite which Linus and some of the other kernel hackers liked and used to make sure Linux didn't turn into a bloated mess like most commercial Unix offerings.

JA: Is LMbench still actively developed?

Larry McVoy: Sure is, http://lmbench.bkbits.net/ is where to go see what is happening. As you can see from the changelogs, Carl Staelin is doing all the development these days. It could certainly use some help though, we have piles and piles of results that we should organize and publish but I simply don't have the time.

JA: The Linux VM has received a fair amount of attention during the 2.4 kernel development process. How did your VM system compare to the Linux solutions?

Larry McVoy: It wasn't "my VM system" at all, it was the SunOS 4.x VM system. I was just tinkering with it and make no claims on the overall design. Smarter people than I did that VM system and it was a joy to behold. You could see how all of it fit together, how the file systems and the VM system worked together, it was elegant.

I can't really comment on the current Linux VM systems, I haven't spent enough time looking at them to be fair. I did notice that I had performance problems with 2.4.9-13 as shipped by Red Hat, and I switched my servers over to 2.4.19-pre6-rmap12i (Rik Van Riel's tree) and have been much happier with that. It throws stuff out when it is old and there is memory pressure and the only way I could get 2.4.9-13 to throw things out was to reboot.

I think one way to see if the VM systems have advanced is to see if they scan pages or inodes. The mistake I made was to try and do the work by scanning each page's reference bits. It was a mistake back then because there were too many pages to scan, which means it is a bigger mistake now, there are far more pages to scan. Note that the page size was 4K then and is 4K now. You can't get any decent view of what is going by looking at physical pages because you have no idea which ones belong together. Besides, if you have a 6GB machine, that's 1.5 million pages to look at. Try and design an algorithm that will tell you which pages to throw out by looking at the pages, it's just not going to work very well if you are looking at physical pages, you don't have enough information. That's what writing all those pageout daemons taught me and I believe it is still true today.

I believe that the right answer is to have a backpointer from each page to the inode and to store statistics in the inode about the state of the pages. So if I modify a page, that information goes into the inode, it doesn't tell me what page, it just tells me there is a dirty page. Then you scan all the inodes to try and decide what to do, and in the inodes you know

Then start writing your pageout daemon with that information. I think you will get much better results.

If any of the VM rewrites have started down this path, it would be fun to watch, I believe that they could come up with some real improvements in how systems behave under load. The basic idea is that you are gathering a large number of statistics into a smaller number of buckets, and all the data in each bucket is related. So you can do a fast scan to find read only inodes which have lots of pages that have not been accessed recently and toss those first, for example.

Oh and one other thing: someone needs to write "topino" which stands for "top inodes". It's like top except it lists inodes instead of processes. You use the basename of the inode as the thing to list (I suppose you could have one that showed you inode numbers if you were worried about basename namespace collisions), and then you display all those statistics. I wrote something like this for SunOS, it was extremely good at providing you insight into what was going on in the VM system.

JA: Earlier you mentioned porting UNIX to a supercomputer. What version of Unix was this?

Larry McVoy: Hmm, I'm not sure, I think it was System III. I know that it was not one of the BSD versions, the source layout was /usr/src/cmd, /usr/src/uts, etc.

JA:What was your involvement with Google? How long did you work on that project?

Larry McVoy: My involvement was pretty minor. I worked a little on the design for the layer that virtualized "small" (32 bit) files into large files, but I don't know if they used that design or not. I was only there for a few months, it was pretty clear at that point that I really wanted to be working on BitKeeper and creating my own company.

We eventually parted on good terms and I'm extremely happy with Google, it has improved my life, it even helped avoid a rather nasty situation during the birth of my first son. Ask Sergey or Larry about it some time, I think they like the story. Google rocks.

JA: Is Sun still using Teamware?

Larry McVoy: Sure are. All of Solaris development has been done under it. We like that, it means Sun trains people in the peer to peer SCM model, which turns into a major source of revenue for us. People leave Sun, want Teamware, ask around, and then call us. BitKeeper is Teamware on steroids. Teamware didn't have changesets, wasn't reproducible, and wasn't as distributed as BK (it only worked over NFS), didn't have the import/export stuff, etc. But the basic idea of replicated repositories is the same.

JA: What prompted you to begin writing BitKeeper? (and when)

Larry McVoy: Having built a good system at Sun and seeing how well it worked, I expected that the commercial SCM folk would have copied it. They didn't, and the commercial systems leave a lot to be desired. So it was always in the back of my mind. I started tinkering around in the spring of '97 or so, the earliest record checkin is in May '97.

It didn't get really serious until it became clear that Linus was getting burnt out because the enormous workload placed on him. I posted in Sep '98 that we needed to do something. I had been around OS development for a long time by then and I'd never seen anything like Linux or met anyone like Linus. In my opinion, then and now, Linus is what makes Linux great. He's the glue that holds it all together. Without him, Linux would splinter like the BSDs have. I'm actually a long time BSD fanatic and I abandoned BSD when I saw the splintering happen. I wrote a paper about my view of Unix and the future of operating systems, which you can find here: http://www.bitmover.com/lm/papers/srcos.html. Bob Young at Red Hat credits it with influencing how Red Hat was set up, he liked that paper.

Anyway, BK became a pressing issue when I saw Linus burning out, I wanted to do anything I could to help. I'm no replacement for Linus, so I focused on doing whatever I could to help offload him. And that was build an SCM system which worked.

Linus, Dave Miller, and Richard Henderson came up to my house for dinner and we drew pictures on the floor for about 3 or 4 hours, and when we were done, Linus said "yeah, that's cool, if you build it and it works like you say, I'll use it". And I foolishly said "No problem, I've done it before, 6 months or so". That would have been around the fall of '98 I think.

JA: From your story, it sound like Linus himself was part of the original BitKeeper design process?

Larry McVoy: Hmm. I won't speak for Linus, but my memory is that it was more about getting him to see the design. The basic design of a peer to peer system was done before Linus ever released a copy of Linux, that was done back in the Teamware days. What we did at my house was go over the design, talk about how it would work, that sort of thing. There is a basic part of how BitKeeper and Teamware work that I've been drawing pictures of for 10 years, the part about how it handles parallel development without losing information when the replicas come back together. I think we spent most of the time on that. At that point and since, Linus has certainly made it clear when he doesn't think it is good enough, so he's influenced things, but the basic architecture was done a long time ago.

JA: Was there much communication between you and Linus during the first few years of BitKeeper development?

Larry McVoy: Yeah, but it was all about LMbench. Linus and I worked very closely on LMbench issues, he has the same sense of ethics that I do that the benchmark should reflect reality fairly, not be skewed toward any hardware or software, and he constantly helped straighten me out. Working with him on LMbench was a lot of fun and continued, albeit at a slower rate, during the BK development.

During BK development, we only talked about it a few times. He was definitely in the "show me the code" mode. He wasn't very interested in hearing how great it was going to be, he wanted to see it working. And he made it clear that he wasn't going to use it until "it was the best". The bummer about that was that it still isn't the best by his definition which is "it's not possible for it to be better". His view was that the other SCM systems were junk, and given his needs, he's right. So he wasn't interested in a system that was a little bit better than junk. It's like there is a scale of 0 to 1 and the best system out there was only at 1/2. He wanted 1/1, not something slightly better than 1/2. I eventually realized that it was going to be 10 years or more before we hit 1/1, so I cornered him at one of the Linux conferences and got him to see that. He eased off on "the best" definition a tad, but I still agree with him that it needs to be 1/1 and we're working on it. That said, it seems to be good enough to save him a fair bit of effort, which was the whole point, so from that point of view we've achieved some amount of success.

JA: How close is BitKeeper now to being what you'd imagined when you began developing it?

Larry McVoy: Pretty darn close. It does a lot more than what I started out thinking it would do, there are a thousand corner cases and features I didn't anticipate, but the basic idea of a peer to peer system of replicated repositories is exactly what I described to Linus & Co.

JA: As you said earlier, you've written a SCM before. What was so different about BitKeeper that it took this much longer than you expected?

Larry McVoy: I had to rewrite SCCS from scratch. SCCS is the basic engine on which everything else is built. RCS wasn't an option, RCS is an awful file format. SCCS can do lots of things that RCS can't even think of doing. A good example is "cvs annotate". To do that, they convert an RCS file to an SCCS file in memory and then run the annotate code. Seems like that should have told them something.

BitKeeper is a substantially more complete product than Teamware, it handles more of the corner cases. Here's some examples: we support http as a transport. That's a pain, http has some limitations that we had to work around, and I'll bet we have at least 3 man months in that alone. Another example is the file merge technology. Wayne and I worked on that for most of last summer, which is a long time but now we have world class file merging technology, much better than Teamware's. Another example is bkbits.net, our hosting service. The PPC team had been in BK for more than a year and Cort got sick of being the guy who did all the work, so he dumped it off of FSMlabs. We wanted to support those guys, so we built bkbits to act as an ASP. It's nowhere near as polished as Sourceforge, for example, but it's useful. Some more time eaten up. The list never ends.

I didn't have Sun supporting the development. So as it grew into a bigger problem, I found myself having to build a company to support the problem. That was fun but very time consuming.

I think the summary is that we had to design and build a substantial product, a team of people, a company, a business model, a licensing model. All that takes time. I'm not complaining, it's been a blast. The only part I hate is constantly having to scramble to make sure we have money for payroll. Oh, and arguing about the license. But I don't argue anymore, I figured out about procmail :-)

Building a company is hard work and most people don't have what it takes to do it, it certainly takes a toll on you. But it is a blast. It's like someone pried your brain open and aimed a fire hose of information at you, about a million different topics, and you have to absorb all that and act on it. If you are like me, you thrive on learning new things, and building a company, well, I think that will force you to learn faster than anything.

This might be a good place to point out that I am not BitKeeper. Andrew Chang*, Andy Chittenden, Cort Dougan, Amy Graf*, Aaron Kushner*, Bill Moore, Rob Netzer*, Georg Nikodym, David Parsons*, Wayne Scott*, Rick Smith*, Harlan Stenn, Linus Torvalds, Ted Tso, Matthias Urlich, Beth Van Eman*, and Zack Weinberg* are among the people who have put time and effort into building BitKeeper up to where it is today. The starred names are people who work here or have worked here, but the all have contributed code, time, and thought to the product. Two other people are worth special mention: Bob Young (of Red Hat) is who I turn to for business advice, he's helped us out many times and I'm very grateful. Andy Bechtolsheim (Sun founder, now at Cisco) helped out in the first two years and we owe him a big thanks as well. It's easier to think of BitMover as a one man show, but that just isn't accurate.

JA: Linus and Marcelo are now both using BitKeeper to maintain the Linux 2.5 development kernel and 2.4 stable kernel. What feedback have you gotten from them?

Larry McVoy: I don't hear much from Marcelo, other than a few problems early on that we fixed (I got to log into his 8 way Xeon box, wheeee! Fast machine!).

Linus has both good and bad things to say. It's clearly saving him a substantial amount of effort, because he can now just pull in a pile of patches from each lieutenant who is using BK. It's easier to do that, and faster, and if there are merge problems, BK has great merge technology. All of this has resulted in less work for Linus, which was, after all, the whole point. As far as I can tell, people (including Linus) seem to think that it is reducing effort and enabling faster forward progress on the kernel.

On the other hand, there are various things which either aren't done yet or don't work right, and he is not shy about letting me know what they are. Some are easy to fix and we fix them fast, and some are profoundly hard and we fix them slowly or not at all. He's grumpy when we don't fix things and it's somewhat complicated by the fact that Linux kernel team's use of BK isn't generating any money for us and we also have to take care of the paying customers, they are the people who make it possible for the kernel team's use of BK. So there is some tension there, but by and large it's worked out pretty well. We're not done by any means, but as things stand, BK is helping and getting better every day.

JA: There has been much controversy over the fact that BitKeeper is available under three special licenses, the BKCL, the BKL and the BKSRC. How do these licenses work, and why did you choose them?

Larry McVoy: I've never bought into the open source model as a self sustaining model for all software. It works in some places the software is tied to some other source of revenue, such as hardware, but in general, it stinks as a business model. It's fantastic if your goal is to have a lot of free software out there, but it starts to fall apart when building that free software costs more than you can extract from it in revenue.

BitKeeper is in that camp. There is about 25 man years of effort in BitKeeper so far, with no end in sight. We pay Bay Area salaries, so our cost for an engineer is about $160K/year. That's at least 4 million dollars no matter how you look at it, and that's a lower bound. I took a hard look at the Cyclic people who tried to make a business out of supporting CVS and they pulled in $145K in their best year. It would take 27 years to make $4 million at that rate, and that assumes we stop drawing salaries today. In this product space, if people can use it for free, they will. People have tried to argue with me that BitKeeper is a better tool and it would generate more support revenue. That's nonsense, exactly because it is a better tool. At least with CVS, there are enough broken or missing features that you could generate revenue to fix them. Maybe.

So I took a hard look at the situation and decided that I wanted to maximize value to everyone. I divided the world up into 3 camps: the free users, the commercial users, and the vendor. The goals were to provide maximum value to everyone and have everyone provide value back in return. Here's how it works:

Free users: these users don't pay in money, but they do pay. They pay by using the product and pointing out bugs. BitKeeper is a dramatically better product because of the free users. The BKL, the free usage license, insists that you are running the latest images, because that's where the free users provide value. It doesn't help anyone to get bug reports on problems we've already fixed. The job of the free users is to help debug the latest.

Commercial users: these users pay in money which funds further development. As a commercial user, they can pick which release they want to run, which sometimes means they stay back for stability reasons, perceived or otherwise. They benefit from the free users running a new release first, and it's typical that they wait for the timestamps in the download area to be a few weeks old before upgrading.

Vendor: we provide value in the form of the product and support. We get the bug fix value from the free users and financial benefit from the commercial users. The money is turned right around into additional development.

While BitKeeper is hardly a get rich quick scheme, it is self supporting. We've taken no outside investment, the company is built on the backs and wallets of the people who work here, and that's cool. It means there is no outside board of directors in the form of VC's telling us to stop wasting time giving it away. I know that giving it away has helped make it a better product, which is good for everyone, but I'd hate to be in the position of having to justify that decision to a VC before the fact. It's easy to see that things worked after the fact, it's much harder to see that they will work ahead of time.

The bottom line on the licensing scheme is that it was designed to give as much and get as much as possible to and from all parties. Licenses such as the GPL give more to the free users, but give dramatically less to both the original author and to the commercial users. Using GPLed software for everything is like living in a world where the answer for when you have an illness is "here are the plans for the hospital, you can finish building it and check yourself in. Oh, and here's the medical instruments you'll need, you can slice yourself open and poke around. You can do it, good luck!".

Licensed software is more like the insurance model. Nobody pays what it cost to develop the software, that's way to expensive. So everyone pays a little bit and the cost load is spread out. Yeah, for consumer applications like what Microsoft ships, they can get very rich because there is a very large market. But for applications like BitKeeper, it's a tiny market, about a million seats world wide, and there are about 300 different SCM tools out there. Hardly the area to go try and do a free product and hope that support revenue will work. It's just not realistic. There is absolutely no chance that BitKeeper would be anywhere near as good as it is today if we had chosen to GPL it.

JA: However, GPLed software comprises the majority of the quite successful GNU/Linux operating system. I have found that if I have problems with a program I'm using I don't have to fix them myself, instead once I track down the development team (often just one person) and report bugs, they are quickly fixed. Free of charge.

Larry McVoy: That really depends on what it is that you are reporting. I've certainly found that to be true for some projects, but go take a look at all the projects on sourceforge and view their activity. The vast majority of them are the first pass of the code, it sort of works, and now no one is working on it.

The problem is that coding is fun while it is fun, but when it isn't fun, there has to be some other reason to keep doing it. For a lot of projects, it's painfully clear that it became "not fun" and the project stopped. Browse the CVS repositories and you'll see what I mean.

JA: Richard Stallman recently offered a rather negative view of Linus and his use of BitKeeper. Have you any reply to the statements RMS made?

Larry McVoy:Hmm, what can I say? I could point out that we made very sure that Linus could both accept and produce traditional diff based patches (and he does). I could point out, as others have, that Richard himself said that you can use a non-free tool when there is no good free alternative. But I think all of that really misses the point.

The point is that more and more people are coming to see the BitKeeper licensing model for what it is: a well thought out compromise which gives as much value as possible to all interested parties. Is it free? Not the way Richard wants, not even close. Is the product better than any free alternative? Yes. Could we have built the product if it were GPLed? No.
Is it helping Linus be more productive? Yes.

Which is better: a free tool such as CVS which doesn't work all the time or a somewhat free tool such as BitKeeper which does work all the time? That really depends on your point of view. If you want the world to consist of nothing but free software, BitKeeper doesn't help that goal. If you want a tool to get a job done and done well, then BitKeeper helps much more than the free alternatives. I understand Richard's point of view, he wants a Utopian world where everything is both free and world class, but that's not realistic. The reality is that you aren't going to get everything you want for free, you're going to have to make some choices.

Richard might want to consider the fact that developing new software is extremely expensive. He's very proud of the collection of free software, but that's a collection of reimplementations, but no profoundly new ideas or products. Free software is very cool, it's useful, I use it, and I'm grateful, but it has one big problem. What if the free software model simply can't support the costs of developing new ideas? Realize that for every good new idea that you hear about, there are at least a 100 that were funded, developed, and failed before you ever saw them. The naive reaction is "well, they were stupid". That's nonsense, history has shown over and over that we find new ideas amongst the insight we gain by building the bad ideas. Without doing that, we don't learn what was bad and we don't recognize what is good. So the problem is that all those bad development projects cost a lot of money. Does free software generate that kind of money? Not a chance. Go look at the software R&D budgets for Microsoft, Sun, IBM, etc. You can take all the free software revenue in the world and it doesn't begin to make a dent in what those guys spend, let alone what they earn.

So what does an all free software world look like? In my opinion, it could turn out to be a pretty dull place. Little or no new ideas, products, or innovations. We need the profit motive to keep the gears turning, those gears crank out the new stuff. It's great that free software gives us free versions of existing products, but who is going to pay for the next generation of new products? If Richard can answer that question, I'm with him 100%, but so far, I have seen no answer, only zealotry.

JA: I think it's safe to say that there are no other freely available source control systems as powerful or well designed as BitKeeper. When challenged, you've said, "Anyone and everyone is welcome to try and build a better SCM system". Do you see this eventually happening?

Larry McVoy: Yes and no. I suspect someone will replicate some of the features of BK in an open source system. So you'll be able to use something that claims to do what BK does. What I don't see happening is an open source project that becomes as well polished as BK is today, let alone tomorrow. There is a team of expensive people working every day on fixing the corner cases in BK. That's definitely in the "not fun" category. Add to that BK is a true distributed system, with no limitations imposed, and if you start to think about it hard you'll see that there are literally hundreds if not thousands of places where you have to do things differently in a distributed system. It's harder, and there isn't any room for cutting corners, it blows up in your face.

I will predict that you will never see a centralized system evolve into a distributed system. So CVS/Subversion/ClearCase/Perforce/etc will all stay with the centralized client/server architecture. They may try to replicate distributed systems and it will sort of work, but all the corner cases will not work. You need to design a distributed system to be distributed from day one.

I don't know how to explain how profoundly different it is to work on a distributed system, the only thing that I can say is that once you "get it", it's a lot like getting how to write multithreaded kernels. If you really know how to do that, you think about things differently. Distributed systems are the same way, once you get it, you think about problems differently. Centralized systems are easy, you have all the data you need to decide what to do. A distributed system means that other instances of your data are changing right now, all over the world, in any and every way, and you have to do the right thing when the data comes back together. If you are a file system person, think about writing a file system where the directory slot you want is occupied by a different inode in a different instance, and resolving all variations of that.

JA: Who are some companies that have chosen BitKeeper as a source control solution?

Larry McVoy: Well, I can tell you about startups who are using it, like 3pardata, Soma Networks, Bluearc, Starent, Intrinsyc, Intransa, Storigen (lots of I/O places), but the big companies tend to be cagey about us using their name. So I'll stay away from that. You can go browse the openlogging tree because that's open by definition, and you'll find places like Apple, Red Hat, Akamai, John Deere, HP, Intel, IBM, Fujitsu, and lots of lesser known companies.

JA: Are you aware of any companies that have switched from using ClearCase to using BitKeeper?

Larry McVoy: Sure. We compete almost exclusively with ClearCase. We've never lost a new sale to ClearCase. That's not as impressive as it sounds, Rational does about $350M/year in ClearCase/ClearQuest, and we're nowhere near that. But we'll get there some day. The reason is that the total cost of ownership is about 5x lower for BitKeeper than ClearCase. You don't need ClearCase admins nor do you need $300K Sun servers to run BitKeeper. We have one customer with more than a 100 developers maintaining 2-3GB of source (a Linux distribution) on one dual processor 750 Mhz rackmount server. They say the server was overpowered for the load. Price out what it would take to do the same thing with ClearCase and you start to see why we don't lose sales.

We have had a small number of sites which have switched, as opposed to choosing BK instead of ClearCase, but those are rare. ClearCase is expensive and the costs are front loaded so management is reluctant to move off of something after they have dumped a large pile of money into it. But we're starting to get some traction even in those situations.

JA: How evolved is the BitKeeper GUI?

Larry McVoy: It's pretty advanced in terms of functionality, you can do things with it that you can't do with anything else. It really helps you debug, for example, which is not something you would typically expect from an SCM system. Whether you like the GUI on first glance depends on who you are. If you want a pretty GUI like Win/XP or KDE, the BK GUI is not going to wow you right away, it's strong on features, not on eye candy. Linus complained about the fact that Tcl/TK, which is what we use for the GUI toolkit, doesn't have anti-aliased fonts, for example.

If you want an integrated IDE, where you never hit the command line, we haven't shipped that yet either. We have a start at it and you can see some screenshots here: http://www.bitkeeper.com/repo/ but it isn't ready for prime time.

All that said, if what you need is to get your job done, the BK GUI is quite powerful. One kernel hacker described the BK GUI as "the only kernel hacker GUI I've ever seen", which we think is a compliment :) The GUI tools are useful, they can do things nothing else can, and they make your day go faster. In particular, the three way file merge is the best in the industry, it works in a way that is much more deterministic than the normal threeway diff, and revtool makes it easy to track down bugs. Add to that a side by side diff viewer, a graphical changeset (patch) viewer, a graphical checkin tool which shows you changes while you type in the comments, a 2 way file merge for simple merges, and a graphical hyperlinked helptool, and things start to look pretty good. The GUI suite currently augments the command line, some people don't like that, they just don't want to hit the command line at all (we fondly describe them as the "hunt and click" crowd). We're working on an IDE interface which will give them what they want, but in the meantime, it seems that people agree that the GUI tools are powerful, functional, and save a lot of time.

JA: Bug tracking is an important part of software development. Does BitKeeper integrate with existing bug tracking solutions, or even offer its own solution?

Larry McVoy: We haven't shipped it yet, but we have a bug tracking system we wrote from scratch. It's based on BK, we've added database functions to BK, you can do things like

bk query 'select where STATE =~ /open/ AND SEVERITY =~ /[12]/'
and it will dump out the list of matching records. It's pretty cool. And when we ship it, it will be cross linked with changes, so when you are doing a checkin, you can search the bugdb for the bug report and say "this changeset closes this bug" and it will make the cross links.

JA: If you one day tire of developing BitKeeper, what will become of it?

Larry McVoy: If I personally tire of it and the company keeps going, it will stay as it is now subject to change by the board of directors. I'm chairman of the board and I don't intend to give that up for at least another 20 years, so I don't anticipate any major changes in how BK is licensed. What I do want to do is educate the people who come in after me so that they can see that BitMover benefits from the current licensing model. It's much harder for other SCM systems to compete with us because we have much of the open source advantage: when we release a new image, it gets tested by a lot of people very quickly. Unless the other SCM people adopt our model, it's tough for them to catch up. If I can make the new management who back fill me understand that, then I suspect that in 100 years, it will still be the same model. The model has worked well for us, it did what we wanted.

If the company were to go under, then BitKeeper becomes GPLed.

JA: I read on your home page that after working on distributed source management systems (BitKeeper) you intend to work on Linux clusters. What sort of work on clusters do you intend?

Larry McVoy: I want to cluster multiple instances of the operating system on a single machine to get SMP scaling on the cheap. I hate multithreading as an approach, it complicates things way too much, and makes the source base more or less unmaintainable. It really raises the bar on how good you have to be to work in the kernel. All the kernel jocks want to prove how studly they are, so they don't care about raising that bar, but give them time, they will.

You can read some slides about this here: http://www.bitmover.com/cc-pitch

JA: The slides in this link are notes talking about a single server "cache coherent" Linux cluster. What is "cache coherency"?

Larry McVoy: I mean the same thing that you would get on a normal SMP. If I have N processors banging on the same data, there are copies of that data in each cache. The SMP hardware, working with the OS, has to make sure that you never have two processors modifying the same data at the same time to different values. That's cache coherency.

JA: I'm very curious about how this would actually work. For example, what would the boot process be like?

Larry McVoy: The boot process is probably the nastiest part, with interrupt dispatch being a close second. The boot process would be a lot like LILO is now, you have a kernel which starts another kernel, except that in this case, the first kernel would start N kernels telling them what resources are theirs.

JA: Would the system tune itself, automatically launching additional Linux instances? Is there then a single controlling entity among the many instances?

Larry McVoy: Personally, I'd say no, at least at first. DEC/Compaq/HP has a patent on migrating CPUs between instances, look at their Galaxy stuff.

I've always advocated doing N OS instances where each OS instance is a 2-4 way SMP. So you get static load balancing when you exec a process (it may exec remotely on a less loaded instance) and dynamic load balancing out of your little SMP.

JA: When scaled, wouldn't there be serious issues with resource contention?

Larry McVoy: There could be. Dave Miller was someone who raised this point in the past. Consider a file system namei() (convert path to inode). If everyone is pounding on the same directories, it won't scale, or that's the claim.

I did some work about 10 years back which addresses this in a cute way. You stripe at the file system level. Imagine an NFS client which stripes over N nfs servers. If you are careful about how you ash the inodes onto the nfs servers, you can scale much much farther with no contention.

So I believe that it can be solved. And the solution space would apply to things outside of a SMP cluster.

JA: Would a smaller SMP server, with perhaps 2 or 4 CPUs, gain anything from this solution?

Larry McVoy: Yes, you bet it would. In fact, it would be cool on a single CPU. Look at all the virtual hardware stuff like IBM's VM, vmware, UML, etc. Now imagine running 2 copies of Linux on a single CPU. You can start playing all sorts of interesting games when the system fails. The kernel can panic and keep going, there are two of them. Or three, or more.

JA: What efforts have been made to bring these clustering ideas into reality?

Larry McVoy: Very little. I talk about them every chance I get but I'm very busy with BK. I had hoped that we would be farther along with BK by now so I could focus on this, but we're not. I've started talking to Cort Dougan about this stuff, he's thinking about implementing a first pass as part of his PhD. Jeff of UML fame has looked at it and seen how UML could be used to prototype it. It looks like the interest in the idea is increasing, and I'm hopeful that Cort and others will pick it up and make it a reality. I told Cort I'd write the controlling tty and process group code, so maybe he'll do it just to make me do some useful work in the kernel again.

JA: How do you enjoy spending your time when you're not working on your computer?

Larry McVoy: I like playing with my kids. It's true what they say, kids give you a second chance at childhood. It's really cool.

I like backpacking and fly fishing, though that's fallen off a bit as I have gotten busier.

I have a cabinet shop in my back yard, and am fanatical about old woodworking tools. My wife says I should tell people I have a shop full of tools so I can repair my tools and there is some truth to that. I have a 1940's era metal working lathe that would make any geek drool and I'd love to get a mill.

Hiking along the coast is something that is fun and I actually do on a regular basis. I also play a little roller hockey. Oh, yeah, I watch the Sharks.

And my wife, when reviewing this, said to point out that I read books like crazy. I read very quickly, which is not always a plus, it means I go through books like there is no tomorrow. The house is full of overflowing bookshelves.

JA: How did you get interested in woodworking tools?

Larry McVoy: Part of it is that I need a break from hi tech. I love what computers have done for me and I like working on them. But it is really cool to go out and grab a hand plane and make some shavings. It smells good, it sounds good, it feels good, and it's all hand work. Gives me a connection to the past.

Part of it is that I build things, I'm an engineer. I've always liked old things, I like how they smell, I like the history, I like trying to imagine how people lived when they used these tools. The oldest tool I have has an 1837 patent date. It's cool to think about the generations of people who used that tool.

Part of it is that I like the way hand built furniture looks. It's a little rough, it's not perfect. Machined furniture looks like it was injection molded, and that's a miserable way to treat a nice piece of wood. Furniture should have, in my opinion, personality. That makes it interesting. Imagine what it would be like if we all had exactly the same perfect facial features. Boring as hell, yet that is what the modern machines are doing to furniture. It's inevitable, the cost of labor makes it so, but that doesn't mean I have to like it.

It's also amazing to see that tool values have remained pretty constant. I have the ultimate Rube Goldberg hand plane, a Stanley 55. I paid $600 for it. In the Sears 1897 catalog, that plane was 3 bucks. I flipped through the catalog and compared prices of things that we can still get now to what they were then, and it mapped pretty closely to the $600.

JA: Reluctantly bringing the conversation back to the topic of this webpage, have you any advice for readers who are interested in getting involved in kernel development?

Larry McVoy: You bet. Learn how to think in C++ but don't ever program in it. If you can't close your eyes and list all the objects and their methods in the kernel, you aren't a kernel programmer by my definition. The best people are the ones who are both architects, in that they can see as well as design the picture of the kernel, and programmers. You want someone who can both see it and build it. It's rare to find that.

Strive for minimalism. Less is more. If you can delete code and have it still work, do so. Strive for consistency. Perl sort of sucks in some ways, but it is cool in that it builds on the knowledge you already have like C, sh, sed, awk. So where possible, try and make your new thing reuse old things, even if it is only old ideas. Old ideas are almost always better than new ideas, I can't tell you how many times I've seen the same bad ideas come back. The ones that are still here are here for a reason.

Go look at RT/Linux for points for orthogonal thinking. That's a really cool way to solve the problem. I like it because it solves the problem in a simple way, it is almost a 100% non-invasive, and it works much better than ill-advised approaches taken by those who want to add real time to multi-user operating systems. People who take the latter approach haven't learned that real time and multi-user throughput are mutually exclusive.

JA: Thank you for taking the time to answer my questions! I'm impressed with how much thought has gone into BitKeeper as a program, and BitMover as a company. It seems Linux development stands to gain much from your efforts.

Larry McVoy: Hey, thanks to you for the chance to ramble on, it's fun to think back on the trip, and what a trip it's been (apologies to Ken Kesey, may he rest in peace).

Related Links:

About the interviewer: Jeremy Andrews was born and raised in Southeast Alaska. Currently he lives and works in South Florida. He maintains KernelTrap as a hobby.

Copyright 2002