Torvalds Gives Inside Skinny on Git

By Steven J. Vaughan-Nichols
eWeek

April 22, 2005

Linus Torvalds didn't want to change software configuration management tools; however, business and open-source philosophy problems left the Linux founder with no choice but to abandon BitKeeper and create his own system: Git.

SCM programs are used to control the flow of updates and track program changes. In a project as large as Linux—more than 17,000 files—this can be very difficult and very slow.

Because most SCMs—such as CVS (Concurrent Versions System)—are too slow for him, Torvalds built his own.

He describes Git as "a stupid (but extremely fast) directory content manager. It doesn't do a whole lot, but what it does do is track directory contents efficiently."

It also can't be used with BitMover Inc.'s BitKeeper, the controversial and proprietary SCM that Torvalds had used to manage Linux kernel development.

"Git has a totally different model of representing the source tree," said Torvalds in an exclusive interview with Ziff Davis Media Internet News.

The name itself really doesn't have a meaning. Torvalds joked that it can be a "random three-letter combination that is pronounceable, and not actually used by any common Unix command. The fact that it is a mispronunciation of 'get' may or may not be relevant." Or, "'stupid. contemptible and despicable. simple.' Take your pick from the dictionary of slang." Or, "global information tracker: [if] you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room."

Git has already been used for its first run of Linux: the beta of Linux 2.6.12-rc3. But Torvalds admits that Git is still a work in progress.

"The roughness really comes from two things," said Torvalds. "It's a young project, and it just takes time for things to mature. That will go on for years, assuming none of the other open-source SCMs just eventually show themselves to be capable enough that we just end up deciding that Git was a good temporary bridge."

Also, Git does some things very differently from traditional source management, Torvalds said.

"BK [BitKeeper] did that too, but in many ways Git is even more different," he said. "You literally cannot track single files individually. Git always works on a 'collection of files' model, and there's no way to get the history of just one file without also getting the history of all other files; you can then ignore the other files, of course."

It's also, Torvalds believes, a matter of perception.

"Some of the 'roughness' is just that people who are used to some things working certain ways will just be surprised by the Git model," he said.

"Git is just incredibly fast at handling big collections of files. The kernel is 17,000+ files, and Git can show the difference between two different kernel versions in small fractions of a second.

"But," Torvalds continued, "if you ask it, 'When did this file change last?' Git will have to think about that, exactly because it doesn't do things on a file level, and it will have to look at all [the] changes."

That may not be for everyone.

"Now, that model is very appropriate for me, and I much prefer the Git model over traditional SCMs. But others will quite possibly hate it with a passion. 'Different' often means 'rough' to people," admitted Torvalds.

Torvalds also isn't sure that Git will move far beyond its use with Linux.

"The most superficial roughness will have been fixed in a month or two. … You certainly could use it for other projects. I bet kernel people will, just because they get used to working with Git," he said.

Still, "it's a different mentality, and a lot of the things that it does well are probably not horribly relevant to many other projects."

There's a good reason for this, he said. "Most other projects just don't have tens of thousands of files and hundreds of patches a day, so they don't have the kind of performance requirements that the kernel has.

"Also, most other projects simply don't use the same distributed development that the kernel uses. They have a single central repository, and people work with that, and while you can certainly use Git that way too, you just won't see a lot of advantages to Git if you use it in a centralized manner," Torvalds said.

"So we'll see," said Torvalds pragmatically.

"It's entirely possible that people will start using Git more widely, but it's absolutely not a done deal. Let's face it, most new projects end up being failures, and I won't be terribly upset if Git just ends up being the thing that gets us kernel developers working well until the point where some other SCM ends up being good enough."

Copyright 2005