We like to think that there is a lot about technology that is objective, but I can't count the number of times that I thought that particular design choices made no sense until I understood why they were made. Before I understand, design choices often look awful. After I understand, my annoyance is often transformed into the realization that I'm probably getting more than I expected.
When I read My Problem with Git: No Abstraction this morning, I felt like it had the first part down. The author bemoaned many of the design choices in git. He's not alone. On any random day, I complain about 3-5 times about the git CLI. The thing that I do appreciate, though, is exactly the thing that the post was complaining about - the fact that, to use git effectively you have to understand its internal model.
Let me backtrack for a minute.
A few years ago, I was at a conference and I heard a keynote by Simon Peyton-Jones of Haskell fame. It was an odd keynote. Often keynotes at technical conferences try to skate into light topics. You rarely see code, and you are rarely asked to think very hard. That's often reserved for the sessions. Peyton-Jones' keynote, however, took no prisoners. He started with the notion of classes in OO and then proceeded to show us how type classes were implemented in the GHC implementation of Haskell (which Simon does extensive work in). What we had at the end of the keynote was a good operational understanding of type classes. It was nice, concise, and a far more effective approach to the subject than what I've ever seen in Haskell books. Haskell books often try to give you the semantics of the constructs, helping you build the mental model you need as a programmer in order to be able to use them. But, that definitely isn't a very complete picture.
At the conference, I turned to a friend of mine and said "you know, more people should explain languages this way.. in terms of an idealized implementation. The understanding is much deeper."
Perhaps it's me. When I learned C, I learned it from a book which explained memory and showed how pointer manipulation affected memory using little drawings of grids and arrows. When I learned C++, I was glad to learn about vtables. Understanding them and how they operated in single and multiple inheritance made it much easier to develop an intuitive sense of whether something was or was not possible in the language, or in other languages.
I feel that git has the same quality. Linus could have chosen to hide the implementation model more deeply. It could have been opaque like other version control systems, but I think it is nice that he didn't. After all, we are programmers. We don't need models that protect us from internal technology. On the contrary, we gain leverage when our tools have consistent implementation models and they are transparent to us. They help us reason.
So, I don't think it is the case that git and the Unix commands mentioned in the post have leaky abstractions. You're expected to know the model. The model is the primary way of understanding the system.
+1 all the way.
Donald Norman (in "The Design of Everyday Things") makes a big deal about this very thing; calling it a "conceptual model."
A decent image appears partway through his blog post here:
http://www.usabilitypost.com/2010/11/17/the-design-of-everyday-things/
Posted by: Mamund | April 26, 2012 at 11:38 AM
Good points. I always tell people to read http://eagain.net/articles/git-for-computer-scientists/ when they don't grok something about the git CLI.
Posted by: Scott Jacobsen | April 26, 2012 at 11:42 AM
The model that I think has worked well for some products (although targeting a different demographic/userbase) is something like what Microsoft Exchange does (bear with me, here ... :)
It comes with a GUI that both walks you through common tasks and helps visualize the current state of the system. However, when doing so, it lets you see how it's performing those actions (well, at least on the 'write' paths) by using the cmdlets that it comes with and showing you the 'script' that it wrote and will run.
Back when I admin'd bunches of AIX machines, there was a similar UI called 'smit' (and a tty version 'smitty') which similarly gave you a UI, but let you progressively learn the underlying details by constantly showing you what it was doing.
For people coming from another VCS (especially with no previous DVCS experience), I think a huge amount of clarity could be gained by just having visualization app(s) that showed you the current state of things (your local branches, remotes, etc.), especially if it could do so with 'potential changes' made (what do the various trees look like if I were to do command X?) without requiring manual use of stash and the like. :)
The bigger gain will certainly be when a UI lets you choose an 'action' entry of 'revert my local changes' but tells you what it's doing (as git commands) and why.
Of course, maybe the real answer is that there needs to be a new VCS created that just happens to use git as an underlying implementation detail, intentionally limiting many of the 'power' uses (at least without bypassing it and using git directly) but trying to solve an 80% target with simpler and less complex mental models.
Things that use git as a storage mechanism seem like a great idea - stop trying to 'fix up' git, just make a new VCS :)
Posted by: James Manning | April 26, 2012 at 02:46 PM
I think of Git a little like a squash racket for a pro player. Squash rackets for pros have a very small 'sweet spot' somewhere near the middle, and if you hit the ball right there, you get tremendous power. If you miss the sweet spot, you get a weird noise and the ball goes in a more or less random direction. Beginners' rackets have much bigger sweet spots, but never deliver the same power. Beginner players (like me) become worse, not better, by trying to use a pro racket.
Git requires you to invest a lot of time in understanding it. If you do, you get a great payoff - a VCS that does exactly what you tell it to do, no more, no less and does so with great performance. But if you're not going to be able to/want to make the investment in learning it (and keep using it daily so you don't forget), Git is probably not a good choice.
I think, like for squash rackets, it makes sense to have different tools for different types of users.
Posted by: Petter Måhlén | April 27, 2012 at 12:47 AM
Is there a link to a recording of Simon's keynote somewhere?
Posted by: ashic | April 27, 2012 at 12:49 AM
I agree completely. For instance, the shop where I'm consulting right now decided to start using git-flow. It's a leaky abstraction on top of git, and until you understand what the abstraction is doing in terms of git, it's just 'magic' that often gets you into trouble. I think it's easier to just use git, as you have to understand the git 'model' anyway.
Also, I worry about developers who have problems learning git well enough to do basic day-to-day development. If they can't get that, then what hope do we have that they can design/code well? It's pretty much the same level of abstraction.
Posted by: Greybeardedgeek | April 27, 2012 at 05:39 AM
I guess I may not have used git enough to have run into trouble but so far... It just works. Things work as expected, and I am loving the push/pull functionality. Merging changes conceptually really is trivial unless there are conflicts, and conflicts are best resolved locally anyway - you dont let your vcs do something "smart" first because that will inevitably lead to a mess. That approach worked for cvs, svn, and now also git.
I have not done complicated vcs ops on git but I doubt it could possibly as byzantine as svn. And besides if you run into complicated version control system issues, 99% of the time, you're doing it wrong.
Posted by: Nk | April 28, 2012 at 05:40 PM