Relearn

merging

merging

Social experiments with Git

Git http://git-scm.com/ is a software that makes it possible to keep track of different versions of files and to exchange these. Even if such a tool sounds utilitarian, its highly distributed nature has profoundly changed the way software developers work together. Git is developed by Linus Torvalds, the developer of the Linux kernel, who is a controversial figure.  The social implications of this tool have been a subject for discussion during Relearn and afterwards.


Femke Snelting:
Two weeks have passed since Relearn ended. In an attempt to prepare this page for publication, I finally added this rather long note:

In the session documented on these pages, we set out to have a closer look at the way Git (a tool for distributed version control and source code management) detects the kinds of conflicts that arise when many people work on the same files, and moreover how it assists in resolving them. We had only scratched its surface two days earlier when experimenting with MetaPost and Git through a digital cadavre exquis. PierreM suggested that it could be interesting to start from Diff and Patch, two much older tools that make up Gits merging mechanism. And so we did.

While these pages were being prepared for publication, Eric brought our bold statement that Git comes from a non-collaborative mindset up for discussion on the Relearn mailinglist: It is for me quite clearly a tool for collaboration (Eric), Not all collaboration is horizontal (Eleanor), How to work alone but in a group? (Vincent). Of course all of that is true. As you can hopefully deduct from the notes that Anne took during the session (afterwards completed by Loraine and Eric), we felt that Git first of all started out as a management tool for and by a specific person, rather than as a platform for shared development. With non-collaborative mindset we meant that Git has not been developed with the idea of social coding in mind, even when a few years later, its being used in mainly collaborative situations.

Trying to understand the initial mindset of Git is just one way to think through how it functions in collaborations. Speaking about collaboration but avoiding the opposition of individualist versus collaborative (as suggested by Eleanor on the mailinglist), might be another. To me it seems necessary to ask questions about Git precisely because of the enthusiasm this tool generates and the central role it starts to play in many workflows. The way it has distribution and forkability built in has consequences for the types of collaboration that it supports and I keep wondering about the ethics of sharing and caring related to it.

Missing from these notes are stories that PierreM told us about using Diff and Patch in code development for Scribus. One of the Scribus maintainers would never look at any code. He spent all day reviewing Diff files. Annes experiences with preserving NetArt (keeping a log of all changes made) made us think through the value of differences that matter, and also if it meant that things that do not change will automatically fade into the background. This connected to an experiment with decision-making processes in the context of Eleanors Consentsus project where all but extreme opinions would be ignored. The man pages for Patch (just type: man patch into your terminal) are interesting in that respect. They clearly reflect the idea that conflict and difference are simply to be expected in Free Software development; they just need to be dealt with as part of daily work. Finally we started experimenting with merging files through Diff and Patch, and first our notes and then our minds drifted off.

++++++++++++++++++++++++++++++++++++++++++++++++

Eric Schrijver:

Dear Femke,

Writing on the Etherpad I wonder how far can we go in editing one another. I would like to remove the last line of the final paragraph? I find it is slightly sentimental and the idea of of ‘our notes and our minds drifting off’ takes away from the interesting oppositions of ideas.

Your updated description of Git as a ‘management tool for and by a specific person’ is correct in the sense that it is was developed as a management tool for specific project, the Linux kernel. Yet I still feel it is not complete, because your description still seems to imply that this project was not a collaboration. Yet the development Linux kernel is a prime example of collaboration in the world of free and open source software.

Of course you are right to state that Git as a tool is impregnated with the specific model of exchange its creator envisions. It is an individualistic vision: the way one starts working on a project, is by getting your own version of the code. You work on whatever you need, you make the changes on your end, and then you propose the original maintainer to merge these changes back in. One part of the social contract of Free and Open Source software becomes very apparent: if for whatever reason the original maintainer does not want your changes, it is easier than ever to start your own version, your own ‘fork’.

I imagine something is lost in this approach. As contributing has become more anonymous, there is the notion of ‘drive-by commits’. Before, I imagine the role of negotiation and seeking for consensus was more important for FLOSS projects: before getting the right to commit to a centralised source repository, one would have needed to spend time on the project mailing list, arguing for ones position, maybe even going to a meet up. Even if it becomes easier to contribute to projects, I imagine it to become harder to create a real social tissue around such projects.

For me, I am happy that it has become easier to contribute to projects. I think there is still the possibility to get socially involved if one wants. And there is another aspect to Git, that I find highly interesting. The individualistic nature of distributed versioning seems to map very well to the way the various actors work together in the field of culture—a field that only exist because something is shared, a field that is rife with exchange, and yet a field that propagates itself in the form of individual expressions. 

I wrote an article about this called ‘I like tight pants and no-one starts from scratch: type design and logic of the fork’. http://i.liketightpants.net/and/no-one-starts-from-scratch-type-design-and-the-logic-of-the-fork Two seminal typefaces of post-war graphic design are described by their own creators as improvements upon existing fonts. The great thing with culture is, that we can actually keep all the forks around, and be happy for the diversity—I am not sure with software it works like this.

++++++++++++++++++++++++++++++++++++++++++++++++

Following the Multi-drawing/Git merging on Wednesday within the Gesturing Paths worksession, inquiring into Git (history, structure, use).

Git was made to deal with a really specific situation. 

Linus Torvalds doesnt like to work with other people. He didnt like existing versioning systems: he used BitKeeper before creating Git, a distributed revision control system which was proprietary. When the access to BitKeeper stopped being free, he worked on his own program in 2005 and chose on purpose the derogatory term git which means an unpleasant or contemptible person (also, the man page of Git called it the stupid content tracker). Its not a collaborative tool, it was conceived to make his own source tree clean, and accept (or not) patches from people by email to him. Not many people have access to the Linux main server. Instead, you are supposed to work on your own copy of the repository, and when you want to contribute, let Git generate an email which can be sent to the maintainer (Linus). He then decides whether it is accepted or not.

Git comes from a non-collaborative mindset [see notes above], but a few years later, its also being used in mainly collaborative situations, like this school and OSP. Yet Git’s peculiar model of collaboration as championed by Torvalds, a sort of collaboration for individualists, remains the most popular way of using it. Today the manual of Git is on the GitHub platform, among many other free and open source projects. GitHub is the most popular site for the hosting of source code. The collaboration model of GitHub is individual first: to contribute to a project, you fork that version of the project.

Git’s way of preventing —or avoiding— conflict happens through the model of branches. You don’t work on the same file: you work on your own version of history. If you try to push your version of history to someone elses Git server, it won’t work if the histories have diverged. It is up to you then to make the two histories work together again, by merging them. Git has advanced algorithms that try to do that for you, including the Octopus merge. If Git can not merge automatically, one has to intervene one-self and resolved the problem outside of Git. Wednesdays experiment was interesting in that aspect (see Gesturing Paths notes on Multi-drawing and Git)


What tools have the possibilities of contact (i.e. conflict)? 
Working through contact: how can you play with the tool? There are ways to solve automatically the conflict if a file has blank spaces for someone, the contact will mean that the software will automerge. Some of the ways to merge that were experienced during the multidrawing Metafont experiment: recursive merging, abortion… Colm suggested to use “include” in Metafont to work collaboratively.

Subversion manual (another revision control system software): the software is not supposed to resolve conflict.  Subversion is centralized: the project is seen as important. Issues with collaboration comes from wanting to be guardian of a project. 
Git is decentralized in theory. Potentially, every computer could become the server. In practice, the most prominent branches would pull off everything.

Git: commit anxiety in Linus words. 
Different options (3? 6?) for conflict in Subversion.

All those programs delegate the merging to other softwares: diff + patch (use diff to apply the differences to the patch). You can read the intermediate object.
What counts as a difference? 
XLM file in the diff?
Potential interesting situation: commenting on a book using a git repository.
Everything that is not binary cannot be diffed: pdf, raster images cannot be put in diff form.

Git as a preservation tool (keeping track of all the modifications of the code of a digital artwork, allowing for reversibility) It is true that the logic of commits fits quite well with the current state of art restauration (making documented, reversible, interventions). I have personally tried this in creating a contemporary version of a 1990s icon collection.

Git used as a convenience.
The diff file is generated on the users machine, not the server where theres the Git repository. If theres a different diff argument for instance, the patch would not apply. 


Assumptions with diff: the line is the basic unit, the difference is between the latest file and the one just before (the latest supposedly being better).
Possible but very complex to implement different diff programs (by changing the arguments and the default settings).


You commit the diff.

The diff and patch programs were not made at the same moment (diff in 1974, patch in 1985). Interesting also to see the differences in the style of the two man pages.

diff = delta http://en.wikipedia.org/wiki/Delta_encoding

Beginning an experiment (not finished): all participants modify a common text (part of the man patch file) and we try to push it, but its not possible for all of us.

What we say

The latest 10 commits, excluding the merges. There is a delay of an hour…