Saturday, July 7, 2012

Adventures in Collaborative Coding With Common Lisp

Update: I've posted a post mortem of the team's attempt this year.

In anticipation of the upcoming ICFP contest (by the way, still looking for team mates, we could use a handful more before I will feel we will saturate our workload), I started looking into collaborative tools for coding. I am aware of a large set of tools that might be useful. This post will describe some these and how we might use them. I am looking at using some subset of Emacs (of course), Slime, Rudel, Mumble, Google+ hangouts, VNC, X11 forwarding (perhaps using XPra), Dropbox, perhaps Git, and naturally ssh to tie it all together.

Collaborative Editing Topology

The basic topology will be like this. I don't know if this helps anybody, but it looks pretty.

Communication between collaborators happens via Mumble and Google+. Google+ has the nice feature that whatever happens in the Hangout will be mirrored to a live Youtube stream and will be saved for future viewing. Files can be exchanged using Dropbox. Rudel allows us to quickly work together and see what others are doing.

The production server is communicated with via Slime/Swank, X forwarding and/or VNC, all via an ssh tunnel. We need X forwarding or VNC in order to make any sort of graphical stuff painless (well, less painful). After experimentation, this is still quite painful. Still looking for a good solution here.

At the end of this post is a pair of videos of a collaborative session I had with one other person on Thursday. The pair of videos are all together quite long and the quality of the video is quite low (much lower than during the actual Hangout), so low that you cannot actually read the text. I'm still trying to figure out how to save the session data well. This was my second attempt to get some kind of example for this post and I felt I couldn't sit on it any longer with the ICFP Contest quickly approaching.

The production server

The first step is to setup a server that can host your Lisp image. This server can really be anything, but you should keep in mind that giving users swank access to a server, is basically giving them shell access at that the Lisp image's privilege level. This means that unless you really trust your collaborators, you should be wary of using a server you care about.

I chose to host off of Amazon EC2 as you only pay per hour of use, so I can start up a fresh system, set it up, and ditch it hours or days later without paying for a month as you might in other places. In a subsequent post (to be posted soon, I hope), I will detail how to set up an EC2 instance for this purpose.

This server will host a Rudel session, a lisp image with a swank connection, optionally a VNC server and Mumble server, and be connected via Dropbox.

Slime/Swank

Most Common Lisp people are probably intimately familiar with Slime and Swank. We are going to be using Slime and Swank to set up a communal Lisp image. Multiple users are going to connect to it and awesomeness will ensue.

We can also have local Lisp images for quicker and/or dirtier work (we don't want to eval broken code on the communal server if we can help it) and anything that needs graphics to run. This is simple and Slime/Swank is ready to go using M-x slime-connect. Use the Slime selector to quickly switch between open connections.

Rudel

Rudel is a collaborative editing library for Emacs. It can use many backends, but we will be using the Obby backend as that was the only one that was easy to set up. Once Rudel is loaded, one person can host a session, which is then joined by any number of participants. We will be hosting the rudel session from the production server. Buffers within Emacs can then be published by one party and subscribed to by any number of other people. After this, that buffer on each computer will hold the same contents, updated in real time as the people code. Text edited by a particular user will be marked in his/her specific background color. There are some packages you will need:

apt-get install emacs23 gnubin-tls avahi-daemon avahi-utils

Setting up Rudel is easy so long as you get the correct version (the one from SourceForge). Once you download and extract it, just add this single line to your .emacs file.

(load-file "~/.emacs.d/rudel-0.2-4/rudel-loaddefs.el")

Note that Rudel uses "C-c c" as a prefix command, which is weird to me, so if you use "C-c c" for anything, either remove that binding, or bind that after you load Rudel, so you can effectively clobber their bindings.

With a couple exceptions (see below) Rudel is pretty painless to use, just join the session, subscribe to some buffers or publish your own, and start editing. It is a good idea to have your Rudel session hosted by an Emacs instance on the production server (so you don't have to kill your Emacs to reset any problems). This is also a good idea just so your computer isn't the single point of failure for the team. You can go to sleep and shut down your computer without effecting others.

One issue for lisp programming is that you can't share the REPL buffer (slime-repl-mode can't be simply turned on and off, nor can you insert text into it all willy-nilly like Rudel assumes it can). However, you can share a Slime scratch buffer, or any buffer that is in slime-mode, which is basically just as good. Google+ allows you to make more involved presentations at the REPL between collaborators if that is needed.

One annoying thing about Rudel is that it seems to be impossible to actually leave a session. When you attempt to leave the session via rudel-end-session, you are disconnected and unsubscribed to all of the buffers, but your login remains and the server keeps the connection open (I believe). This doesn't seem too bad until you try to join again and realize that you can't because your username (and possibly color) are currently in use. To get around this, I just append a number on the end of my username in order to make it fresh every time. Regarding colors, I just pick a garish one when logging in and then change it to something better once I have joined (once you have joined, you can have the same color as someone else). Most likely, most people will work with colors turned off anyway.

Another annoying but (logically consistent) feature of Rudel is that M-x undo will undo other peoples edits as well as you own. This is something which is sometimes desired, but often times not if there are two writing code concurrently. If kill-undo is burned into your muscle memory instead of kill-yank, then you might have some problems. I am trying to come up with a work around for that particular case. Other times can be handled by simply specifying the region and using undo within that region (see the undo help page).

Git, Rudel, and Dropbox

The summary of this section is that these tools don't work together, at all, at a fundamental level. Use Rudel. People can use Git on a person by person basis. Just give up on the idea of sharing a source directory via dropbox. It is a lost cause to try combining Dropbox, Rudel, or Git simultaneously. Be warned that Git will be crippled when using it this way, you can't do any of the good git stuff like branches, reverts, and merges, as it will mess up everybody else's Rudel buffers.. Always unsubscribe, do any fancy Git commands you like, then republish (possibly under a different name) or subscribe and replace the entire file (presumably with the approval of the people sharing the file).

While I have never participated in a short dead line contest and actually used a version control system, I am sufficiently sold on the idea of distributed version control that I would like the option of using it here. Using a version control system has kind-of been integrated into how I think development should be done in general. Development should be broken into smaller, separable tasks which should be made as commits with commit logs telling the future developers (a.k.a. you) why this was done. Developing without it would make me feel naked, or at least haphazard.

That said, there is a problem with using git and Dropbox at the same time. In fact, it is a very fundamental problem. Dropbox is attempting to make two or more directories seem to be the same, no matter what computer you are on. Git, on the other hand, explicitly works under the assumption that the directories are on two different computers and are absolutely independent.

For an example of this conflict, consider a group of people collaborating on a project using Dropbox. As one person edits files on his computer, these edits are quietly sent to the other computers (which causes you to have to constantly revert buffers in Emacs and leads to conflicts, but let's say we are okay with that). If you are using Git, any change to the repo will also be synced. This seems good at first, until you realize that the index is in the repo. This means that you can't develop like Git wants you to develop, incrementally building up the index, crafting your commit, then committing. Each developer would step on the others' toes as they add to the index. Instead you need a process where you build your index in your head, then put a freeze on development to commit, e.g. "hey, nobody do anything, I am going to commit something." This basically eliminates most of the positives that Git brings to the table.

I tried several schemes of moving the .git directory out of the Dropbox folder which fixes this whole index problem. When you do this, you get back all of that Git goodness, but you lose the idea of a synced repo, so why use Dropbox at all? In fact it is worse that that. You now have two conflicting ideas of what the merged repo will be. You cannot combine the two, only discard one and accept the other. So, I submit, that Dropbox and Git just do not mix for this purpose. It can't be done in a sane way.

Everything I just described regarding Git and Dropbox is also true of Git and Rudel. Rudel, however, comes with the extra limitation that you can't just change files on disk anymore. The buffers might be saved to your disk, but the real buffer is in the "cloud". So, changing the file on reverting to a file on disk will break Rudel. From what I have seen, your buffer will no longer be in sync with others. It is important to note that you could actually reconcile this limitation by replacing revert-buffer with something that edits the changed lines in a Rudel approved fashion. But right now, this is not supported.

I lean towards using Rudel as I feel it is more important. We will still use Dropbox for easy file transfer, and people can still try to use git so long as they don't attempt to use any commands that will change the buffers on disk. No one should try to concurrently develop in the same synced Dropbox folder, though. I might setup a backup script that will run every 5 minutes or something and take a snapshot (using rdiff-backup, for instance) of my files so that we can roll back to previous versions if everything hits the fan.

Another thing that can be done with Git is to have a single person in charge of version control of a given file. That person will watch what other people are doing and make commits as needed. That person also institute reverts, branches, and merges, but such actions really need to be done via a safer mechanism than a simple git-checkout or git-merge. The person in charge of version control should really unpublish the file and republish it once the change is made.

Google+ and Mumble

I really like Google+ hangouts, they are about as close sitting at the same desk as someone else as video chat has ever gotten. As nice as Google+ is, it is a bit annoying to have to leave that CPU/network hog running non-stop in order to communicate. This can be partially handled by Mumble, a VoIP push to talk program. It is light weight and can be left running non-stop without many issues. I'm not sure what will be better in practice.

VNC and X Forwarding

VNC has a lot of issues when it comes to connecting two peers, particularly two peers that might be using an ISP that won't allow incoming connections from the Internet. It will work well with a one computer acting as a server that is accessible from the Internet. However, ever when you have VNC working, you can usually count any OpenGL out of the equation. Attempting to use OpenGL on EC2 resulted in a crash of the Lisp system, if I remember correctly. Replacing the OpenGL drivers with a software renderer might help (in fact it seems necessary as the EC2 server has no video card for a hardware driver to make sense). But the main issue is the lag, which is pretty bad, and the general frame rate. However, you can certainly setup a GUI with buttons, combo boxes, static images, etc, and it will work fine. The only real issue is real-time graphics.

X forwarding is another option, it can be made pretty efficient with the help of XPra or NX, and with XPra, at least, the window can be detached and re-attached by someone else. But this is not really collaborative, though. To my knowledge, there is no way to use this technology (or technology like it, e.g. XPra or NX) in a collaborative way. If you do choose to use it and you are using EC2, I could only get it to work if I installed the software rendering drivers (otherwise the Lisp system crashed, this is actually not that uncommon if you are using CL-GLUT).

The Experience

This is a really neat experience. Rudel and sharing a Lisp image was a very new experience to me, and it felt like there was a lot of potential there. It will take me, and probably others, some time to actually wrap my head around all of the implications here. I often times found myself forgetting that I can edit the buffer while someone else is editing something else, or that I can evaluate that code and it will instantly become available to the other users. I can only imagine that this parallel development could scale nicely with more people. You do need to coordinate the development, but this is always true. Problems need to be broken into distinct, separable subtasks and the solutions to those subtasks need to have well defined interfaces in order to prevent breaking other peoples code, but this is, again, always part of any development with more than one person. There are also times where it is very clearly a win, for instance, when writing unit tests or debugging at the REPL.

Of course, if you are really interested in playing with this, hopefully this post and optionally my subsequent post on setting up EC2 will allow you and your friends to try this out yourself. Also, I'd once again like to put in a plug for my ICFP team, join us, it will be fun. Beyond that, at least for the time being, I will put out a standing offer that if you want to have a collaborative coding session with me, in Common Lisp, let me know and I will probably be happy to participate.

Here are the videos of the coding session. Again, I apologize for the quality and the slowness of the development (Oleg and I are still learning each other's style). The task we were setting out to accomplish was to design a program that could solve a maze.

Other things we could have used

There are tons of tools out there. I am aware that you can do a lot with communal Screen sessions if you are willing to limit yourself to the terminal. You can also just run Emacs over X11 forwarding (Emacs has the capability to spawn frames on different displays). This might work, but some have said that Emacs can freeze if one of the users drop their connection (I suppose without closing the window).

If anybody knows of any other awesome tools, or a better set up like this, please comment below, I'd love to hear about it.

2 comments :

  1. From the first glance, this toolset and integration with Emacs are looking very awesome.

    ReplyDelete
  2. Aw, this was an extremely good post. Taking the time and actual effort
    to create a good article… but what can I say… I hesitate
    a lot and never seem to get nearly anything done.

    Look into my web site ... slushy

    ReplyDelete