a Github survey

The great thing about git and other distributed version control systems is that once you clone (or fork) a repository, you have all the data. You don't have to trust that Github will preserve it; everyone who develops the project is a backup.

Github carries this principle quite far amoung the features they provide. But not all the way. Today I have surveyed their features, and where the data for each is stored.

  • source code -- in git, of course!
  • user and project pages and wiki -- in git
  • gists -- in git
  • issues -- in a database accessible by an API
  • notes on commits -- in a database accessible by an API
  • relationships between repos (who forked what, pull requests) -- in a database accessible by an API
  • your account details and activity -- in a database, accessible by you via an API
  • list of all projects and users -- in a closed database (AFAIK)

The two that really stand out are the issues and notes not being stored in git. This means that, if a project uses github, it gets locked into github to a degree. The records of bugs and features, all the planning, and communication, is locked away in a database where it cannot be cloned, where every developer is not a backup.

Github's intent here is not to control this data to lock you in (to the extent they want to lock you in, they do that by providing a proprietary UI that people rave about); it was probably only expedient to use some sort of database, rather than git, when implementing these features.

They should automatically produce git repository branches containing a project's issues, and notes, based on the contents of their database. (For notes, git notes is the obviously right storage location.) Along with ensuring every developer checkout is a backup, this would allow accessing that data while offline, which is one of the reasons we use distributed version control.

The lack of a global list of projects is problimatic in a more global sense. It means that we can't make a backup of all the (public) repositories in Github (assuming that we had the bandwidth and storage to do it). I recently backed up all the repositories on Berlios.de, when it looked to be shutting down; this was only possible because they allowed enumerating them all.

People at The Internet Archive say that their archival coverage of free software is actually quite bad. We trust our version control systems to save our free software data, but while this works individually, it will result in data loss globally over time. I'd encourage Github to help the Internet Archive improve their collections by donating periodic snapshots of their public git repositories to the Archive. You're located in the same city, 5 miles apart; they have lots of hard drives (though less right now during the shortage than usual); this should be pretty easy to do.


Full disclosure: Github has bought me dinner and seemed like stand-up guys to me.

Posted
solar year

I've been at the cabin, on solar power, for a year now. I have a year of data!

Everything went pretty well until last month. There was an April rainy spell where power felt slightly tight. Then over the summer, plenty of power, no need to conserve. The last month though had what seemed like weeks of continual grey clouds, where I never saw the sun.

high noon today

Of course, even on a sunny day in winter, it does not get far above the hills, and the peak production window is only a few hours. This bad combination had my battery power dipping below the 10 volts that I consider low, down to 9, and even to 8 volts.

I use kerosine lamps in the winter. (I prefer the light anway.) I've also started unplugging my Thecus server at night to conserve power, meaning no internet late or early. For four or so nights, I had no power to run even my laptop after sunset. On one notable day, there was no power even in the daytime.

Even when it turned sunny again, I found that the batteries would seem to charge to 12 volts during the day, but then precipitously drop to 10 and 9 volts at night. I think the problem was not damaged batteries, but that these Nicads charge most efficiently above 12 volts (14 volts is best), and there was never enough power saved up to get them full enough that they could charge really efficiently.

So, I reluctantly spent three days away this week, to let the batteries soak up sun and recover. It seems to have worked; they've been holding a 12 volt charge overnight again.

Posted