I'm working on designing a microformat that can be used to indicate the location of VCS (git, svn, etc) repositories related to a web page.
I'd appreciate some web standards-savvy eyes on my rel-vcs microformat rfc.
If it looks good, next steps will be making things like
gitweb, viewvc, ikiwiki, etc, support it. I've already written a
preliminary webcheckout
tool that will download an url, parse
the microformat, and run the appropriate VCS program(s).
(Followed by, with any luck, github, ohloh, etc using the microformat in both the pages they publish, and perhaps, in their data importers.)
Why? Well,
- A similar approach worked great for Debian source packages
with the
XS-VCS-*
fields. - Pasting git urls from download pages of software projects gets old.
- I'm tired of having to do serious digging to find where to clone the source to websites like Keith Packard's blog, or cariographics.org, or St Hugh of Lincoln Primary School. Sites that I know live in a git repo, somewhere.
- With the downturn, hosting sites are going down left and right, and users who trusted their data to these sites are losing it. Examples include AOL Hometown and Ficlets, Google lively, Journalspace, podango, etc etc. Even livejournal's future is looking shakey. Various people are trying to archive some of this data before it vanishes for good. I'm more interested in establishing best practices that make it easy and attractive to let all the data on your website be cloned/forked/preserved. Things that people bitten by these closures just might demand in the future. This will be one small step in that direction.a
I agree about the links for CVSweb and the likes. I wrote about the two types of CVS (pserver and anoncvs, or more likely, cvs-over-ssh) in http://kitenet.net/~joey/rfc/rel-vcs/discussion/ and sincerely hope nobody uses cvs-over-rsh any more (but then, pserver needs to die too…).
For svn, sure, pointing to a tag or some sort of repository root is unhelpful, but I don't think it's necessarily fair to say "always point to trunk" either - if pointing to a particular branch would be more appropriate for a page, that would be good too.
For git you'd just point to the repository anyway - there isn't a standard way to address a branch. If
git
was a real URI scheme then perhaps it'd have such a mechanism...Putting things like git and svn in the
type
attribute seems like an abuse of thelink
element -type
is defined to be a MIME type, and neither git nor svn is a MIME type. (The type="text/x-wiki" hack for "edit this page in a browser" is similarly abusive - it looks like a MIME type, but clearly isn't. The MIME type of a web form is still text/html, even if you're using it to edit a wiki.)I can see the problem that it tries to solve, which is that given a http URI, you have no idea whether it points to a git, svn or bzr repository. Presumably you'd prefer webcheckout to avoid downloading $uri/info/refs to check for git, doing a PROPGET to check for svn, downloading _darcs/inventory to check for darcs, and doing whatever the check for bzr is...
I agree, it's mild abuse to use "git" as a mime type.
The only alternative seems to be something like rel="vcs-git". Perhaps taking the whole vcs-* namespace like this is better?
I've adapted the Semantic Radar extension to identify
rel=vcs-*
tags. See my blog post for details.