Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

rjbs (4671)

rjbs
  (email not shown publicly)
http://rjbs.manxome.org/
AOL IM: RicardoJBSignes (Add Buddy, Send Message)
Yahoo! ID: RicardoSignes (Add User, Send Message)

I'm a Perl coder living in Bethlehem, PA and working Philadelphia. I'm a philosopher and theologan by training, but I was shocked to learn upon my graduation that these skills don't have many associated careers. Now I write code.

Journal of rjbs (4671)

Tuesday October 14, 2008
09:18 PM

another unproductive complain about subversion

[ #37668 ]

I remember in 2005 or so when I first started using Subversion, I liked it so much. It was much easier to use than CVS. Everyone said it would be make tagging and branching easier than CVS. In CVS, tagging was fine, but branching was such a pain that I never bothered.

Eventually, I found out that branching and merging were much easier, but still a real pain. Tagging, though, was completely insane. Tags were implemented as copies (just like branches). This sort of made sense as a cheap way for branches to work, but none for tags. Tags are labels for points in time in a repository. They shouldn't be mutable, unless maybe to let you remove a label from rev 1 and put it on rev 2.

Because they're implemented as copies, you can actually go in and alter the state of a tag, meaning that tags are useless as ... tags. It also means that if you have a standard Subversion repository with trunk, branches, and tags directories, and you check out the whole thing, you check out absolutely every file in every revision. "Copies are cheap" was a big Subversion mantra back in the day, because in the repository, only files that changed were new files on disk -- but that only goes for what's in the repository, not your checkout. In your checkout, every copy of readme.txt is its own file -- and it has to be, because even the tags are mutable. You can't say that ./tags/1.000/readme.txt is the same file as ./tags/2.000/readme.txt just because there was no change between the two releases, because you could go change either of them, and if you do, you'd change both. Oops!

This came up today because of a piece of automated deployment code that did something like this:

$ mkdir TEMP
$ chdir TEMP
$ svn co $REPO/project
$ cd project/trunk
$ bump-perl-version
$ cd ..
$ svn cp trunk tags/$NEW_VERSION
$ svn ci -m "bump and tag $NEW_VERSION"

Checking out requires a whole lot of space, because it has to check out every single tag's copy of every file. Tagging the new release is also fairly space hungry. How hungry? Well... the project I'm working on right now is a web application. Let's call it New-Webapp.

If I export a copy of trunk from Subversion, getting just the files that make up the latest version of the application, it's 1.9 megabytes.

If I check a copy of the trunk out, so now there's all the extra working copy files, and it's 5.2 megabytes.

If I check out the whole repo, getting every tag and branch (for your information, there is exactly one branch), it's 207 megabytes.

Now, keep in mind that this gives me every file from every tagged release (there are 40 releases). This does not give me the entire revision history. There are many, many revisions missing. After all, what I have is basically 42 revisions: 40 releases, trunk, and one branch. That's it.

If I use git-svn to build a git repository of the project, meaning that I have absolutely every revision, every tag, and every branch, it's 249 megabytes. That's all 1149 revisions.

I am so ready to be done with Subversion.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I agree, allowing tags to be mutable is silly, but has that actually been a problem for you in practice. I don't recall ever actually modifying the contents of stuff under a tag, and you can always revert it.

    As far as the checkout thing goes, that'd be annoying, but I've never done that either. When I want to make a tag I always operate directly on the repo URI:

    svn cp http://.../svn/Foo/trunk [...] http://.../svn/Foo/tags/2.0 [...]

    Yeah, distributed VCS is probably better in lots of ways, but Subversion works well enou

    • I updated the code to tag directly in the repo. The obnoxious thing is that now there are two operations: update the trunk with the new release iformation, then tag. It is not atomic. I have to specify what changeset to copy into the tag.

      --
      rjbs
      • Uh, shouldn't you check in the "bump version" change to trunk _before_ you tag said version?

    • I use svnmerge.py, which is a lot like the merge tracking in svk.

      I don't really see the problem with tags. I don't edit them, and I don't ever check out from root, so I don't download any files I don't want.

      My only svn problem that I haven't found a simple solution for is the slowdown when your repo gets really huge. I expect git would handle this better, but selective update/commit commands make it workable for me, just not as fast as I want it.

      • Would you know whether that is "huge" in files, or in revisions, or both?

        I'm arguing against putting many different (unrelated) projects in the same svn repo, and this would be another reason to not do that.

    • I agree. The subversion tagging concept is perfectly acceptable because you can always revert them (using "svn revert" or "svn copy -r"), and changing the contents of the tags is unlikely. There are three similar issues with Perl:

      1. using my/our for constants [perl.org.il] - that's what I use most of the time, because it gives me most of the benefits of Readonly that Damian mentions in his book. Again, I might change them, but it's not very likely, so it doesn't matter
      2. Using a leading underscore for private methods ("
      • And a question to rjbs - I still don't understand why you want to check out the entire Subversion repository. This is usually an indication of poor design. If you want to switch from one branch/tag/trunk to another you can always use "svn switch".

        The code all gets checked out to get an atomic commit-and-tag. Without that, you have two transactions.

        Yes, that's work-around-able.

        It doesn't change the fact that Subversion is a pig or that Subversion's tags are, at best, tolerable. "a tag is a copy" is workable, sure, if you are not crazy and likely to change it. That doesn't change the fact that it's much more annoying than it needs to be. If a tag was name for a revision number, I could check out my "whole" project, which would be the heads of the branches and trunk, and I'd be able to list the tags for quick reference for diffing.

        Subversion makes all this annoying and slow.

        --
        rjbs
      • I might change them, but it’s not very likely

        But what’s the failure mode? More than likely, if you do happen to change a constant, you will probably have mysterious bugs that do not obviously point to the culprit. It can potentially take a huge amount of time to track down the issue. The rationale is the same as with using strict, only more polarised. Therefore, for constants, I have gotten into the habit of doing the following:

        our $ANSWER; BEGIN { *ANSWER = \42 }

        Trying to mutate $ANSWER or a

    • you can always revert it

      Unless you’ve accidentally committed the changes (which is hardly unlikely in a scenario where you’ve changed the tag’s files while thinking you were in some other directory), in which case Subversion likes to make your life as miserable as possible in trying to get that commit to go away.

  • Since svn doesn't REALLY have tags or branches (it just has copies) what really confuses me about tags is that svn DOES actually have a completely suitable methodology it should be able to use.

    Every combination of a URI path and a repository version number is unique.

    So combine the two and could probably abuse the URI spec to do this.

    http://svn.ali.as/cpan/trunk/Config-Tiny#1234 [svn.ali.as]

    Lets call it a "pseudo tag" or ptag for short.

    Given we have nice immutable paths, why not then just make a simple text file in the r

  • Where subversion works well is in providing a centralized versioned filesystem. For many people, that is enough to allow it to be used as a source code repository given a couple conventions (i.e. "trunk, branches, tags" and "don't mess with the tags".)

    Having never been a CVS user, I find svn to be much more approachable and never felt like the tags weren't taggy enough.

    Now, for merging and distributed revision control, git certainly has lots of benefits. But I think it also loses the ease-of-use of svn fo

    • I totally agree that people explain git unproductively. I've written about that before [manxome.org]. It's definitely possible to say "just give me the trunk head," because gitweb lets you click to get a tgz of it. I don't know what the corresponding git command is, or if (maybe) it isn't exposed with one -- but it is clearly possible and easy. That would be the equivalent to svn export, though, not svn checkout.

      --
      rjbs
  • Don't forget to run git gc after git svn clone. That's reduced space significantly for me in a few cases, and sped up git to boot.
    • Good catch. That shaves about 35 MB off my git repo. Of course, the space consumed by a git repo was already pretty reasonable...

      --
      rjbs