Skip to content
January 17, 2011 / cdsmith

The Butterfly Effect in Cabal

This is an elaboration on a point from my previous article, and an ensuing reddit comment.

Twitter has a unique language all its own, and it would be nice to be able to understand and do something cool with all those twitter tags… you know, @cdsmithus, #haskell, etc.  The first step is to parse them from a block of text.  So I hypothetically might decide to write a very simple Haskell package called “twittertags” to do that.  Since I’d like my work to be extensible by others, I’ll export my work as parser combinators using parsec.  Then other people can use my code for the oh-so-difficult task of recognizing special punctuation as tags on Twitter.

Now, I’m well aware that while the current version of Parsec is 3.1, quite a few people are still using 2.1 because of performance concerns (or just lack of time to update their code).  Fortunately, if I import the compatible module names, I can be careful and make my code work with any version of parsec you like.

Of course, now that I’ve done the hard work of recognizing punctuation, people come along to use my library, building various Twitter and social media applications on top of this package of mine.  In particular, let’s consider two such packages called “twitclient” and “superblog”.  The both expose their own libraries, too, for extensibility.  They want to use my parser combinators in their own, so they also depend on Parsec, but neither of these folks is going to test against two different Parsec versions!  The author of “twitclient” decides to go with parsec 2.1, because of a measurable performance hit with the newer version.  But the author of “superblog” is all for the latest versions, and depends on 3.1.  My package works with both, so there’s no problem… right?

Wrong.

The dependencies now look like this…

Figure: The Cabal Butterfly Effect

Cabal allows several different versions of the same package to be installed at once, so there’s no problem with having both parsec versions installed at once.  But the question is how my own library should be built.  Should it be built against parsec 2.1?  Or against parsec 3.1?  The answer is both.  We need versions of twittertags to built against both of the two dependencies, or else one of the two packages that depends on it will fail.  But Cabal currently can’t do this.

What is does is certainly… suboptimal.  When you build twitclient, it will recompile twittertags against parsec-2.1, which will break superblog.  If you then reinstall superblog to fix it, Cabal will recompile twittertags against parsec-3.1, and break twitclient… and so on, ad infinitum.  What’s worse, it will break these packages without even telling you that it is doing so!

That’s what I’m now dubbing the “Cabal Butterfly Effect”, and one of the issues I brought up in the earlier post.

11 Comments

Leave a Comment
  1. Robert Massaioli / Jan 17 2011 3:17 pm

    Is that really what happens? Because if so that is dreadful. I just assumed that cabal would look out for these things and if it noticed this sort of conflict then it would not replace the one that has already been built and would just build a local version that would not be registered… I guess not but this explains alot of my ghc-pkg woes. I would consider this to be a bug so is there one in the Cabal bug list about this dependency problem?

    • cdsmith / Jan 17 2011 4:02 pm

      Cabal definitely does this for me. I am using Cabal 1.8… so not the latest version, but 1.8 is the one most people are using. If this is fixed in 1.10, I’d love to hear about it.

  2. Dominik / Jan 18 2011 1:23 am

    Jipp, that brings it to the point. I had similar problems with gitit once ;-)
    Essentially cabal would need to dynamically generate the correct “chain”, thus storing the dependencies under which the stuff was built, leading to twittertags-parsec-2.1 and twittertags-parsec-3.1

  3. gasche / Jan 18 2011 2:22 am

    Correctly handling dependencies in a vast software repository is a complex problem on which there is still active research. It’s unreasonable to hope the Cabal authors will solve it independently in a completely satisfactory manner. Worse, the current trend of every community (each free software distribution, plus each programming language) having separate tools with arcane and ad-hoc dependencies handling leads to a huge duplication of efforts.

    For a scientific work on the package dependencies and attempts at solutions, see the Mancoosi project : http://www.mancoosi.org/ . They seek to develop generic solutions to problems such as the update problem, etc.

    That is not to say that specific efforts are a waste of time. In particular, there are always relatively orthogonal concerns that will improve usability of such tools, such as separating compile-time and build-time dependencies. I think Cabal people are working on that.

  4. Paolo Losi / Jan 18 2011 4:05 am

    Let me put forward a sketch of a solution:

    1) libraries and project should depends not directly on a set of “foundation”
    libraries (e.g. parsec, transformers) but on a specific version
    of the haskell platform.

    2) other dependencies (non foundation libraries)
    should be handled on a per project basis rather that on global level.
    This means that each project should have is compiled version of the
    dependencies.
    Narrowing down the dependency problem to the subset of libraries required
    for a project makes the dependency resolution task much easier.
    Having to recompile from source the libraries for
    each project seems cumbersome and it probably is, but it’s much less
    cumbersome that fighting against current dependency problems.

  5. Simon / Jan 21 2011 1:49 am

    This is not just a Cabal limitation, in fact, it is actually a GHC limitation too. GHC adds the package name and version to each symbol name in the object code, so that a binary can include multiple versions of the same package. However, this means that you cannot link a binary to multiple instances of the same package with the same version, but different dependencies, because the symbol names will clash.

    Ultimately we’d like to implement some ABI compatibility in GHC to help with this problem, see the end of http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/RecompilationAvoidance#Interfacestability.

    • Eyal Lotem / Feb 6 2011 9:41 am

      Wouldn’t it be better to replace the version number with a hash (e.g SHA1) of: (Package name, Package version, Dependency hashes), so that the entire graph of dependencies behind a package is represented by the hash?

      • Simon / Feb 7 2011 3:20 am

        Yes, although that will force complete recompilation whenever one of the dependencies changes, rather than only recompiling what is necessary.

Trackbacks

  1. Tweets that mention The Butterfly Effect in Cabal « Sententia cdsmithus -- Topsy.com
  2. A Recap About Cabal and Haskell Libraries « Sententia cdsmithus
  3. Trials and tribulations of writing my first “real” Haskell package | dikgwahlapiso

Leave a comment