Skip to content
April 23, 2010 / cdsmith

Enforcing Stability in Hackage – Some Thoughts

Some time ago, I wrote a post here about the relationship between Hackage and the Haskell Platform.  By key point was this: Hackage is not a mechanism for enforcing stability; it’s good that anyone, no matter what they write, can upload it to Hackage.  If we try to do quality control as a prerequisite to Hackage, then the only consequence will be fewer packages in Hackage.

That’s still true.  The recent dicussion here made me think, though.  Is there a proper role for quality control in Hackage?  I think there might be.  I’ll start with some basic rules:

  • Nothing in such a system should make it harder to upload a package to Hackage.  This includes asking package developers to commit to future maintenance.  It wouldn’t be worth the loss of a convenient way to install packages… even packages that don’t work at all, and for which the author is merely sharing a partial proof of concept.
  • The system should avoid assigning personal responsibility unless necessary.  This means that if someone goes away, the system should still work as well as it can, modulo that the one person isn’t contributing.  If the community as a whole is waiting on someone specific to get around to something, the system is broken.
  • Such a system would have to be symbiotic with the Haskell Platform.  The Haskell Platform is working, and a great direction to follow for the future of Haskell.  Hackage shouldn’t be competing against the platform in any way.  This is to me the least justifiable of any ground rule… I’m generally not a believer in being so scared of duplication that we don’t try new things… but when it comes to quality assurance, I think a different dynamic is in play versus software development.  Quality assurance really is a bit of a chore, a lot less personal, and it’s fair to at least work at not wasting anyone’s time.

If something could be designed according to these ground rules, it could be a positive direction for Haskell to go.

I see several parts to the effort:

Part 1: Quality of Individual Packages

A lot of thought has already gone into this piece of the equation.  Everyone is looking forward to the day when we have Hackage 2.0, which new social networky features.  Here’s a few of the ideas.

Community rating of software. This is the obvious thing to add, but it also may turn out to be the most difficult, by far, to get right!  Everyone wants to do a nice rating system to harness popular opinion and all that… but systems that work tend to be the result of years and years of tweaking by dedicated people who are willing to track down the answers to questions like “why is this recommending X over Y?”

We’re going to have to ask ourselves some difficult questions.  Is the quality of a Haskell package something that can just be measured on a linear scale?  Or does it depend on what packages you’re using it with?  Does it depend on your API preferences?  Or can we say that some APIs are just better than others?

Then we’re going to have to ask the hardest question: how do we get people to really rate the software on quality and stability, not level of excitement?  There are a lot of exciting Haskell projects out there, which are simply bad ideas to depend on in a stable piece of software!  In the Haskell community as it exists today, I suspect many of them would end up rated quite highly.  Unfortunately, social ranking systems are very frequently about level of interest rather than level of usefulness.  The front page of Reddit is not a source of useful, practical information.

So all in all, there are going to be a few long, hard years of asking ourselves hard questions, both about the best mathematical model to compile all of this data, and also about the social engineering needed to increase the signal to noise ratio of the rating system.

I’ll offer this one suggestion now: let’s allow users to rank projects on several axes: (a) whether the quality is good and the interface is stable and usable, and (b) how “awesome” the project is.  Ask the question about awesomeness first, so we get that out of the way, since that’s the place people will be more eager to give their input.  Then ask about the quality.

User-visible statistics. Adding to this subjective judgement, it would be nice to see some hard data.  A number of data points would be useful in making the decision to use a package.

  • How many people have this package installed? Not the greatest question to ask, but an easy one!  This is subject to all the same concerns as the community rating idea, though.  Packages might be widely installed for a lot of reasons: among them, that the package is useful, and also that the package seemed like a really cool thing to check out.
  • How many packages depend on this one?  How widely used are they? Definitely a much better question.  However, it’s biased in its own way.  Some packages just aren’t meant to be used by other libraries, while others are entirely meant for that purpose.
  • How frequently are there build failures in this package?  In packages that depend on this one? In packages this one depends on? This could be a very interesting source of information.
  • How quickly is the public interface for this package changing? This is a different question from the subjective idea of package stability that Hackage already reports from the package author.  Of course, it requires a distance function on public APIs of packages.

User reviews, comments, discussions. As a Sourceforge-esque feature to help Haskell package developers, this would be great!  It would be especially great if, instead of two thousand isolated discussion forums, this functioned in some way as a Hackage-wide discussion forum, and certain projects and/or users could tap in by defining filters on tags, keywords, etc. for the discussions interesting to them.  Of course, an obvious way to see discussion forums about packages would also have some consequences for quality assurance.

Together, these techniques could hopefully make a lot of progress on the problem of staring at the five packages for doing linear algebra in Haskell and wondering exactly how they differ from each other, and which one you should use.  That’s a nice problem to solve.  It can even be done without a huge risk of violating any of the ground rules mentioned above.  All of the ideas mentioned above are done by others besides the package author, we’re never waiting on anyone, and it’s solving a problem that’s a bit removed from the Haskell Platform (though certainly quality decisions that go into the Haskell Platform could be influenced by the statistics and input provided here).

The fact is, though, in the whole quality assurance pictures, it’s the easy problem!

Part 2: The Integration/Stability Picture

The far more difficult problem is integration of packages and maintaining a stable collection of functioning packages.  Given that packages A, B, and C work independently, how can we be sure that they work together nicely?  That they don’t have incompatible version dependencies?  Or different and conflicting assumptions?  Or that their interfaces aren’t so wildly conflicting that it’s awkward to use them both at once?

It is certainly tempting to back off here, and just say that the Haskell Platform will fix this.  (No offense to Don… I agree with most of what he’s saying in that link)  I think that may be a mistake, however.  The Haskell Platform encompasses a few dozen packages.  Granted, that may increase over time, but: (a) aren’t there certain kinds of packages that won’t and shouldn’t ever become a part of the Haskell Platform?  Do they not need integration testing?  Really?  Also, (b) what about the problem of whether the Haskell Platform’s processes scale to a large number of packages?

I’d argue that it might help to think, in general, about the problem of collections of Haskell packages that are independently developed, but for which there’s a community of people that want to use them together.  Then the Haskell Platform becomes the limiting case where that community of people is the entire Haskell community.  There may, though, be smaller such communities.

Okay, you’re being quite fair to ask for an example here.  The Happstack community looks like a strong candidate.  Happstack itself is written as a collection of loosely coupled packages, such as happstack-server, and happstack-state.  There’s talk of adding more on top of this: modules to handle authentication, and sessions.  Happstack also includes dependencies on other modules… some of which often break with respect to each other (generally a matter of version dependency situations that Cabal can’t or won’t resolve).  Even more to the point, look at web development in many other languages, and you’ll notice that there are even larger collections of various frameworks and libraries available, and that very few languages handle the testing of all of those web development tools and libraries as a part of the standard library for the language.

I think we should be posing and discussing these questions.

  • How much of what the Haskell platform does is specific to it being the Haskell Platform?  How much of it could apply to any collection of packages that are developed independently, but have a community of people that want to use them together?
  • What features for Hackage might be requested by the Haskell Platform?  Can they be generalized to various different communities or package collections?

A proposal might look something like this: in addition to packages, Hackage could track aggregates, which just have a name, description, version, and a collection of other tools and libraries (or other aggregates?) of specific version numbers.  Perhaps, as in the case of the Haskell Platform, library authors would need to propose their packages to various aggregates…. perhaps one could let the aggregate owner decide that kind of policy.  Many of the part 1 features might be applied here as well.

Or that might be complete bunk, and useless.  The point is, perhaps we need to be thinking about how specific subcommunities could do something like the Haskell Platform at the subcommunity level.  And that seems like a place where a unified set of tools in Hackage could be a very good thing.

Conclusion

In essence, I think we’re moving the right direction here.  The answers people are giving are good ones: that Hackage is going to develop some more social moderation features, but that the Haskell Platform is ultimately the answer to stability.  But maybe we ought to be asking, not if Hackage should enforce stability or integration, but rather if it should provide more of the tools for more groups to do so.

17 Comments

Leave a Comment
  1. moonmaster9000 / Apr 24 2010 7:01 am

    hi, i’m a long-time rubyist / budding haskellist. to me, hackage needs a lot of polish, but i don’t necessarily recommend giving the hackage site all the bells and whistles you’re mentioning.

    i recommend checking out the relationship between rubygems and github. rubygems is a beautiful, simple site for finding and publishing gems (a “gem” in ruby is a library of code). gems have versions. 99% of the gems on rubygems have a corresponding repository on github, with links from the gem page on rubygems to github.

    github is where the code lives, and where the community discussion takes place.

    it’s a simple, powerful setup, delivered with style; the combination of these two sites have helped enable an unbelievable explosion of activity in the ruby community.

    • John Bender / Apr 24 2010 3:35 pm

      @moonmaster9000

      Mayhaps you mean gemcutter.org?

      Either way I share a similar sentiment. Hackage could be a package repository – only – with the code/docs/README.md living in something like github ( I’d personally vote for github ).

      @cdsmith

      does cabal-install support builds from tarball, as you can download tarballs from github based on tags (REST), which normally correspond to releases in the ruby/gems world. I’ve been tooling around in the cabal-install source to see what support there is but haven’t seen anything.

      • cdsmith / Apr 24 2010 4:26 pm

        If you have a cabalified tarball, and want to build and install the library, you just extract it, and do a ‘cabal configure’, ‘cabal build’, and ‘cabal install’ from inside the package directory. (I’m using the cabal-install binary, because I like the consistency of the commands, but this isn’t really using cabal-install at all. It’s just using plain old cabal.)

        Why I would want to, though, is a different question. If the package lives in Hackage, then there’s no need for me to determine that this particular author is using github, find github’s mechanism to download the tarball, and then build it by hand. But that’s nowhere near the worst of it… I’d also have to find and download all of the dependencies by hand, too. Hackage and cabal-install solve that problem nicely, so no reason to go back to the bad old days.

      • John Bender / Apr 24 2010 4:38 pm

        @cdsmith

        Sorry, I thought (or meant) cabal-install could resolve the dependencies using the package itself (either the tarball in the scenario I described or a package from hackage) in which case pulling a tarball from github wouldn’t necessarily be a dependency issue, just another place for the package to reside that has nicer features.

        PS I’m an avid reader of your blog, so thank you for publishing!

      • cdsmith / Apr 24 2010 4:47 pm

        Ah, gotcha. Yes, the dependency information is there. I’m not sure if you can use cabal-install to resolve and install all of the dependencies from hackage, when the package you’re doing it for is *not* on Hackage. Now that you mention it, that would be nice.

      • moonmaster9000 / Apr 24 2010 4:59 pm

        hi john. gemcutter.org became rubygems.org a few months ago.

    • cdsmith / Apr 24 2010 4:34 pm

      Hackage doesn’t currently require anything about my other development tools except that I use cabal for the build. We’re obviously not going to force a mass migration of the Haskell library community from darcs to git, and then enforce the use of git in order to participate in Hackage. That’s a non-starter, for hopefully obvious reasons (ground rule #1 above). So if we want quality control mechanisms that work across all Haskell packages, then we should build them into Hackage.

      • John Bender / Apr 24 2010 4:41 pm

        Didn’t want to imply that haskellers should move to github, just thought it would be a nice option to support.

        “I’d personally vote for github” should have been “man it would be nice to have something like github or be able to use github itself”

      • moonmaster9000 / Apr 24 2010 5:03 pm

        there’s nothing about rubygems.org that forces your source code to be hosted on github (or anywhere, for that matter). rubygems.org stores all the code that goes into each and every gem you can find on it. hackage should do the same. in the ruby world, if you “gem install “, it will search a list of sources (rubygems.org being the default source), find the gem you want to install, resolve any gem dependencies (gem dependency’s are stored in the gem’s “gemspec” file), and install everything onto your local system.

      • cdsmith / Apr 24 2010 5:49 pm

        Mooonmaster, yes, this is what Hackage already does.

        The discussion here is about adding features to help people: (a) choose the best Haskell package for their needs, (b) set realistic expectations about the quality and stability of the code they are using, and (c) share their experiences with using certain packages.

        Since none of those tasks are related to version control, Hackage seems like the right place to add them.

      • cdsmith / Apr 24 2010 5:57 pm

        To reply to myself and be semantic… there are several components here: Cabal records dependencies and includes a constraint solver for figuring out how to fulfill a collection of version dependencies, as well as providing a standard way to control the build process (configure, make, install…). Hackage allows you to upload packages that are maintained with Cabal, and keeps a database of them where users can look at the Cabal description of a package, and download it. Finally, cabal-install is a tool that can take a package name, use Cabal to determine its dependencies, use Hackage to download all of those dependencies, and then use Cabal again to build and install them.

        So the collection of all three tools together does what you suggest. Hackage alone is a web site that lets you view Cabal files and download packages that have been uploaded, and that’s about all right now. (Well, and view documentation built with Haddock from the source code, and see the results of build-bots that try to build the packages occasionally.)

  2. Will Donnelly / Apr 24 2010 11:12 am

    Might it be worth considering some sort of “Others who used this package also used…” feature? It seems like such a thing could be really helpful for finding groups of packages that tend to integrate well.

    • cdsmith / Apr 24 2010 4:27 pm

      Will, that sounds like an excellent idea to me! Again, it would probably require a lot of tweaking before it’s very useful.

  3. Chris Eidhof / Apr 26 2010 12:26 am

    A while ago Tom Lokhorst wrote a script that allows you to install packages directly from github:

    http://hpaste.org/fastcgi/hpaste.fcgi/view?id=5082

  4. moonmaster9000 / Apr 27 2010 1:18 pm

    for me, the only things i would change about hackage would be #1 – the style (except for the logo, the site looks like it’s from 1997), and #2 – make sure packages have links to a homepage where people could read a README of sorts. i follow hackage on twitter, and see really awesome libraries get uploaded all the time – yet when i go to the package page on hackage, about 80% of the time, i can’t find a link to anything that would provide me with a REAMDE that explains the library, how it’s used, etc.

  5. anxiety while flying / Feb 6 2013 1:40 am

    I’ve learn a few excellent stuff here. Definitely value bookmarking for revisiting. I surprise how a lot attempt you set to make any such great informative web site.

Trackbacks

  1. State of the Hackage « Adventures in Duality

Leave a comment