A Hack for Approaching Haskell Libraries
One of the most difficult problems I find in learning new libraries and modules is where to start. If you start in the obvious places, there are inevitably zillions of abstractions that just don’t make sense yet. As a quick kludge, I write the following Haskell code; it examines a set of modules in a directory, and sorts them by the number of dependencies on other modules. The result is that if module A depends on module B, then B will be listed before A in the result (except for circular dependencies). If you read the source code in the order given here, there’s more chance of understanding it all.
So here’s the code to build such a list:
import System import System.Directory import IO import Data.List import Data.Char import Data.Maybe import Data.Ord {- Read a Haskell *.hs file and get a list of all the modules that it imports. -} findDeps base pkg = do let hi = base ++ (map dotToSlash pkg) ++ ".hs" ex <- doesFileExist hi if not ex then return [] else do src <- readFile hi let imps = filter isImport (lines src) return $ catMaybes $ map getTargetModule imps where dotToSlash '.' = '/' dotToSlash x = x isImport (' ':t) = isImport t isImport ('\\t':t) = isImport t isImport t = "import" `isPrefixOf` t getTargetModule s = let pre = takeWhile (/= '(') s in find (isUpper . head) (words pre) {- Find the transitive, reflexive closure of the relation defined by the findDeps function. This returns a list of ALL modules that this one uses or depends on, directly or indirectly. -} allDeps base mod = allDeps' [mod] [mod] where allDeps' (m:ms) full = do d <- findDeps base m let d' = filter (not . flip elem full) d t <- allDeps' (d' ++ ms) (d' ++ full) return (m : t) allDeps' [] _ = return [] {- Usage: OrderByComplexity = directory where source code is found. This MUST end in '/' = file that lists the modules you're interested in, one per line. This is often taken from a .cabal -} main = do [ base, pkgFile ] <- getArgs pkgStr <- readFile pkgFile let pkgs = lines pkgStr mods <- mapM (allDeps base) pkgs let deps = zip pkgs mods let deps' = sortBy (comparing fst) deps mapM_ print $ map fst $ sortBy (comparing $ length . snd) deps'
Hmm, were you aware of the UNIX ‘tsort’ utility? It would be able to automate most of this.
Just out of curiousity as a person who’s getting better at reading Haskell, but not writing it, why did you structure the code such that dotToSlash, getTargetModule, and isImport are inside a “where” clause, instead of as their own functions?
It confused me a bit when reading your code; I scanned toplevel for two of those, before looking inside the findDeps block.
Bill, I did so only because they are functions that have little intrinsic value except in that context. I felt that writing the code that way brought out the structure of what I was doing better than having six different top-level functions.
Thanks. Other than that one surprising bit, I thought it was very readable code.