Sententia cdsmithus

June 6, 2008

Another Screenshot of the Ray Tracer

Filed under: Haskell — cdsmith @ 4:30 pm

Some people are remarkably picky. They actually want to see reflections in a ray tracer. Can you believe that? Here you go then!

Ray Tracer Second Screen Shot

Ray Tracing in Haskell

Filed under: Haskell — cdsmith @ 10:03 am

I’m giving a talk at a user group meeting next week on how to program in Haskell. Rather than just listing off language constructs, I wanted to motivate the whole thing with an actual application. So that’s how I ended up spending yesterday afternoon writing a ray tracer in Haskell. I’m certainly not the first to do so, nor the best, but I am writing this blog post to explain the process I went through. I’ll try making this a literate Haskell post, so you can copy and paste it into a source file (extension .lhs) and run it. The actual code is split into three (very short) modules, but this should work.

Some imports to start us off:

> import Data.Maybe (mapMaybe)
> import Data.Function (on)
> import Data.List (minimumBy)
> import Data.Ix (range)
> import Control.Monad (forM_)
> import Graphics.UI.Gtk hiding (Color, Point, Object)
> import qualified Graphics.UI.Gtk as GTK

First, I built a couple basics of vector math. Out of a desire to demonstrate some list manipulation functions in the presentation, my Vector type is simply [Double].

> type Vector = [Double]
>
> (<+>)       = zipWith (+)
> (<->)       = zipWith (-)
> (<*>)       = zipWith (*)
> vlength u   = sqrt (sum (map (^2) u))
> normalize u = map (/ vlength u) u
> u .*. v     = sum (u <*> v)
> k #*  v     = map (*k) v

Since vectors don’t really even approximate a ring, it seems quite inappropriate to call them Num. Therefore, I’ve invented my own operators. The operators in angle brackets are component-wise, while the normal dot and scalar products are given their own operators, .*. and #*, respectively. I found this to be quite usable, though it would be equally reasonable to eschew operators, and define these as named functions to be used in infix notation.

I then define a few precedences, to make these easier to use.

> infixl 6 <+>
> infixl 6 <->
> infixl 7 <*>
> infixl 7 .*.
> infixl 7 #*

I’ll treat points as displacement vectors from an origin, and colors as vectors in a color space, but it helps to use separate names for them when writing type signatures:

> type Point = Vector
> type Color = Vector

Now for a few real data types. The idea is that an object has some general-purpose properties, plus a shape. The shape could be arbitrary, but is currently limited to only spheres and infinite planes. (Hey, this talk is only an hour and a half long!) So I’ve separated it out. The other properties I’ve got are material properties: how shiny (that is, reflective) is the surface, and what is its color. These are defined as functions from a point in space to the desired values, but in practice I’m just going to use const.

> data Object = Obj          { shape       :: Shape,
>                              objectColor :: Point -> Color,
>                              objectShine :: Point -> Double }
>
> data Shape  = Sphere       { center  :: Point, radius      :: Double }
>             | Plane        { point   :: Point, normal      :: Vector }

And lights are important to. Right now, I’ve got point lights and ambient lights. Adding directed lights (e.g. spot lights) wouldn’t be too difficult, but it is neither of critical importance nor necessary to demonstrate a Haskell language idea, so I’ve left it out.

> data Light  = PointLight   { lightPt :: Point, lightColor  :: Color  }
>             | AmbientLight { lightColor :: Color                     }

The two things we need to know about any shape are how to compute the distance to it along a given ray (specified by origin and a unit direction vector), and how to find the normal (another unit vector) at a given point on the surface of the object. These next two functions do this. They are defined for spheres and planes, but can of course be extended to new variants of Shape via additional pattern matching.

> shapeDistance :: Shape -> Point -> Vector -> Maybe Double
>
> shapeDistance (Sphere c r) orig dir =
>     let p    = orig <+> (c <-> orig) .*. dir #* dir
>         d_cp = vlength (p <-> c)
>         d    = vlength (p <-> orig) - sqrt (r^2 - d_cp^2)
>         ans | d_cp >= r                 = Nothing
>             | (p <-> orig) .*. dir <= 0 = Nothing
>             | otherwise                 = Just d
>     in  ans
>
> shapeDistance (Plane p n) orig dir
>     | dir .*. n >= 0  = Nothing
>     | otherwise       = Just (((p <-> orig) .*. n) / (dir .*. n))
>
> shapeNormal :: Shape -> Point -> Vector
> shapeNormal (Sphere c r) pt = normalize (pt <-> c)
> shapeNormal (Plane p n)  pt  = n

The math there can be derived without a whole lot of trouble, provided you’re relatively familiar with the basic ideas of vector operations. It surprises me, though, how many people only understand dot products in terms of their implementation (the sum of the component-wise products, or a b cos(theta)), and not in terms of what they mean. In particular, when one of the vectors in a dot product is a unit vector, the dot product is the component of the other vector along the direction of the unit vector. That’s a pretty powerful tool to have.

Next come the lights. For the lights, the general question is how much (in each color component) the given light illuminates a particular point with a particular normal. The object list is needed here as well, since lights can be occluded. This is all captured in a function applyLight, which is implemented for each type of light. It’s trivial for ambient lights, since by definition they ignore all the specifics of the situation; but it’s non-trivial for point lights. I also fence all light components to eliminate “negative” lights.

> fence a b x | x < a     = a
>             | x > b     = b
>             | otherwise = x
>
> applyLight :: [Object] -> Point -> Vector -> Light -> Vector
> applyLight objs pt n (PointLight lpt color) =
>     let dir   = normalize (lpt <-> pt)
>         dist  = vlength   (lpt <-> pt)
>         lval  = (n .*. dir) / dist^2 #* color
>         final = map (fence 0 9999) lval
>     in  case findNearest objs pt dir of
>             Just (_,opt)
>                 | vlength (opt <-> pt) < dist -> [0,0,0]
>                 | otherwise                   -> final
>             Nothing                           -> final
>
> applyLight objs pt n (AmbientLight color) = color

Now for the common stuff. A very common thing we want is to take our list of objects and find the nearest point of intersection for a given ray. This function does that, by calling objectDistance for each object in turn, and then choosing the minimum one. It gives back Nothing if there is no intersection, or Just (obj,point) if there is.

> findNearest :: [Object] -> Point -> Vector -> Maybe (Object, Point)
> findNearest objs orig dir =
>         let consider obj = case shapeDistance (shape obj) orig dir of
>                                 Nothing -> Nothing
>                                 Just d  -> Just (obj, d)
>             result = mapMaybe consider objs
>         in  case result of
>                 [] -> Nothing
>                 _  -> let (obj,d) = minimumBy (compare `on` snd) result
>                       in  Just (obj, orig <+> d #* dir)

And finally, the ray tracing itself. I’ve also thrown in defaultColor, which is the color assigned to rays that have either reflected back and forth beyond the recursion limit, or that proceed into infinity. For now, this is always black.

> defaultColor :: Vector -> Color
> defaultColor _ = [0, 0, 0]
>
> trace :: [Object] -> [Light] -> Point -> Vector -> Int -> Color
> trace objs lights orig dir 0     = defaultColor dir
> trace objs lights orig dir limit =
>     case findNearest objs orig dir of
>         Nothing        -> defaultColor dir
>         Just (obj, pt) ->
>             let n        = shapeNormal (shape obj) pt
>                 ndir     = dir <-> 2 * (n .*. dir) #* n
>                 refl     = trace objs lights pt ndir (limit - 1)
>                 color    = objectColor obj pt
>                 shine    = objectShine obj pt
>                 lvals    = map (applyLight objs pt n) lights
>                 lighting = (foldl (<+>) [0,0,0] lvals)
>                 in_light = shine #* refl <+> (1 - shine) #* lighting
>             in map (fence 0 1) (color <*> in_light)

Finally, there’s the GUI stuff. I won’t discuss this much, except to say that it’s horridly bad. I’m doing the actual ray tracing in an onExpose event handler. If I had even a modicum of common sense, I’d be rendering to an image or something, so that I wouldn’t have to redo all the ray tracing just because you switched over to a different window. Oh well. I’ll improve this before the talk on Tuesday.

> objs = [
>     Obj (Sphere [-50, -50, 100] 100)
>         (const [1.0, 0.3, 1.0])
>         (const 0.1),
>     Obj (Sphere [ 50,  50, 100] 100)
>         (const [0.2, 1.0, 0.2])
>         (const 0.1),
>     Obj (Plane [-100,100,0] (normalize [1,-0.2,-0.2]))
>         (const [1.0,1.0,1.0])
>         (const 0.0)
>     ]
>
> lights = [ PointLight   [175, -200, 10] [40000.0, 20000, 20000],
>            AmbientLight                [0.0, 0.05, 0.1]
>          ]
>
> drawScene d ev = do
>     dw    <- widgetGetDrawWindow d
>     (w,h) <- widgetGetSize d
>     gc    <- gcNew dw
>
>     let eye = [0, 0, -500]
>
>     forM_ (range ((0,0), (w-1, h-1))) $ \(x,y) -> do
>         let (x', y') = (fromIntegral x - fromIntegral w / 2,
>                         fromIntegral y - fromIntegral h / 2)
>         let [r,g,b]  = trace objs lights eye
>                              (normalize ([x',y',0] <-> eye)) 5
>         let fg       = GTK.Color (round (65535 * r))
>                                  (round (65535 * g))
>                                  (round (65535 * b))
>         gcSetValues gc $ newGCValues { foreground = fg }
>         drawPoint dw gc (x,y)
>     return True
>
> main = do
>     initGUI
>     w <- windowNew
>     d <- drawingAreaNew
>     windowSetTitle w "Ray Tracer"
>     containerAdd   w d
>     onExpose  d (drawScene d)
>     onDestroy w mainQuit
>     widgetShowAll w
>     mainGUI

And there you go: a ray tracer! Of course, there are many things that ought to be improved. They range from the trivial (render to an off-screen image, perhaps save to an image file, and read from a file instead of coding the scene as an immediate data structure) to the very complex (implement diffuse effects, more complex interpolated curves, etc.). But as a quick demo goes, this one didn’t turn out so bad.

Screenshot

March 7, 2008

Functions and Partial Orders

Filed under: Haskell, math — cdsmith @ 4:03 pm

In a few other blogs (especially here), there has been some discussion of the mathematical foundations of functions in programming languages, starting from topology. Just for fun, I will approach from the other direction, describing functions that can be defined in various languages in terms of partially ordered sets. Of course, for those of you who know this already, then you can go on and define topologies on these posets a la Dana Scott, so it amounts to the same thing. But this way has more pretty pictures.

Functions and Partial Information

Suppose we want to look at functions from the natural numbers to the natural numbers. (By the natural numbers, I mean non-negative integers.) There are, of course, infinitely many such functions. Let’s pick a few that are interesting:

  1. The constant function f(x) = 5.
  2. The identity function g(x) = x.
  3. The successor function h(x) = x + 1.
  4. The doubling function p(x) = 2x.
  5. The square function q(x) = x2.

That should give us a good start. These functions are all quite well-defined over the integers. Indeed, I can implement them all in my favorite programming language, and I’ll get exactly that (note: only because my favorite programming language does arbitrary precision integers, of course.)

The next question is this: what happens if I don’t know everything about the input to the function? What can I infer about the output given partial information about the input? Here’s some of the things I might ask of that sort:

x x = 2 x is even x is at least 5 x could be anything
f(x) f(x) = 5 f(x) = 5 f(x) = 5 f(x) = 5
g(x) g(x) = 2 g(x) is even g(x) is at least 5 g(x) could be anything
h(x) h(x) = 3 h(x) is odd h(x) is at least 6 h(x) is at least 1
p(x) p(x) = 4 p(x) is even p(x) is at least 10
p(x) is even
p(x) is even
q(x) q(x) = 4 q(x) is even
q(x) is a perfect square
q(x) is at least 25
q(x) is a perfect square
q(x) is a perfect square

Getting back this sort of partial information is trickier than implementing the function over totally known arguments. That’s what this blog post is about.

Partial Orders and Up-Sets

The trick is to build a partial order on the domain of the function. A partial order is some relation, ≤, that has three properties:

  • Reflexive: For all x, xx.
  • Anti-symmetric: If xy and yx, then x = y.
  • Transitive: If xy and yz, then xz.

Everyone is familiar with the ≤ that means “is no larger than.” But there are other perfectly good definitions as well. For example “is a factor of” is a perfectly good meaning for ≤. Now here’s where the pretty pictures come in. We can draw a diagram for a partial order, where there’s an upward path from x to y precisely when xy.

For example, here is part of the diagram for the familiar partial order “is no larger than” on the natural numbers:

Familiar Partial Order on the Naturals

And here is part of the diagram for “is a factor of”:

Divisibility Order on the Naturals

Each of these partial orders contains some information about the relationships between natural numbers. In particular, one can identify certain properties of the natural numbers as belonging to up-sets: sets that have the property that if they contain something, they also contain everything that it greater than it. So “at least 5″ is the up-set of naturals in the familiar order that contains 5 and everything above it. We can call this the up-set generated by 5. In the second (divisibility) order, the up-set generated by 5 can be described as “divisible by 5″.

To use these orders to get information about partial inputs to functions, we need two things:

  1. Up-sets that capture the partial information we have.
  2. Functions that preserve that order.

Let’s shoot for the first one. Looking at the kinds of partial information we had:

  • x is even. This is the up-set generated by 2 in the divisibility order.
  • x is at least 5. This is the up-set generated by 5 in the familiar order.
  • x could be anything. This is the up-set generated by the bottom element of any partial order we like, so long as it has a bottom element.
  • x is a perfect square. We’ll need a quite complicated order for this one. We can order the natural numbers by the divisibility order on the least common divisor of the exponents of their prime factorizations. It turns out this isn’t even a valid partial order (it violates anti-symmetry), but I can treat it as a partial order on equivalence classes, and things work out. Then perfect squares are the up-set generated by [2] (the equivalence class of numbers whose gcd of the exponents in their prime factorization is precisely 2) in this order.

Once we’ve chosen a partial order whose up-sets capture the information we want, we have to verify that the function actually preserves the order. That is, if we take an up-set generated by x, and pass everything in it through the function, will the result live in the up-set generated by f(x). The successor function h(x) = x + 1 that we defined earlier has this property in the familiar order. If I have a set of all numbers bigger than 4, and I add one to them all, I get all the numbers bigger than 5. However, h(x) does not preserve up-sets in the divisibility order. If I have all numbers divisible by 3, and I add one to them all, there’s no guarantee that the resulting numbers will be divisible by 4! So I have to be careful about using an order-preserving function, if I want any kind of partial information from up-sets.

What a mess! Sure, if I can define the right partial order, and then prove that the function preserves up-sets on the partial order, and sometimes pull this trick with equivalence classes, and then I get to say something about the behavior of my function on partial information. But that seems less than fulfilling somehow. It feels like I’m working too hard, and now I can’t even define some functions, depending on which order I choose.

Sorting Out the Mess

So what do we do with all of this?

The first thing to do is to pick an order and stick with it. We’ll lose some partial information, but we’ll get to keep other partial information. The logical choice for natural numbers is the familiar order. It’s fairly common for me to want to know that a number is at least 3. It’s somewhat less common for me to want to know that a number is a multiple of 3. But I won’t assume any choice, and in a future blog post, I’ll demonstrate how to build numeric data types in Haskell that parallel several possible choices, and exhibit different laziness characteristics as a result.

The next thing we want to do is recover the ability to define all functions again. The problem with defining some functions is that there’s something “greater than” the number we’re looking at that won’t stay that way once we’re done. The obvious solution is to just make sure there’s nothing “greater than” a number. We can invent a whole set of pseudo-numbers: ~0, ~1, ~2, and so on. Then we’ll put the numbers themselves above that. Here’s the new set of natural numbers with the familiar order:

The Lazy Naturals by the Familiar Order

And voila! I can define all my favorite functions with the actual numbers. What remains is to extend my function onto the pseudo-numbers as well. In this order, the pseudo-number ~n means “at least n“. There’s always one safe way to do this: just map everything to ~0, the minimal element. However, if I still want to get partial information about the function, I need to do better.

Here’s the result for the divisibility order:

Lazy Naturals with Divisibility Order

Here we have the same thing, except that “~n” now means “divisible by n“. Again, I need to extend my functions to the pseudo-numbers. I can still map everything to the bottom element (in this case, “~1″), but this throws away any partial information I may have been keeping.

Finally, one more option. This corresponds to a partial order on an equivalence class, sort of like what came out of the perfect squares example. This one, though, is the trivial equivalence class that lumps everything together. The result is the strict natural numbers, in which I must know the entire number to get any information out of any (non-constant) function.

Strict Natural Numbers

Here only a constant function could do anything except map ~0 back to ~0. Therefore, unless your function is constant, the only way to get any information back from your input is to know the precise value of the input. This is an important arrangement of the natural numbers because for efficiency reasons, in the particular case of integers, this is precisely what all real programming languages do. We’re just using natural numbers as an example, though; the ideas here apply equally well to more complex structures where laziness exists.

It turns out, now, that once we’ve chosen a partial order for the pseudo-numbers, there is sometimes a best possible way to extend any function over the naturals to include pseudo-numbers. To determine the value of any function f at ~n, we’d like to find the greatest lower bound of f(x) for all x~n. Now if the partial order we chose happens to be something called a complete meet-semilattice (all of the orders we’ve talked about are), then that greatest lower bound exists.

Connections to Programming Languages

As you may have guessed by now by my occasional references, this is all connected to laziness of functions in programming languages. I had intended to write about that now, but this has gotten too long already, so I am instead going to stop here and finish the other half at a later time. Hope you enjoyed it!

July 29, 2007

37 Reasons to Love Haskell (playing off the Ruby article)

Filed under: Haskell — cdsmith @ 2:33 pm

Here’s an article on reasons to like Ruby.  It’s actually very well written and persuasive.  So I took it as a challenge.  How well would Haskell fare when compared against this specific set of criteria, changing the points as little as possible?  The meanings will inevitably shift a bit, and readers will know that Haskell has its own set of advantages, of course.

The point of this exercise, though, is to see how Haskell does in the benefits mentioned for Ruby in that particular article.  It’s not a Haskell vs. Ruby; just a shameless theft of a set of criteria.  Should be fun.

  1. It’s object oriented.  Although it wasn’t designed for it, this paper points out that Haskell scores pretty well in support for object oriented programming features.  It’s not entirely clear that this is a good way to write code; but if it isn’t, it’s not because Haskell doesn’t provide the features or support you need to make it work.
  2. It’s pure. Haskell chooses to be pure in the functional sense, rather than the object-oriented sense.  The same ideas remain, though; there are no messy bits that don’t fit into the prevailing mode of thought.
  3. It’s a dynamic language. Yep, you heard right.  Projects like Yi and hs-plugins and lambdabot prove that it’s quite possible to write programs with runtime code loading and manipulation and other dynamic features in Haskell.  Indeed, the type system gives you a level of assurance that you haven’t made big mistakes along the way; something that’s quite likely when assembling code at runtime from many small pieces.
  4. It’s an interpreted language. Honestly, I’m having trouble understanding this point in the original; the Ruby article doesn’t ever say why this is a good thing.  It helps, of course, in that it lets you work with the program interactively, easily trying out bits at a time rather than having to write a new main method to do any testing.  Unlike Ruby, Haskell can also be compiled to handle any performance concerns; combining the advantages of both worlds.
  5. It understands regular expressions. The Text.Regex module in the Haskell standard library contains functions to do regular expression stuff.  It even uses Haskell’s operator mechanism to define operators like =~ to look more like other languages sometimes.
  6. It’s multi-platform. GHC, the major Haskell compiler, exists for Linux (many processors and distributions), Other UNIX-like systems (FreeBSD, NetBSD, OpenBSD, Solaris), many Windows variants (95 through Vista), and MacOS (Intel and PowerPC).  Nobody cares about MS-DOS :).
  7. It’s derivative. Haskell as a language borrows many of the best features from other languages, especially ML (its type system), and Miranda (its evaluation order).  It borrows libraries from lots of places.  There are Haskell bindings for wxWindows and GTK+, for example.  And yes, printf, too.
  8. It’s innovative.  If there’s any single widely used language today that has a claim to being truly innovative, it’s Haskell.  Haskell is practically a research playground, while still managing to be a practical language.  New language features have been the bread and butter of its progress: type classes, monadic I/O and computational environments; more recently: GADTs, associated types.  All of these are rare in other languages.
  9. It’s a Very High-Level Language (VHLL).  I actually doubt this term has any kind of meaning at all; but if it does, then Haskell has quite a good claim to fitting its meaning.  To the extent that the high/low level language distinction is meaningful, it’s about the ability of software to transform the program from something that’s useful to programmers into something that’s efficient and executable on machines.  More work is going on here in Haskell (e.g., see the Data Parallel Haskell project) than in any other language I’m aware of.
  10. It has a smart garbage collector.  I’m not sure this should even be worth a mention on the list.  Languages without automatic memory management are simply not candidates for writing serious application level code in the modern world.  For what it’s worth, though, yes Haskell does it.
  11. It’s a scripting language. Despite having virtually none of the properties that conventional wisdom associates with scripting languages, Haskell is quite usable for scripting.  Function composition and the monadic >>= operator provide the ability to combine pieces every bit as tersely as pipes; type inference eliminates the cost of static types.  This page talks about using Haskell for scripting, and this one describes the -e option to ghc, which lets you run Haskell code directly from the command line, and gives an example of using it.
  12. It’s versatile. As a general purpose programming language, you can do as much with Haskell as practically anything.  Scripting is simple, as described above.  Haskell also has a very advanced application server letting you write complex web applications; has been used to write a very nice window manager, is widely used in financial and other industries, and has been used to implement other languages — itself and the very first implementation of Perl 6.
  13. It’s thread-capable. And more!  There are basically two languages worth looking at for modern concurrent programming: Erlang and Haskell.  Haskell implements not just multithreading, but three different higher-level abstractions on top of multithreading: Software Transactional Memory, Data Parallel Haskell, and basic Parallel Haskell.  All three provide tools to make it easier to build correct threaded programs.  (Of course, Haskell offers less interesting traditional concurrency abstractions as well.)
  14. It’s open-source. All Haskell implementations have the source code fully available.  The major people involved hang out and regularly respond by both mailing lists (newsgroup interface available, too) and even IRC channels.  Haskell has one of the most famously open and friendly communities around, so you’ll fit right in!
  15. It’s intuitive.  Haskell doesn’t have a shallow learning curve, but that’s because you’re really learning things; not just learning a new syntax for the same programming you’ve done for years.  Haskell stays out of the way and lets your mind be expanded by the concepts you’re seeing instead of by the arbitrary choices of the language implementors.  That’s about as intuitive as you can really ask for.
  16. It handles exceptions well.  Unlike practically any other language, Haskell does the right thing for computations that have exceptional cases.  When working in a purely functional way, it gives you simple types like Either and Maybe that help to express those exceptional conditions in a functional manner.  When you’re working in an imperative way (e.g., in the IO monad, or any other MonadPlus environment) it provides exception handling, since that’s the right choice for the imperative style.
  17. It has an advanced Array type.  You don’t have to declare types at compile-time, or allocate memory in advance, or keep up with their length, or worry about indexing out of bounds (unless you explicitly choose to use unsafe operations).  Unlike many other languages, though, arrays aren’t syntactically preferred.  It’s just as easy to use a map, list, sequence, or many other things depending on what best suits your application.
  18. It’s extensible. You can write external libraries in Haskell or in C.  You can declare new instances (in other words add new behaviors to existing type signatures; gives you the core benefit of modifying existing classes) on the fly.  You can also add new operators (and I do mean new operators, not rehashing a very limited set of operators in error-prone ways like you do in C++) and use monads to define whole new computational environments if you like.
  19. It encourages literate programming. Even if you aren’t really doing literate programming, you can embed comments in your code which the Haddock tool will extract and manipulate.  You can look at type information even if it was never documented at all, simply by asking GHCi or Hugs for the type.  If you’re really into literate programming, though, major Haskell compilers understand literate source files that default to comments, and only include code delimited in specific ways (birdtracks, or latex begin/end commands).  This lets you, quite literally, compile the same document into either executable code or a details PDF user manual, without even having to do anything unusual to your latex source!
  20. It uses punctuation and capitalization creatively. Haskell enforces a punctuation scheme that makes the meaning of code clearer.  Types, classes, modules, and data constructors begin with upper-case letters.  Variables (including type variables) start with lower-case letters.  Fortunately, though, a lot of very important information is stored in the type — including, for example, whether a function result is Boolean or not, and even whether it destructively updates something!  This information is available at your fingertips either by reading documentation or just by typing :t at the GHCi command prompt, so you don’t have to repeat it over and over again in your code.
  21. (Some) reserved words aren’t. Haskell has very few reserved words compared to most other languages, because the language itself is conceptually simpler (not to be confused with easier to learn, as mentioned earlier).  A few keywords aren’t reserved, but I can’t pretend that’s a good thing.
  22. It allows iterators.  More generally, higher-order functions are useful in a large number of situations.  When combined with lazy evaluation, there are a plethora of powerful techniques for handling collections of data in Haskell.  Iterators loosely correspond to maps and folds, which are well supported and widely used.
  23. It has safety and security features. Haskell provides as advanced a set of fool-proof safety and security features as is found in basically any language.  Its type system allows you to express security constraints (even rather generalized ones) in embedded domain-specific constructs that can be enforced by the compiler, so you never even attempt to run unsafe code!  If a new kind of security issue arises, it’s generally possible to use the powerful type system to solve the problem without waiting for language support for something like tainted data.
  24. It (really) has no pointers. Unlike the trivial sense in which languages like Java are claimed to have no pointers by restricting the most dangerous operations on them, Haskell really has no pointers.  (It’s worth noting that the Java language specification wasn’t fooled into the “no pointers” myth, as a peek as the second sentence of 4.3.1 in the 3rd edition spec will make clear.)  In a pure functional language, there’s no difference between using pointers and actual values, so the compiler can make the decision based on performance concerns, rather than exposing the distinction between pointers and data to the programmer.  This is part of Haskell’s being a higher level language.
  25. It pays attention to detail. This is one of those Orwellian newspeak things where the Ruby article I’m working from claims one thing (attention to detail) and then describes the opposite (extreme sloppiness).  Haskell follows the real attention to detail picture here, though you can define your own synonyms (for values or types) if you like.
  26. It has a flexible syntax. Haskell has a truly flexible syntax in ways that Ruby can’t dream of.  It allows programmers to embed domain-specific languages that are significantly and fundamentally different from the imperative model of Ruby.  Parsers can be written by embedding context-free grammars right into the source code.  Logic processing can be added by embedding Prolog-style inference rules into the language.  An infinite supply of operators are available to facilitate these languages.  Higher-order functions and monadic environments are available to make them work well.
  27. It has a rich set of libraries. Available libraries is one place where Haskell is way above practically all languages with similar community sizes, and in the ballpark of a lot of mainstream languages.  There are libraries for practically anything, including several for GUI programming, web programming, transactional persistence, and plenty else besides.  I wouldn’t want to try to list them all, so here.
  28. It has a debugger. Okay, so this is the biggest stretch yet.  The development version of GHC (to become GHC 6.8) actually does have a debugger; but the debugging tools for released versions of Haskell are sketchy at best.  This is improving.
  29. It can be used interactively. This is true in the sense that GHCi exists.  It’s less true in the sense that someone can use it as their login shell.  Yes, it’s possible; no, it’s probably not a good idea.  This is interesting, though.
  30. It is concise. Haskell code is about comparable to many scripting languages.  Sometimes it’s a little longer.  Sometimes (generally when one can make good use of powerful abstraction techniques like monads and higher-order functions ina  program of large enough size that it makes a difference) it’s a lot shorter.  The xmonad window manager is about 500 lines of code.
  31. It is expression-oriented.  As a purely functional language, of course it’s expression-oriented.
  32. It is laced with syntactic sugar.  Haskell’s got all sorts of nice syntax in ways that really matter quite a lot: custom operators and fixity declarations, do blocks for monadic computation, etc.
  33. It has operator overloading.  In fact, operator overloading in Haskell is far nicer than in most other languages, because you can make up your own operators.  No more confusing bit shifting with I/O.  Within a given context, an operator means a specific thing; but at the same time, it can apply to your custom types (ah, the magic of type classes) and you can make up your own different operators to do different things concisely.  Reading a journal paper that uses dotted relational operators to mean something?  Great!  Use them in Haskell, too.
  34. It has infinite-precision integer arithmetic.  It also has infinite-precision rational arithmetic, and fixed-precision types in case you want those, too.  And you can build your own types somewhat concisely.
  35. It has an exponentiation operator.  Actually, it has three!  This is because there’s a difference in what the three of them mean in some cases, so you get your choice.
  36. It has powerful string handling.  It does, but more importantly it also has powerful list handling, and these list handling routines are all usable on strings.
  37. It has few exceptions to its rules.  Haskell’s semantics are basically the lambda calculus.  The whole language is remarkably consistent and behaves in consistent ways.  The semantics are even simpler and easier to understand than any imperative language, which has to do with the distinction between values and variables; or eager languages, which have to deal with the question of evaluation order.  This makes Haskell programs very easy to understand and manipulate safely.

So there you have it.

July 19, 2007

Parsing, CFGs, and Type Hacking

Filed under: Haskell — cdsmith @ 3:21 pm

This is what I have been playing with for the last day or so.

Haskell has a very nice monadic parser library for predictive parsing (parsec), and a decent lex/yacc-style parser and lexer generator suite (happy and alex).  That said, though, it’s more fun and educational to write code than to worry about what’s already been written, I set out to do something similar.  In particular, my goals are:

  • Write an LR parser, like that generated by parser generators.
  • Write readable grammars directly in Haskell source files, with no external parser generator utilities.
  • Let the type checker do its job by checking and inferring types for semantic rules.

I haven’t written the parsing code yet, but I did squeeze out something that I’m nearly happy with for the first two goals.

{-# LANGUAGE MultiParamTypeClasses     #-}
{-# LANGUAGE UndecidableInstances      #-}
{-# LANGUAGE NoMonomorphismRestriction #-}

All the language extensions I’ll be using.  This is the bare-bones list; the original list was eight or nine lines..  So, if you were wonder whether this is a good post for new Haskellers just learning the language, there’s your answer!

I need some way to represent variables (in the CFG sense).  In order to ensure that everything is well-typed, I need some way to keep track of the type of the semantic value associated with each variable.  Here’s what I did.

data Var a  = Var String

And a right-hand side of each rule will have a sequence of variables and terminals.  Again, to keep the type information around, I’ll need a sort of cons operator at the type level.  Here is that.  I defined a type, and also an operator that makes the type easier to use.

data RHS a b = RHS a b
(&) = RHS -- a convenient operator
infixr 5 &

And next, things get hard.  I’m using a multiparameter type class, in the fine tradition of Haskell type hacking, to represent a relation between types.  My relation is defined in the following comment:

{-
    (Action a b c) means the following:

    A production with a right hand side of type:    a
    may be associated with a semantic rule of type: b
    to produce a rule with semantic result type:    c
-}
class Action a b c | a b -> c, a c -> b

In other words, the Action class will be used to ensure that the result type of a grammar production, the right-hand side of the production, and the type of the associated semantic rule are all consistent with each other.  The functional dependencies simply assert that if you know the types of the right-hand side and the semantic rule, this is enough to determine what the result will be after applying the semantic rule; and that if you know the right-hand side and the result type, this is enough to determine what the type of the semantic rule needs to be.

There are a three base cases for this relation:

instance                   Action  (Var x)          (x -> y)  y

This says that if a production has the form A -> B, where A has semantic values of type y, and B has semantic values of type x, then the semantic rule must have type (x -> y).  If you think about it, this should make sense.

instance                   Action  Char             y         y

This rule says that if the right-hand side of a production is a signle character (a terminal, not a variable), the semantic rules should be a constant that matches the semantic type for the left-hand variable.

instance                   Action  ()               y         y

This describes the situation for empty productions (sometimes called epsilon or lambda productions).  Since leaving out any terms on the right-hand side of a rule isn’t an option, I use (), called “unit” to represent an empty right-hand side.

Those are the base cases.  (As a side comment, only the last one is strictly necessary; the first two are basically just syntactic convenience.  See below.)  Here’s how rules with more than one symbol on the right-hand side are handled.

instance (Action a b c) => Action  (RHS (Var x) a)  (x -> b)  c

The RHS operator defined earlier is used to build a list of sorts.  This rule says that adding a variable to the beginning of the right-hand side of a rule requires adding a parameter to the beginning of the semantic action, and that the result type stays the same.  This case handles right-hand sides that begin with a variable.

instance (Action a b c) => Action  (RHS Char    a)  b         c

Finally, this case handles right-hand sides that begin with a terminal (a character).  The types of the semantic rule and result don’t change, since a terminal is known in advance, so there’s no need for it to carry semantic information.

Some more syntactic convenience makes it easier to write grammars.  Here I abuse monads to take advantage of the special syntax.

{-
    The RuleSet monad is used to define rules of a grammar in a
    convenient 'do' notation.
-}

data Rule   = forall a b c. (Action a b c) => Rule (Var c, a, b)
data RuleSet x = RuleSet ([Rule], x)

instance Monad RuleSet where
    a >>= k  = let RuleSet (r1, v1) = a
                   RuleSet (r2, v2) = k v1
               in  RuleSet (r1 ++ r2, v2)
    return x = RuleSet ([], x)

So a rule consists of a left-hand side, a right-hand side, and a semantic rule.  They are constrained to match each other by the Action class defined above.  A RuleSet is basically just a writer monad for lists of rules, but I defined it by hand just for the fun of it.

Now time to define an operator for building rules inside the monad:

(==>) :: (Action a b c) => Var c -> a -> b -> RuleSet ()
(==>) lhs rhs sem = RuleSet ([Rule (lhs, rhs, sem)], ())

infixr 4 ==>

It took a while to pick this.  All the good arrow-like operators seems to be taken!  Nevertheless, it does the job we want fairly well.  Notice that even though I’m using an infix operator, there are three operands.  The normal usage looks like this:

lefthand ==> righthand $ semanticrule

You’ll see examples coming up.

The formal definition of a context-free grammar includes four things: a set of variables, a set of terminals, a set of productions, and a special start variable.  We’ve got three: variables are values of type Var a.  Terminals are values of type Char.  Productions are values of type Rule.  Next, I need a start symbol.  This is defined once, outside of the monadic environment in which rules are defined.  At the same time, I through away the result value of the monad, which is useless since I was just exploiting the syntax rather than building a real monad.

{-
    Grammar.  A grammar is a set of rules together with a
    start symbol.
-}

data Grammar a = Grammar (Var a) [Rule]
grammar s (RuleSet (rs, x)) = Grammar s rs

And that’s it!  Here’s a sample grammar so we can see it work.

{-
    Sample grammar.

    The parentheses in the let bindings are required: they force the rules
    to be monomorphic, which is needed for type checking to work properly.
-}
g = let ( expr     ) = Var "expression"
        ( term     ) = Var "term"
        ( termmore ) = Var "term operator"
        ( fact     ) = Var "factor"
        ( factmore ) = Var "factor operator"
        ( digit    ) = Var "digit"
        ( digits   ) = Var "digits"

    in grammar expr $ do
       expr     ==> term & termmore        $ \\x y   -> y x
       termmore ==> ()                     $ \\x     -> x
       termmore ==> '+' & term & termmore  $ \\x y z -> y (x + z)
       termmore ==> '-' & term & termmore  $ \\x y z -> y (x - z)
       term     ==> fact & factmore        $ \\x y   -> y x
       factmore ==> ()                     $ \\x     -> x
       factmore ==> '*' & fact & factmore  $ \\x y z -> y (x * z)
       factmore ==> '/' & fact & factmore  $ \\x y z -> y (x / z)
       fact     ==> '(' & expr & ')'       $ \\x     -> x
       fact     ==> digit & digits         $ \\x y   -> y x
       digits   ==> ()                     $ \\x     -> x
       digits   ==> digit & digits         $ \\x y z -> y (10*x + z)
       digit    ==> '0'                    $ 0
       digit    ==> '1'                    $ 1
       digit    ==> '2'                    $ 2
       digit    ==> '3'                    $ 3
       digit    ==> '4'                    $ 4
       digit    ==> '5'                    $ 5
       digit    ==> '6'                    $ 6
       digit    ==> '7'                    $ 7
       digit    ==> '8'                    $ 8
       digit    ==> '9'                    $ 9

(This is a modified grammar I had laying around from a set of compiler course notes.  It happens to have left recursion removed, but that’s immaterial really.)  There are no type declarations in the entire grammar.  Ask GHCi for the type of g, though, and it answers.

g :: (Fractional a) => Grammar a

It correctly inferred that the result type of expr must be Fractional.  How?  Because the third production for factmore uses the / operator.  This means that factmore must be Fractional, and the type ripples upward all the way to the start variable of expr.

The only thing I don’t like at the moment is the need to use an explicit monomorphic binding (the parentheses) to declare non-terminals.  If that’s not done, then the compiler thinks non-terminals can have different result types when used in different places, and the types it infers tend to be several pages long!  A nice solution to that would be good, but I’m happy with everything else!

Please comment if there’s something that can be improved.

July 17, 2007

A Neat Problem

Filed under: Haskell, math — cdsmith @ 11:41 pm

Here’s a neat game, with a really neat solution.  I’ll post the solution later, but this post shows a Haskell implementation of the game.  I first encountered this game as a puzzle in a game (an otherwise unremarkable one) called Keepsake by The Adventure Company.

The game looks like this.  There are five displays, which display the letters A through E.  The letters cycle through that sequence, wrapping back to A at the end.  There are also five buttons, which are numbered 1 through 5.  Each of the button changes each display my some random amount (from zero to four steps forward).  Pressing a button five times consecutively will always get things back to where they were before it was pressed. The goal is to get from the (randomly chosen) starting point to the (randomly chosen) ending point.

For example, if the starting point were ABCDE, the target were EDBDE, and button 3 advanced the first and third dials by two and the second by 1 (what I’ll call a “signature” of [2, 1, 2, 0, 0]), then the correct answer could be obtained by pushing button 3 twice.  The first press would change it to CCEDE, and the second to EDBDE (because the third dial wraps).

module Main where

import Control.Monad (replicateM)
import System.Random
import System.IO

Three fundamental pieces of information are used to define instances of this problem.  There is a starting display, a target display, and the actions of each of the five buttons.  The program chooses random values for these three, prints the target, and then starts processing moves by the user.

main :: IO ()
main = do   mat  <- buildMat
            disp <- replicateM 5 $ randomRIO ('A', 'E')
            tgt  <- replicateM 5 $ randomRIO ('A', 'E')
            putStrLn $ "Target = " ++ tgt
            continue mat disp tgt

The actions of each of the five buttons are represented in a matrix of sorts.  Each button has a row in the matrix; and the columns represent the effect on each of the letters in the display.

type Matrix = [[Int]]

Random values for the matrix are chosen as follows.

randomMat :: IO Matrix
randomMat = replicateM 5 $ replicateM 5 $ randomRIO (0,4)

There is one quirk, though.  We want the problem to have a solution.  Clearly there are some random matrices for which the problem has no solution.  (For a trivial example, imagine a matrix of all zeros; then it’s impossible to change the display.)  It turns out that, for reasons I will explain when I post the solution, it is sufficient for the matrix to have a non-zero (mod 5) determinant.  So here’s the code to calculate the determinant (by cofactor expansion along the first column)

det :: Matrix -> Int
det mat | length mat == 1 = head (head mat)
        | otherwise       = sum $ map det' [0 .. length mat - 1]
    where det' n = let x      = head (mat !! n)
                       submat = map tail (take n mat ++ drop (n + 1) mat)
                       sign   = if n `mod` 2 == 0 then 1 else -1
                   in  x * sign * det submat

The choice of a matrix, then, is performed by continuing to generate matrices until we find one with a non-zero (mod 5) determinant.

buildMat :: IO Matrix
buildMat = do m <- randomMat
              if det m `mod` 5 == 0 then buildMat else return m

Once the problem is generated, the next step is to keep asking for which button to press until the user accomplishes the goal.

continue :: [[Int]] -> [Char] -> [Char] -> IO ()
continue mat disp tgt | disp == tgt = do
    putStrLn "Congratulations!  You win."

                      | otherwise   = do
    putStrLn disp
    putStr "Enter 1 through 5: "
    hFlush stdout
    s        <- getLine
    let n     = read s
    let disp' = apply mat (n - 1) disp
    continue mat disp' tgt

Only two more functions are needed.  First, apply takes the matrix and a button number, and calculates the result of pressing that button.

apply :: Matrix -> Int -> [Char] -> [Char]
apply mat n vals = zipWith ($) (map add (mat !! n)) vals

Finally, the add function rotates a single letter of the display by a given amount.

add :: Int -> Char -> Char
add n s = iterate nxt s !! n
    where nxt 'A' = 'B' ; nxt 'B' = 'C'
          nxt 'C' = 'D' ; nxt 'D' = 'E'
          nxt 'E' = 'A'

Putting all of this together gives an implementation of the puzzle.  The puzzle can be fun to play with, but there’s also a neat straight-forward mathematical technique for finding the solution, which I’ll post in a few days.  Until then, have fun!

July 14, 2007

My Dream GHCi Session

Filed under: Haskell — cdsmith @ 12:39 pm

When I use GHCi in my dreams, here’s how it works:

$ ghci MyFile.hs
GHCi, version 9.9.02013010: http://www.haskell.org/ghc/  :? for help
Loading package base ... linking ... done.
[...]
[1 of 1] Compiling MyFile              ( MyFile.hs, interpreted )

MyFile.hs:13:5: Not in scope: `callDatabase'

MyFile.hs:112:16:
    No instance for (Fractional Integer)
    [...]

Partially failed, modules loaded: MyFile
MyFile> :t foo
Warning: foo depends on callDatabase, which is not in scope.
Warning: Inferred type may be too general.
foo :: forall a b. (SQL a, SQLData b) => a -> t -> STM [b]

MyFile> :list foo
13: foo sql db = do dat <- callDatabase db sql
14:                 readTVar dat

MyFile> :list 110-114
110:       return [a]
111:
112: bar x = (baz x) / 5
113:
114: {- superCoolFunction: Makes the world end; do not use unless

MyFile> :t baz
baz :: forall a. a -> Integer

MyFile> let bar = fromIntegral (baz x) / 5
MyFile> let callDatabase = undefined :: (SQL a, DBDriver b) => b -> a -> STM TVar
MyFile> :recomp MyFile
[1 of 1] Compiling MyFile              ( MyFile.hs, interpreted )
Ok, modules loaded: MyFile

MyFile> :t foo
foo :: forall a b. (SQL a, DBDriver t, SQLData b) => a -> t -> STM [b]

MyFile> superCoolFunction

And then I wake up! There’s more I’d like to include, such as:

  • Define data structures inside the REPL
  • Ask for the type of an expression *in context* *e.g., what’s the type of 5 in (5 + 3.0)?
  • Change defaulting from the REPL
  • Enter multiline blocks of code without having to edit a file
  • Run code that doesn’t work, and define broken bits = bottom.
  • Show inferred and expected types with :t when type checking fails

But I woke up too soon. (Or, if you prefer, posted this blog entry too soon.)

July 6, 2007

Why don’t Haskell Developers Use Windows?

Filed under: Haskell — cdsmith @ 9:27 am

Reading a recent thread on haskell-cafe, it was mentioned that there aren’t a lot of Windows Haskell programmers.  That’s true, but I think it’s misleading to say it that way.  This is one of those situations where correlation does mean causation.  Here’s my story.

I write Haskell using Windows… sort of.  Okay, not really.  What I do is use a Windows laptop to open a PuTTY window, in which I write Haskell code on a UNIX system that’s sitting on the other side of the room.  This is certainly not my ideal working environment.  I’d love to really write Haskell on Windows, but I don’t for several reasons.

I don’t write Haskell with Windows because I haven’t found a good environment for it.  I like my IDE programs, and especially Eclipse - which I already use for Java, C, and C++ programming; but that doesn’t look feasible.  The closest I could seem to come was eclipsefp, but it’s nearly unusable.  It gets syntax highlighting completely wrong.  (It’s easy to get hard cases wrong in Haskell; writing a really correct syntax highlighter is hard; but this one even gets obvious things wrong like forgetting to highlight deriving, and requiring the occasional frivolous edit to poke it into changing syntax highlighting when something changes.)  I tried to fix some things, but the Haskell code is in a darcs repository, and the only darcs plugin for Eclipse never seemed to work.  Writing Eclipse plugins in any environment except Eclipse is hideously painful; so enough of that.  Even if I did get it to work, the Eclipse support for Haskell doesn’t seem to advanced; may as well use a command line.

I don’t write Haskell with Windows because of Lambdabot.  Lambdabot and GOA (GHCi On Acid) are important tools for me.  They are far more important than the platform I develop on.  But it looks rather non-trivial to build them for Windows.  I tried to do so for several hours, before I gave up.

I don’t write Haskell with Windows because I build development versions of GHC a lot.  (I do this to at least help with testing, and partly because I have aspirations to contribute to GHC myself in the future.)  While most of the breaking of GHC seems to happen on MacOS, I notice a lot of patches floating around to fix broken builds of GHC for Windows as well.  One rarely sees a developer commit something that breaks the build for Linux, because that’s what the developers use.

There are probably more people out there in the same situation.  There seems to be a catch-22 here.  It’s not that Haskell doesn’t attract Windows.  Haskell attracts people, who subsequently realize that developing on Windows is harder than developing on Linux.  Since Haskell does not attract the sort of people who are slaves to their platform, that means there are fewer people developing Haskell with Windows.

July 5, 2007

Learning Haskell and Number Theory: The End of GCD

Filed under: Haskell, math — cdsmith @ 10:41 pm

This is the fifth installment of my series “Learning Haskell and Number Theory”, in which I work through Gareth and Mary Jones’ text, Elementary Number Theory, and translate its ideas and algorithms into Haskell.  In the last installment, I used the Haskell foldl function to convert a binary greatest common denominator into one that operates on a list.  This time, I fix a glitch in that implementation, the dreaded “space leak,” and look at strict and lazy evaluation in Haskell.

The implementation of gcdmany from last time looked like this:

gcdmany :: forall a. Integral a => [a] -> a
gcdmany = foldl gcd1 0

This is a correct implementation, but not a good one.  Before we can understand why, though, we need a diversion through tail recursion.  Everyone who’s done much functional programming knows the importance of tail recursion.  It turns O(n) space into O(1) space.  For example, this implementation of factorial:

fact 0 = 1
fact n = n * fact (n - 1)

uses O(n) space because it’s not tail recursive.  The evaluation of fact (in a strict language; I’ll talk about Haskell’s quirks later) would look like this:

fact 6
6 * (fact 5)
6 * (5 * (fact 4))
6 * (5 * (4 * (fact 3)))
6 * (5 * (4 * (3 * (fact 2))))
6 * (5 * (4 * (3 * (2 * (fact 1)))))
6 * (5 * (4 * (3 * (2 * (1 * (fact 0))))))
6 * (5 * (4 * (3 * (2 * (1 * 1)))))
6 * (5 * (4 * (3 * (2 * 1))))
6 * (5 * (4 * (3 * 2)))
6 * (5 * (4 * 6))
6 * (5 * 24)
6 * 120
720

Each line represents the set of computations that needs to be stored in computer memory at any stage of execution.  You can see that the memory needed grows with the value of n.  (It doesn’t matter much for factorial, because merely storing the answer also uses space that’s O(n) in the size of the input.  That’s beside the point.)

An alternative implementation is:

fact n = fact' 1 n
    where fact' a 0 = a
          fact' a n = fact' (a * n) (n - 1)

This one is made tail recursive, by following a very common technique of keeping an accumulator as an explicit parameter (in this case, a) with an outer setup function.  If you look back at the fourth installment, you’ll see that this is exactly what I did to turn the first implementation of gcdmany into the second one.  I introduced an explicit accumulator parameter, and then used it to convert the routine into a tail recursive one with setup.  In a strict language (like, say, the ML family of languages) that would be sufficient to turn the fact function into O(1) space.

The execution of this one (again, in a strict language) looks like this:

fact 6
fact'   1 6
fact'   6 5
fact'  30 4
fact' 120 3
fact' 360 2
fact' 720 1
fact' 720 0
720

This is much better!  In our hypothetical strict language, we’ve solved the problem.  In Haskell, though, it’s not quite good enough.  Haskell, being a lazy language, won’t even do simple multiplication until we ask for the result.  The execution, therefore, looks like this:

fact 6
fact' 1 6
fact' (1 * 6) 5
fact' (1 * 6 * 5) 4
fact' (1 * 6 * 5 * 4) 3
fact' (1 * 6 * 5 * 4 * 3) 2
fact' (1 * 6 * 5 * 4 * 3 * 2) 1
fact' (1 * 6 * 5 * 4 * 3 * 2 * 1) 0
1 * 6 * 5 * 4 * 3 * 2 * 1
6 * 5 * 4 * 3 * 2 * 1
30 * 4 * 3 * 2 * 1
120 * 3 * 2 * 1
360 * 2 * 1
720 * 1
720

And, oops!  We’re back to using O(n) space again.  This is the dreaded “space leak”.

Back to our gcdmany function.  It, too, contains such a space leak.  Instead of computing intermediate greatest common divisors as it applies arguments one by one, it actually accumulates the GCD operations for the whole list, before it starts to evaluate any of them.

We’d like to fix this.  Haskell provides a way to do so: selective strict evaluation.  In other words, we can tell Haskell that certain things must be evaluated strictly rather than put off for later.  Doing this never changes the answer to a successful computation, but it can do two things:

  1. Make a computation run faster, and with less memory.
  2. Make a computation fail or run forever, when it might otherwise have succeeded.

The first effect is what we want here.  The second, though, is interesting as well.  Consider the following code:

mylist = [1, 2, 3, undefined]
main   = print (length mylist == 4)

Running this program will print True.  In a strict language, this program would fail because the fourth element of the list is undefined.  (To elaborate more, you might design a language to avoid failing if the term is known to be undefined.  But what if teh function that computes it goes into an infinite loop?  In Haskell, it wouldn’t matter because we never use that value.  In a strict language, it would matter quite a lot!)

In the case of gcdmany, though, I can verify that I am not doing anything that might cause the system to go into an infinite loop or fail.  It is, therefore, safe, to replace it with a strict version.

The first way of doing this is standard Haskell 98.  It looks like this:

gcdmany :: forall a. Integral a => [a] -> a
gcdmany xs = gcdmany' 0 xs
    where gcdmany' p []     = p
          gcdmany' p (x:xs) = (gcd1 p x) `seq` gcdmany' (gcd1 p x) xs

The magic function seq makes this all work.  If you read the description of seq in the standard library, it’s quite confusing for beginners to Haskell.  What it really means is that is evaluates the first bit of code (so, as a side effect, if the first bit goes into an infinite loop or throws an exception, it’ll do that too), and then goes on with the second one as if the first one didn’t exist.  However, evaluating the first one means that instead of it being work put off until later, the result is already known.  That gets us back to the nice O(1) memory usage.

The second way of doing this does the same thing, but is shorter and only works in GHC with the -fallow-bang-patterns option.

gcdmany :: forall a. Integral a => [a] -> a
gcdmany xs = gcdmany' 0 xs
    where gcdmany' !p []     = p
          gcdmany' !p (x:xs) = gcdmany' (gcd1 p x) xs

The ! in front of the argument p does the same thing as seq but is much nicer to read.

Both of these require that we write out gcdmany by hand, though.  Before, we were able to make it a lot nicer using foldl.  The third option fixes that.  It’s not part of the standard Haskell 98 prelude, but it does exist in GHC.

gcdmany :: forall a. Integral a => [a] -> a
gcdmany = foldl' gcd1 0

The foldl’ function is just a strict version of foldl.  It does exactly what we want.  So with all those words, the solution was achieved by adding one small character.  (Why, though?  Because of the ability to use higher order functions to do functional abstraction!)

In general, one wants to use foldl’ when folding over an operation that produces a distinct result, and foldr when folding with a data constructor or an operation that builds more complex data structures.  In the first case, you avoid space leaks.  In the second case, you can use lazy evaluation to only build a piece at a time of the data structure you’re operating on, and thus save even more memory through lazy evaluation.

That’s all for now.  Next time, I’ll look at some cool connections between Euclid’s GCD algorithm and the Fibonacci numbers, and then move on to chapter 2.

Find the Bug

Filed under: Haskell — cdsmith @ 2:47 pm

Here’s a pretty tricky bug I hunted down yesterday, with the help of oerjan on #haskell.  If you weren’t there to see, it’s a lot of fun.  Here’s the code:

foo n = product [24 - n .. 23] `div` product [1 .. n]
bar n = length (takeWhile ((< 1000000) . foo) [1 .. n])
baz n = if bar n == n then n else bar n

And here’s the result:

cdsmith@devtools:~/haskell/projects$ ghci Strange.hs
GHCi, version 6.7.20070703: http://www.haskell.org/ghc/  :? for help
[1 of 1] Compiling Main             ( Strange.hs, interpreted )
Ok, modules loaded: Main.
*Main> bar 10
9
*Main> baz 10
10

The question: Why does bar 10 evaluate to something different from baz 10?  By looking at the definition of baz, you might convince yourself that baz and bar do the same thing.  Apparently not.

Scroll down for the answer.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The problem is numeric overflow.  The standard library length function has a result type of Int.  The intermediate result in foo is too large to fit in the data type Int.  The bar function has type forall a. Integral a => a -> Int.  When calling bar directly, this isn’t a problem because the argument defaults to type Integer and foo is evaluated entirely in arbitrary precision numbers.  However, when baz is evaluated, there is a comparison bar n == n, so the result of bar must have the same type as its argument.  The result is known to be Int because the return type of bar is not polymorphic.  So the argument n must also be of type Int.  Then foos inferred type is Int -> Int, which causes the overflow.

All clear now?

If one wanted to take a lesson from this, it would be one of the following:

  • Be careful with types.
  • General purpose functions like length shouldn’t use dangerous types like Int.
  • Int should be made less dangerous by throwing an exception on overflow.

Let’s take a poll: which lesson do you draw from it?  Post in the comments.

Next Page »