Some people are remarkably picky. They actually want to see reflections in a ray tracer. Can you believe that? Here you go then!

Some people are remarkably picky. They actually want to see reflections in a ray tracer. Can you believe that? Here you go then!

I’m giving a talk at a user group meeting next week on how to program in Haskell. Rather than just listing off language constructs, I wanted to motivate the whole thing with an actual application. So that’s how I ended up spending yesterday afternoon writing a ray tracer in Haskell. I’m certainly not the first to do so, nor the best, but I am writing this blog post to explain the process I went through. I’ll try making this a literate Haskell post, so you can copy and paste it into a source file (extension .lhs) and run it. The actual code is split into three (very short) modules, but this should work.
Some imports to start us off:
> import Data.Maybe (mapMaybe) > import Data.Function (on) > import Data.List (minimumBy) > import Data.Ix (range) > import Control.Monad (forM_) > import Graphics.UI.Gtk hiding (Color, Point, Object) > import qualified Graphics.UI.Gtk as GTK
First, I built a couple basics of vector math. Out of a desire to demonstrate some list manipulation functions in the presentation, my Vector type is simply [Double].
> type Vector = [Double] > > (<+>) = zipWith (+) > (<->) = zipWith (-) > (<*>) = zipWith (*) > vlength u = sqrt (sum (map (^2) u)) > normalize u = map (/ vlength u) u > u .*. v = sum (u <*> v) > k #* v = map (*k) v
Since vectors don’t really even approximate a ring, it seems quite inappropriate to call them Num. Therefore, I’ve invented my own operators. The operators in angle brackets are component-wise, while the normal dot and scalar products are given their own operators, .*. and #*, respectively. I found this to be quite usable, though it would be equally reasonable to eschew operators, and define these as named functions to be used in infix notation.
I then define a few precedences, to make these easier to use.
> infixl 6 <+> > infixl 6 <-> > infixl 7 <*> > infixl 7 .*. > infixl 7 #*
I’ll treat points as displacement vectors from an origin, and colors as vectors in a color space, but it helps to use separate names for them when writing type signatures:
> type Point = Vector > type Color = Vector
Now for a few real data types. The idea is that an object has some general-purpose properties, plus a shape. The shape could be arbitrary, but is currently limited to only spheres and infinite planes. (Hey, this talk is only an hour and a half long!) So I’ve separated it out. The other properties I’ve got are material properties: how shiny (that is, reflective) is the surface, and what is its color. These are defined as functions from a point in space to the desired values, but in practice I’m just going to use const.
> data Object = Obj { shape :: Shape,
> objectColor :: Point -> Color,
> objectShine :: Point -> Double }
>
> data Shape = Sphere { center :: Point, radius :: Double }
> | Plane { point :: Point, normal :: Vector }
And lights are important to. Right now, I’ve got point lights and ambient lights. Adding directed lights (e.g. spot lights) wouldn’t be too difficult, but it is neither of critical importance nor necessary to demonstrate a Haskell language idea, so I’ve left it out.
> data Light = PointLight { lightPt :: Point, lightColor :: Color }
> | AmbientLight { lightColor :: Color }
The two things we need to know about any shape are how to compute the distance to it along a given ray (specified by origin and a unit direction vector), and how to find the normal (another unit vector) at a given point on the surface of the object. These next two functions do this. They are defined for spheres and planes, but can of course be extended to new variants of Shape via additional pattern matching.
> shapeDistance :: Shape -> Point -> Vector -> Maybe Double > > shapeDistance (Sphere c r) orig dir = > let p = orig <+> (c <-> orig) .*. dir #* dir > d_cp = vlength (p <-> c) > d = vlength (p <-> orig) - sqrt (r^2 - d_cp^2) > ans | d_cp >= r = Nothing > | (p <-> orig) .*. dir <= 0 = Nothing > | otherwise = Just d > in ans > > shapeDistance (Plane p n) orig dir > | dir .*. n >= 0 = Nothing > | otherwise = Just (((p <-> orig) .*. n) / (dir .*. n)) > > shapeNormal :: Shape -> Point -> Vector > shapeNormal (Sphere c r) pt = normalize (pt <-> c) > shapeNormal (Plane p n) pt = n
The math there can be derived without a whole lot of trouble, provided you’re relatively familiar with the basic ideas of vector operations. It surprises me, though, how many people only understand dot products in terms of their implementation (the sum of the component-wise products, or a b cos(theta)), and not in terms of what they mean. In particular, when one of the vectors in a dot product is a unit vector, the dot product is the component of the other vector along the direction of the unit vector. That’s a pretty powerful tool to have.
Next come the lights. For the lights, the general question is how much (in each color component) the given light illuminates a particular point with a particular normal. The object list is needed here as well, since lights can be occluded. This is all captured in a function applyLight, which is implemented for each type of light. It’s trivial for ambient lights, since by definition they ignore all the specifics of the situation; but it’s non-trivial for point lights. I also fence all light components to eliminate “negative” lights.
> fence a b x | x < a = a > | x > b = b > | otherwise = x > > applyLight :: [Object] -> Point -> Vector -> Light -> Vector > applyLight objs pt n (PointLight lpt color) = > let dir = normalize (lpt <-> pt) > dist = vlength (lpt <-> pt) > lval = (n .*. dir) / dist^2 #* color > final = map (fence 0 9999) lval > in case findNearest objs pt dir of > Just (_,opt) > | vlength (opt <-> pt) < dist -> [0,0,0] > | otherwise -> final > Nothing -> final > > applyLight objs pt n (AmbientLight color) = color
Now for the common stuff. A very common thing we want is to take our list of objects and find the nearest point of intersection for a given ray. This function does that, by calling objectDistance for each object in turn, and then choosing the minimum one. It gives back Nothing if there is no intersection, or Just (obj,point) if there is.
> findNearest :: [Object] -> Point -> Vector -> Maybe (Object, Point) > findNearest objs orig dir = > let consider obj = case shapeDistance (shape obj) orig dir of > Nothing -> Nothing > Just d -> Just (obj, d) > result = mapMaybe consider objs > in case result of > [] -> Nothing > _ -> let (obj,d) = minimumBy (compare `on` snd) result > in Just (obj, orig <+> d #* dir)
And finally, the ray tracing itself. I’ve also thrown in defaultColor, which is the color assigned to rays that have either reflected back and forth beyond the recursion limit, or that proceed into infinity. For now, this is always black.
> defaultColor :: Vector -> Color > defaultColor _ = [0, 0, 0] > > trace :: [Object] -> [Light] -> Point -> Vector -> Int -> Color > trace objs lights orig dir 0 = defaultColor dir > trace objs lights orig dir limit = > case findNearest objs orig dir of > Nothing -> defaultColor dir > Just (obj, pt) -> > let n = shapeNormal (shape obj) pt > ndir = dir <-> 2 * (n .*. dir) #* n > refl = trace objs lights pt ndir (limit - 1) > color = objectColor obj pt > shine = objectShine obj pt > lvals = map (applyLight objs pt n) lights > lighting = (foldl (<+>) [0,0,0] lvals) > in_light = shine #* refl <+> (1 - shine) #* lighting > in map (fence 0 1) (color <*> in_light)
Finally, there’s the GUI stuff. I won’t discuss this much, except to say that it’s horridly bad. I’m doing the actual ray tracing in an onExpose event handler. If I had even a modicum of common sense, I’d be rendering to an image or something, so that I wouldn’t have to redo all the ray tracing just because you switched over to a different window. Oh well. I’ll improve this before the talk on Tuesday.
> objs = [
> Obj (Sphere [-50, -50, 100] 100)
> (const [1.0, 0.3, 1.0])
> (const 0.1),
> Obj (Sphere [ 50, 50, 100] 100)
> (const [0.2, 1.0, 0.2])
> (const 0.1),
> Obj (Plane [-100,100,0] (normalize [1,-0.2,-0.2]))
> (const [1.0,1.0,1.0])
> (const 0.0)
> ]
>
> lights = [ PointLight [175, -200, 10] [40000.0, 20000, 20000],
> AmbientLight [0.0, 0.05, 0.1]
> ]
>
> drawScene d ev = do
> dw <- widgetGetDrawWindow d
> (w,h) <- widgetGetSize d
> gc <- gcNew dw
>
> let eye = [0, 0, -500]
>
> forM_ (range ((0,0), (w-1, h-1))) $ \(x,y) -> do
> let (x', y') = (fromIntegral x - fromIntegral w / 2,
> fromIntegral y - fromIntegral h / 2)
> let [r,g,b] = trace objs lights eye
> (normalize ([x',y',0] <-> eye)) 5
> let fg = GTK.Color (round (65535 * r))
> (round (65535 * g))
> (round (65535 * b))
> gcSetValues gc $ newGCValues { foreground = fg }
> drawPoint dw gc (x,y)
> return True
>
> main = do
> initGUI
> w <- windowNew
> d <- drawingAreaNew
> windowSetTitle w "Ray Tracer"
> containerAdd w d
> onExpose d (drawScene d)
> onDestroy w mainQuit
> widgetShowAll w
> mainGUI
And there you go: a ray tracer! Of course, there are many things that ought to be improved. They range from the trivial (render to an off-screen image, perhaps save to an image file, and read from a file instead of coding the scene as an immediate data structure) to the very complex (implement diffuse effects, more complex interpolated curves, etc.). But as a quick demo goes, this one didn’t turn out so bad.

In a few other blogs (especially here), there has been some discussion of the mathematical foundations of functions in programming languages, starting from topology. Just for fun, I will approach from the other direction, describing functions that can be defined in various languages in terms of partially ordered sets. Of course, for those of you who know this already, then you can go on and define topologies on these posets a la Dana Scott, so it amounts to the same thing. But this way has more pretty pictures.
Suppose we want to look at functions from the natural numbers to the natural numbers. (By the natural numbers, I mean non-negative integers.) There are, of course, infinitely many such functions. Let’s pick a few that are interesting:
That should give us a good start. These functions are all quite well-defined over the integers. Indeed, I can implement them all in my favorite programming language, and I’ll get exactly that (note: only because my favorite programming language does arbitrary precision integers, of course.)
The next question is this: what happens if I don’t know everything about the input to the function? What can I infer about the output given partial information about the input? Here’s some of the things I might ask of that sort:
| x | x = 2 | x is even | x is at least 5 | x could be anything |
|---|---|---|---|---|
| f(x) | f(x) = 5 | f(x) = 5 | f(x) = 5 | f(x) = 5 |
| g(x) | g(x) = 2 | g(x) is even | g(x) is at least 5 | g(x) could be anything |
| h(x) | h(x) = 3 | h(x) is odd | h(x) is at least 6 | h(x) is at least 1 |
| p(x) | p(x) = 4 | p(x) is even | p(x) is at least 10 p(x) is even |
p(x) is even |
| q(x) | q(x) = 4 | q(x) is even q(x) is a perfect square |
q(x) is at least 25 q(x) is a perfect square |
q(x) is a perfect square |
Getting back this sort of partial information is trickier than implementing the function over totally known arguments. That’s what this blog post is about.
The trick is to build a partial order on the domain of the function. A partial order is some relation, ≤, that has three properties:
Everyone is familiar with the ≤ that means “is no larger than.” But there are other perfectly good definitions as well. For example “is a factor of” is a perfectly good meaning for ≤. Now here’s where the pretty pictures come in. We can draw a diagram for a partial order, where there’s an upward path from x to y precisely when x ≤ y.
For example, here is part of the diagram for the familiar partial order “is no larger than” on the natural numbers:

And here is part of the diagram for “is a factor of”:

Each of these partial orders contains some information about the relationships between natural numbers. In particular, one can identify certain properties of the natural numbers as belonging to up-sets: sets that have the property that if they contain something, they also contain everything that it greater than it. So “at least 5″ is the up-set of naturals in the familiar order that contains 5 and everything above it. We can call this the up-set generated by 5. In the second (divisibility) order, the up-set generated by 5 can be described as “divisible by 5″.
To use these orders to get information about partial inputs to functions, we need two things:
Let’s shoot for the first one. Looking at the kinds of partial information we had:
Once we’ve chosen a partial order whose up-sets capture the information we want, we have to verify that the function actually preserves the order. That is, if we take an up-set generated by x, and pass everything in it through the function, will the result live in the up-set generated by f(x). The successor function h(x) = x + 1 that we defined earlier has this property in the familiar order. If I have a set of all numbers bigger than 4, and I add one to them all, I get all the numbers bigger than 5. However, h(x) does not preserve up-sets in the divisibility order. If I have all numbers divisible by 3, and I add one to them all, there’s no guarantee that the resulting numbers will be divisible by 4! So I have to be careful about using an order-preserving function, if I want any kind of partial information from up-sets.
What a mess! Sure, if I can define the right partial order, and then prove that the function preserves up-sets on the partial order, and sometimes pull this trick with equivalence classes, and then I get to say something about the behavior of my function on partial information. But that seems less than fulfilling somehow. It feels like I’m working too hard, and now I can’t even define some functions, depending on which order I choose.
So what do we do with all of this?
The first thing to do is to pick an order and stick with it. We’ll lose some partial information, but we’ll get to keep other partial information. The logical choice for natural numbers is the familiar order. It’s fairly common for me to want to know that a number is at least 3. It’s somewhat less common for me to want to know that a number is a multiple of 3. But I won’t assume any choice, and in a future blog post, I’ll demonstrate how to build numeric data types in Haskell that parallel several possible choices, and exhibit different laziness characteristics as a result.
The next thing we want to do is recover the ability to define all functions again. The problem with defining some functions is that there’s something “greater than” the number we’re looking at that won’t stay that way once we’re done. The obvious solution is to just make sure there’s nothing “greater than” a number. We can invent a whole set of pseudo-numbers: ~0, ~1, ~2, and so on. Then we’ll put the numbers themselves above that. Here’s the new set of natural numbers with the familiar order:

And voila! I can define all my favorite functions with the actual numbers. What remains is to extend my function onto the pseudo-numbers as well. In this order, the pseudo-number ~n means “at least n“. There’s always one safe way to do this: just map everything to ~0, the minimal element. However, if I still want to get partial information about the function, I need to do better.
Here’s the result for the divisibility order:

Here we have the same thing, except that “~n” now means “divisible by n“. Again, I need to extend my functions to the pseudo-numbers. I can still map everything to the bottom element (in this case, “~1″), but this throws away any partial information I may have been keeping.
Finally, one more option. This corresponds to a partial order on an equivalence class, sort of like what came out of the perfect squares example. This one, though, is the trivial equivalence class that lumps everything together. The result is the strict natural numbers, in which I must know the entire number to get any information out of any (non-constant) function.

Here only a constant function could do anything except map ~0 back to ~0. Therefore, unless your function is constant, the only way to get any information back from your input is to know the precise value of the input. This is an important arrangement of the natural numbers because for efficiency reasons, in the particular case of integers, this is precisely what all real programming languages do. We’re just using natural numbers as an example, though; the ideas here apply equally well to more complex structures where laziness exists.
It turns out, now, that once we’ve chosen a partial order for the pseudo-numbers, there is sometimes a best possible way to extend any function over the naturals to include pseudo-numbers. To determine the value of any function f at ~n, we’d like to find the greatest lower bound of f(x) for all x ≥ ~n. Now if the partial order we chose happens to be something called a complete meet-semilattice (all of the orders we’ve talked about are), then that greatest lower bound exists.
As you may have guessed by now by my occasional references, this is all connected to laziness of functions in programming languages. I had intended to write about that now, but this has gotten too long already, so I am instead going to stop here and finish the other half at a later time. Hope you enjoyed it!
Here’s an article on reasons to like Ruby. It’s actually very well written and persuasive. So I took it as a challenge. How well would Haskell fare when compared against this specific set of criteria, changing the points as little as possible? The meanings will inevitably shift a bit, and readers will know that Haskell has its own set of advantages, of course.
The point of this exercise, though, is to see how Haskell does in the benefits mentioned for Ruby in that particular article. It’s not a Haskell vs. Ruby; just a shameless theft of a set of criteria. Should be fun.
Text.Regex module in the Haskell standard library contains functions to do regular expression stuff. It even uses Haskell’s operator mechanism to define operators like =~ to look more like other languages sometimes.>>= operator provide the ability to combine pieces every bit as tersely as pipes; type inference eliminates the cost of static types. This page talks about using Haskell for scripting, and this one describes the -e option to ghc, which lets you run Haskell code directly from the command line, and gives an example of using it.Either and Maybe that help to express those exceptional conditions in a functional manner. When you’re working in an imperative way (e.g., in the IO monad, or any other MonadPlus environment) it provides exception handling, since that’s the right choice for the imperative style.:t at the GHCi command prompt, so you don’t have to repeat it over and over again in your code.So there you have it.
This is what I have been playing with for the last day or so.
Haskell has a very nice monadic parser library for predictive parsing (parsec), and a decent lex/yacc-style parser and lexer generator suite (happy and alex). That said, though, it’s more fun and educational to write code than to worry about what’s already been written, I set out to do something similar. In particular, my goals are:
I haven’t written the parsing code yet, but I did squeeze out something that I’m nearly happy with for the first two goals.
{-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE UndecidableInstances #-} {-# LANGUAGE NoMonomorphismRestriction #-}
All the language extensions I’ll be using. This is the bare-bones list; the original list was eight or nine lines.. So, if you were wonder whether this is a good post for new Haskellers just learning the language, there’s your answer!
I need some way to represent variables (in the CFG sense). In order to ensure that everything is well-typed, I need some way to keep track of the type of the semantic value associated with each variable. Here’s what I did.
data Var a = Var String
And a right-hand side of each rule will have a sequence of variables and terminals. Again, to keep the type information around, I’ll need a sort of cons operator at the type level. Here is that. I defined a type, and also an operator that makes the type easier to use.
data RHS a b = RHS a b (&) = RHS -- a convenient operator infixr 5 &
And next, things get hard. I’m using a multiparameter type class, in the fine tradition of Haskell type hacking, to represent a relation between types. My relation is defined in the following comment:
{- (Action a b c) means the following: A production with a right hand side of type: a may be associated with a semantic rule of type: b to produce a rule with semantic result type: c -} class Action a b c | a b -> c, a c -> b
In other words, the Action class will be used to ensure that the result type of a grammar production, the right-hand side of the production, and the type of the associated semantic rule are all consistent with each other. The functional dependencies simply assert that if you know the types of the right-hand side and the semantic rule, this is enough to determine what the result will be after applying the semantic rule; and that if you know the right-hand side and the result type, this is enough to determine what the type of the semantic rule needs to be.
There are a three base cases for this relation:
instance Action (Var x) (x -> y) y
This says that if a production has the form A -> B, where A has semantic values of type y, and B has semantic values of type x, then the semantic rule must have type (x -> y). If you think about it, this should make sense.
instance Action Char y y
This rule says that if the right-hand side of a production is a signle character (a terminal, not a variable), the semantic rules should be a constant that matches the semantic type for the left-hand variable.
instance Action () y y
This describes the situation for empty productions (sometimes called epsilon or lambda productions). Since leaving out any terms on the right-hand side of a rule isn’t an option, I use (), called “unit” to represent an empty right-hand side.
Those are the base cases. (As a side comment, only the last one is strictly necessary; the first two are basically just syntactic convenience. See below.) Here’s how rules with more than one symbol on the right-hand side are handled.
instance (Action a b c) => Action (RHS (Var x) a) (x -> b) c
The RHS operator defined earlier is used to build a list of sorts. This rule says that adding a variable to the beginning of the right-hand side of a rule requires adding a parameter to the beginning of the semantic action, and that the result type stays the same. This case handles right-hand sides that begin with a variable.
instance (Action a b c) => Action (RHS Char a) b c
Finally, this case handles right-hand sides that begin with a terminal (a character). The types of the semantic rule and result don’t change, since a terminal is known in advance, so there’s no need for it to carry semantic information.
Some more syntactic convenience makes it easier to write grammars. Here I abuse monads to take advantage of the special syntax.
{- The RuleSet monad is used to define rules of a grammar in a convenient 'do' notation. -} data Rule = forall a b c. (Action a b c) => Rule (Var c, a, b) data RuleSet x = RuleSet ([Rule], x) instance Monad RuleSet where a >>= k = let RuleSet (r1, v1) = a RuleSet (r2, v2) = k v1 in RuleSet (r1 ++ r2, v2) return x = RuleSet ([], x)
So a rule consists of a left-hand side, a right-hand side, and a semantic rule. They are constrained to match each other by the Action class defined above. A RuleSet is basically just a writer monad for lists of rules, but I defined it by hand just for the fun of it.
Now time to define an operator for building rules inside the monad:
(==>) :: (Action a b c) => Var c -> a -> b -> RuleSet () (==>) lhs rhs sem = RuleSet ([Rule (lhs, rhs, sem)], ()) infixr 4 ==>
It took a while to pick this. All the good arrow-like operators seems to be taken! Nevertheless, it does the job we want fairly well. Notice that even though I’m using an infix operator, there are three operands. The normal usage looks like this:
lefthand ==> righthand $ semanticrule
You’ll see examples coming up.
The formal definition of a context-free grammar includes four things: a set of variables, a set of terminals, a set of productions, and a special start variable. We’ve got three: variables are values of type Var a. Terminals are values of type Char. Productions are values of type Rule. Next, I need a start symbol. This is defined once, outside of the monadic environment in which rules are defined. At the same time, I through away the result value of the monad, which is useless since I was just exploiting the syntax rather than building a real monad.
{- Grammar. A grammar is a set of rules together with a start symbol. -} data Grammar a = Grammar (Var a) [Rule] grammar s (RuleSet (rs, x)) = Grammar s rs
And that’s it! Here’s a sample grammar so we can see it work.
{- Sample grammar. The parentheses in the let bindings are required: they force the rules to be monomorphic, which is needed for type checking to work properly. -} g = let ( expr ) = Var "expression" ( term ) = Var "term" ( termmore ) = Var "term operator" ( fact ) = Var "factor" ( factmore ) = Var "factor operator" ( digit ) = Var "digit" ( digits ) = Var "digits" in grammar expr $ do expr ==> term & termmore $ \\x y -> y x termmore ==> () $ \\x -> x termmore ==> '+' & term & termmore $ \\x y z -> y (x + z) termmore ==> '-' & term & termmore $ \\x y z -> y (x - z) term ==> fact & factmore $ \\x y -> y x factmore ==> () $ \\x -> x factmore ==> '*' & fact & factmore $ \\x y z -> y (x * z) factmore ==> '/' & fact & factmore $ \\x y z -> y (x / z) fact ==> '(' & expr & ')' $ \\x -> x fact ==> digit & digits $ \\x y -> y x digits ==> () $ \\x -> x digits ==> digit & digits $ \\x y z -> y (10*x + z) digit ==> '0' $ 0 digit ==> '1' $ 1 digit ==> '2' $ 2 digit ==> '3' $ 3 digit ==> '4' $ 4 digit ==> '5' $ 5 digit ==> '6' $ 6 digit ==> '7' $ 7 digit ==> '8' $ 8 digit ==> '9' $ 9
(This is a modified grammar I had laying around from a set of compiler course notes. It happens to have left recursion removed, but that’s immaterial really.) There are no type declarations in the entire grammar. Ask GHCi for the type of g, though, and it answers.
g :: (Fractional a) => Grammar a
It correctly inferred that the result type of expr must be Fractional. How? Because the third production for factmore uses the / operator. This means that factmore must be Fractional, and the type ripples upward all the way to the start variable of expr.
The only thing I don’t like at the moment is the need to use an explicit monomorphic binding (the parentheses) to declare non-terminals. If that’s not done, then the compiler thinks non-terminals can have different result types when used in different places, and the types it infers tend to be several pages long! A nice solution to that would be good, but I’m happy with everything else!
Please comment if there’s something that can be improved.
Here’s a neat game, with a really neat solution. I’ll post the solution later, but this post shows a Haskell implementation of the game. I first encountered this game as a puzzle in a game (an otherwise unremarkable one) called Keepsake by The Adventure Company.
The game looks like this. There are five displays, which display the letters A through E. The letters cycle through that sequence, wrapping back to A at the end. There are also five buttons, which are numbered 1 through 5. Each of the button changes each display my some random amount (from zero to four steps forward). Pressing a button five times consecutively will always get things back to where they were before it was pressed. The goal is to get from the (randomly chosen) starting point to the (randomly chosen) ending point.
For example, if the starting point were ABCDE, the target were EDBDE, and button 3 advanced the first and third dials by two and the second by 1 (what I’ll call a “signature” of [2, 1, 2, 0, 0]), then the correct answer could be obtained by pushing button 3 twice. The first press would change it to CCEDE, and the second to EDBDE (because the third dial wraps).
module Main where import Control.Monad (replicateM) import System.Random import System.IO
Three fundamental pieces of information are used to define instances of this problem. There is a starting display, a target display, and the actions of each of the five buttons. The program chooses random values for these three, prints the target, and then starts processing moves by the user.
main :: IO () main = do mat <- buildMat disp <- replicateM 5 $ randomRIO ('A', 'E') tgt <- replicateM 5 $ randomRIO ('A', 'E') putStrLn $ "Target = " ++ tgt continue mat disp tgt
The actions of each of the five buttons are represented in a matrix of sorts. Each button has a row in the matrix; and the columns represent the effect on each of the letters in the display.
type Matrix = [[Int]]
Random values for the matrix are chosen as follows.
randomMat :: IO Matrix randomMat = replicateM 5 $ replicateM 5 $ randomRIO (0,4)
There is one quirk, though. We want the problem to have a solution. Clearly there are some random matrices for which the problem has no solution. (For a trivial example, imagine a matrix of all zeros; then it’s impossible to change the display.) It turns out that, for reasons I will explain when I post the solution, it is sufficient for the matrix to have a non-zero (mod 5) determinant. So here’s the code to calculate the determinant (by cofactor expansion along the first column)
det :: Matrix -> Int det mat | length mat == 1 = head (head mat) | otherwise = sum $ map det' [0 .. length mat - 1] where det' n = let x = head (mat !! n) submat = map tail (take n mat ++ drop (n + 1) mat) sign = if n `mod` 2 == 0 then 1 else -1 in x * sign * det submat
The choice of a matrix, then, is performed by continuing to generate matrices until we find one with a non-zero (mod 5) determinant.
buildMat :: IO Matrix buildMat = do m <- randomMat if det m `mod` 5 == 0 then buildMat else return m
Once the problem is generated, the next step is to keep asking for which button to press until the user accomplishes the goal.
continue :: [[Int]] -> [Char] -> [Char] -> IO () continue mat disp tgt | disp == tgt = do putStrLn "Congratulations! You win." | otherwise = do putStrLn disp putStr "Enter 1 through 5: " hFlush stdout s <- getLine let n = read s let disp' = apply mat (n - 1) disp continue mat disp' tgt
Only two more functions are needed. First, apply takes the matrix and a button number, and calculates the result of pressing that button.
apply :: Matrix -> Int -> [Char] -> [Char] apply mat n vals = zipWith ($) (map add (mat !! n)) vals
Finally, the add function rotates a single letter of the display by a given amount.
add :: Int -> Char -> Char add n s = iterate nxt s !! n where nxt 'A' = 'B' ; nxt 'B' = 'C' nxt 'C' = 'D' ; nxt 'D' = 'E' nxt 'E' = 'A'
Putting all of this together gives an implementation of the puzzle. The puzzle can be fun to play with, but there’s also a neat straight-forward mathematical technique for finding the solution, which I’ll post in a few days. Until then, have fun!
When I use GHCi in my dreams, here’s how it works:
$ ghci MyFile.hs GHCi, version 9.9.02013010: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. [...] [1 of 1] Compiling MyFile ( MyFile.hs, interpreted ) MyFile.hs:13:5: Not in scope: `callDatabase' MyFile.hs:112:16: No instance for (Fractional Integer) [...] Partially failed, modules loaded: MyFile MyFile> :t foo Warning: foo depends on callDatabase, which is not in scope. Warning: Inferred type may be too general. foo :: forall a b. (SQL a, SQLData b) => a -> t -> STM [b] MyFile> :list foo 13: foo sql db = do dat <- callDatabase db sql 14: readTVar dat MyFile> :list 110-114 110: return [a] 111: 112: bar x = (baz x) / 5 113: 114: {- superCoolFunction: Makes the world end; do not use unless MyFile> :t baz baz :: forall a. a -> Integer MyFile> let bar = fromIntegral (baz x) / 5 MyFile> let callDatabase = undefined :: (SQL a, DBDriver b) => b -> a -> STM TVar MyFile> :recomp MyFile [1 of 1] Compiling MyFile ( MyFile.hs, interpreted ) Ok, modules loaded: MyFile MyFile> :t foo foo :: forall a b. (SQL a, DBDriver t, SQLData b) => a -> t -> STM [b] MyFile> superCoolFunction
And then I wake up! There’s more I’d like to include, such as:
But I woke up too soon. (Or, if you prefer, posted this blog entry too soon.)
Reading a recent thread on haskell-cafe, it was mentioned that there aren’t a lot of Windows Haskell programmers. That’s true, but I think it’s misleading to say it that way. This is one of those situations where correlation does mean causation. Here’s my story.
I write Haskell using Windows… sort of. Okay, not really. What I do is use a Windows laptop to open a PuTTY window, in which I write Haskell code on a UNIX system that’s sitting on the other side of the room. This is certainly not my ideal working environment. I’d love to really write Haskell on Windows, but I don’t for several reasons.
I don’t write Haskell with Windows because I haven’t found a good environment for it. I like my IDE programs, and especially Eclipse - which I already use for Java, C, and C++ programming; but that doesn’t look feasible. The closest I could seem to come was eclipsefp, but it’s nearly unusable. It gets syntax highlighting completely wrong. (It’s easy to get hard cases wrong in Haskell; writing a really correct syntax highlighter is hard; but this one even gets obvious things wrong like forgetting to highlight deriving, and requiring the occasional frivolous edit to poke it into changing syntax highlighting when something changes.) I tried to fix some things, but the Haskell code is in a darcs repository, and the only darcs plugin for Eclipse never seemed to work. Writing Eclipse plugins in any environment except Eclipse is hideously painful; so enough of that. Even if I did get it to work, the Eclipse support for Haskell doesn’t seem to advanced; may as well use a command line.
I don’t write Haskell with Windows because of Lambdabot. Lambdabot and GOA (GHCi On Acid) are important tools for me. They are far more important than the platform I develop on. But it looks rather non-trivial to build them for Windows. I tried to do so for several hours, before I gave up.
I don’t write Haskell with Windows because I build development versions of GHC a lot. (I do this to at least help with testing, and partly because I have aspirations to contribute to GHC myself in the future.) While most of the breaking of GHC seems to happen on MacOS, I notice a lot of patches floating around to fix broken builds of GHC for Windows as well. One rarely sees a developer commit something that breaks the build for Linux, because that’s what the developers use.
There are probably more people out there in the same situation. There seems to be a catch-22 here. It’s not that Haskell doesn’t attract Windows. Haskell attracts people, who subsequently realize that developing on Windows is harder than developing on Linux. Since Haskell does not attract the sort of people who are slaves to their platform, that means there are fewer people developing Haskell with Windows.
This is the fifth installment of my series “Learning Haskell and Number Theory”, in which I work through Gareth and Mary Jones’ text, Elementary Number Theory, and translate its ideas and algorithms into Haskell. In the last installment, I used the Haskell foldl function to convert a binary greatest common denominator into one that operates on a list. This time, I fix a glitch in that implementation, the dreaded “space leak,” and look at strict and lazy evaluation in Haskell.
The implementation of gcdmany from last time looked like this:
gcdmany :: forall a. Integral a => [a] -> a gcdmany = foldl gcd1 0
This is a correct implementation, but not a good one. Before we can understand why, though, we need a diversion through tail recursion. Everyone who’s done much functional programming knows the importance of tail recursion. It turns O(n) space into O(1) space. For example, this implementation of factorial:
fact 0 = 1 fact n = n * fact (n - 1)
uses O(n) space because it’s not tail recursive. The evaluation of fact (in a strict language; I’ll talk about Haskell’s quirks later) would look like this:
fact 6 6 * (fact 5) 6 * (5 * (fact 4)) 6 * (5 * (4 * (fact 3))) 6 * (5 * (4 * (3 * (fact 2)))) 6 * (5 * (4 * (3 * (2 * (fact 1))))) 6 * (5 * (4 * (3 * (2 * (1 * (fact 0)))))) 6 * (5 * (4 * (3 * (2 * (1 * 1))))) 6 * (5 * (4 * (3 * (2 * 1)))) 6 * (5 * (4 * (3 * 2))) 6 * (5 * (4 * 6)) 6 * (5 * 24) 6 * 120 720
Each line represents the set of computations that needs to be stored in computer memory at any stage of execution. You can see that the memory needed grows with the value of n. (It doesn’t matter much for factorial, because merely storing the answer also uses space that’s O(n) in the size of the input. That’s beside the point.)
An alternative implementation is:
fact n = fact' 1 n where fact' a 0 = a fact' a n = fact' (a * n) (n - 1)
This one is made tail recursive, by following a very common technique of keeping an accumulator as an explicit parameter (in this case, a) with an outer setup function. If you look back at the fourth installment, you’ll see that this is exactly what I did to turn the first implementation of gcdmany into the second one. I introduced an explicit accumulator parameter, and then used it to convert the routine into a tail recursive one with setup. In a strict language (like, say, the ML family of languages) that would be sufficient to turn the fact function into O(1) space.
The execution of this one (again, in a strict language) looks like this:
fact 6 fact' 1 6 fact' 6 5 fact' 30 4 fact' 120 3 fact' 360 2 fact' 720 1 fact' 720 0 720
This is much better! In our hypothetical strict language, we’ve solved the problem. In Haskell, though, it’s not quite good enough. Haskell, being a lazy language, won’t even do simple multiplication until we ask for the result. The execution, therefore, looks like this:
fact 6 fact' 1 6 fact' (1 * 6) 5 fact' (1 * 6 * 5) 4 fact' (1 * 6 * 5 * 4) 3 fact' (1 * 6 * 5 * 4 * 3) 2 fact' (1 * 6 * 5 * 4 * 3 * 2) 1 fact' (1 * 6 * 5 * 4 * 3 * 2 * 1) 0 1 * 6 * 5 * 4 * 3 * 2 * 1 6 * 5 * 4 * 3 * 2 * 1 30 * 4 * 3 * 2 * 1 120 * 3 * 2 * 1 360 * 2 * 1 720 * 1 720
And, oops! We’re back to using O(n) space again. This is the dreaded “space leak”.
Back to our gcdmany function. It, too, contains such a space leak. Instead of computing intermediate greatest common divisors as it applies arguments one by one, it actually accumulates the GCD operations for the whole list, before it starts to evaluate any of them.
We’d like to fix this. Haskell provides a way to do so: selective strict evaluation. In other words, we can tell Haskell that certain things must be evaluated strictly rather than put off for later. Doing this never changes the answer to a successful computation, but it can do two things:
The first effect is what we want here. The second, though, is interesting as well. Consider the following code:
mylist = [1, 2, 3, undefined] main = print (length mylist == 4)
Running this program will print True. In a strict language, this program would fail because the fourth element of the list is undefined. (To elaborate more, you might design a language to avoid failing if the term is known to be undefined. But what if teh function that computes it goes into an infinite loop? In Haskell, it wouldn’t matter because we never use that value. In a strict language, it would matter quite a lot!)
In the case of gcdmany, though, I can verify that I am not doing anything that might cause the system to go into an infinite loop or fail. It is, therefore, safe, to replace it with a strict version.
The first way of doing this is standard Haskell 98. It looks like this:
gcdmany :: forall a. Integral a => [a] -> a gcdmany xs = gcdmany' 0 xs where gcdmany' p [] = p gcdmany' p (x:xs) = (gcd1 p x) `seq` gcdmany' (gcd1 p x) xs
The magic function seq makes this all work. If you read the description of seq in the standard library, it’s quite confusing for beginners to Haskell. What it really means is that is evaluates the first bit of code (so, as a side effect, if the first bit goes into an infinite loop or throws an exception, it’ll do that too), and then goes on with the second one as if the first one didn’t exist. However, evaluating the first one means that instead of it being work put off until later, the result is already known. That gets us back to the nice O(1) memory usage.
The second way of doing this does the same thing, but is shorter and only works in GHC with the -fallow-bang-patterns option.
gcdmany :: forall a. Integral a => [a] -> a gcdmany xs = gcdmany' 0 xs where gcdmany' !p [] = p gcdmany' !p (x:xs) = gcdmany' (gcd1 p x) xs
The ! in front of the argument p does the same thing as seq but is much nicer to read.
Both of these require that we write out gcdmany by hand, though. Before, we were able to make it a lot nicer using foldl. The third option fixes that. It’s not part of the standard Haskell 98 prelude, but it does exist in GHC.
gcdmany :: forall a. Integral a => [a] -> a gcdmany = foldl' gcd1 0
The foldl’ function is just a strict version of foldl. It does exactly what we want. So with all those words, the solution was achieved by adding one small character. (Why, though? Because of the ability to use higher order functions to do functional abstraction!)
In general, one wants to use foldl’ when folding over an operation that produces a distinct result, and foldr when folding with a data constructor or an operation that builds more complex data structures. In the first case, you avoid space leaks. In the second case, you can use lazy evaluation to only build a piece at a time of the data structure you’re operating on, and thus save even more memory through lazy evaluation.
That’s all for now. Next time, I’ll look at some cool connections between Euclid’s GCD algorithm and the Fibonacci numbers, and then move on to chapter 2.
Here’s a pretty tricky bug I hunted down yesterday, with the help of oerjan on #haskell. If you weren’t there to see, it’s a lot of fun. Here’s the code:
foo n = product [24 - n .. 23] `div` product [1 .. n] bar n = length (takeWhile ((< 1000000) . foo) [1 .. n]) baz n = if bar n == n then n else bar n
And here’s the result:
cdsmith@devtools:~/haskell/projects$ ghci Strange.hs GHCi, version 6.7.20070703: http://www.haskell.org/ghc/ :? for help [1 of 1] Compiling Main ( Strange.hs, interpreted ) Ok, modules loaded: Main. *Main> bar 10 9 *Main> baz 10 10
The question: Why does bar 10 evaluate to something different from baz 10? By looking at the definition of baz, you might convince yourself that baz and bar do the same thing. Apparently not.
Scroll down for the answer.
The problem is numeric overflow. The standard library length function has a result type of Int. The intermediate result in foo is too large to fit in the data type Int. The bar function has type forall a. Integral a => a -> Int. When calling bar directly, this isn’t a problem because the argument defaults to type Integer and foo is evaluated entirely in arbitrary precision numbers. However, when baz is evaluated, there is a comparison bar n == n, so the result of bar must have the same type as its argument. The result is known to be Int because the return type of bar is not polymorphic. So the argument n must also be of type Int. Then foo‘s inferred type is Int -> Int, which causes the overflow.
All clear now?
If one wanted to take a lesson from this, it would be one of the following:
length shouldn’t use dangerous types like Int.Int should be made less dangerous by throwing an exception on overflow.Let’s take a poll: which lesson do you draw from it? Post in the comments.