This is the fifth installment of my series “Learning Haskell and Number Theory”, in which I work through Gareth and Mary Jones’ text, Elementary Number Theory, and translate its ideas and algorithms into Haskell.  In the last installment, I used the Haskell `foldl `function to convert a binary greatest common denominator into one that operates on a list.  This time, I fix a glitch in that implementation, the dreaded “space leak,” and look at strict and lazy evaluation in Haskell.

The implementation of `gcdmany `from last time looked like this:

```gcdmany :: forall a. Integral a => [a] -> a
gcdmany = foldl gcd1 0
```

This is a correct implementation, but not a good one.  Before we can understand why, though, we need a diversion through tail recursion.  Everyone who’s done much functional programming knows the importance of tail recursion.  It turns O(n) space into O(1) space.  For example, this implementation of factorial:

```fact 0 = 1
fact n = n * fact (n - 1)
```

uses O(n) space because it’s not tail recursive.  The evaluation of `fact` (in a strict language; I’ll talk about Haskell’s quirks later) would look like this:

```fact 6
6 * (fact 5)
6 * (5 * (fact 4))
6 * (5 * (4 * (fact 3)))
6 * (5 * (4 * (3 * (fact 2))))
6 * (5 * (4 * (3 * (2 * (fact 1)))))
6 * (5 * (4 * (3 * (2 * (1 * (fact 0))))))
6 * (5 * (4 * (3 * (2 * (1 * 1)))))
6 * (5 * (4 * (3 * (2 * 1))))
6 * (5 * (4 * (3 * 2)))
6 * (5 * (4 * 6))
6 * (5 * 24)
6 * 120
720
```

Each line represents the set of computations that needs to be stored in computer memory at any stage of execution.  You can see that the memory needed grows with the value of n.  (It doesn’t matter much for factorial, because merely storing the answer also uses space that’s O(n) in the size of the input.  That’s beside the point.)

An alternative implementation is:

```fact n = fact' 1 n
where fact' a 0 = a
fact' a n = fact' (a * n) (n - 1)
```

This one is made tail recursive, by following a very common technique of keeping an accumulator as an explicit parameter (in this case, `a`) with an outer setup function.  If you look back at the fourth installment, you’ll see that this is exactly what I did to turn the first implementation of `gcdmany `into the second one.  I introduced an explicit accumulator parameter, and then used it to convert the routine into a tail recursive one with setup.  In a strict language (like, say, the ML family of languages) that would be sufficient to turn the `fact` function into O(1) space.

The execution of this one (again, in a strict language) looks like this:

```fact 6
fact'   1 6
fact'   6 5
fact'  30 4
fact' 120 3
fact' 360 2
fact' 720 1
fact' 720 0
720
```

This is much better!  In our hypothetical strict language, we’ve solved the problem.  In Haskell, though, it’s not quite good enough.  Haskell, being a lazy language, won’t even do simple multiplication until we ask for the result.  The execution, therefore, looks like this:

```fact 6
fact' 1 6
fact' (1 * 6) 5
fact' (1 * 6 * 5) 4
fact' (1 * 6 * 5 * 4) 3
fact' (1 * 6 * 5 * 4 * 3) 2
fact' (1 * 6 * 5 * 4 * 3 * 2) 1
fact' (1 * 6 * 5 * 4 * 3 * 2 * 1) 0
1 * 6 * 5 * 4 * 3 * 2 * 1
6 * 5 * 4 * 3 * 2 * 1
30 * 4 * 3 * 2 * 1
120 * 3 * 2 * 1
360 * 2 * 1
720 * 1
720
```

And, oops!  We’re back to using O(n) space again.  This is the dreaded “space leak”.

Back to our `gcdmany `function.  It, too, contains such a space leak.  Instead of computing intermediate greatest common divisors as it applies arguments one by one, it actually accumulates the GCD operations for the whole list, before it starts to evaluate any of them.

We’d like to fix this.  Haskell provides a way to do so: selective strict evaluation.  In other words, we can tell Haskell that certain things must be evaluated strictly rather than put off for later.  Doing this never changes the answer to a successful computation, but it can do two things:

1. Make a computation run faster, and with less memory.
2. Make a computation fail or run forever, when it might otherwise have succeeded.

The first effect is what we want here.  The second, though, is interesting as well.  Consider the following code:

```mylist = [1, 2, 3, undefined]
main   = print (length mylist == 4)
```

Running this program will print `True`.  In a strict language, this program would fail because the fourth element of the list is undefined.  (To elaborate more, you might design a language to avoid failing if the term is known to be undefined.  But what if teh function that computes it goes into an infinite loop?  In Haskell, it wouldn’t matter because we never use that value.  In a strict language, it would matter quite a lot!)

In the case of `gcdmany`, though, I can verify that I am not doing anything that might cause the system to go into an infinite loop or fail.  It is, therefore, safe, to replace it with a strict version.

The first way of doing this is standard Haskell 98.  It looks like this:

```gcdmany :: forall a. Integral a => [a] -> a
gcdmany xs = gcdmany' 0 xs
where gcdmany' p []     = p
gcdmany' p (x:xs) = (gcd1 p x) `seq` gcdmany' (gcd1 p x) xs
```

The magic function `seq `makes this all work.  If you read the description of `seq` in the standard library, it’s quite confusing for beginners to Haskell.  What it really means is that is evaluates the first bit of code (so, as a side effect, if the first bit goes into an infinite loop or throws an exception, it’ll do that too), and then goes on with the second one as if the first one didn’t exist.  However, evaluating the first one means that instead of it being work put off until later, the result is already known.  That gets us back to the nice O(1) memory usage.

The second way of doing this does the same thing, but is shorter and only works in GHC with the `-fallow-bang-patterns `option.

```gcdmany :: forall a. Integral a => [a] -> a
gcdmany xs = gcdmany' 0 xs
where gcdmany' !p []     = p
gcdmany' !p (x:xs) = gcdmany' (gcd1 p x) xs
```

The ! in front of the argument `p` does the same thing as `seq `but is much nicer to read.

Both of these require that we write out `gcdmany `by hand, though.  Before, we were able to make it a lot nicer using `foldl`.  The third option fixes that.  It’s not part of the standard Haskell 98 prelude, but it does exist in GHC.

```gcdmany :: forall a. Integral a => [a] -> a
gcdmany = foldl' gcd1 0
```

The `foldl' `function is just a strict version of `foldl`.  It does exactly what we want.  So with all those words, the solution was achieved by adding one small character.  (Why, though?  Because of the ability to use higher order functions to do functional abstraction!)

In general, one wants to use `foldl' `when folding over an operation that produces a distinct result, and `foldr `when folding with a data constructor or an operation that builds more complex data structures.  In the first case, you avoid space leaks.  In the second case, you can use lazy evaluation to only build a piece at a time of the data structure you’re operating on, and thus save even more memory through lazy evaluation.

That’s all for now.  Next time, I’ll look at some cool connections between Euclid’s GCD algorithm and the Fibonacci numbers, and then move on to chapter 2.

1. Paul Johnson / Jul 6 2007 2:19 am

(gcd1 p x) `seq` gcdmany’ (gcd1 p x) xs

As I understand it GHC doesn’t do common subexpression elimination. So surely this will compute (gcd1 p x) eagerly, then throw away the result and do it again lazily.

What you would need to do is more like:

let g = gcd1 p x in g `seq` gcdmany’ g xs

But I may have misunderstood.

Paul

2. sorear / Jul 6 2007 10:07 am

There is a third possibility: Adding seqs can make your program vastly slower and use vastly more memory. After all – laziness is usually a good thing!

rot13 ch
| ch >= ‘a’ && ch = ‘n’ && ch = ‘A’ && ch = ‘N’ && ch

3. cdsmith / Jul 6 2007 12:59 pm

Paul, at least with GHC 6.6 and GHC 6.7 latest development version, the seq works as shown in the article. At least on [1 .. 1000000], gcdmany fails with a stack overflow without the seq, and succeeds with it. I’d be interested if you have a source for the need to use a let binding, but it doesn’t match what I see.

4. Cale Gibbard / Jul 6 2007 10:30 pm

While your solution with “(gcd1 p x) `seq` gcdmany’ (gcd1 p x) xs” will work to stop the stack overflow, it will also cause GHC to evaluate gcd1 twice as many times. GHC is *not* going to do CSE here — it only does a particularly trivial form of CSE that this isn’t covered by. The reason that it causes the stack not to overflow is that evaluating that gcd1 will inadvertently force the evaluation of p (supposing that your number type is strict), like the bang patterns do, but at a far greater expense.

You’ll effectively cut the number of times gcd1 runs in half by forcing it to share, using the “let g = gcd1 p x in g `seq` gcdmany’ g xs” variant.

To see this, try profiling the code and looking at what happens to the “entries” column when you make the change.

5. Don Stewart / Jul 6 2007 10:56 pm