Sententia cdsmithus

March 7, 2008

Functions and Partial Orders

Filed under: Haskell, math — cdsmith @ 4:03 pm

In a few other blogs (especially here), there has been some discussion of the mathematical foundations of functions in programming languages, starting from topology. Just for fun, I will approach from the other direction, describing functions that can be defined in various languages in terms of partially ordered sets. Of course, for those of you who know this already, then you can go on and define topologies on these posets a la Dana Scott, so it amounts to the same thing. But this way has more pretty pictures.

Functions and Partial Information

Suppose we want to look at functions from the natural numbers to the natural numbers. (By the natural numbers, I mean non-negative integers.) There are, of course, infinitely many such functions. Let’s pick a few that are interesting:

  1. The constant function f(x) = 5.
  2. The identity function g(x) = x.
  3. The successor function h(x) = x + 1.
  4. The doubling function p(x) = 2x.
  5. The square function q(x) = x2.

That should give us a good start. These functions are all quite well-defined over the integers. Indeed, I can implement them all in my favorite programming language, and I’ll get exactly that (note: only because my favorite programming language does arbitrary precision integers, of course.)

The next question is this: what happens if I don’t know everything about the input to the function? What can I infer about the output given partial information about the input? Here’s some of the things I might ask of that sort:

x x = 2 x is even x is at least 5 x could be anything
f(x) f(x) = 5 f(x) = 5 f(x) = 5 f(x) = 5
g(x) g(x) = 2 g(x) is even g(x) is at least 5 g(x) could be anything
h(x) h(x) = 3 h(x) is odd h(x) is at least 6 h(x) is at least 1
p(x) p(x) = 4 p(x) is even p(x) is at least 10
p(x) is even
p(x) is even
q(x) q(x) = 4 q(x) is even
q(x) is a perfect square
q(x) is at least 25
q(x) is a perfect square
q(x) is a perfect square

Getting back this sort of partial information is trickier than implementing the function over totally known arguments. That’s what this blog post is about.

Partial Orders and Up-Sets

The trick is to build a partial order on the domain of the function. A partial order is some relation, ≤, that has three properties:

  • Reflexive: For all x, xx.
  • Anti-symmetric: If xy and yx, then x = y.
  • Transitive: If xy and yz, then xz.

Everyone is familiar with the ≤ that means “is no larger than.” But there are other perfectly good definitions as well. For example “is a factor of” is a perfectly good meaning for ≤. Now here’s where the pretty pictures come in. We can draw a diagram for a partial order, where there’s an upward path from x to y precisely when xy.

For example, here is part of the diagram for the familiar partial order “is no larger than” on the natural numbers:

Familiar Partial Order on the Naturals

And here is part of the diagram for “is a factor of”:

Divisibility Order on the Naturals

Each of these partial orders contains some information about the relationships between natural numbers. In particular, one can identify certain properties of the natural numbers as belonging to up-sets: sets that have the property that if they contain something, they also contain everything that it greater than it. So “at least 5″ is the up-set of naturals in the familiar order that contains 5 and everything above it. We can call this the up-set generated by 5. In the second (divisibility) order, the up-set generated by 5 can be described as “divisible by 5″.

To use these orders to get information about partial inputs to functions, we need two things:

  1. Up-sets that capture the partial information we have.
  2. Functions that preserve that order.

Let’s shoot for the first one. Looking at the kinds of partial information we had:

  • x is even. This is the up-set generated by 2 in the divisibility order.
  • x is at least 5. This is the up-set generated by 5 in the familiar order.
  • x could be anything. This is the up-set generated by the bottom element of any partial order we like, so long as it has a bottom element.
  • x is a perfect square. We’ll need a quite complicated order for this one. We can order the natural numbers by the divisibility order on the least common divisor of the exponents of their prime factorizations. It turns out this isn’t even a valid partial order (it violates anti-symmetry), but I can treat it as a partial order on equivalence classes, and things work out. Then perfect squares are the up-set generated by [2] (the equivalence class of numbers whose gcd of the exponents in their prime factorization is precisely 2) in this order.

Once we’ve chosen a partial order whose up-sets capture the information we want, we have to verify that the function actually preserves the order. That is, if we take an up-set generated by x, and pass everything in it through the function, will the result live in the up-set generated by f(x). The successor function h(x) = x + 1 that we defined earlier has this property in the familiar order. If I have a set of all numbers bigger than 4, and I add one to them all, I get all the numbers bigger than 5. However, h(x) does not preserve up-sets in the divisibility order. If I have all numbers divisible by 3, and I add one to them all, there’s no guarantee that the resulting numbers will be divisible by 4! So I have to be careful about using an order-preserving function, if I want any kind of partial information from up-sets.

What a mess! Sure, if I can define the right partial order, and then prove that the function preserves up-sets on the partial order, and sometimes pull this trick with equivalence classes, and then I get to say something about the behavior of my function on partial information. But that seems less than fulfilling somehow. It feels like I’m working too hard, and now I can’t even define some functions, depending on which order I choose.

Sorting Out the Mess

So what do we do with all of this?

The first thing to do is to pick an order and stick with it. We’ll lose some partial information, but we’ll get to keep other partial information. The logical choice for natural numbers is the familiar order. It’s fairly common for me to want to know that a number is at least 3. It’s somewhat less common for me to want to know that a number is a multiple of 3. But I won’t assume any choice, and in a future blog post, I’ll demonstrate how to build numeric data types in Haskell that parallel several possible choices, and exhibit different laziness characteristics as a result.

The next thing we want to do is recover the ability to define all functions again. The problem with defining some functions is that there’s something “greater than” the number we’re looking at that won’t stay that way once we’re done. The obvious solution is to just make sure there’s nothing “greater than” a number. We can invent a whole set of pseudo-numbers: ~0, ~1, ~2, and so on. Then we’ll put the numbers themselves above that. Here’s the new set of natural numbers with the familiar order:

The Lazy Naturals by the Familiar Order

And voila! I can define all my favorite functions with the actual numbers. What remains is to extend my function onto the pseudo-numbers as well. In this order, the pseudo-number ~n means “at least n“. There’s always one safe way to do this: just map everything to ~0, the minimal element. However, if I still want to get partial information about the function, I need to do better.

Here’s the result for the divisibility order:

Lazy Naturals with Divisibility Order

Here we have the same thing, except that “~n” now means “divisible by n“. Again, I need to extend my functions to the pseudo-numbers. I can still map everything to the bottom element (in this case, “~1″), but this throws away any partial information I may have been keeping.

Finally, one more option. This corresponds to a partial order on an equivalence class, sort of like what came out of the perfect squares example. This one, though, is the trivial equivalence class that lumps everything together. The result is the strict natural numbers, in which I must know the entire number to get any information out of any (non-constant) function.

Strict Natural Numbers

Here only a constant function could do anything except map ~0 back to ~0. Therefore, unless your function is constant, the only way to get any information back from your input is to know the precise value of the input. This is an important arrangement of the natural numbers because for efficiency reasons, in the particular case of integers, this is precisely what all real programming languages do. We’re just using natural numbers as an example, though; the ideas here apply equally well to more complex structures where laziness exists.

It turns out, now, that once we’ve chosen a partial order for the pseudo-numbers, there is sometimes a best possible way to extend any function over the naturals to include pseudo-numbers. To determine the value of any function f at ~n, we’d like to find the greatest lower bound of f(x) for all x~n. Now if the partial order we chose happens to be something called a complete meet-semilattice (all of the orders we’ve talked about are), then that greatest lower bound exists.

Connections to Programming Languages

As you may have guessed by now by my occasional references, this is all connected to laziness of functions in programming languages. I had intended to write about that now, but this has gotten too long already, so I am instead going to stop here and finish the other half at a later time. Hope you enjoyed it!

January 7, 2008

What do politics and programming languages have in common?

Filed under: Uncategorized — cdsmith @ 2:37 pm

This is a blog entry about programming languages, but I’m going to start out talking about politics.

Trigger Words in Politics

I’m currently volunteering for the Barack Obama campaign for president in the United States.  If there’s a single thing one learns quickly in politics, it’s this: ideas themselves matter a lot less than getting people to remember them.  It’s sad, because people who care about the important problems facing the government veritably thrive on those ideas.  But the other 90% of people who ultimately make the decision in the end will have trouble just keeping straight who says what.

The solution: trigger words.  Candidates express their ideas, but they always attach a word or short phrase.  The word or phrase is far too short to ever really capture any idea worth its salt; but it’s what people will remember.  So Barack Obama talks about “the politics of hope.”  He says “change” a lot.  And a good number of people will know that Barack advocates “a politics of hope” and “change” without being able to tell you what that actually means.  The words, in fact, act as a kind of subconscious repository of those ideas that people don’t mention anyway.  When Barack talks about the need to respect the dignity of people around the world instead of barricading against them as potential terrorists, he throws in the phrase “politics of hope,” and people add that to the ideas they associate with the phrase.  When he talks about banishing corporate lobbyists from the White House, or refers to his legislative work to stop unseemly relationships between members of Congress and lobbyists, he says “change” and this gets added to the meaning of change.  People remember the good feelings they had when they heard these terms before.

Of course, two can play at that game.  Since “the politics of hope” is in the end a vague term with very little meaning, it’s open for twisting and manipulation.  So it happens that, every time Barack Obama says he doesn’t think something will work, or that something is a bad idea, the response from other campaigns is “what happened to the politics of hope, eh Barack?”  And now that Barack did very well in the Iowa caucus, other candidates are scrambling to stand for “change”… but one notable exception, most of them are not talking about lobbyists and centers of political power, but rather about their own pet issues.  So it happens that one can have a spirited debate over trigger words without ever touching on actual policy differences.  Yet, one must defend one’s trigger words.  It would be ridiculous for Barack to respond that change and the politics of hope are just words and let’s forget them all; because those words are the repository of people’s good will toward the whole campaign.

Okay, yeah.  I promised this was about programming languages.  Here we go.

Trigger Words in Programming

So it was in this context that I came across a discussion across several blogs, including here and here.  I tend to agree with these authors and most issues, but I have to object that they are throwing a trigger word under the bus in the process, and that’s a serious mistake.  That word is “intuitive”.

Yes, intuitive is basically a trigger word.  People have seen languages where you have to keep looking everything up in a reference manual, or else memorize large amounts of information (e.g., Visual Basic).  And people have seen languages where things just seem to make sense (insert favorite language here).  And they prefer the latter kind of language, which is widely identified as being intuitive.  It’s a mistake to say that “intuitive” means familiar.  If anything, it means something a little closer to elegant; but I’d go for a new phrase I’m about to invent.  Intuitive means idea-efficient.  It means that there exists a small set of ideas whose straight-forward application can lead to good solutions to a large set of problems.  The principle of least surprise is, therefore, certainly a part of it.  So is the central design principle of Scheme, that “Programming languages should be designed not by piling
feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.”  In the end, though, the exact meaning of the word is something that’s experienced, not explained.  And that’s why the word itself is of such value; because I don’t want to rewrite this paragraph every time I want to mention the concept of an intuitive language.

Calling something unintuitive simply because it’s unfamiliar is a lot like saying, “What ever happened to the politics of hope, eh Barack?”  And responding that intuitiveness is overrated is a lot like responding with, “The politics of hope doesn’t matter, after all.”

So that’s my plea.  Respect the idea of an intuitive language.  Defend it, and point out that it doesn’t mean “familiar.”  Otherwise, we’re throwing away experience that could lead people to make wiser decisions about their tools.

December 9, 2007

Some Basic Stuff: The Writer Monad

Filed under: Uncategorized — cdsmith @ 11:47 am

Edit: Clarified some of the code.

Edit II: Comment about laziness.

In this post, Magnus Therning gives a number of solutions to the n-queens problem in Haskell.  The problem is just to arrange n queens on an n x n chess board so that neither is threatening any of the others.  It’s a classic problem that’s solved with backtracking.  That is, you try something, and if it doesn’t work you back up and try something else.  Simple enough.

It reminded me that when I was first learning Haskell, I wrote an n-queens solution using the Writer monad.  This monad just augments the pure functional environment with a new instruction called “tell”.  For those new to Haskell but with some knowledge of Python, consider it equivalent to Python’s “yield”.  Here’s the code:

import Control.Monad
import Control.Monad.Writer
import Data.List

diagonal (x1,y1) (x2,y2) = x1 + y1 == x2 + y2
                        || x1 - y1 == x2 - y2

nqueens n = execWriter $ f [1..n] 1 []
    where f [] _ ps = tell [ps]
          f cs r ps = forM_ cs $ \c ->
                          unless (any (diagonal (r,c)) ps) $
                              f (delete c cs) (r + 1) ((r,c):ps)

And that’s it.  The diagonal function determines if two positions (as order pairs of row and column) are diagonal from each other.  The nqueens function then uses execWriter to run something in the Writer monad and extract the list of answers that were “told” during its execution.  Then I call the workhorse function, which for lack of a better name is just called f.  This function assigns one queen to each row, from the top down.  It is recursive, so it tracks state in its parameters: the first is a list of free columns, the second is the row number, and the third is the list of positions where queens have been placed so far.  If there are no more free columns, then we must have placed all the queens, so we tell the solution.  If there are free columns, we loop through them and try putting a queen in each free column at the current row.

Formally, this returns a list of all solutions to the problem. However, it’s a lazy list; so if you only use the first one, the remainder of the list will never be calculated. So if you only want one solution, just do head (nqueens 8), and you’ll get the first solution.

Okay, nothing phenomenal here.  Just a quick example of some Haskell code.

November 29, 2007

Some Playing with Derivatives

Filed under: Uncategorized — cdsmith @ 2:31 pm

This is a summary of what I’ve been playing with in case people find it interesting.

In general, there are three ways to find the derivative of a function:

  1. Do the symbolic manipulation of formulae that we all learned in school when we were 16 years old.  This assumes that one has the function as an algebraic expression of some kind in terms of known quantities.
  2. Do it numerically, by computing (f(x + h) - f(x)) / h for some small value of h.  This suffers from numerical inaccuracies, and will generally cause rises in the blood pressure of numerical analysts.
  3. Calculate the value of the function and its derivative simultaneously at a given point of evaluation.

The last option seems to have a lot of names: automatic differentiation, algorithmic differentiation, and a few more.  Just for fun, I wrote an implementation of a very simplified version of it in Haskell.

The Implementation

module AD where

data AD a = AD a a

instance Eq a => Eq (AD a) where
    (AD x dx) == (AD y dy)    = x == y

instance Show a => Show (AD a) where
    show (AD x dx) = show x ++ " + " ++ show dx ++ " eps"

instance Num a => Num (AD a) where
    (AD x dx) + (AD y dy)       = AD (x + y)          (dx + dy)
    (AD x dx) - (AD y dy)       = AD (x - y)          (dx - dy)
    (AD x dx) * (AD y dy)       = AD (x * y)          (dx * y + x * dy)
    negate (AD x dx)            = AD (negate x)       (negate dx)
    abs (AD 0 _)                = error "not differentiable: |0|"
    abs (AD x dx)               = AD (abs x)          (dx * signum x)
    signum (AD 0 _)             = error "not differentiable: signum(0)"
    signum (AD x dx)            = AD (signum x)       0
    fromInteger i               = AD (fromInteger i)  0

instance Fractional a => Fractional (AD a) where
    (AD x dx) / (AD y dy)       = AD (x / y)          ((dx * y - x * dy) / y^2)
    recip (AD x dx)             = AD (1 / x)          ((-dx) / x^2)
    fromRational x              = AD (fromRational x) 0

instance Floating a => Floating (AD a) where
    pi                          = AD pi        0
    exp (AD x dx)               = AD (exp x)   (dx * exp x)
    sqrt (AD x dx)              = AD (sqrt x)  (dx / (2 * sqrt x))
    log (AD x dx)               = AD (log x)   (dx / x)
    (AD x dx) ** (AD y dy)      = AD (x ** y)  (dx * y * (x ** (y-1)) +
                                                dy * (x ** y) * log x)
    sin (AD x dx)               = AD (sin x)   ( dx * cos x)
    cos (AD x dx)               = AD (cos x)   (-dx * sin x)
    asin (AD x dx)              = AD (asin x)  ( dx / sqrt (1 - x^2))
    acos (AD x dx)              = AD (acos x)  (-dx / sqrt (1 - x^2))
    atan (AD x dx)              = AD (atan x)  (dx / (1 + x^2))
    sinh (AD x dx)              = AD (sinh x)  (dx * cosh x)
    cosh (AD x dx)              = AD (cosh x)  (dx * sinh x)
    asinh (AD x dx)             = AD (asinh x) (dx / sqrt (x^2 + 1))
    acosh (AD x dx)             = AD (acosh x) (dx / sqrt (x^2 - 1))
    atanh (AD x dx)             = AD (atanh x) (dx / (1 - x^2))

diff           :: Num a        => (AD a -> AD t)                     -> a -> t
diffNum        :: Num b        => (forall a. Num a        => a -> a) -> b -> b
diffFractional :: Fractional b => (forall a. Fractional a => a -> a) -> b -> b
diffFloating   :: Floating b   => (forall a. Floating a   => a -> a) -> b -> b

diff f x           = let AD y dy = f (AD x 1) in dy
diffNum f x        = let AD y dy = f (AD x 1) in dy
diffFractional f x = let AD y dy = f (AD x 1) in dy
diffFloating f x   = let AD y dy = f (AD x 1) in dy

Essentially, this declares a data type to represent the pair of a number and its first derivative.  It then defines how to perform primitive math operations on that data type, for three of Haskell’s numerical type classes.  Note that basically the entire implementation consists of talking about how derivatives fare in the face of certain operations.  In fact, you may recognize a lot of it from the algebra of differentials in calculus.  For example, if z = xy, then dz = x dy + y dx, and so forth.  (One quick note, because someone else asked me about this.  The exponentiation operator ** is complicated because both the base and the exponent could be non-constant.  So it must use both the power rule and the exponentiation rule.)

My Argument With the Type System

The last section is certainly weird.  It defines four identical functions, because the type system gets in the way of doing what I really want.  What I would like is for the diff function to map functions over the numeric typeclasses to other functions over the same type classes.  For example, the derivative of a function with type (Num a => a -> a) will itself have type (Num a => a -> a).  The derivative of a function with type (Floating a => a -> a) will itself have type (Floating a => a -> a).  But the only way that works is for diff to get a type that depends on AD.  That’s confusing, because the type AD is an implementation detail; and I’d rather not even export it.  The user of this module won’t think they’ve written a function for the type AD; they’ve just written a function over any type that is an instance of, say, Floating.

To illustrate something closer to what would make me happy, I’ve also given the same function three better types, in diffNum, diffFractional, and diffFloating.  These rank 2 types provide the right level of abstraction; but unfortunately, it’s impossible to write only one function that works for all three numeric type classes.  Here’s what I’d really like to be able to say:

diff :: forall A a. (A :> Floating, A a, Num a) => (forall b. (A b) => b -> b) -> a -> a

Here, I’m using :> to mean “is a superclass of”, and A is meant to quantify over type classes. In other words, the type says that given any type class that’s a superclass of Floating, I can pass in a function that’s polymorphic over that type class, and get back its derivative (interpreting its domain and range as being, at the very least, numbers). But that’s not possible, so the functions above all have some of the advantages, and I’ve written all of them.

Does It Work?

Glad you asked.  Yes and no.

Here’s the yes part.

$ ghci -XRankNTypes autodiff.hs
> let f x = 3 * x^2 + 2 * x - 3
> let f' = diff f
> f' 3
20
> let g x = 1 / sqrt (exp x - asinh x)
> let g' = diff g
> g' 3
-0.12660691544665373

So far, so good.

The “no” part of the answer is because there are a few specific situations in which the code will give the wrong answer.  First, the code I’ve written often gives a derivative when the derivative is really undefined.  Sometimes this is me being sloppy; for example, the log of -1 is undefined, but this code provides a derivative anyway.  I could fix this if I were willing to make my code a little uglier.  Second, uses of fromRational and fromInteger must be constants, because the code assumes they are.  It was pointed out by Luke Palmer that I can define a function of the correct type (Num a => a -> a) by using the Show superclass to convert a value to a string, then read it as an Integer, do all sorts of things to it, and then convert the result back to the type a by using fromInteger.  Doing so will give a derivative of zero, even if your function really has a different derivative.

A more nefarious example is this:

f x = if x == 1 then 1 else x^2

This is the same function as f x = x^2.  However, asking for the derivative at x = 1 will return 0, because the function returns the constant 1, whose first derivative is 0.  In particular, the technique runs into problems right on the boundaries of intervals where the function is defined piecewise.  It doesn’t appear to me that there’s a particularly good way of dealing with this, except to process the source code and modify if statements to refuse to calculate derivatives at those locations.  That’s far beyond the scope of a little playing with overloading, so I have no intention of doing it.

Extending to Other Cool Stuff

If we just got functions computing values, this wouldn’t be very visual.  In the effort to avoid introducing GUI libraries, I’ll play the old trick of graphing functions sideways in ASCII art.  Let’s look at the second function in my example above… the weird one with square roots, exponentials, and inverse hyperbolic sines.

Here’s a test program.

import AD

graph f ymin ystep xs = mapM_ (putStrLn . flip replicate '*'
                                        . round
                                        . (/ ystep)
                                        . (subtract ymin))
                              (map f xs)

main = do let g x = 1 / sqrt (exp x - asinh x)
          graph g               0       (1/75)  [0.0, 0.1 .. 4.0]
          putStrLn "------------------------------------"
          graph (diff g)        (-0.5)  (1/175) [0.0, 0.1 .. 4.0]
          putStrLn "------------------------------------"
          graph (diff (diff g)) (-0.75) (1/75)  [0.0, 0.1 .. 4.0]

That just defines our function g again, and graphs it and its first two derivatives using ASCII art.  Here’s the result.

***************************************************************************
***************************************************************************
**************************************************************************
*************************************************************************
***********************************************************************
*********************************************************************
*******************************************************************
****************************************************************
*************************************************************
**********************************************************
*******************************************************
****************************************************
*************************************************
***********************************************
********************************************
*****************************************
***************************************
*************************************
***********************************
*********************************
*******************************
*****************************
***************************
**************************
************************
***********************
**********************
*********************
********************
*******************
******************
*****************
****************
***************
**************
*************
*************
************
***********
***********
**********
------------------------------------
****************************************************************************************
******************************************************************************
*******************************************************************
********************************************************
*********************************************
***********************************
***************************
**********************
******************
*****************
*****************
******************
********************
***********************
**************************
******************************
*********************************
*************************************
****************************************
*******************************************
**********************************************
************************************************
***************************************************
*****************************************************
*******************************************************
*********************************************************
***********************************************************
*************************************************************
**************************************************************
****************************************************************
*****************************************************************
*******************************************************************
********************************************************************
*********************************************************************
**********************************************************************
***********************************************************************
************************************************************************
*************************************************************************
**************************************************************************
**************************************************************************
***************************************************************************
------------------------------------
*******************
************
********
********
************
******************
***************************
*************************************
**********************************************
******************************************************
************************************************************
****************************************************************
*******************************************************************
*********************************************************************
**********************************************************************
***********************************************************************
***********************************************************************
**********************************************************************
**********************************************************************
*********************************************************************
********************************************************************
*******************************************************************
*******************************************************************
******************************************************************
*****************************************************************
*****************************************************************
****************************************************************
***************************************************************
***************************************************************
**************************************************************
**************************************************************
**************************************************************
*************************************************************
*************************************************************
*************************************************************
************************************************************
************************************************************
************************************************************
************************************************************
***********************************************************
***********************************************************

October 9, 2007

The Lack of Difference Between "What" and "How"

Filed under: Uncategorized — cdsmith @ 11:29 am

I’m just jotting down my thoughts in response to a conversation at work yesterday.  I’ve realized now that there is a whole category of programmers out there that see a very stark black-and-white distinction between two things:

  • What my code does
  • How my code works

I don’t see that distinction at all.  It’s fascinating to me to suddenly realize that many people filter their entire view of programming through lenses that separate those two nearly identical concepts into opposite sides of an impermeable divide.  It leads me to suddenly understand some of the nagging issues that live in the back of my head, such as:

  1. How can anyone work toward patents on software without it keeping them up at night riddled with guilt?
  2. How can anyone disagree that software running on vote-counting machines for government elections should be released into the public domain?
  3. Why are so many people so worried about whether their code passes tests, but not with whether their tests are testing the right things?

Reading articles from Reddit today, I came across James Golick’s statement:

When you explore in your implementation code, you’re trying to answer two questions at once: “What should my code do?” and “What’s the best way to implement my code’s functionality?”

And there’s the same thing again.

I’m not saying one way or the other of thinking about programming is “right”.  I’ll be the first to admit that there exists infinitely many ways of writing programs to compute the same function.  I just doubt whether it’s ever really possible to have a right answer to “what should my code do?” independent of the related question “how should it do that?”  Except for small finite sets of possible input, I just don’t think we generally have the language for it.

October 4, 2007

Revisiting "X Improves My Design"

Filed under: Uncategorized — cdsmith @ 10:25 am

I seem to have upset a few people.  Let me clarify a few points about the last article I wrote:

  • I don’t think unit testing or test-driven development is a bad idea.  I’ve done it for some time.  I’ve argued for it.  I’ve invited Mark Clark to speak at user groups to promote TDD.  If this post is read as a hit piece on the idea of test-driven development, that’s not how it was intended.  I’m not the right person to write such a hit piece, because I think TDD is a good idea.
  • Nothing in the article was intended to attack or insult Tim Ottinger.  It was meant to point out something we all do, and that I think we need to be made aware of.  I do it with functional programming and static typing.  Others do it with OO design, or certain kinds of version control, or whatever.  Tim wrote a good article, and I used it as a springboard to make a slightly related point.  I do apologize to Tim if my comments were upsetting at all.
  • I don’t actually believe that doing test-driven development requires completely giving up private variables.  I don’t think it necessarily means adding Java EE style abstract factories for all classes in the program.  I was pointing out, more than anything else, that we should just remain aware that these techniques have disadvantages.

The main thing to bring away is there there are very few techniques in programming that don’t make some things hard or awkward.  Denying that fact is dishonest, and does nothing to help the advocacy of a technique.  Acknowledging it brings us back to real people balancing real and often conflicting concerns in the context of their various projects.  This is the essence of software development, and it’s directly opposed to the human tendency to pick a side and defend it against all comers.

I remain convinced that test-driven development will lead to some design compromises in other areas.  I’ve seen it in my projects, and I think Tim’s experience basically says the same thing.  I still think it’s best that we describe these compromises as what they are.  It’s important to resist silver-bullet-ism in a field that’s already gone too far down that path.

September 29, 2007

The "X Improves My Design Anyway" Myth

Filed under: Uncategorized — cdsmith @ 10:50 am

It’s amazing how often the bandwagon impulse overrides clear thinking.  Perfectly intelligent people will idiotically claim that global warming isn’t happening, because the political party that represented their views on abortion happens to say so.  Compassionate and considerate children often turn into bullies in junior high school because they need to belong to a group.  And in software development, advocates of programming techniques will adopt a masochistic kind of “I deserved it anyway” attitude toward the things that it makes hard.

An example of the last variety popped up recently in Tim Ottinger’s blog on TDD.  The way it manifests itself here is in the “X improves my design anyway” myth.  This myth says that even though I have to do things a harder way now, it’s really good for me anyway and I should have been doing it all along.  Much like castor oil is good for children, I suppose.  Tim Ottinger now advocates changing style and convention guidelines in companies to say that you should be doing things the hard way anyhow, so that the problems caused by TDD just sort of magically go away.

To be clear, I’m not saying TDD is a bad idea.  I’m saying that competent software developers ought to make rational choices, by considering the advantages and disadvantages of things for a particular purpose.  This is directly opposed to the idea that we ought to stack the deck by defining away the disadvantages and pretending they are good design that we just didn’t recognize until now.

Among the things Tim thinks are now good design:

  • Making member variables non-private, so that tests can get to them.  Also, weakening the distinction between public and private interface (apparently, in order to help people understand the code… huh?)
  • Wrapping basically every new object allocation in a factory method with some kind of abstraction for changing the implementation on the fly (in other words, the thing that makes J2EE hell).  Tim says “Calling a concrete class constructor inside the body of a method now makes you cringe.”  That doesn’t look like a good way to do things to me.

And after all this, the conclusion is: “For the most part, TDD forces you to start doing the things you always should have done…”  Bingo!  The myth in action.

Of course, Tim isn’t just stupid.  There is no one who doesn’t do this.  Other examples include:

  • Declaring all your variables in a block at the top of a Pascal procedure is a little inconvenient, but choosing your local variables carefully in advance of writing the code improves your design.
  • It’s hard to map arbitrary data structures in your application language to the relational database, but designing your data in a relational way first improves your design.
  • Monads can be verbose and difficult to use in Haskell, but that makes people wary about using monads, which improves design.
  • Multiple inheritance is impossible in Java, but untangling your multiple inheritance hierarchies improves your design.

This myth happens when we adopt a philosophy or approach to building software.  Invariably there are certain tasks that are made more difficult by the new approach, and there are somewhat painful workarounds to these scenarios.  However, because the word “workaround” is a bit of an ugly one, we want to sweep this under a rug.  The easiest way to do this is to take the painful workaround, and claim that it was the right way to go about things anyway.  Then we get “X improves my design anyway” - where X is some procedure used to work around a limitation of the approach.

Of course, X very well may improve your design.  If that’s the case, it’s worth the time to step back from the context (programming language, TDD, relational databases, etc.) that motivated you to do X in the first place, and see whether X makes sense without that context.  If so, great!  Tell people about it, and maybe it’ll catch on as a design technique in its own right.  But if it only makes sense in the context of your technique, then it’s time to start calling it what it is: a workaround for a shortcoming of the technique.  (Hint: if you just can’t imagine writing good code without your technique and so think this exercise is futile, then you are too far on the bandwagon already; let somebody else decide.)

In Tim’s case, a lot of his recommendations simply don’t stand the test of scrutiny.  The debate over public member variables happened, and was lost.  If TDD requires more public member variables, that’s a workaround for a limitation of TDD; it’s not good design.  J2EE (sorry, JavaEE!) is needlessly complex and difficult to use, and people are moving to other languages and frameworks because of it.  If TDD requires recycling the same endless factories and “okay, how the hell do I get an instance of this interface now?” that plagued JavaEE, then that’s also a limitation of TDD as well.  Some other recommendations do stand the test of time.  Ultimately, though, the problem is the attempt to take these decisions out of the hands of competent developers, and put them in the hands of dead convention and style guidelines.  This is denial - pretending that there is no decision to be made, when there clearly is a decision.

August 7, 2007

When Deception and Lies Became Acceptable

Filed under: Uncategorized — cdsmith @ 11:04 am

This is not a technical post; I realize it will probably appear on several syndication sites where people expect to read technical content.  Sorry about that;you can skip it.

I’m really amazed at what people will put up with in old-fashioned post office style mail.  With email, everyone and their mother knows that spammers are scum and that it’s immoral to do business with them.  By regular mail, perfectly respectable companies don’t just send unsolicited mail (snail-spam); they send deceptive and dishonest snail-spam, and people take it in stride and don’t think anything is out of the ordinary.

I just checked my mail today.  In addition to regular junk mail, here’s what I found:

  • A letter from Chase, a bank and credit card provider.  On the front, it says “Important Information Enclosed” and “This may be your only notice.”  It’s an advertisement, but they didn’t want me to think that, at least until I got a little ways in.
  • A letter from Solstice Capital, a home loan company.  The envelope has official-looking notices warning the postmaster (yes, the postmaster!) not to bend or tear the letter.  On the inside, the letter is designed to look like a court filing.  The second sheet has something that looks so much like a check that they had to write “Not a Check” twice in inconspicuous places around the edge.  (The “looks like a check” deception is among the favorites of the less ethical snail-spammers.)

I sat back and wondered how we got to this point; where a respectable gentleman will swear about spam, but will ignore the lies and scams in their regular mailbox.  It’s probably because no one pays attention, and no one says anything.

Here’s my commitment: from now on, when I receive deceptive snail-spam, I will call, speak to the highest-level supervisor I can, and let them know exactly why I will never do business with them again.  I’m also keeping a list of companies that do this, and I encourage you to help me maintain the list.

July 29, 2007

37 Reasons to Love Haskell (playing off the Ruby article)

Filed under: Haskell — cdsmith @ 2:33 pm

Here’s an article on reasons to like Ruby.  It’s actually very well written and persuasive.  So I took it as a challenge.  How well would Haskell fare when compared against this specific set of criteria, changing the points as little as possible?  The meanings will inevitably shift a bit, and readers will know that Haskell has its own set of advantages, of course.

The point of this exercise, though, is to see how Haskell does in the benefits mentioned for Ruby in that particular article.  It’s not a Haskell vs. Ruby; just a shameless theft of a set of criteria.  Should be fun.

  1. It’s object oriented.  Although it wasn’t designed for it, this paper points out that Haskell scores pretty well in support for object oriented programming features.  It’s not entirely clear that this is a good way to write code; but if it isn’t, it’s not because Haskell doesn’t provide the features or support you need to make it work.
  2. It’s pure. Haskell chooses to be pure in the functional sense, rather than the object-oriented sense.  The same ideas remain, though; there are no messy bits that don’t fit into the prevailing mode of thought.
  3. It’s a dynamic language. Yep, you heard right.  Projects like Yi and hs-plugins and lambdabot prove that it’s quite possible to write programs with runtime code loading and manipulation and other dynamic features in Haskell.  Indeed, the type system gives you a level of assurance that you haven’t made big mistakes along the way; something that’s quite likely when assembling code at runtime from many small pieces.
  4. It’s an interpreted language. Honestly, I’m having trouble understanding this point in the original; the Ruby article doesn’t ever say why this is a good thing.  It helps, of course, in that it lets you work with the program interactively, easily trying out bits at a time rather than having to write a new main method to do any testing.  Unlike Ruby, Haskell can also be compiled to handle any performance concerns; combining the advantages of both worlds.
  5. It understands regular expressions. The Text.Regex module in the Haskell standard library contains functions to do regular expression stuff.  It even uses Haskell’s operator mechanism to define operators like =~ to look more like other languages sometimes.
  6. It’s multi-platform. GHC, the major Haskell compiler, exists for Linux (many processors and distributions), Other UNIX-like systems (FreeBSD, NetBSD, OpenBSD, Solaris), many Windows variants (95 through Vista), and MacOS (Intel and PowerPC).  Nobody cares about MS-DOS :).
  7. It’s derivative. Haskell as a language borrows many of the best features from other languages, especially ML (its type system), and Miranda (its evaluation order).  It borrows libraries from lots of places.  There are Haskell bindings for wxWindows and GTK+, for example.  And yes, printf, too.
  8. It’s innovative.  If there’s any single widely used language today that has a claim to being truly innovative, it’s Haskell.  Haskell is practically a research playground, while still managing to be a practical language.  New language features have been the bread and butter of its progress: type classes, monadic I/O and computational environments; more recently: GADTs, associated types.  All of these are rare in other languages.
  9. It’s a Very High-Level Language (VHLL).  I actually doubt this term has any kind of meaning at all; but if it does, then Haskell has quite a good claim to fitting its meaning.  To the extent that the high/low level language distinction is meaningful, it’s about the ability of software to transform the program from something that’s useful to programmers into something that’s efficient and executable on machines.  More work is going on here in Haskell (e.g., see the Data Parallel Haskell project) than in any other language I’m aware of.
  10. It has a smart garbage collector.  I’m not sure this should even be worth a mention on the list.  Languages without automatic memory management are simply not candidates for writing serious application level code in the modern world.  For what it’s worth, though, yes Haskell does it.
  11. It’s a scripting language. Despite having virtually none of the properties that conventional wisdom associates with scripting languages, Haskell is quite usable for scripting.  Function composition and the monadic >>= operator provide the ability to combine pieces every bit as tersely as pipes; type inference eliminates the cost of static types.  This page talks about using Haskell for scripting, and this one describes the -e option to ghc, which lets you run Haskell code directly from the command line, and gives an example of using it.
  12. It’s versatile. As a general purpose programming language, you can do as much with Haskell as practically anything.  Scripting is simple, as described above.  Haskell also has a very advanced application server letting you write complex web applications; has been used to write a very nice window manager, is widely used in financial and other industries, and has been used to implement other languages — itself and the very first implementation of Perl 6.
  13. It’s thread-capable. And more!  There are basically two languages worth looking at for modern concurrent programming: Erlang and Haskell.  Haskell implements not just multithreading, but three different higher-level abstractions on top of multithreading: Software Transactional Memory, Data Parallel Haskell, and basic Parallel Haskell.  All three provide tools to make it easier to build correct threaded programs.  (Of course, Haskell offers less interesting traditional concurrency abstractions as well.)
  14. It’s open-source. All Haskell implementations have the source code fully available.  The major people involved hang out and regularly respond by both mailing lists (newsgroup interface available, too) and even IRC channels.  Haskell has one of the most famously open and friendly communities around, so you’ll fit right in!
  15. It’s intuitive.  Haskell doesn’t have a shallow learning curve, but that’s because you’re really learning things; not just learning a new syntax for the same programming you’ve done for years.  Haskell stays out of the way and lets your mind be expanded by the concepts you’re seeing instead of by the arbitrary choices of the language implementors.  That’s about as intuitive as you can really ask for.
  16. It handles exceptions well.  Unlike practically any other language, Haskell does the right thing for computations that have exceptional cases.  When working in a purely functional way, it gives you simple types like Either and Maybe that help to express those exceptional conditions in a functional manner.  When you’re working in an imperative way (e.g., in the IO monad, or any other MonadPlus environment) it provides exception handling, since that’s the right choice for the imperative style.
  17. It has an advanced Array type.  You don’t have to declare types at compile-time, or allocate memory in advance, or keep up with their length, or worry about indexing out of bounds (unless you explicitly choose to use unsafe operations).  Unlike many other languages, though, arrays aren’t syntactically preferred.  It’s just as easy to use a map, list, sequence, or many other things depending on what best suits your application.
  18. It’s extensible. You can write external libraries in Haskell or in C.  You can declare new instances (in other words add new behaviors to existing type signatures; gives you the core benefit of modifying existing classes) on the fly.  You can also add new operators (and I do mean new operators, not rehashing a very limited set of operators in error-prone ways like you do in C++) and use monads to define whole new computational environments if you like.
  19. It encourages literate programming. Even if you aren’t really doing literate programming, you can embed comments in your code which the Haddock tool will extract and manipulate.  You can look at type information even if it was never documented at all, simply by asking GHCi or Hugs for the type.  If you’re really into literate programming, though, major Haskell compilers understand literate source files that default to comments, and only include code delimited in specific ways (birdtracks, or latex begin/end commands).  This lets you, quite literally, compile the same document into either executable code or a details PDF user manual, without even having to do anything unusual to your latex source!
  20. It uses punctuation and capitalization creatively. Haskell enforces a punctuation scheme that makes the meaning of code clearer.  Types, classes, modules, and data constructors begin with upper-case letters.  Variables (including type variables) start with lower-case letters.  Fortunately, though, a lot of very important information is stored in the type — including, for example, whether a function result is Boolean or not, and even whether it destructively updates something!  This information is available at your fingertips either by reading documentation or just by typing :t at the GHCi command prompt, so you don’t have to repeat it over and over again in your code.
  21. (Some) reserved words aren’t. Haskell has very few reserved words compared to most other languages, because the language itself is conceptually simpler (not to be confused with easier to learn, as mentioned earlier).  A few keywords aren’t reserved, but I can’t pretend that’s a good thing.
  22. It allows iterators.  More generally, higher-order functions are useful in a large number of situations.  When combined with lazy evaluation, there are a plethora of powerful techniques for handling collections of data in Haskell.  Iterators loosely correspond to maps and folds, which are well supported and widely used.
  23. It has safety and security features. Haskell provides as advanced a set of fool-proof safety and security features as is found in basically any language.  Its type system allows you to express security constraints (even rather generalized ones) in embedded domain-specific constructs that can be enforced by the compiler, so you never even attempt to run unsafe code!  If a new kind of security issue arises, it’s generally possible to use the powerful type system to solve the problem without waiting for language support for something like tainted data.
  24. It (really) has no pointers. Unlike the trivial sense in which languages like Java are claimed to have no pointers by restricting the most dangerous operations on them, Haskell really has no pointers.  (It’s worth noting that the Java language specification wasn’t fooled into the “no pointers” myth, as a peek as the second sentence of 4.3.1 in the 3rd edition spec will make clear.)  In a pure functional language, there’s no difference between using pointers and actual values, so the compiler can make the decision based on performance concerns, rather than exposing the distinction between pointers and data to the programmer.  This is part of Haskell’s being a higher level language.
  25. It pays attention to detail. This is one of those Orwellian newspeak things where the Ruby article I’m working from claims one thing (attention to detail) and then describes the opposite (extreme sloppiness).  Haskell follows the real attention to detail picture here, though you can define your own synonyms (for values or types) if you like.
  26. It has a flexible syntax. Haskell has a truly flexible syntax in ways that Ruby can’t dream of.  It allows programmers to embed domain-specific languages that are significantly and fundamentally different from the imperative model of Ruby.  Parsers can be written by embedding context-free grammars right into the source code.  Logic processing can be added by embedding Prolog-style inference rules into the language.  An infinite supply of operators are available to facilitate these languages.  Higher-order functions and monadic environments are available to make them work well.
  27. It has a rich set of libraries. Available libraries is one place where Haskell is way above practically all languages with similar community sizes, and in the ballpark of a lot of mainstream languages.  There are libraries for practically anything, including several for GUI programming, web programming, transactional persistence, and plenty else besides.  I wouldn’t want to try to list them all, so here.
  28. It has a debugger. Okay, so this is the biggest stretch yet.  The development version of GHC (to become GHC 6.8) actually does have a debugger; but the debugging tools for released versions of Haskell are sketchy at best.  This is improving.
  29. It can be used interactively. This is true in the sense that GHCi exists.  It’s less true in the sense that someone can use it as their login shell.  Yes, it’s possible; no, it’s probably not a good idea.  This is interesting, though.
  30. It is concise. Haskell code is about comparable to many scripting languages.  Sometimes it’s a little longer.  Sometimes (generally when one can make good use of powerful abstraction techniques like monads and higher-order functions ina  program of large enough size that it makes a difference) it’s a lot shorter.  The xmonad window manager is about 500 lines of code.
  31. It is expression-oriented.  As a purely functional language, of course it’s expression-oriented.
  32. It is laced with syntactic sugar.  Haskell’s got all sorts of nice syntax in ways that really matter quite a lot: custom operators and fixity declarations, do blocks for monadic computation, etc.
  33. It has operator overloading.  In fact, operator overloading in Haskell is far nicer than in most other languages, because you can make up your own operators.  No more confusing bit shifting with I/O.  Within a given context, an operator means a specific thing; but at the same time, it can apply to your custom types (ah, the magic of type classes) and you can make up your own different operators to do different things concisely.  Reading a journal paper that uses dotted relational operators to mean something?  Great!  Use them in Haskell, too.
  34. It has infinite-precision integer arithmetic.  It also has infinite-precision rational arithmetic, and fixed-precision types in case you want those, too.  And you can build your own types somewhat concisely.
  35. It has an exponentiation operator.  Actually, it has three!  This is because there’s a difference in what the three of them mean in some cases, so you get your choice.
  36. It has powerful string handling.  It does, but more importantly it also has powerful list handling, and these list handling routines are all usable on strings.
  37. It has few exceptions to its rules.  Haskell’s semantics are basically the lambda calculus.  The whole language is remarkably consistent and behaves in consistent ways.  The semantics are even simpler and easier to understand than any imperative language, which has to do with the distinction between values and variables; or eager languages, which have to deal with the question of evaluation order.  This makes Haskell programs very easy to understand and manipulate safely.

So there you have it.

July 19, 2007

Parsing, CFGs, and Type Hacking

Filed under: Haskell — cdsmith @ 3:21 pm

This is what I have been playing with for the last day or so.

Haskell has a very nice monadic parser library for predictive parsing (parsec), and a decent lex/yacc-style parser and lexer generator suite (happy and alex).  That said, though, it’s more fun and educational to write code than to worry about what’s already been written, I set out to do something similar.  In particular, my goals are:

  • Write an LR parser, like that generated by parser generators.
  • Write readable grammars directly in Haskell source files, with no external parser generator utilities.
  • Let the type checker do its job by checking and inferring types for semantic rules.

I haven’t written the parsing code yet, but I did squeeze out something that I’m nearly happy with for the first two goals.

{-# LANGUAGE MultiParamTypeClasses     #-}
{-# LANGUAGE UndecidableInstances      #-}
{-# LANGUAGE NoMonomorphismRestriction #-}

All the language extensions I’ll be using.  This is the bare-bones list; the original list was eight or nine lines..  So, if you were wonder whether this is a good post for new Haskellers just learning the language, there’s your answer!

I need some way to represent variables (in the CFG sense).  In order to ensure that everything is well-typed, I need some way to keep track of the type of the semantic value associated with each variable.  Here’s what I did.

data Var a  = Var String

And a right-hand side of each rule will have a sequence of variables and terminals.  Again, to keep the type information around, I’ll need a sort of cons operator at the type level.  Here is that.  I defined a type, and also an operator that makes the type easier to use.

data RHS a b = RHS a b
(&) = RHS -- a convenient operator
infixr 5 &

And next, things get hard.  I’m using a multiparameter type class, in the fine tradition of Haskell type hacking, to represent a relation between types.  My relation is defined in the following comment:

{-
    (Action a b c) means the following:

    A production with a right hand side of type:    a
    may be associated with a semantic rule of type: b
    to produce a rule with semantic result type:    c
-}
class Action a b c | a b -> c, a c -> b

In other words, the Action class will be used to ensure that the result type of a grammar production, the right-hand side of the production, and the type of the associated semantic rule are all consistent with each other.  The functional dependencies simply assert that if you know the types of the right-hand side and the semantic rule, this is enough to determine what the result will be after applying the semantic rule; and that if you know the right-hand side and the result type, this is enough to determine what the type of the semantic rule needs to be.

There are a three base cases for this relation:

instance                   Action  (Var x)          (x -> y)  y

This says that if a production has the form A -> B, where A has semantic values of type y, and B has semantic values of type x, then the semantic rule must have type (x -> y).  If you think about it, this should make sense.

instance                   Action  Char             y         y

This rule says that if the right-hand side of a production is a signle character (a terminal, not a variable), the semantic rules should be a constant that matches the semantic type for the left-hand variable.

instance                   Action  ()               y         y

This describes the situation for empty productions (sometimes called epsilon or lambda productions).  Since leaving out any terms on the right-hand side of a rule isn’t an option, I use (), called “unit” to represent an empty right-hand side.

Those are the base cases.  (As a side comment, only the last one is strictly necessary; the first two are basically just syntactic convenience.  See below.)  Here’s how rules with more than one symbol on the right-hand side are handled.

instance (Action a b c) => Action  (RHS (Var x) a)  (x -> b)  c

The RHS operator defined earlier is used to build a list of sorts.  This rule says that adding a variable to the beginning of the right-hand side of a rule requires adding a parameter to the beginning of the semantic action, and that the result type stays the same.  This case handles right-hand sides that begin with a variable.

instance (Action a b c) => Action  (RHS Char    a)  b         c

Finally, this case handles right-hand sides that begin with a terminal (a character).  The types of the semantic rule and result don’t change, since a terminal is known in advance, so there’s no need for it to carry semantic information.

Some more syntactic convenience makes it easier to write grammars.  Here I abuse monads to take advantage of the special syntax.

{-
    The RuleSet monad is used to define rules of a grammar in a
    convenient 'do' notation.
-}

data Rule   = forall a b c. (Action a b c) => Rule (Var c, a, b)
data RuleSet x = RuleSet ([Rule], x)

instance Monad RuleSet where
    a >>= k  = let RuleSet (r1, v1) = a
                   RuleSet (r2, v2) = k v1
               in  RuleSet (r1 ++ r2, v2)
    return x = RuleSet ([], x)

So a rule consists of a left-hand side, a right-hand side, and a semantic rule.  They are constrained to match each other by the Action class defined above.  A RuleSet is basically just a writer monad for lists of rules, but I defined it by hand just for the fun of it.

Now time to define an operator for building rules inside the monad:

(==>) :: (Action a b c) => Var c -> a -> b -> RuleSet ()
(==>) lhs rhs sem = RuleSet ([Rule (lhs, rhs, sem)], ())

infixr 4 ==>

It took a while to pick this.  All the good arrow-like operators seems to be taken!  Nevertheless, it does the job we want fairly well.  Notice that even though I’m using an infix operator, there are three operands.  The normal usage looks like this:

lefthand ==> righthand $ semanticrule

You’ll see examples coming up.

The formal definition of a context-free grammar includes four things: a set of variables, a set of terminals, a set of productions, and a special start variable.  We’ve got three: variables are values of type Var a.  Terminals are values of type Char.  Productions are values of type Rule.  Next, I need a start symbol.  This is defined once, outside of the monadic environment in which rules are defined.  At the same time, I through away the result value of the monad, which is useless since I was just exploiting the syntax rather than building a real monad.

{-
    Grammar.  A grammar is a set of rules together with a
    start symbol.
-}

data Grammar a = Grammar (Var a) [Rule]
grammar s (RuleSet (rs, x)) = Grammar s rs

And that’s it!  Here’s a sample grammar so we can see it work.

{-
    Sample grammar.

    The parentheses in the let bindings are required: they force the rules
    to be monomorphic, which is needed for type checking to work properly.
-}
g = let ( expr     ) = Var "expression"
        ( term     ) = Var "term"
        ( termmore ) = Var "term operator"
        ( fact     ) = Var "factor"
        ( factmore ) = Var "factor operator"
        ( digit    ) = Var "digit"
        ( digits   ) = Var "digits"

    in grammar expr $ do
       expr     ==> term & termmore        $ \\x y   -> y x
       termmore ==> ()                     $ \\x     -> x
       termmore ==> '+' & term & termmore  $ \\x y z -> y (x + z)
       termmore ==> '-' & term & termmore  $ \\x y z -> y (x - z)
       term     ==> fact & factmore        $ \\x y   -> y x
       factmore ==> ()                     $ \\x     -> x
       factmore ==> '*' & fact & factmore  $ \\x y z -> y (x * z)
       factmore ==> '/' & fact & factmore  $ \\x y z -> y (x / z)
       fact     ==> '(' & expr & ')'       $ \\x     -> x
       fact     ==> digit & digits         $ \\x y   -> y x
       digits   ==> ()                     $ \\x     -> x
       digits   ==> digit & digits         $ \\x y z -> y (10*x + z)
       digit    ==> '0'                    $ 0
       digit    ==> '1'                    $ 1
       digit    ==> '2'                    $ 2
       digit    ==> '3'                    $ 3
       digit    ==> '4'                    $ 4
       digit    ==> '5'                    $ 5
       digit    ==> '6'                    $ 6
       digit    ==> '7'                    $ 7
       digit    ==> '8'                    $ 8
       digit    ==> '9'                    $ 9

(This is a modified grammar I had laying around from a set of compiler course notes.  It happens to have left recursion removed, but that’s immaterial really.)  There are no type declarations in the entire grammar.  Ask GHCi for the type of g, though, and it answers.

g :: (Fractional a) => Grammar a

It correctly inferred that the result type of expr must be Fractional.  How?  Because the third production for factmore uses the / operator.  This means that factmore must be Fractional, and the type ripples upward all the way to the start variable of expr.

The only thing I don’t like at the moment is the need to use an explicit monomorphic binding (the parentheses) to declare non-terminals.  If that’s not done, then the compiler thinks non-terminals can have different result types when used in different places, and the types it infers tend to be several pages long!  A nice solution to that would be good, but I’m happy with everything else!

Please comment if there’s something that can be improved.

« Previous PageNext Page »

Blog at WordPress.com.