July 30, 2011 / cdsmith

Arrow = Category + Applicative? (Part I)

It’s been said a few times that Haskell’s Arrow type class is really just the combination of Category and Applicative. Frankly, I’ve never really “got” the arrow type class, and I’ve always thought it looked rather convoluted and ugly — with it’s explicit use of tuple types and related weird contortions that remind me of the monstrosities that sometimes come out of lambdabot’s point-free converter. Reading papers about whether the originally stated axioms are even all necessary reinforces that idea. But Category and Applicative are very simple, easy, and intuitive. So this is very good news… supposing it’s true.

Before just crossing arrows off of the list of things to learn, though, we should understand the statement and verify that it’s true. Note that being true means not only that you can construct terms with the appropriate types, but that terms can be constructed in a way that gives inverse maps between the two concepts, and that the axioms on both sides follow from the other. That’s what we set out to do here.

In this post, part I, we talk about what this means, and establish the easy direction: that there is an obvious Applicative corresponding to any Arrow.

What Does It Even Mean?

First, we run into the difficulty that the original statement isn’t very precise. After all, Category and Arrow types have kind (* -> * -> *), while Applicative types have kind (* -> *). So clearly the same type can’t be an instance of all three type classes. So what we really mean is that Arrow is equivalent to a Category instance, plus a universally quantified applicative instance for any fixed domain (where I mean domain in the Category sense; the first type parameter of the Category type). That will be clearer if we write some declarations.

So, for the Arrow side, we have:

instance Category (:~>) where
    id    = ...
    (.)   = ...

instance Arrow (:~>) where
    arr   = ...
    first = ...

Since Arrow is already a subclass of Category, the Arrow axioms relate all of these terms together, and here I’ll slightly reorganize the Arrow axioms from their usual presentation to emphasize the role of the Category instance.

[C1] f                            = id . f
                                  = f  . id
[C2] f . (g . h)                  = (f . g) . h)

[A1] arr id                       = id
[A2] arr   (f . g)                = arr f   . arr g
[A3] first (f . g)                = first f . first g
[A4] first (arr f)                = arr (f `cross` id)
[A5] first f . arr (id `cross` g) = arr (id `cross` g) . first f
[A6] arr fst . first f            = f . arr fst
[A7] first f . arr assoc          = arr assoc . first (first f)

We used the following functions in the plumbing above:

(f `cross` g) (x,y) = (f x, g y)
assoc ((x,y),z)     = (x,(y,z))

On the Applicative side, we want the following

instance Category (:~>) where
    id    = ...
    (.)   = ...

instance Applicative ((:~>) a) where
    pure  = ...
    (<*>) = ...

Notice that the Applicative instance fixes the domain. It’s universally quantified, sure, but within any use of the Applicative terms, we’ll need to keep the domain the same. This is how we relate the instances for different kinds.

These are also subject to axioms:

[C1] f               = id . f
                     = f  . id
[C2] f . (g . h)     = (f . g) . h

[F1] pure id <*> v   = v
[F2] u <*> (v <*> w) = pure (.) <*> u <*> v <*> w
[F3] pure (f x)      = pure f <*> pure x
[F4] u <*> pure y    = pure ($ y) <*> u

The Category instance, and axioms [C1] and [C2], are given. The question is:

Given an Arrow instance and axioms [A1] through [A7], can we write an Applicative instance which satisfies axioms [F1] through [F4]?
Given an Applicative instance and axioms [F1] through [F4], can we write an Arrow instance which satisfies axioms [A1] through [A7]?
Are the maps between the two inverse to each other? That is, can we deduce from the relevant axioms that the result of substituting back and forth in either order gives back the original constants?

If we can do this, then Arrow becomes not particularly interesting, since it’s just an overly complicated way of expressing computations with Category and Applicative. This is what we’re trying to show.

Arrow to Applicative

In this part, we get from the Arrow axioms to the Applicative axioms. Assuming Arrow (and, of course, its superclass Category), we’ll define the following Applicative instance.

instance Applicative ((:~>) a) where
    pure x  = arr (const x)
    f <*> x = arr (uncurry (flip ($))) . first x . arr swap . first f . arr dup

We made use of two auxiliary functions

dup   x    = (x,x)
swap (x,y) = (y,x)

What we’re saying here is pretty straightforward.

First, Arrow’s arr and Applicative’s pure are both notions of purity, but Applicative’s notion is just a bit stronger. A map constructed with arr is effectively just a function, but at least it can depend on its input, if nothing else. The Applicative pure, on the other hand, considers dependence on the input, too, as if it were a sort of effect (keep in mind that having the input around at all is part of what we’re wrapping up in the applicative functor), so purity is a combination for Arrow purity, and also starting from a constant functor.

Second, the application part tells us that our notion of application is to split the “input” (in the categorical sense) and let both f and x process it, and then apply the resulting function to the resulting value. The order of processing matters here, since arrows can have side effects! The convention with most Applicative instances is that effects are performed left to right, so we arrange for f to be processed before x. It’s interesting, though, and will come up again later, that we could put the effects in the opposite order and still have a perfectly valid, though counter-intuitive, Applicative instance.

To verify that the Applicative instance defined here is a valid one, we’ll need to verify the axioms. We have four of them to check. In the following, I’ll apply the Category axioms freely without mention… in particular, I never indicate the grouping of composition, since it doesn’t matter, and I’ll collapse occurrences of id freely as well.

[F1] pure id <*> v = v

pure id <*> v
= arr (uncurry (flip ($))) . first v . arr swap . first (pure id) . arr dup            -- def. of <*>
= arr (uncurry (flip ($))) . first v . arr swap . first (arr (const id)) . arr dup     -- def. of pure
= arr (uncurry (flip ($))) . first v . arr swap . arr (const id `cross` id) . arr dup  -- [A4]
= arr (uncurry (flip ($))) . first v . arr (swap . (const id `cross` id)) . arr dup    -- [A2]
= arr (uncurry (flip ($))) . first v . arr (id `cross` const id) . arr dup             -- simplification
= arr (uncurry (flip ($))) . arr (id `cross` const id) . first v . arr dup             -- [A5]
= arr (uncurry (flip ($)) . (id `cross` const id)) . first v . arr dup                 -- [A2]
= arr fst . first v . arr dup                                                          -- simplification
= v . arr fst . arr dup                                                                -- [A6]
= v . arr (fst . dup)                                                                  -- [A2]
= v . arr id                                                                           -- simplification
= v . id = v                                                                           -- [A1]

I’ll skip the longer one for now and move to [F3].

[F3] pure (f x) = pure f <*> pure x

pure f <*> pure x
= arr (uncurry (flip ($))) . first (pure x) . arr swap . first (pure f) . arr dup                     -- def. of <*>
= arr (uncurry (flip ($))) . first (arr (const x)) . arr swap . first (arr (const f)) . arr dup       -- def. of pure
= arr (uncurry (flip ($))) . arr (const x `cross` id) . arr swap . arr (const f `cross` id) . arr dup -- [A4]
= arr (uncurry (flip ($)) . (const x `cross` id) . swap . (const f `cross` id) . dup)                 -- [A2]
= arr (const (f x))                                                                                   -- simplification
= pure (f x)                                                                                          -- def. of pure

That one was easy. So emboldened, let’s move on to [F4].

[F4] u <*> pure y = pure ($ y) <*> u

u <*> pure y
= arr (uncurry (flip ($))) . first (pure y) . arr swap . first u . arr dup               -- def. of <*>
= arr (uncurry (flip ($))) . first (arr (const y)) . arr swap . first u . arr dup        -- def. of pure
= arr (uncurry (flip ($))) . arr (const y `cross` id) . arr swap . first u . arr dup     -- [A4]
= arr (uncurry (flip ($)) . (const y `cross` id) . swap) . first u . arr dup             -- [A2]
= arr (uncurry (flip ($)) . (id `cross` const ($ y))) . first u . arr dup                -- function identities
= arr (uncurry (flip ($))) . arr (id `cross` const ($ y)) . first u . arr dup            -- [A2]
= arr (uncurry (flip ($))) . first u . arr (id `cross` const ($ y)) . arr dup            -- [A5]
= arr (uncurry (flip ($))) . first u . arr ((id `cross` const ($ y)) . dup)              -- [A2]
= arr (uncurry (flip ($))) . first u . arr (swap . (const ($ y) `cross` id) . dup)       -- function identities
= arr (uncurry (flip ($))) . first u . arr swap . arr (const ($ y) `cross` id) . arr dup -- [A2]
= arr (uncurry (flip ($))) . first u . arr swap . first (arr (const ($ y))) . arr dup    -- [A4]
= arr (uncurry (flip ($))) . first u . arr swap . first (pure ($ y)) . arr dup           -- def. of pure
= pure ($ y) <*> u                                                                       -- def. of <*>

Not quite so easy, but we made it through. Now face the real monster, axiom [F2]. Hopefully we’re familiar enough by now with applying [A2] and rewriting the functions that result that we can combine those into a single step now. Before we dive in, we’ll need one auxiliary function: the inverse of the assoc function from Arrow axiom [A7]:

unassoc (x,(y,z)) = ((x,y),z)

Now, let’ start. Because the proof is so long, I’ve taken the time to bold the parts that change in each line, so you can follow more easily.

[F2] u <*> (v <*> w) = pure (.) <*> u <*> v <*> w

pure (.) <*> u <*> v <*> w
= arr (const (.)) <*> u <*> v <*> w                                                      -- def. of pure
= arr (uncurry (flip ($))) . first w . arr swap . first (arr (uncurry (flip ($)))
 . first v . arr swap . first (arr (uncurry (flip ($))) . first u . arr swap
 . first (arr (const (.))) . arr dup) . arr dup) . arr dup                              -- def. of <*>
= arr (uncurry (flip ($))) . first w . arr swap . first (arr (uncurry (flip ($))))
  . first (first v) . first (arr swap) . first (first (arr (uncurry (flip ($)))))
  . first (first (first u)) . first (first (arr swap)) . first (first (first (arr (const (.)))))
  . first (first (arr dup)) . first (arr dup) . arr dup                                  -- [A3]
= arr (uncurry (flip ($))) . first w . arr swap . arr (uncurry (flip ($)) `cross` id)
  . first (first v) . arr (swap `cross` id) . arr ((uncurry (flip ($)) `cross` id) `cross` id)
  . first (first (first u)) . arr ((swap `cross` id) `cross` id)
  . arr (((const (.) `cross` id) `cross` id) `cross` id) . arr ((dup `cross` id) `cross` id)
  . arr (dup `cross` id) . arr dup                                                       -- [A4]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . arr assoc . first (first (first u))
  . arr unassoc . arr (id `cross` dup) . arr (\ x -> ((x, (.)), x))                      -- [A2] and function identities
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . first (first u)
  . arr assoc . arr unassoc . arr (id `cross` dup) . arr (\ x -> ((x, (.)), x))          -- [A7]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . first (first u)
  . arr (id `cross` dup) . arr (\ x -> ((x, (.)), x))                                    -- [A2] and [A1]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . arr (id `cross` dup) . first (first u)
  . arr (\ x -> ((x, (.)), x))                                                           -- [A5]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . arr assoc . first (first u)
  . arr unassoc . arr (id `cross` ((.),)) . arr dup                                      -- [A2] and function identities
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . first u
  . arr assoc . arr unassoc . arr (id `cross` ((.),)) . arr dup                          -- [A7]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . first u
  . arr (id `cross` ((.),)) . arr dup                                                    -- [A2] and [A1]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . arr (id `cross` ((.),)) . first u
  . arr dup                                                                              -- [A5]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . arr assoc . first (first v)
  . arr unassoc . arr (id `cross` \ (b,c) -> ((.) c, b)) . arr assoc
 . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A2] and function identities
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . first v
  . arr assoc . arr unassoc . arr (id `cross` \ (b,c) -> ((.) c, b)) . arr assoc
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A7]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . first v
  . arr (id `cross` \ (b,c) -> ((.) c, b)) . arr assoc
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A2] and [A1]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . arr (id `cross` \ (b,c) -> ((.) c, b)) . first v
  . arr assoc . arr (\(a,b) -> ((b,b),a)) . first u . arr dup                            -- [A5]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (b, c . a)) . first v . arr assoc
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A2] and function identities
= arr (uncurry (flip ($))) . first w . arr (id `cross` uncurry (flip (.))) . arr assoc
  . arr (\ ((a,b),c) -> ((b,a),c)) . first (first v)
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                       -- [A7] and [A2] and identities
= arr (uncurry (flip ($))) . arr (id `cross` uncurry (flip (.))) . arr assoc . first (first w)
  . arr (\ ((a,b),c) -> ((b,a),c)) . first (first v)
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                       -- [A5] and [A7]
= arr (uncurry (flip ($))) . arr (uncurry (flip ($)) `cross` id) . first (first w)
  . arr (swap `cross` id) . first (first v)
  . arr (dup `cross` id) . arr swap . first u . arr dup                                  -- [A2] and function identities
= arr (uncurry (flip ($))) . first (arr (uncurry (flip ($)))) . first (first w)
  . first (arr swap) . first (first v)
  . first (arr dup) . arr swap . first u . arr dup                                       -- [A4]
= arr (uncurry (flip ($))) . first (arr (uncurry (flip ($))) . first w
  . arr swap . first v . arr dup) . arr swap . first u . arr dup                         -- [A3]
= u <*> (v <*> w)                                                                        -- def. of <*>

That was… interesting. Perhaps there’s a short, concise, elegant way to do this, but I don’t see it, so I just clobbered at it until it worked out in the end. At least we’re done.

I will point out that I initially wrote a much longer proof, but had an error in the definition of the Applicative instance, and when I fixed that, I rewrote this proof from scratch and it came out a good bit shorter. The technique is basically the same throughout: it’s all about applying the [A7] axiom to pull out assoc and unassoc pairs around the major parts and move them back and forth. It’s also interesting that the [A6] axiom from Arrow did appear in the earlier proof, but didn’t turn out to be necessary this time. In fact, the only place that axiom appears at all is in the proof of [F1], which is essentially that the functor preserves identity functions. Very curious…

Summary

Okay, so we’ve now shown that arrows naturally give rise to Applicative instances: if you have an Arrow, then for any fixed input type, the arrows from that input type constitute an applicative functor. That’s half (the easier half) of what we set out to show, but it’s a decent stopping point. Coming next: what about getting Arrow from Applicative? Are the Applicative axioms powerful enough? (Hint: no) What other axioms do we need to add, and are they natural things to expect of any type with both Applicative and Category instances? Find out the answers to these exciting questions in the next installment!