Skip to content
July 30, 2011 / cdsmith

Arrow = Category + Applicative? (Part I)

It’s been said a few times that Haskell’s Arrow type class is really just the combination of Category and Applicative.  Frankly, I’ve never really “got” the arrow type class, and I’ve always thought it looked rather  convoluted and ugly — with it’s explicit use of tuple types and related weird contortions that remind me of the monstrosities that sometimes come out of lambdabot’s point-free converter.  Reading papers about whether the originally stated axioms are even all necessary reinforces that idea.  But Category and Applicative are very simple, easy, and intuitive.  So this is very good news… supposing it’s true.

Before just crossing arrows off of the list of things to learn, though, we should understand the statement and verify that it’s true.  Note that being true means not only that you can construct terms with the appropriate types, but that terms can be constructed in a way that gives inverse maps between the two concepts, and that the axioms on both sides follow from the other.  That’s what we set out to do here.

In this post, part I, we talk about what this means, and establish the easy direction: that there is an obvious Applicative corresponding to any Arrow.

What Does It Even Mean?

First, we run into the difficulty that the original statement isn’t very precise.  After all, Category and Arrow types have kind  (* -> * -> *), while Applicative types have kind (* -> *).  So clearly the same type can’t be an instance of all three type classes.  So what we really mean is that Arrow is equivalent to a Category instance, plus a universally quantified applicative instance for any fixed domain (where I mean domain in the Category sense; the first type parameter of the Category type).  That will be clearer if we write some declarations.

So, for the Arrow side, we have:

instance Category (:~>) where
    id    = ...
    (.)   = ...

instance Arrow (:~>) where
    arr   = ...
    first = ...

Since Arrow is already a subclass of Category, the Arrow axioms relate all of these terms together, and here I’ll slightly reorganize the Arrow axioms from their usual presentation to emphasize the role of the Category instance.

[C1] f                            = id . f
                                  = f  . id
[C2] f . (g . h)                  = (f . g) . h)
[A1] arr id                       = id
[A2] arr   (f . g)                = arr f   . arr g
[A3] first (f . g)                = first f . first g
[A4] first (arr f)                = arr (f `cross` id)
[A5] first f . arr (id `cross` g) = arr (id `cross` g) . first f
[A6] arr fst . first f            = f . arr fst
[A7] first f . arr assoc          = arr assoc . first (first f)

We used the following functions in the plumbing above:

(f `cross` g) (x,y) = (f x, g y)
assoc ((x,y),z)     = (x,(y,z))

On the Applicative side, we want the following

instance Category (:~>) where
    id    = ...
    (.)   = ...

instance Applicative ((:~>) a) where
    pure  = ...
    (<*>) = ...

Notice that the Applicative instance fixes the domain.  It’s universally quantified, sure, but within any use of the Applicative terms, we’ll need to keep the domain the same.  This is how we relate the instances for different kinds.

These are also subject to axioms:

[C1] f               = id . f
                     = f  . id
[C2] f . (g . h)     = (f . g) . h
[F1] pure id <*> v   = v
[F2] u <*> (v <*> w) = pure (.) <*> u <*> v <*> w
[F3] pure (f x)      = pure f <*> pure x
[F4] u <*> pure y    = pure ($ y) <*> u

The Category instance, and axioms [C1] and [C2], are given.  The question is:

  1. Given an Arrow instance and axioms [A1] through [A7], can we write an Applicative instance which satisfies axioms [F1] through [F4]?
  2. Given an Applicative instance and axioms [F1] through [F4], can we write an Arrow instance which satisfies axioms [A1] through [A7]?
  3. Are the maps between the two inverse to each other?  That is, can we deduce from the relevant axioms that the result of substituting back and forth in either order gives back the original constants?

If we can do this, then Arrow becomes not particularly interesting, since it’s just an overly complicated way of expressing computations with Category and Applicative.  This is what we’re trying to show.

Arrow to Applicative

In this part, we get from the Arrow axioms to the Applicative axioms.  Assuming Arrow (and, of course, its superclass Category), we’ll define the following Applicative instance.

instance Applicative ((:~>) a) where
    pure x  = arr (const x)
    f <*> x = arr (uncurry (flip ($))) . first x . arr swap . first f . arr dup

We made use of two auxiliary functions

dup   x    = (x,x)
swap (x,y) = (y,x)

What we’re saying here is pretty straightforward.

First, Arrow’s arr and Applicative’s pure are both notions of purity, but Applicative’s notion is just a bit stronger.  A map constructed with arr is effectively just a function, but at least it can depend on its input, if nothing else.  The Applicative pure, on the other hand, considers dependence on the input, too, as if it were a sort of effect (keep in mind that having the input around at all is part of what we’re wrapping up in the applicative functor), so purity is a combination for Arrow purity, and also starting from a constant functor.

Second, the application part tells us that our notion of application is to split the “input” (in the categorical sense) and let both f and x process it, and then apply the resulting function to the resulting value.  The order of processing matters here, since arrows can have side effects!  The convention with most Applicative instances is that effects are performed left to right, so we arrange for f to be processed before x.  It’s interesting, though, and will come up again later, that we could put the effects in the opposite order and still have a perfectly valid, though counter-intuitive, Applicative instance.

To verify that the Applicative instance defined here is a valid one, we’ll need to verify the axioms.  We have four of them to check.  In the following, I’ll apply the Category axioms freely without mention… in particular, I never indicate the grouping of composition, since it doesn’t matter, and I’ll collapse occurrences of id freely as well.

[F1] pure id <*> v = v
pure id <*> v
= arr (uncurry (flip ($))) . first v . arr swap . first (pure id) . arr dup            -- def. of <*>
= arr (uncurry (flip ($))) . first v . arr swap . first (arr (const id)) . arr dup     -- def. of pure
= arr (uncurry (flip ($))) . first v . arr swap . arr (const id `cross` id) . arr dup  -- [A4]
= arr (uncurry (flip ($))) . first v . arr (swap . (const id `cross` id)) . arr dup    -- [A2]
= arr (uncurry (flip ($))) . first v . arr (id `cross` const id) . arr dup             -- simplification
= arr (uncurry (flip ($))) . arr (id `cross` const id) . first v . arr dup             -- [A5]
= arr (uncurry (flip ($)) . (id `cross` const id)) . first v . arr dup                 -- [A2]
= arr fst . first v . arr dup                                                          -- simplification
= v . arr fst . arr dup                                                                -- [A6]
= v . arr (fst . dup)                                                                  -- [A2]
= v . arr id                                                                           -- simplification
= v . id = v                                                                           -- [A1]

I’ll skip the longer one for now and move to [F3].

[F3] pure (f x) = pure f <*> pure x
pure f <*> pure x
= arr (uncurry (flip ($))) . first (pure x) . arr swap . first (pure f) . arr dup                     -- def. of <*>
= arr (uncurry (flip ($))) . first (arr (const x)) . arr swap . first (arr (const f)) . arr dup       -- def. of pure
= arr (uncurry (flip ($))) . arr (const x `cross` id) . arr swap . arr (const f `cross` id) . arr dup -- [A4]
= arr (uncurry (flip ($)) . (const x `cross` id) . swap . (const f `cross` id) . dup)                 -- [A2]
= arr (const (f x))                                                                                   -- simplification
= pure (f x)                                                                                          -- def. of pure

That one was easy.  So emboldened, let’s move on to [F4].

[F4] u <*> pure y = pure ($ y) <*> u
u <*> pure y
= arr (uncurry (flip ($))) . first (pure y) . arr swap . first u . arr dup               -- def. of <*>
= arr (uncurry (flip ($))) . first (arr (const y)) . arr swap . first u . arr dup        -- def. of pure
= arr (uncurry (flip ($))) . arr (const y `cross` id) . arr swap . first u . arr dup     -- [A4]
= arr (uncurry (flip ($)) . (const y `cross` id) . swap) . first u . arr dup             -- [A2]
= arr (uncurry (flip ($)) . (id `cross` const ($ y))) . first u . arr dup                -- function identities
= arr (uncurry (flip ($))) . arr (id `cross` const ($ y)) . first u . arr dup            -- [A2]
= arr (uncurry (flip ($))) . first u . arr (id `cross` const ($ y)) . arr dup            -- [A5]
= arr (uncurry (flip ($))) . first u . arr ((id `cross` const ($ y)) . dup)              -- [A2]
= arr (uncurry (flip ($))) . first u . arr (swap . (const ($ y) `cross` id) . dup)       -- function identities
= arr (uncurry (flip ($))) . first u . arr swap . arr (const ($ y) `cross` id) . arr dup -- [A2]
= arr (uncurry (flip ($))) . first u . arr swap . first (arr (const ($ y))) . arr dup    -- [A4]
= arr (uncurry (flip ($))) . first u . arr swap . first (pure ($ y)) . arr dup           -- def. of pure
= pure ($ y) <*> u                                                                       -- def. of <*>

Not quite so easy, but we made it through.  Now face the real monster, axiom [F2].  Hopefully we’re familiar enough by now with applying [A2] and rewriting the functions that result that we can combine those into a single step now.  Before we dive in, we’ll need one auxiliary function: the inverse of the assoc function from Arrow axiom [A7]:

unassoc (x,(y,z)) = ((x,y),z)

Now, let’ start.  Because the proof is so long, I’ve taken the time to bold the parts that change in each line, so you can follow more easily.

[F2] u <*> (v <*> w) = pure (.) <*> u <*> v <*> w
pure (.) <*> u <*> v <*> w
= arr (const (.)) <*> u <*> v <*> w                                                      -- def. of pure
= arr (uncurry (flip ($))) . first w . arr swap . first (arr (uncurry (flip ($)))
 . first v . arr swap . first (arr (uncurry (flip ($))) . first u . arr swap
 . first (arr (const (.))) . arr dup) . arr dup) . arr dup                              -- def. of <*>
= arr (uncurry (flip ($))) . first w . arr swap . first (arr (uncurry (flip ($))))
  . first (first v) . first (arr swap) . first (first (arr (uncurry (flip ($)))))
  . first (first (first u)) . first (first (arr swap)) . first (first (first (arr (const (.)))))
  . first (first (arr dup)) . first (arr dup) . arr dup                                  -- [A3]
= arr (uncurry (flip ($))) . first w . arr swap . arr (uncurry (flip ($)) `cross` id)
  . first (first v) . arr (swap `cross` id) . arr ((uncurry (flip ($)) `cross` id) `cross` id)
  . first (first (first u)) . arr ((swap `cross` id) `cross` id)
  . arr (((const (.) `cross` id) `cross` id) `cross` id) . arr ((dup `cross` id) `cross` id)
  . arr (dup `cross` id) . arr dup                                                       -- [A4]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . arr assoc . first (first (first u))
  . arr unassoc . arr (id `cross` dup) . arr (\ x -> ((x, (.)), x))                      -- [A2] and function identities
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . first (first u)
  . arr assoc . arr unassoc . arr (id `cross` dup) . arr (\ x -> ((x, (.)), x))          -- [A7]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . first (first u)
  . arr (id `cross` dup) . arr (\ x -> ((x, (.)), x))                                    -- [A2] and [A1]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ ((a,b),(c,d)) -> ((c, b a), d)) . arr (id `cross` dup) . first (first u)
  . arr (\ x -> ((x, (.)), x))                                                           -- [A5]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . arr assoc . first (first u)
  . arr unassoc . arr (id `cross` ((.),)) . arr dup                                      -- [A2] and function identities
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . first u
  . arr assoc . arr unassoc . arr (id `cross` ((.),)) . arr dup                          -- [A7]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . first u
  . arr (id `cross` ((.),)) . arr dup                                                    -- [A2] and [A1]
= arr (uncurry (flip ($))) . first w
  . arr (\ ((a,b),c) -> (c, b a)) . first (first v)
  . arr (\ (a,(b,c)) -> ((c, b a), c)) . arr (id `cross` ((.),)) . first u
  . arr dup                                                                              -- [A5]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . arr assoc . first (first v)
  . arr unassoc . arr (id `cross` \ (b,c) -> ((.) c, b)) . arr assoc
 . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A2] and function identities
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . first v
  . arr assoc . arr unassoc . arr (id `cross` \ (b,c) -> ((.) c, b)) . arr assoc
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A7]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . first v
  . arr (id `cross` \ (b,c) -> ((.) c, b)) . arr assoc
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A2] and [A1]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (c, b a)) . arr (id `cross` \ (b,c) -> ((.) c, b)) . first v
  . arr assoc . arr (\(a,b) -> ((b,b),a)) . first u . arr dup                            -- [A5]
= arr (uncurry (flip ($))) . first w
  . arr (\ (a,(b,c)) -> (b, c . a)) . first v . arr assoc
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                        -- [A2] and function identities
= arr (uncurry (flip ($))) . first w . arr (id `cross` uncurry (flip (.))) . arr assoc
  . arr (\ ((a,b),c) -> ((b,a),c)) . first (first v)
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                       -- [A7] and [A2] and identities
= arr (uncurry (flip ($))) . arr (id `cross` uncurry (flip (.))) . arr assoc . first (first w)
  . arr (\ ((a,b),c) -> ((b,a),c)) . first (first v)
  . arr (\ (a,b) -> ((b,b),a)) . first u . arr dup                                       -- [A5] and [A7]
= arr (uncurry (flip ($))) . arr (uncurry (flip ($)) `cross` id) . first (first w)
  . arr (swap `cross` id) . first (first v)
  . arr (dup `cross` id) . arr swap . first u . arr dup                                  -- [A2] and function identities
= arr (uncurry (flip ($))) . first (arr (uncurry (flip ($)))) . first (first w)
  . first (arr swap) . first (first v)
  . first (arr dup) . arr swap . first u . arr dup                                       -- [A4]
= arr (uncurry (flip ($))) . first (arr (uncurry (flip ($))) . first w
  . arr swap . first v . arr dup) . arr swap . first u . arr dup                         -- [A3]
= u <*> (v <*> w)                                                                        -- def. of <*>

That was… interesting.  Perhaps there’s a short, concise, elegant way to do this, but I don’t see it, so I just clobbered at it until it worked out in the end.  At least we’re done.

I will point out that I initially wrote a much longer proof, but had an error in the definition of the Applicative instance, and when I fixed that, I rewrote this proof from scratch and it came out a good bit shorter.  The technique is basically the same throughout: it’s all about applying the [A7] axiom to pull out assoc and unassoc pairs around the major parts and move them back and forth.  It’s also interesting that the [A6] axiom from Arrow did appear in the earlier proof, but didn’t turn out to be necessary this time.  In fact, the only place that axiom appears at all is in the proof of [F1], which is essentially that the functor preserves identity functions.  Very curious…

Summary

Okay, so we’ve now shown that arrows naturally give rise to Applicative instances: if you have an Arrow, then for any fixed input type, the arrows from that input type constitute an applicative functor.  That’s half (the easier half) of what we set out to show, but it’s a decent stopping point.  Coming next: what about getting Arrow from Applicative?  Are the Applicative axioms powerful enough?  (Hint: no)  What other axioms do we need to add, and are they natural things to expect of any type with both Applicative and Category instances?  Find out the answers to these exciting questions in the next installment!

About these ads

7 Comments

Leave a Comment
  1. Heinrich Apfelmus / Jul 31 2011 1:35 am

    Nice!

    It turns out that the proofs become a lot more understandable if you use diagrammatic reasoning, i.e. the “rectangle representation” with “signal lines” and all.

  2. Yitz / Jul 31 2011 4:39 am

    Thanks, getting the formal relationship between Arrow and Applicative nailed down is great service to the Haskell community!

    However, your stated goal of killing off Arrows is not going to happen so quickly. Even when two formalisms cover exactly the same space,
    the two different ways of expressing things can each be powerful in their own ways. That is certainly the case for Arrows and Applicative.

    At the very least, there are some simple combinators from Control.Arrow that every Haskeller should be familiar with. Among
    those are first, second, ***, and &&&. Lately >>> has become popular, too. I find your use of your own names for some of these, like `cross` instead of ***, to be confusing.

    Tuples are Haskell’s general product type. Their explicit use in the definition of Arrows is natural and elegant. I don’t understand what problem you have with that.

    I have no plans to dive deeply into the theory of Arrows or write any Arrows-based libraries myself in the near future. But as a user, I find well-designed and well-documented Arrow-based libraries to be as easy and intuitive to use as any other. Like you, I currently understand Applicative much better than Arrow, so I really appreciate your work in clarifying the relationship between them. I am looking forward to your next post!

    • cdsmith / Jul 31 2011 7:29 am

      I actually deliberately used `cross` instead of ***, because the former is specialized to function types, while the latter would be (confusingly, in this case) generally applicable to all Arrow instances, while we only intend it for functions. I wasn’t terribly consistent in this: e.g., I freely used id and (.) for function-only stuff, but in the case of cross, I felt like *** was too suggestive as an Arrow operation, when I had no intention of using it that way. (In practice, I find myself and others using the Arrow stuff like &&& and *** specialized to functions all the time… in fact, far more often than I see real uses of arrows. But it’s still suggesting something to pull them out when general arrows are the topic.)

      As for tuples being natural and elegant… I have a hard time agreeing. Sure, they make sense in a way, but the general feel is like you’re programming with some kind of control-flow assembly language when you use the API directly. You have to explicitly make copies of values in order to use them several places. You have to keep track of structures of nested tuples (“Was this a pair where the first element is a pair? Or was the second element a pair?”) In fact, arrows are generally used with proc notation, the entire purpose of which is basically freeing programmers from explicit management of those tuples… and it’s not just a thin syntactic layer akin to monadic do notation; it has to insert a bunch of plumbing steps intelligently!

  3. Pletora / Jul 31 2011 7:10 am

    I totally agree with you on the superfluousness of the Arrow class. I think the best way for beginners to get comfortable with the Arrow class is to take a look at the following alternate definition of Arrow that really makes explicit the dependency on both Applicative and Category. I find use of each of the individual combinators to be much easier to understand than in the usual definition which defines everything in terms of arr and first. Unfortunately this definition doesn’t guarantee any laws concerning the order of execution of effects in &&& and ***

    {-# LANGUAGE FlexibleInstances, FlexibleContexts, UndecidableInstances #-}

    import Prelude hiding ( id )

    import Control.Category ( Category (..), (>>>) )
    import Control.Applicative ( Applicative (..), liftA2 )

    (&&&) :: Applicative m => m a -> m b -> m (a, b)
    (&&&) = liftA2 (,)

    class Category arr => Arrow arr where
    arr :: Functor (arr a) => (a -> b) -> arr a b
    arr f = fmap f id

    (***) :: Applicative ( arr (a, b) ) => arr a x -> arr b y -> arr (a, b) (x, y)
    f *** g = (arr fst >>> f) &&& (arr snd >>> g)

    first :: Applicative ( arr (a, b) ) => arr a x -> arr (a, b) (x, b)
    first f = f *** id

    second :: Applicative ( arr (a, b) ) => arr b y -> arr (a, b) (a, y)
    second g = id *** g

    instance ( Category arr, Applicative (arr a) ) => Arrow arr where

  4. jeremygibbons / Jul 31 2011 3:49 pm

    Do the proofs get any easier if you use the “strong lax-monoidal” presentation of Applicative? Instead of pure and , you have pure, a “pairing” operator of type (f a,f b) -> f (a,b), and “strength” of type (a,f b) -> f (a,b) with suitable axioms (as in McBride and Paterson).

Trackbacks

  1. Arrow = Category + Applicative? (Part IIa) « Sententia cdsmithus
  2. On CodeWorld and Haskell | Sententia cdsmithus

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 80 other followers

%d bloggers like this: