The Comonad.Reader » Mathematics

Japanese “ekmett” Workshop (Part 1 of 2)

Edward Kmett — Mon, 01 Apr 2013 05:59:05 +0000

A couple of weeks back one of my coworkers brought to my attention a several hour long workshop in Japan to go over and describe a number of my libraries, hosted by TANAKA Hideyuki — not the voice actor, I checked!

I was incredibly honored and I figured that if that many people (they had 30 or so registered attendees and 10 presentations) were going to spend that much time going over software that I had written, I should at least offer to show up!

I'd like to apologize for any errors in the romanization of people's names or misunderstandings I may have in the following text. My grasp of Japanese is very poor! Please feel free to send me corrections or additions!

Surprise!

Sadly, my boss's immediate reaction to hearing that there was a workshop in Japan about my work was to quip that "You're saying you're huge in Japan?" With him conspicuously not offering to fly me out here, I had to settle for surprising the organizers and attending via Google Hangout.

Commentary and Logs

@nushio was very helpful in getting me connected, and while the speakers gave their talks I sat on the irc.freenode.net #haskell-lens channel and Google Hangout and answered questions and provided a running commentary with more details and references. Per freenode policy the fact that we were logging the channel was announced -- well, at least before things got too far underway.

Here is the IRC session log as a gist. IKEGAMI Daisuke @ikegami__ (ikeg in the IRC log) tried to keep up a high-level running commentary about what was happening in the video to the log, which may be helpful if you are trying to follow along through each retroactively.

Other background chatter and material is strewn across twitter under the #ekmett_conf hash tag and on a japanese twitter aggregator named togetter

Getting Started

The 1PM start time in Shibuya, Tokyo, Japan translates to midnight at the start of Easter here in Boston, which meant ~6 hours later when we reached the Q&A session, I was a bit loopy from lack of sleep, but they were incredibly polite and didn't seem to mind my long rambling responses.

Thanks to the organizers, we have video of the vast majority of the event! There was no audio for the first couple of minutes, and the recording machine lost power for the last talk and the Q&A session at the end as we ran somewhat longer than they had originally scheduled! -- And since I was attending remotely and a number of others flitted in and out over the course of the night, they were nice enough to put most of the slides and background material online.

profunctors by Liyang HU and HIBINO Kei

Liyang Hu (@liyanghu) started the session off with a nicely self-contained crash course on my profunctors package, since profunctors are used fairly heavily inside the implementation of lens and machines, with a couple of detours into contravariant and bifunctors.

His presentation materials are available interactively from the new FP Complete School of Haskell. You can also watch the video recording of his talk on ustream.

This talk was followed by a much more condensed version of very similar content in Japanese by Hibino Kei (@khibino) His talk was more focused on the relationship between arrows and profunctors, and the slides are available through slideshare.

lens by @its_out_of_tune

Once the necessary background material was out of the way, the talk on lens -- arguably the presentation that most of the people were there for -- came early.

@its_out_of_tune gave an incredibly dense overview of how to use the main parts of the lens package in Japanese. His slides are available online and here is a recording of his talk.

Over the course of a half hour, he was able to cram in a great cross-section of the library including material that I hadn't even been able to get to even with 4x the amount of time available during my New York talk on how to use the lens template-haskell code to automatically generate lenses for user data types and how to use the lens Action machinery.

free and free-game by KINOSHITA Fumiaki

Next up, was my free package and the neat free-game engine that Kinoshita Fumiaki (@fumieval) built on top.

The slides were in English, though the talk and humor were very Japanese. ^_^

That said, he had some amazingly nice demos, including a live demo of his tetris clone, Monaris, which is visible about 10 minutes into the video!

ad by @nebutalab

@nebutalab, like me, joined the session remotely through Google Hangout, and proceeded to give a tutorial on how forward mode automatic differentiation works through my AD package.

His slides were made available before the talk and the video is available in two parts due a technical hiccup in the middle of the recording.

I'm currently working to drastically simplify the API for ad with Alex Lang. Fortunately almost all of the material in this presentation will still be relevant to the new design.

tables by MURAYAMA Shohei

Next up, Murayama Shohei (@yuga) gave an introduction to tables, which is a small in memory data-store that I wrote a few months back to sit on top of lens.

Video of @yuga's talk and his slides are available, which I think makes this the first public talk about this project. -_^

machines by YOSHIDA Sanshiro

Yoshida Sanshiro (@halcat0x15a) gave a nice overview of the currently released version of machines including a lot of examples! I think he may have actually written more code using machines just for demonstrations than I have written using it myself.

Video of his talk is available along with his slide deck -- just tap left or right to move through the slides. He has also written a blog post documenting his early explorations of the library, and some thoughts about using it with attoparsec.

I've recently been trying to redesign machines with coworker Paul CHIUSANO @pchiusano and we've begun greatly simplifying the design of machines based on some work he has been doing in Scala, so unfortunately many of the particulars of this talk will be soon outdated, but the overall 'feel' of working with machines should be preserved across the change-over. Some of these changes can be seen in the master branch on github now.

More to come

There were 4 more sessions, but alas, I'm out of time for the moment! I'll continue this write-up with more links to the source material and my thoughts as soon as I can tomorrow!

Unnatural Transformations and Quantifiers

Dan Doel — Sun, 23 Sep 2012 03:43:13 +0000

Recently, a fellow in category land discovered a fact that we in Haskell land have actually known for a while (in addition to things most of us probably don't). Specifically, given two categories and , a functor , and provided some conditions in hold, there exists a monad , the codensity monad of .

In category theory, the codensity monad is given by the rather frightening expression:

Where the integral notation denotes an end, and the square brackets denote a power, which allows us to take what is essentially an exponential of the objects of by objects of , where is enriched in . Provided the above end exists, is a monad regardless of whether has an adjoint, which is the usual way one thinks of functors (in general) giving rise to monads.

It also turns out that this construction is a sort of generalization of the adjunction case. If we do have , this gives rise to a monad . But, in such a case, , so the codensity monad is the same as the monad given by the adjunction when it exists, but codensity may exist when there is no adjunction.

In Haskell, this all becomes a bit simpler (to my eyes, at least). Our category is always , which is enriched in itself, so powers are just function spaces. And all the functors we write will be rather like (objects will come from kinds we can quantify over), so ends of functors will look like forall r. F r r where . Then:
newtype Codensity f a = Codensity (forall r. (a -> f r) -> f r)

As mentioned, we've known for a while that we can write a Monad instance for Codensity f without caring at all about f.

As for the adjunction correspondence, consider the adjunction between products and exponentials:

This gives rise to the monad , the state monad. According to the facts above, we should have that Codensity (s ->) (excuse the sectioning) is the same as state, and if we look, we see:
forall r. (a -> s -> r) -> s -> r

which is the continuation passing, or Church (or Boehm-Berarducci) encoding of the monad.

Now, it's also well known that for any monad, we can construct an adjunction that gives rise to it. There are multiple ways to do this, but the most accessible in Haskell is probably via the Kleisli category. So, given a monad on , there is a category with the same objects, but where . The identity for each object is return and composition of arrows is:
(f >=> g) x = f x >>= g

Our two functors are:
F a = a F f = return . f

U a = M a U f = (>>= f)

Verifying that requires only that , but this is just , which is a triviality. Now we should have that .

So, one of the simplest monads is reader, . Now, just takes objects in the Kleisli category (which are objects in ) and applies to them, so we should have Codensity (e ->) is reader. But earlier we had Codensity (e ->) was state. So reader is state, right?

We can actually arrive at this result another way. One of the most famous pieces of category theory is the Yoneda lemma, which states that the following correspondence holds for any functor :

This also works for any functor into and looks like:
F a ~= forall r. (a -> r) -> F r

for . But we also have our functor , which should look more like:

U a ~= forall r. (a -> M r) -> U r M a ~= forall r. (a -> M r) -> M r

So, we fill in M = (e ->) and get that reader is isomorphic to state, right? What's going on?

To see, we have to take a closer look at natural transformations. Given two categories and , and functors , a natural transformation is a family of maps such that for every the following diagram commutes:

[Fa`Ga`Fb`Gb;\phi_a`Ff`Gf`\phi_b]\efig $' alt='$ \bfig \square<1000,1000>[Fa`Ga`Fb`Gb;\phi_a`Ff`Gf`\phi_b]\efig $' align=absmiddle>

The key piece is what the morphisms look like. It's well known that parametricity ensures the naturality of t :: forall a. F a -> G a for , and it also works when the source is . It should also work for a category, call it , which has Haskell types as objects, but where , which is the sort of category that newtype Endo a = Endo (a -> a) is a functor from. So we should be at liberty to say:
Codensity Endo a = forall r. (a -> r -> r) -> r -> r ~= [a]

However, hom types for are not merely made up of hom types on the same arguments, so naturality turns out not to be guaranteed. A functor must take a Kleisli arrow to an arrow , and transformations must commute with that mapping. So, if we look at our use of Yoneda, we are considering transformations :

[\mathbf{Hask}_M(a,b)`Ub`\mathbf{Hask}_M(a,c)`Uc;\phi_a`\mathbf{Hask}_M(a,f)`Uf`\phi_h]\efig $' alt='$ \bfig\square<1000,1000>[\mathbf{Hask}_M(a,b)`Ub`\mathbf{Hask}_M(a,c)`Uc;\phi_a`\mathbf{Hask}_M(a,f)`Uf`\phi_h]\efig $' align=absmiddle>

Now, and . So

t :: forall r. (a -> M r) -> M r

will get us the right type of maps. But, the above commutative square corresponds to the condition that for all f :: b -> M c:
t . (>=> f) = (>>= f) . t

So, if we have h :: a -> M b, Kleisli composing it with f and then feeding to t is the same as feeding h to t and then binding the result with f.

Now, if we go back to reader, we can consider the reader morphism:
f = const id :: a -> e -> e

For all relevant m and g, m >>= f = id and g >=> f = f. So the
naturality condition here states that t f = id.

Now, t :: forall r. (a -> e -> r) -> e -> r. The general form of these is state actions (I've split e -> (a, e) into two pieces):

t f e = f (v e) (st e)
  where
  rd :: e -> a
  st :: e -> e

If f = const id, then:

t (const id) e = st e
 where
 st :: e -> e

But our naturality condition states that this must be the identity, so we must have st = id. That is, the naturality condition selects t for which the corresponding state action does not change the state, meaning it is equivalent to a reader action! Presumably the definition of an end (which involves dinaturality) enforces a similar condition, although I won't work through it, as it'd be rather more complicated.

However, we have learned a lesson. Quantifiers do not necessarily enforce (di)naturality for every category with objects of the relevant kind. It is important to look at the hom types, not just the objects .In this case, the point of failure seems to be the common, extra s. Even though the type contains nautral transformations for the similar functors over , they can (in general) still manipulate the shared parameter in ways that are not natural for the domain in question.

I am unsure of how exactly one could enforce the above condition in (Haskell's) types. For instance, if we consider:

forall r m. Monad m => (a -> m r) -> m r

This still contains transformations of the form:

t k = k a >> k a

And for this to be natural would require:

(k >=> f) a >> (k >=> f) a = (k a >> k a) >>= f

Which is not true for all possible instantiations of f. It seems as though leaving m unconstrained would be sufficient, as all that could happen is t feeding a value to k and yielding the result, but it seems likely to be over-restrictive.

Searching Infinity Parametrically

Edward Kmett — Sun, 25 Dec 2011 06:19:43 +0000

Andrej Bauer recently gave a really nice talk on how you can exploit side-effects to make a faster version of Martin Escardo's pseudo-paradoxical combinators.

A video of his talk is available over on his blog, and his presentation is remarkably clear, and would serve as a good preamble to the code I'm going to present below.

Andrej gave a related invited talk back at MSFP 2008 in Iceland, and afterwards over lunch I cornered him (with Dan Piponi) and explained how you could use parametricity to close over the side-effects of monads (or arrows, etc) but I think that trick was lost in the chaos of the weekend, so I've chosen to resurrect it here, and improve it to handle some of his more recent performance enhancements, and show that you don't need side-effects to speed up the search after all!

First, we'll need to import a few things:

 
{-# LANGUAGE RankNTypes #-}
 
import Data.Maybe (fromMaybe)
import Control.Applicative
import Data.IntMap (IntMap)
import qualified Data.IntMap as IntMap
import Control.Monad
import Control.Monad.Trans.Class
import Data.Functor.Identity

What are looking for is an implementation of Hilbert's epsilon.

This is a formal mechanism for eliminating existentials over some non-empty set X by defining a function

 
ε: (X -> Prop) -> X

such that if there exists an x in X such that p(X) holds then p(ε(p)) holds.

As noted by Andrej, we could reify this constructively as a function "epsilon :: (X -> Bool) -> X" for some X.

Now, for some sets, Hilbert's epsilon is really easy to define. If X is a finite set, you can just exhaustively enumerate all of the options returning a member of X such that the property holds if any of them do, otherwise since X is non-empty, just return one of the elements that you tested.

This would be a pretty boring article and I'd be back to eating Christmas dinner with my family if that was all there was to it. However, certain infinite spaces can also be searched.

Last year, Luke Palmer wrote a post on "Searchable Data Types" that might also serve as a good introduction. In that article he led off with the easiest infinite space to search, the lazy naturals, or the 'one point compactification of the naturals'. That is to say the natural numbers extended with infinity.

 
data LazyNat = Zero | Succ LazyNat
infinity :: LazyNat
infinity = Succ infinity

Now we can implement Palmer's epsilon (called lyingSearch in his article).

 
palmer :: (LazyNat -> Bool) -> LazyNat
palmer p
  | p Zero = Zero
  | otherwise = Succ $ palmer $ p . Succ

The trick to making this work is that we place a requirement that the predicate that you pass has to terminate in a bounded amount of time no matter what input you give it, and since we're working with the naturals extended with infinity, if no natural satisfies the predicate, we'll just keep returning a longer and longer chain of Succ's, effectively yielding infinity.

To check to see if the returned number satisfies the predicate you can always use p (palmer p). The predicate is required to terminate in finite time, even when given infinity, so this will yield a Bool and not bottom out unless the user supplied predicate takes an unbounded amount of time.

I posted a reply to Luke's article when it came up on reddit which included a Hinze-style generic implementation of his lyingSearch predicate, which you can see now is just Hilbert's epsilon for arbitrary recursive polynomial data types.

http://www.reddit.com/r/haskell/comments/e7nij/searchable_data_types/c15zs6l

Another space we can search is the Cantor space 2^N.

 
type Cantor = Integer -> Bool

With that we jump clear from countable infinity to uncountable infinity, but it can still be searched in finite time!

This is the space we'll be paying attention to for the rest of this article.

First we'll define how to "book a room in Hilbert's Hotel."

 
infixr 0 #
(#) :: Bool -> Cantor -> Cantor
(x # a) 0 = x
(x # a) i = a (i - 1)

Then this can be used to obtain the following implementation of Hilbert's epsilon for the Cantor space, attributed by Andrej to Ulrich Berger.

 
berger :: (Cantor -> Bool) -> Cantor
berger p =
  if ex $ \a -> p $ False # a
  then False # berger $ \a -> p $ False # a
  else True  # berger $ \a -> p $ True # a
  where ex q = q (berger q)

This version is particularly close in structure to the one for searching the LazyNats, but it is dreadfully slow!

It would be nice to be able to search the space faster and that is just what Martin Escardo's improved version does, through a more sophisticated divide and conquer technique.

 
escardo :: (Cantor -> Bool) -> Cantor
escardo p = go x l r where
  go x l r n =  case divMod n 2 of
    (0, 0) -> x
    (q, 1) -> l q
    (q, 0) -> r $ q-1
  x = ex $ \l -> ex $ \r -> p $ go True l r
  l = escardo $ \l -> ex $ \r -> p $ go x l r
  r = escardo $ \r -> p $ go x l r
  ex q = q (escardo q)

To proceed from here I'll need a State monad:

 
newtype S s a = S { runS :: s -> (a, s) }
 
instance Functor (S s) where
  fmap f (S m) = S $ \s -> case m s of
    (a, s') -> (f a, s')
 
instance Applicative (S s) where
  pure = return
  (< *>) = ap
 
instance Monad (S m) where
  return a = S $ \s -> (a, s)
  S m >>= k = S $ \s -> case m s of
    (a, s') -> runS (k a) s'

And now we've reached the point. From here, Andrej's pure code ends, and his side-effecting ocaml and custom programming language start. The first thing he does is compute the modulus of continuity by using a side-effect that writes to a reference cell which he very carefully ensures doesn't leak out of scope, so he doesn't have to concern himself with the proposition code editing the value of the reference.

 
let mu f a =
  let r = ref 0 in
  let b n = (r := max n ! r; a n) in
    ignore (f b);
    !r

To obtain the same effect we'll instead make a predicate using the state monad to model the single reference cell.

 
-- bad
modulus :: (Num b, Ord s) =>
  ((s -> S s a) -> S b c) -> (s -> a) -> b

We can mash b and s together, and try to make the ordering and number agree by claiming that it is instead Real and we'd get the slightly more reasonable looking type:

 
-- still bad
modulus :: Real a =>
  ((a -> S n b) -> S n c) -> (a -> b) -> a

In the imperative code, lexical scoping had ensured that no other code could edit the reference cell, but with this type we don't have that. The predicate is allowed to use arbitrary state actions to muck with the modulus of convergence even though the only thing that should be editing it is the wrapper beta that we placed around alpha.

But how can we ensure that the end user couldn't gain access to any of the additional functionality from the monad? Parametricity!

 
-- getting better
modulus :: Real a =>
  (forall f. Monad f => (a -> f b) -> f c) ->
  (a -> b) ->
  a

Here the only thing you are allowed to assume about f is that it forms a monad. This gives you access to return and >>=, but the predicate can't do anything interesting with them. All it can do is work with what is effectively the identity monad, since it knows no additional properties!

We can have mercy on the end user and give them a little bit more syntactic sugar, since it doesn't cost us anything to let them also have access to the Applicative instance.

 
-- good
modulus :: Real a =>
  (forall f. (Monad f, Applicative f) => (a -> f b) -> f c) ->
  (a -> b) ->
  a

With that we can show Andrej's version of the modulus of convergence calculation does not need side-effects!

>
> modulus (\a -> a 10 >>= a) (\n -> n * n)
100

Admittedly plumbing around the monadic values in our proposition is a bit inconvenient.

His next example was written in a custom ocaml-like programming language. For translating his effect type into Haskell using parametricity, we'll need a CPS'd state monad, so we can retry from the current continuation while we track a map of assigned values.

 
newtype K r s a = K { runK :: (a -> s -> r) -> s -> r }
 
instance Functor (K r s) where
  fmap = liftM
 
instance Applicative (K r s) where
  pure = return
  (< *>) = ap
 
instance Monad (K r s) where
  return a = K $ \k -> k a
  K m >>= f = K $ \k -> m $ \a -> runK (f a) k

For those of you who have been paying attention to my previous posts, K r s is just a Codensity monad!

 
neighborhood ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  IntMap Bool
neighborhood phi = snd $ runK (phi beta) (,) IntMap.empty where
  beta n = K $ \k s -> case IntMap.lookup n s of
    Just b -> k b s
    Nothing -> case k True (IntMap.insert n True s) of
      (False, _) -> k False (IntMap.insert n False s)
      r -> r

With that we can adapt the final version of Hilbert's epsilon for the Cantor space that Andrej provided to run in pure Haskell.

 
bauer ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  Cantor
bauer p = \n -> fromMaybe True $ IntMap.lookup n m where
  m = neighborhood p

With a little work you can implement a version of an exists and forAll predicate on top of that by running them through the identity monad.

 
exists ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  Bool
forAll ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  Bool

I've gone further in playing with this idea, using monad homomorphisms rather than simply relying on the canonical homomorphism from the identity monad. You can get the gist of it here:

https://gist.github.com/1518767

This permits the predicates themselves to embed some limited monadic side-effects, but then you get more extensional vs. intensional issues.

An obvious direction from here is to fiddle with a version of Martin Escardo's search monad that takes advantage of these techniques, but I'll leave the exploration of these ideas to the reader for now and go enjoy Christmas dinner.

Happy Holidays,
Edward Kmett

Free Modules and Functional Linear Functionals

Edward Kmett — Mon, 11 Jul 2011 20:58:04 +0000

Today I hope to start a new series of posts exploring constructive abstract algebra in Haskell.

In particular, I want to talk about a novel encoding of linear functionals, polynomials and linear maps in Haskell, but first we're going to have to build up some common terminology.

Having obtained the blessing of Wolfgang Jeltsch, I replaced the algebra package on hackage with something... bigger, although still very much a work in progress.

(Infinite) Modules over Semirings

Recall that a vector space V over a field F is given by an additive Abelian group on V, and a scalar multiplication operator
(.*) :: F -> V -> V subject to distributivity laws

 
s .* (u + v) = s .* u + s .* v
(s + t) .* v = s .* v + t .* v

and associativity laws

 
   (s * t) .* v = s .* (t .* v)

and respect of the unit of the field.

 
   1 .* v = v

Since multiplication on a field is commutative, we can also add

 
  (*.) :: V -> F -> V
  v *. f = f .* v

with analogous rules.

But when F is only a Ring, we call the analogous structure a module, and in a ring, we can't rely on the commutativity of multiplication, so we may have to deal left-modules and right-modules, where only one of those products is available.

We can weaken the structure still further. If we lose the negation in our Ring we and go to a Rig (often called a Semiring), now our module is an additive moniod.

If we get rid of the additive and multiplicative unit on our Rig we get down to what some authors call a Ringoid, but which we'll call a Semiring here, because it makes the connection between semiring and semigroup clearer, and the -oid suffix is dangerously overloaded due to category theory.

First we'll define additive semigroups, because I'm going to need both additive and multiplicative monoids over the same types, and Data.Monoid has simultaneously too much and too little structure.

 
-- (a + b) + c = a + (b + c)
class Additive m where
  (+) :: m -> m -> m
  replicate1p :: Whole n => n -> m -> m -- (ignore this for now)
  -- ...

their Abelian cousins

 
-- a + b = b + a
class Additive m => Abelian m

and Multiplicative semigroups

 
-- (a * b) * c = a * (b * c)
class Multiplicative m where
  (*) :: m -> m -> m
  pow1p :: Whole n => m -> n -> m
  -- ...

Then we can define a semirings

 
-- a*(b + c) = a*b + a*c
-- (a + b)*c = a*c + b*c
class (Additive m, Abelian m, Multiplicative m) => Semiring

With that we can define modules over a semiring:

 
-- r .* (x + y) = r .* x + r .* y
-- (r + s) .* x = r .* x + s .* x
-- (r * s) .* x = r .* (s .* x)
class (Semiring r, Additive m) => LeftModule r m
   (.*) :: r -> m -> m

and analogously:

 
class (Semiring r, Additive m) => RightModule r m
   (*.) :: m -> r -> m

For instance every additive semigroup forms a semiring module over the positive natural numbers (1,2..) using replicate1p.

If we know that our addition forms a monoid, then we can form a module over the naturals as well

 
-- | zero + a = a = a + zero
class
    (LeftModule Natural m,
    RightModule Natural m
    ) => AdditiveMonoid m where
   zero :: m
   replicate :: Whole n => n -> m -> m

and if our addition forms a group, then we can form a module over the integers

 
-- | a + negate a = zero = negate a + a
class
    (LeftModule Integer m
    , RightModule Integer m
    ) => AdditiveGroup m where
  negate :: m -> m
  times :: Integral n => n -> m -> m
  -- ...

Free Modules over Semirings

A free module on a set E, is a module where the basis vectors are elements of E. Basically it is |E| copies of some (semi)ring.

In Haskell we can represent the free module of a ring directly by defining the action of the (semi)group pointwise.

 
instance Additive m => Additive (e -> m) where
   f + g = \x -> f x + g x
 
instance Abelian m => Abelian (e -> m)
 
instance AdditiveMonoid m => AdditiveMonoid (e -> m) where
   zero = const zero
 
instance AdditiveGroup m => AdditveGroup (e -> m) where
   f - g = \x -> f x - g x

We could define the following

 
instance Semiring r => LeftModule r (e -> m) where
   r .* f = \x -> r * f x

but then we'd have trouble dealing with the Natural and Integer constraints above, so instead we lift modules

 
instance LeftModule r m => LeftModule r (e -> m) where
   (.*) m f e = m .* f e
 
instance RightModule r m => RightModule r (e -> m) where
   (*.) f m e = f e *. m

We could go one step further and define multiplication pointwise, but while the direct product of |e| copies of a ring _does_ define a ring, and this ring is the one provided by the Conal Elliot's vector-space package, it isn't the most general ring we could construct. But we'll need to take a detour first.

Linear Functionals

A Linear functional f on a module M is a linear function from a M to its scalars R.

That is to say that, f : M -> R such that

 
f (a .* x + y) = a * f x + f y

Consequently linear functionals also form a module over R. We call this module the dual module M*.

Dan Piponi has blogged about these dual vectors (or covectors) in the context of trace diagrams.

If we limit our discussion to free modules, then M = E -> R, so a linear functional on M looks like (E -> R) -> R
subject to additional linearity constraints on the result arrow.

The main thing we're not allowed to do in our function is apply our function from E -> R to two different E's and then multiply the results together. Our pointwise definitions above satisfy those linearity constraints, but for example:

 
bad f = f 0 * f 0

does not.

We could capture this invariant in the type by saying that instead we want

 
newtype LinearM r e =
  LinearM {
    runLinearM :: forall r. LeftModule r m => (e -> m) -> m
  }

we'd have to make a new such type every time we subclassed Semiring. I'll leave further exploration of this more exotic type to another time. (Using some technically illegal module instances we can recover more structure that you'd expect.)

Now we can package up the type of covectors/linear functionals:

 
infixr 0 $*
newtype Linear r a = Linear { ($*) :: (a -> r) -> r }

The sufficiently observant may have already noticed that this type is the same as the Cont monad (subject to the linearity restriction on the result arrow).

In fact the Functor, Monad, Applicative instances for Cont all carry over, and preserve linearity.

(We lose callCC, but that is at least partially due to the fact that callCC has a less than ideal type signature.)

In addition we get a number of additional instances for Alternative, MonadPlus, by exploiting the knowledge that r is ring-like:

 
instance AdditiveMonoid r => Alternative (Linear r a) where
  Linear f < |> Linear g = Linear (f + g)
  empty = Linear zero

Note that the (+) and zero there are the ones defined on functions from our earlier free module construction!

Linear Maps

Since Linear r is a monad, Kleisli (Linear r) forms an Arrow:

 
b -> ((a -> r) ~> r)

where the ~> denotes the arrow that is constrained to be linear.

If we swap the order of the arguments so that

 
(a -> r) ~> (b -> r)

this arrow has a very nice meaning! (See Numeric.Map.Linear)

 
infixr 0 $#
newtype Map r b a = Map { ($#) :: (a -> r) -> (b -> r) }

Map r b a represents the type of linear maps from a -> b. Unfortunately due to contravariance the arguments wind up in the "wrong" order.

 
instance Category (Map r) where
  Map f . Map g = Map (g . f)
  id = Map id

So we can see that a linear map from a module A with basis a to a vector space with basis b effectively consists of |b| linear functionals on A.

Map r b a provides a lot of structure. It is a valid instance of an insanely large number of classes.

Vectors and Covectors

In physics, we sometimes call linear functionals covectors or covariant vectors, and if we're feeling particularly loquacious, we'll refer to vectors as contravariant vectors.

This has to do with the fact that when you change basis, you change map the change over covariant vectors covariantly, and map the change over vectors contravariantly. (This distinction is beautifully captured by Einstein's summation notation.)

We also have a notion of covariance and contravariance in computer science!

Functions vary covariantly in their result, and contravariant in their argument. E -> R is contravariant in E. But we chose this representation for our free modules, so the vectors in our free vector space (or module) are contravariant in E.

 
class Contravariant f where
  contramap :: (a -> b) -> f a -> f b
 
-- | Dual function arrows.
newtype Op a b = Op { getOp :: b -> a } 
 
instance Contravariant (Op a) where
  contramap f g = Op (getOp g . f)

On the other hand (E -> R) ~> R varies covariantly with the change of E.

as witnessed by the fact that it is a Functor.

 
instance Functor (Linear r) where
  fmap f m = Linear $ \k -> m $* k . f

We have lots of classes for manipulating covariant structures, and most of them apply to both (Linear r) and (Map r b).

Other Representations and Design Trade-offs

One common representation of vectors in a free vector space is as some kind of normalized list of scalars and basis vectors. In particular, David Amos's wonderful HaskellForMaths uses

 
newtype Vect r a = Vect { runVect :: [(r,a)] }

for free vector spaces, only considering them up to linearity, paying for normalization as it goes.

Given the insight above we can see that Vect isn't a representation of vectors in the free vector space, but instead represents the covectors of that space, quite simply because Vect r a varies covariantly with change of basis!

Now the price of using the Monad on Vect r is that the monad denormalizes the representation. In particular, you can have multiple copies of the same basis vector., so any function that uses Vect r a has to merge them together.

On the other hand with the directly encoded linear functionals we've described here, we've placed no obligations on the consumer of a linear functional. They can feed the directly encoded linear functional any vector they want!

In fact, it'll even be quite a bit more efficient to compute,

To see this, just consider:

 
instance MultiplicativeMonoid r => Monad (Vect r) where
   return a = Vect [(1,a)]
   Vect as >>= f = Vect
       [ (p*q, b) | (p,a) < - as, (q,b) <- runVect (f b) ]

Every >>= must pay for multiplication. Every return will multiply the element by one. On the other hand, the price of return and bind in Linear r is function application.

 
instance Monad (Linear r) where
  return a = Linear $ \k -> k a
  m >>= f = Linear $ \k -> m $* \a -> f a $* k

A Digression on Free Linear Functionals

To wax categorical for a moment, we can construct a forgetful functor U : Vect_F -> Set that takes a vector space over F to just its set of covectors.

 
F E = (E -> F, F,\f g x -> f x + g x ,\r f x -> r * f x)

using the pointwise constructions we built earlier.

Then in a classical setting, you can show that F is left adjoint to U.

In particular the witnesses of this adjunction provide the linear map from (E -> F) to V and the function E -> (V ~> F) giving a linear functional on V for each element of E.

In a classical setting you can go a lot farther, and show that all vector spaces (but not all modules) are free.

But in a constructive setting, such as Haskell, we need a fair bit to go back and forth, in particular we wind up need E to be finitely enumerable to go one way, and for it to have decidable equality to go in the other. The latter is fairly easy to see, because even going from E -> (E -> F) requires that we can define and partially apply something like Kronecker's delta:

 
delta :: (Rig r, Eq a) => e -> e -> r
delta i j | i == j = one
             | otherwise = zero

The Price of Power

The price we pay is that, given a Rig, we can go from Vect r a to Linear r a but going back requires a to be be finitely enumerable (or for our functional to satisfy other exotic side-conditions).

 
vectMap :: Rig r => Vect r a -> Linear r a
vectMap (Vect as) = Map $ \k -> sum [ r * k a | (r, a) < - as ]

You can still probe Linear r a for individual coefficients, or pass it a vector for polynomial evaluation very easily, but for instance determining a degree of a polynomial efficiently requires attaching more structure to your semiring, because the only value you can get out of Linear r a is an r.

Optimizing Linear Functionals

In both the Vect r and Linear r cases, excessive use of (>>=) without somehow normalizing or tabulating your data will cause a lot of repeated work.

This is perhaps easiest to see from the fact that Vect r never used the addition of r, so it distributed everything into a kind of disjunctive normal form. Linear r does the same thing.

If you look at the Kleisli arrows of Vect r or Linear r as linear mappings, then you can see that Kleisli composition is going to explode the number of terms.

So how can we collapse back down?

In the Kleisli (Vect r) case we usually build up a map as we walk through the list then spit the list back out in order having added up like terms.

In the Map r case, we can do better. My representable-tries package provides a readily instantiable HasTrie class, and the method:

 
memo :: HasTrie a => (a -> r) -> a -> r

which is responsible for providing a memoized version of the function from a -> r in a purely functional way. This is obviously a linear map!

 
memoMap :: HasTrie a => Map r a a
memoMap = Map memo

We can also flip memo around and memoize linear functionals.

 
memoLinear :: HasTrie a => a -> Linear r a
memoLinear = Linear . flip memo

Next time, (co)associative (co)algebras and the myriad means of multiplying (co)vectors!

Monads from Comonads

Edward Kmett — Mon, 27 Jun 2011 20:50:32 +0000

Today I'll show that you can derive a Monad from any old Comonad you have lying around.

But first, we'll need to take a bit of a bit of a detour.

A Monad Sandwich

We'll need the definition of an adjunction on the category of Haskell types, which we can strip down and borrow from my adjunctions package.

 
class (Functor f, Representable u) =>
         Adjunction f u | f -> u, u -> f where
    leftAdjunct :: (f a -> b) -> a -> u b
    rightAdjunct :: (a -> u b) -> f a -> b

Here we can define our Adjunction by defining leftAdjunct and rightAdjunct, such that they witness an isomorphism from (f a -> b) to (a -> u b)

Every Adjunction F -| G : C -> D, gives rise to a monad GF on D and a Comonad FG on C.

In addition to this, you can sandwich an additional monad M on C in between GF to give a monad GMF on D:

Control.Monad.Trans.Adjoint

and you can sandwich a comonad W on D in between F and G to yield the comonad FWG on C:

Control.Comonad.Trans.Adjoint

A Contravariant Comonad Sandwich

As was first shown to me me by Derek Elkins, this construction works even when you C is not the category of Haskell types!

Consider the Contravariant functor Op r:

 
newtype Op a b = Op { getOp :: b -> a } 
 
instance Contravariant (Op a) where
  contramap f g = Op (getOp g . f)

We can view Op r as a functor from Hask^op -> Hask or as one from Hask -> Hask^op.

We can define a notion of a contravariant adjunction F -| G : Hask^op -> Hask.

Data.Functor.Contravariant.Adjunction

 
class (Contravariant f, Corepresentable g) =>
       Adjunction f g | f -> g, g -> f where
    leftAdjunct :: (b -> f a) -> a -> g b
    rightAdjunct :: (a -> g b) -> b -> f a

Where, now, leftAdjunct and rightAdjunct witness the isomorphism from (f a < - b) to (a -> g b), which means once you flip the arrow around both seem to be going the same way. Ultimately any contravariant adjunction on Hask is comprised of two isomorphic functors, each self-adjoint.

This gives rise to one notion of a comonad-to-monad transformer!

Control.Monad.Trans.Contravariant.Adjoint

But we can we do better?

An End as the Means

First, some boilerplate.

 
{-# LANGUAGE Rank2Types, FlexibleInstances, FlexibleContexts, MultiParamTypeClasses, UndecidableInstances #-}
 
import Data.Monoid
import Control.Comonad
import Control.Applicative
import Control.Comonad.Store.Class
import Control.Comonad.Env.Class as Env
import Control.Comonad.Traced.Class as Traced
import Control.Monad.Reader.Class
import Control.Monad.Writer.Class
import Control.Monad.State.Class
import Data.Functor.Bind

Our new comonad to monad transformer is given by

 
newtype Co w a = Co { runCo :: forall r. w (a -> r) -> r }

What we've done is added a quantifier to prevent the use of the type r, as we did when describing Codensity and Ran, categorically we've taken some kind of end. This idea came to me after an observation was made by Russell O'Connor that Conts (Store s) a was pretty close to a continuation passing style version of State s.

Now, we can start spitting out instances for this type.

 
instance Functor w => Functor (Co w) where
   fmap f (Co w) = Co (w . fmap (. f))
 
instance Comonad w => Monad (Co w) where
   return a = Co (`extract` a)
   Co k >>= f = Co (k .extend (\wa a -> runCo (f a) wa))
 
instance Comonad w => Applicative (Co w) where
   mf < *> ma = mf >>= \f -> fmap f ma
   pure a = Co (`extract` a)

In my break-out of category-extras, I've split off the semigroupoid structure of Kleisli-, co-Kleisli-, and static- arrow composition as Bind, Extend and Apply respectively, so we can make use of slightly less structure and get slightly less structure in turn:

 
instance Extend w => Bind (Co w) where
   Co k >>- f = Co (k .extend (\wa a -> runCo (f a) wa))
 
instance Extend w => Apply (Co w) where
   mf < .> ma = mf >>- \f -> fmap f ma

From comonad-transformers to the mtl

We can look at how this transforms some particular comonads.

The comonadic version of State is Store. Looking at Co (Store s) a

 
Co (Store s) a ~ forall r. ((s -> a -> r, s) -> r)
               ~ forall r. (s -> a -> r) -> s -> r
               ~ forall r. (a -> s -> r) -> s -> r
               ~ Codensity ((->)s) a
               ~ State s a

This gives rise to a leap of intuition that we'll motivate further below:

 
instance ComonadStore s m => MonadState s (Co m) where
   get = Co (\w -> extract w (pos w))
   put s = Co (\w -> peek s w ())

Sadly this breaks down a little for Writer and Reader as the mtl unfortunately has historically included a bunch of extra baggage in these classes. In particular, in reader, the notion of local isn't always available, blocking some otherwise perfectly good MonadReader instances, and I've chosen not to repeat this mistake in comonad-transformers.

 
instance ComonadEnv e m => MonadReader e (Co m) where
   ask = Co (\w -> extract w (Env.ask w))
   local = error "local"

Ideally, local belongs in a subclass of MonadReader.

 
class Monad m => MonadReader e m | m -> e where
   ask :: m a -> e
 
class MonadReader e m => MonadLocal e m | m -> e where
   local :: (e -> e) -> m a -> m a

Similarly there is a lot of baggage in the MonadWriter. The Monoid constraint isnt necessary for the class itself, just for most instances, and the listen and pass members should be a member of a more restricted subclass as well to admit some missing MonadWriter instances, but we can at least provide the notion of tell that is critical to Writer.

 
instance (Monoid e, ComonadTraced e m) => MonadWriter e (Co m) where
   tell m = Co (\w -> Traced.trace m w ())
   listen = error "listen"
   pass = error "pass"

But given the split out

 
instance Monad m => MonadWriter e m | m -> e where
    tell :: e -> m ()
 
instance MonadWriter e m => MonadListen e m | m -> e
    listen :: m a -> m (a, w)
    pass :: m (a, w -> w) -> m a

We could provide this functionality more robustly. (There is a similar subset of Comonads that can provide listen and pass analogues.)

While I am now the maintainer of the mtl, I can't really justify making the above corrections to the class hierarchy at this time. They would theoretically break a lot of code. I would be curious to see how much code would break in practice though.

Combinators Please!

There is a recurring pattern in the above code, so we can also improve this construction by providing some automatic lifting combinators that take certain cokleisli arrows and give us monadic values

 
lift0 :: Comonad w => (forall a. w a -> s) -> Co w s
lift0 f = Co (extract < *> f)
 
lift1 :: (forall a. w a -> a) -> Co w ()
lift1 f = Co (`f` ())

along with their inverses

 
lower0 :: Functor w => Co w s -> w a -> s
lower0 (Co f) w = f (id < $ w)
 
lower1 :: Functor w => Co w () -> w a -> a
lower1 (Co f) w = f (fmap const w)

(The proofs that these are inverses are quite hairy, and lean heavily on parametricity.)

Then in the above, the code simplifies to:

 
get = lift0 pos
put s = lift1 (peek s)
ask = lift0 Env.ask
tell s = lift1 (tell s)

Co-Density?

Co and Codensity are closely related.

Given any Comonad W, it is given rise to by the composition FG for some adjunction F -| G : Hask -> C.

Considering only the case where C = Hask for now, we can find that

 
Co w a ~ forall r. (f (g (a -> r)) -> r).

Since f -| g, we know that g is Representable by f (), as witnessed by:

 
tabulateAdjunction :: Adjunction f u => (f () -> b) -> u b
tabulateAdjunction f = leftAdjunct f ()
 
indexAdjunction :: Adjunction f u => u b -> f a -> b
indexAdjunction = rightAdjunct . const

therefore

 
Co w a ~ f (g (a -> r)) -> r ~ f (f () -> a -> r) -> r

Since f is a left adjoint functor, f a ~ (a, f ()) by Sjoerd Visscher's elegant little split combinator:

 
split :: Adjunction f u => f a -> (a, f ())
split = rightAdjunct (flip leftAdjunct () . (,))

which has the simple inverse

 
unsplit :: Adjunction f g => a -> f () -> f a
unsplit a = fmap (const a)

so we can apply that to our argument:

 
Co w a ~ forall r. f (f () -> a -> r) -> r ~
         forall r. (f () -> a -> r, f ()) -> r

and curry to obtain

 
Co w a ~ forall r. (f () -> a -> r) -> f () -> r

and swap the arguments

 
Co w a ~ forall r. (a -> f () -> r) -> f () -> r

then we can tabulate the two subtypes of the form (f () -> r)

 
Co w a ~ forall r. (a -> g r) -> g r

and so we find that

 
Co w a ~ Codensity g a

Finally,

 
Codensity g a ~ Ran g g a

but we showed back in my second article on Kan extensions that given f -| g that

 
Ran g g a ~ g (f a)

So Co w ~ Co (f . g) ~ (g . f), the monad given rise to by composing our adjunction the other way!

Comonads from Monads?

Now, given all this you might ask

Is there is a similar construction that lets you build a comonad out of a monad?

Sadly, it seems the answer in Haskell is no.

Any adjunction from Hask -> Hask^op would require two functions

 
class (Contravariant f, Contravariant g) => DualContravariantAdjunction f g where
    leftAdjunct :: (f a -> b) -> g b -> a
    rightAdjunct :: (g b -> a) -> f a -> b

where both functors are contravariant.

Surmounting the intuitionistic impossibility of this, then given any such adjunction, there would be a nice coend we could take, letting us sandwich any Monad in the middle as we did above.

There does exist one such very boring Contravariant Functor.

 
newtype Absurd a = Absurd (Absurd a)
 
absurdity :: Absurd a -> b
absurdity (Absurd a) = absurdity a
 
instance Contravariant Absurd where
   contramap f (Absurd as) = Absurd (contramap f as)
 
instance DualContravariantAdjunction Absurd Absurd where
    leftAdjunct _ = absurdity
    rightAdjunct _ = absurdity

We can safely sandwich IO within this adjunction from Hask -> Hask^op to obtain a comonad.

 
newtype Silly m a = Silly { runSilly :: Absurd (m (Absurd a)) }
 
instance Monad m => Extend (Silly m) where
    extend f (Silly m) = absurdity m
 
instance Monad m => Comonad (Silly m) where
    extract (Silly m) = absurdity m

But for any more interesting such type that actually lets us get at its contents, we would be able to derive a circuitous path to unsafePerformIO!

Since unsafePerformIO should not be constructible without knowing IO specifics, no useful DualContravariantAdjunctions should exist.

Free Monads for Less (Part 2 of 3): Yoneda

Edward Kmett — Fri, 24 Jun 2011 04:49:46 +0000

Last time, I started exploring whether or not Codensity was necessary to improve the asymptotic performance of free monads.

This time I'll show that the answer is no; we can get by with something smaller.

The Yoneda Lemma

Another form of right Kan extension arises from the Yoneda lemma.

I covered it briefly in my initial article on Kan extensions, but the inestimable Dan Piponi wrote a much nicer article on how it implies in Haskell that given a Functor instance on f, this type

 
newtype Yoneda f a = Yoneda (forall r. (a -> r) -> f r)

is isomorphic to f a, witnessed by these natural transformations:

 
liftYoneda :: Functor f => f a -> Yoneda f a
liftYoneda a = Yoneda (\f -> fmap f a)
 
lowerYoneda :: Yoneda f a -> f a
lowerYoneda (Yoneda f) = f id

That said, you are not limited to applying Yoneda to types that have Functor instances.

This type and these functions are provided by Data.Functor.Yoneda from the kan-extensions package.

Codensity vs. Yoneda

Note, Yoneda f is in some sense smaller than Codensity f, as Codensity f a is somewhat 'bigger' than f a, despite providing an embedding, while Yoneda f a is isomorphic.

For example, Codensity ((->) s) a is isomorphic to State s a, not to s -> a as shown by:

 
instance MonadState s (Codensity ((->) s)) where
   get = Codensity (\k s -> k s s)
   put s = Codensity (\k _ -> k () s)

Now, Codensity is a particular form of right Kan extension, which always yields a Monad, without needing anything from f.

Here we aren't so fortunate, but we do have the fact that Yoneda f is always a Functor, regardless of what f is, as shown by:

 
instance Functor (Yoneda f) where
  fmap f (Yoneda m) = Yoneda (\k -> m (k . f))

which was obtained just by cutting and pasting the appropriate definition from Codensity or ContT, and comes about because Yoneda is a right Kan extension, like all of those.

To get a Monad instance for Yoneda f we need to lean on f somehow.

One way is to just borrow a Monad instance from f, since f a is isomorphic to Yoneda f a, if we have a Functor for f, and if we have a Monad, we can definitely have a Functor.

 
instance Monad m => Monad (Yoneda m) where
  return a = Yoneda (\f -> return (f a))
  Yoneda m >>= k = Yoneda (\f -> m id >>= \a -> runYoneda (k a) f)

Map Fusion and Reassociating Binds

Unlike Codensity the monad instance above isn't very satisfying, because it uses the >>= of the underlying monad, and as a result the >>=s will wind up in the same order they started.

On the other hand, the Functor instance for Yoneda f is still pretty nice because the (a -> r) part of the type acts as an accumulating parameter fusing together uses of fmap.

This is apparent if you expand lowerYoneda . fmap f . fmap g . liftYoneda , whereupon you can see we only call fmap on the underlying Functor once.

Intuitively, you can view Yoneda as a type level construction that ensures that you get fmap fusion, while Codensity is a type level construction that ensures that you right associate binds. It is important to note that Codensity also effectively accumulates fmaps, as it uses the same definition for fmap as Yoneda!

With this in mind, it doesn't usually make much sense to use Codensity (Codensity m) or Yoneda (Yoneda m) because the purpose being served is redundant.

Less obviously, Codensity (Yoneda m) is also redundant, because as noted above, Codensity also does fmap accumulation.

Other Yoneda-transformed Monads

Now, I said one way to define a Monad for Yoneda f was to borrow an underlying Monad instance for f, but this isn't the only way.

Consider Yoneda Endo. Recall that Endo from Data.Monoid is given by

 
newtype Endo a = Endo { appEndo :: a -> a }

Clearly Endo is not a Monad, it can't even be a Functor, because a occurs in both positive and negative position.

Nevertheless Yoneda Endo can be made into a monad -- the continuation passing style version of the Maybe monad!

 
newtype YMaybe a = YMaybe (forall r. (a -> r) -> r -> r)

I leave the rather straightforward derivation of this Monad for the reader. A version of it is present in monad-ran.

This lack of care for capital-F Functoriality also holds for Codensity, Codensity Endo can be used as a two-continuation list monad. It is isomorphic to the non-transformer version of Oleg et al.'s LogicT, which is available on hackage as logict from my coworker, Dan Doel.

The Functor, Applicative, Monad, MonadPlus and many other instances for LogicT can be rederived in their full glory from Codensity (GEndo m) automatically, where

 
newtype GEndo m r = GEndo (m r -> m r)

without any need for conscious thought about how the continuations are plumbed through in the Monad.

Bananas in Space

One last digression,

 
newtype Rec f r = (f r -> r) -> r

came up once previously on this blog in Rotten Bananas. In that post, I talked about how Fegaras and Sheard used a free monad (somewhat obliquely) in "Revisiting catamorphisms over datatypes with embedded functions" to extend catamorphisms to deal with strong HOAS, and then talked further about how Stephanie Weirich and Geoffrey Washburn used Rec to replace the free monad used by Fegaras and Sheard. That said, they did so in a more restricted context, where any mapping was done by giving us both an embedding and a projection pair.

Going to Church

We can't just use Rec f a instead of Free f a here, because Free f a is a functor, while Rec f a is emphatically not.

However, if we apply Yoneda to Rec f, we obtain a Church-encoded continuation-passing-style version of Free!

 
newtype F f a = F { runF :: forall r. (a -> r) -> (f r -> r) -> r }

Since this is of the form of Yoneda (Rec f), it is clearly a Functor:

 
instance Functor (F f) where
   fmap f (F g) = F (\kp -> g (kp . f))

And nicely, without knowing anything about f, we also get a Monad!

 
instance Monad (F f) where
   return a = F (\kp _ -> kp a)
   F m >>= f = F (\kp kf -> m (\a -> runF (f a) kp kf) kf)

But when we >>= all we do is change the continuation for (a -> r), leaving the f-algebra, (f r -> r), untouched.

Now, F is a monad transformer:

 
instance MonadTrans F where
   lift f = F (\kp kf -> kf (liftM kp f))

which is unsurprisingly, effectively performing the same operation as lifting did in Free.

Heretofore, we've ignored everything about f entirely.

This has pushed the need for the Functor on f into the wrapping operation:

 
instance Functor f => MonadFree f (F f) where
   wrap f = F (\kp kf -> kf (fmap (\ (F m) -> m kp kf) f))

Now, we can clearly transform from our representation to any other free monad representation:

 
fromF :: MonadFree f m => F f a -> m a
fromF (F m) = m return wrap

or to it from our original canonical ADT-based free monad representation:

 
toF :: Functor f => Free f a -> F f a
toF xs = F (\kp kf -> go kp kf xs) where
  go kp _  (Pure a) = kp a
  go kp kf (Free fma) = kf (fmap (go kp kf) fma)

So, F f a is isomorphic to Free f a.

So, looking at Codensity (F f) a as Codensity (Yoneda (Rec f)), it just seems silly.

As we mentioned before, we should be able to go from Codensity (Yoneda (Rec f)) a to Codensity (Rec f) a, since Yoneda was just fusing uses of fmap, while Codensity was fusing fmap while right-associating (>>=)'s.

Swallowing the Bigger Fish

So, the obvious choice is to try to optimize to Codensity (Rec f) a. If you go through the motions of encoding that you get:

 
newtype CF f a = CF (forall r. (a -> (f r -> r) -> r) -> (f r -> r) -> r)

which is in some sense larger than F f a, because the first continuation gets both an a and an f-algebra (f r -> r).

But tellingly, once you write the code, the first continuation never uses the extra f-algebra you supplied it!

So Codensity (Yoneda (Rec f)) a gives us nothing of interest that we don't already have in Yoneda (Rec f) a.

Consequently, in this special case rather than letting Codensity (Yoneda x) a swallow the Yoneda to get Codensity x a we can actually let the Yoneda swallow the surrounding Codensity obtaining Yoneda (Rec f) a, the representation we started with.

Scott Free

Finally, you might ask if a Church encoding is as simple as we could go. After all a Scott encoding

 
newtype ScottFree f a = ScottFree
    { runScottFree :: forall r.
       (a -> r) -> (f (ScottFree f a) -> r) -> r
    }

would admit easier pattern matching, and a nice pun, and seems somewhat conceptually simpler, while remaining isomorphic.

But the Monad instance:

 
instance Functor f => Monad (ScottFree f) where
   return a = ScottFree (\kp _ -> kp a)
   ScottFree m >>= f = ScottFree
       (\kb kf -> m (\a -> runScottFree (f a) kb kf) (kf . fmap (>>= f)))

needs to rely on the underlying bind, and you can show that it won't do the right thing with regards to reassociating.

So, alas, we cannot get away with ScottFree.

Nobody Sells for Less

So, now we can rebuild Voigtländer's improve using our Church-encoded / Yoneda-based free monad F, which is precisely isomorphic to Free, by using

 
lowerF :: F f a -> Free f a
lowerF (F f) = f Pure Free

to obtain

 
improve :: (forall a. MonadFree f m => m a) -> Free f a
improve m = lowerF m

And since our Church-encoded free monad is isomorphic to the simple ADT encoding, our new solution is as small as it can get.

Next time, we'll see this construction in action!

Reverse-Mode Automatic Differentiation in Haskell

Edward Kmett — Sun, 16 May 2010 04:27:12 +0000

I've uploaded a package named rad to Hackage for handling reverse-mode automatic differentiation in Haskell.

Internally, it leverages a trick from Andy Gill's Kansas Lava to observe sharing in the tape it records for back propagation purposes, and uses type level branding to avoid confusing sensitivities.

I've tried to keep the API relatively close to that of Barak Pearlmutter and Jeffrey Mark Siskind's fad package, but I couldn't resist making a couple of minor tweaks here and there for generality.

I still need to go through and finish up the remaining unimplemented fad combinators, figure out a nice way to build a reverse-mode AD tower, validate that I didn't screw up my recollection of basic calculus, and provide a nice API for using this approach to get local reverse mode checkpoints in an otherwise forward mode AD program, but I am quite happy with how things have progressed thus far.

[Edit: I've uploaded minor bug fixes for exp and (**)]

Remodeling Precision

Edward Kmett — Tue, 15 Sep 2009 18:56:05 +0000

Two concepts come up when talking about information retrieval in most standard documentation, Precision and Recall. Precision is a measure that tells you if your result set contains only results that are relevant to the query, and recall tells you if your result set contains everything that is relevant to the query.

The formula for classical precision is:

However, I would argue that the classical notion of Precision is flawed, in that it doesn't model anything we tend to care about. Rarely are we interested in binary classification, instead we want a ranked classification of relevance.

When Google tells you that you have a million results, do you care? No, you skim the first few entries for what it is that you are looking for, unless you are particularly desperate for an answer. So really, you want a metric that models the actual behavior of a search engine user and that level of desperation.

There are two issues with classical precision:

the denominator of precision goes to infinity as the result set increases in size
each result is worth the same amount no matter where it appears in the list

The former ensures that a million answers drowns out any value from the first screen, the latter ensures that it doesn't matter which results are on the first screen. A more accurate notion of precision suitable for modern search interfaces should model the prioritization of the results, and should allow for a long tail of crap if the stuff that people will look at is accurate over all.

So how to model user behavior? We can replace the denominator with a partial sum of a geometric series for probability p < 1, where p models the percentage chance that a user will continue to browse to the next item in the list. Then you can scale the value of the nth summand in the numerator as being worth up to pⁿ. If you have a ranked training set it is pretty easy to score precision in this fashion.

You retain all of the desirable properties of precision. It maxes out at 100%, it decreases when you give irrelevant results, but now it effectively models when you return irrelevant results early in your result list.

The result more accurately models user behavior when faced with a search engine than the classical binary precision metric. The parameter p models the desperation of the user and can vary to fit your problem domain. I personally like p=50%, because it makes for nice numbers, but it should proabably be chosen based on sampling based on knowledge of the search domain.

You can of course embellish this model with a stair-step in the cost function on each page boundary, etc. — any monotone decreasing infinite series that sums to a finite number in the limit should do.

A similar modification can of course be applied to recall.

I used this approach a couple of years ago to help tune a search engine to good effect. I went to refer someone to this post today and I realized I hadn't posted it in the almost two years since it was written, so here it is, warts and all.

If anyone is familiar with similar approaches in the literature, I'd be grateful for references!

Iteratees, Parsec and Monoids (Slides)

Edward Kmett — Thu, 20 Aug 2009 16:55:03 +0000

I was asked to give two talks at the Boston Area Haskell User Group for this past Tuesday. The first was pitched at a more introductory level and the second was to go deeper into what I have been using monoids for lately.

The first talk covers an introduction to the mathematical notion of a monoid, introduces some of the features of my Haskell monoids library on hackage, and starts to motivate the use of monoidal parallel/incremental parsing, and the modification use of compression algorithms to recycle monoidal results.

The second talk covers a way to generate a locally-context sensitive parallel/incremental parser by modifying Iteratees to enable them to drive a Parsec 3 lexer, and then wrapping that in a monoid based on error productions in the grammar before recycling these techniques at a higher level to deal with parsing seemingly stateful structures, such as Haskell layout.

Due to a late start, I was unable to give the second talk. However, I did give a quick run through to a few die-hards who stayed late and came to the Cambridge Brewing Company afterwards. As I promised some people that I would post the slides after the talk, here they are.

The current plan is to possibly give the second talk in full at either the September or October Boston Haskell User Group sessions, depending on scheduling and availability.

[ Iteratee.hs ]

Slides from Hac Phi: “All About Monoids”

Edward Kmett — Fri, 31 Jul 2009 15:41:16 +0000

Some people have requested my slides from the short talk I gave about monoids and monoidal parsing at Hac Phi. So, here they are.

There will be more to come at the next Boston Haskell User Group in August, where it looks like I'll be giving two short talks covering monoids. I may use the monoidal parsing engine from Kata as an example for the advanced talk if I have time and will start to cover parsing larger classes of grammars in general (regular languages, CFGs/TIGs, TAGs, PEGs, LALR, attribute-grammars, etc.)