Tue 13 May 2008
Towards Formal Power Series for Functors
Posted by Edward Kmett under Category Theory , Haskell , Mathematics , Type Theory[12] Comments
The post below will only compile on a version of GHC >= 6.9, since it uses type families.
There has been a lot of posting recently about automatic differentiation in Haskell, and I wanted to try the same thing with functors in the spirit of Conor McBride's Clowns to the Left of me, Jokers to the Right and The derivative of a regular type is its type of one hole contexts, figuring that a Power Series could fully generalize Christophe Poucet's Higher Order Zippers, and might provide me with a neat extension to the zipper comonadic automata I've been aluding to recently.
{-# OPTIONS -fglasgow-exts -fallow-undecidable-instances -fallow-overlapping-instances #-} module Derivatives where import Control.Monad.Identity import Control.Arrow ((+++),(***),(&&&)) import Data.Monoid infixl 9 :.: infixl 7 :*: infixl 6 :+:
To avoid importing category-extras and keep this post self-contained (modulo GHC 6.9!), we'll define some preliminaries such as Bifunctors:
class Bifunctor f where bimap :: (a -> c) -> (b -> d) -> f a b -> f c d instance Bifunctor (,) where bimap f g ~(a,b) = (f a, g b) instance Bifunctor Either where bimap f _ (Left a) = Left (f a) bimap _ g (Right b) = Right (g b)
Constant functors:
data Void instance Show Void where show _ = "Void" newtype Const k a = Const { runConst :: k } deriving (Show) type Zero = Const Void type One = Const () instance Functor (Const k) where fmap f = Const . runConst
and functor products and coproducts:
newtype Lift p f g a = Lift { runLift :: p (f a) (g a) } type (:+:) = Lift Either type (:*:) = Lift (,) instance Show (p (f a) (g a)) => Show (Lift p f g a) where show (Lift x) = "(Lift (" ++ show x ++ "))" instance (Bifunctor p, Functor f, Functor g) => Functor (Lift p f g) where fmap f = Lift . bimap (fmap f) (fmap f) . runLift
and finally functor composition
newtype (f :.: g) a = Comp { runComp :: f (g a) } deriving (Show) instance (Functor f, Functor g) => Functor (f :.: g) where fmap f = Comp . fmap (fmap f) . runComp
So then, an ideal type for repeated differentiation would look something like the following, for some definition of D.
[Edit: sigfpe pointed out, quite rightly, that this is just repeated differentiation, and apfelmus pointed out that it not a power series, because I have no division!]
newtype AD f a = AD { runAD :: (f a, AD (D f) a) }
As a first crack at D, you might be tempted to just go with a type family:
{-
type family D (f :: * -> *) :: * -> *
type instance D Identity = One
type instance D (Const k) = Zero
type instance D (f :+: g) = D f :+: D g
type instance D (f :*: g) = f :*: D g :+: D f :*: g
type instance D (f :.: g) = (D f :.: g) :*: D g
-}
This could take you pretty far, but unfortunately doesn't adequately provide you with any constraints on the type so that we can treat AD f as a functor.
So, we'll go with:
class (Functor (D f), Functor f) => Derivable (f :: * -> *) where type D f :: * -> *
and cherry pick the instances necessary to handle the above cases:
instance Derivable Identity where type D Identity = One instance Derivable (Const k) where type D (Const k) = Zero instance (Derivable f, Derivable g) => Derivable (f :+: g) where type D (f :+: g) = D f :+: D g instance (Derivable f, Derivable g) => Derivable (f :*: g) where type D (f :*: g) = f :*: D g :+: D f :*: g instance (Derivable f, Derivable g) => Derivable (f :.: g) where type D (f :.: g) = (D f :.: g) :*: D g
With those instances in hand, we can define the definition of a Functor for the automatic differentiation of a Functor built out of these primitives:
instance (Derivable f, Functor (AD (D f))) => Functor (AD f) where fmap f = Power . bimap (fmap f) (fmap f) . runPower
Unfortunately, here is where I run out of steam, because any attempt to actually use the construct in question blows the context stack because the recursion for Functor (AD f) isn't well founded and my attempts to force it to be so through overlapping-instances have thus-far failed.
Thoughts?
May 13th, 2008 at 7:38 am
I don’t see why you’d need overlapping or undecidable instances, all your class instances look like H98 (modulo the type family).
There seems to be a problem with the power series constructor, I fail to see how it’s different from
newtype Power _ a = P (a, Power _ a)
i.e. an infinite stream of a . The instance then simply becomes
instance Functor (Power f) where
fmap f = Power . bimap f (fmap f) . runPower
This is probably not what you want. In the case you want, the real obstruction is that for power series, the functor in question must be differentiable an infinite number of times. In other words, something like
instance Differentiable f => Differentiable (D f) where …
Then, a context
instance Differentiable f => Functor (Power f)
should be enough.
May 13th, 2008 at 10:04 am
Good point. OTOH:
instance Differentiable f => Differentiable (D f)
doesn’t work because D isn’t a type constructor for the compiler to grab onto, which is why I started beating my head against the other part.
What i was looking for was actually to say:
newtype Power f a = Power (f a, Power (D f) a)
instead of
newtype Power f a = Power (a, Power (D f) a)
[Edit: fixed]
May 13th, 2008 at 11:08 am
Yeah, just
instance Differentiable f => Differentiable (D f)
doesn’t work.
And concerning the power series, I think what you really want is
newtype Power f a = Power (f () :+: (a :*: Power (D f) a))
but that’s wrong too, because you would have to divide by $latex \frac1{n!}$ somewhere. Power series are different from (but related to) automatic differentiation! Most likely, the impossible division can be included by using bags of n elements instead of powers
a :*: a :*: ...
(= ordered pairs) of n elements. In any case, the functor instance for power series is no problem, since a is used explicitly.Not sure in the case of automatic differentiation; the main problem here is indeed to somehow tell the compiler that
instance Functor f => Functor (D f)
But since type families are open, I guess that’s not possible. I mean, a “mischievous” person could add a new case for this type family which violates this condition. So, the problem is not that the compiler doesn’t grok it, the problem is that there is a good reason that the compiler doesn’t accept it.
May 13th, 2008 at 11:32 am
True enough. I just started down this path because I was thinking in terms of inductive families of types, and realized with open type functions that you could put a more interesting operation in than just a pair, etc.
And yeah, without a notion of type division (maybe using species?), I don’t see how to get from this form of AD to a power series.
May 13th, 2008 at 11:43 am
Oh, maybe a more interesting instance for this would probably be the notion of a derivative for a list since that doesn’t zero out:
instance Derivable [] where
type D [] = [] :*: []
I also would like to find out a way to include a simplification step in the type function to go through and do the Zero :*: x = Zero step, etc to make the encoding a little less naive.
May 13th, 2008 at 3:21 pm
Although the context on Functor (AD f) is sufficient, it seems (to me) like the wrong place to specify it. Since
AD f a ~~ (f a, AD (D f) a)
With (Derivable f) => Functor (AD f) you only need (Derivable (D f)) to justify using bimap (fmap f) (fmap f). So, it seems like what you’d really want is:
class (Functor f, Derivable (D f)) => Derivable f where
type D f :: * -> *
instance (Derivable f) => Functor (AD f) where …
Where the context on Derivable enforces that all such functors are arbitrarily differentiable. But, of course, GHC complains about a cycle there (in 6.8, at least. I don’t have 6.9 handy).
Simplification should, I think, be possible with total type families once they go into 6.9. For instance:
type family (f :: * -> *) :+: (g :: * -> *) :: * -> * where
Zero :+: g = g
f :+: Zero = f
f :+: g = Lift Either f g
You’re allowed to overlap instances for total type families, because they’re closed.
May 13th, 2008 at 3:30 pm
Concerning the division, we can appeal to the special case
a^n / n! = Bag n a
i.e. the multiset (bag) with n elements of type a. So, the power series expansion for a data type just counts how many different sets of elements it may contain. So,
a :*: a
contains two bags of two a (one for each permutation in the tuple).Btw, as the differential equation for the list type suggests, we have the power series expansion
[a] = 1 + a + a^2 + a^3 + ...
which “is” 1/(1-a).
May 13th, 2008 at 7:10 pm
I’m confused. This doesn’t look like AD to me. The rules you give are the standard Leibniz rules for symbolic differentiation and then I think you’re trying to use those to get a (compile time) list of all derivatives. This is a really neat thing to attempt (though I have a hunch it’s not possible in Haskell) but it seems quite different to AD to me. Anyway, I suspect a language like Omega may be better suited for what you are doing.
May 13th, 2008 at 7:14 pm
Oh yeah, I almost forgot. Check out Conor McBride et al’s handling of derivatives here. It includes the handling of fixed points.
May 13th, 2008 at 8:04 pm
Yeah, mostly i was just trying to see what an inductive family of types built over an open type function would look like.
I left the fixed point cases off because I was still trying to get the finite cases to terminate and once I ran up against the wall of type-ability I didn’t check that what I had so far was what I thought I had so far.
It has been a while since I did anything with classical analysis. Hrmm. Back to the drawing board I guess until I figure out how to do what I really want.
May 13th, 2008 at 8:41 pm
Hrmm,
So a simple encoding of the basic AD rules leaves your Dual class, just one level up.
data D f g a = D (f a) (g a)
type family (:+>) (f :: * -> *) (g :: * -> *) :: * -> *
type family (:*>) (f :: * -> *) (g :: * -> *) :: * -> *
type family DLift (f :: (* -> *) -> * -> *) (f’ :: (* -> *) -> * -> *) (g :: * -> *) :: * -> *
type instance (D x a :+> D y b) = D (x :+: y) (a :+: b)
type instance (D x a :*> D y b) = D (x :*: y) (x :*: b :+: a :*: y)
type instance DLift f f’ (D x a) = D (f x) (a :+: f’ x)
dConst :: a -> D Identity Zero a
dConst z = D (Identity z) zero
dVar :: a -> D Identity One a
dVar z = D (Identity z) (suck zero)
I think the general AD case requires polymorphic kinds, so yeah something like Omega is probably better suited. Alas.
May 14th, 2008 at 2:45 am
[...] Ok, I decided to take a step back from my flawed approach in the last post and play with the idea of power series of functors from a different perspective. [...]