Sun 25 Dec 2011
Searching Infinity Parametrically
Posted by Edward Kmett under Algorithms , Haskell , Kan Extensions , Logic , Mathematics , Monads , Type Hackery[11] Comments
Andrej Bauer recently gave a really nice talk on how you can exploit side-effects to make a faster version of Martin Escardo's pseudo-paradoxical combinators.
A video of his talk is available over on his blog, and his presentation is remarkably clear, and would serve as a good preamble to the code I'm going to present below.
Andrej gave a related invited talk back at MSFP 2008 in Iceland, and afterwards over lunch I cornered him (with Dan Piponi) and explained how you could use parametricity to close over the side-effects of monads (or arrows, etc) but I think that trick was lost in the chaos of the weekend, so I've chosen to resurrect it here, and improve it to handle some of his more recent performance enhancements, and show that you don't need side-effects to speed up the search after all!
First, we'll need to import a few things:
{-# LANGUAGE RankNTypes #-} import Data.Maybe (fromMaybe) import Control.Applicative import Data.IntMap (IntMap) import qualified Data.IntMap as IntMap import Control.Monad import Control.Monad.Trans.Class import Data.Functor.Identity
What are looking for is an implementation of Hilbert's epsilon.
This is a formal mechanism for eliminating existentials over some non-empty set X by defining a function
ε: (X -> Prop) -> X
such that if there exists an x in X such that p(X) holds then p(ε(p)) holds.
As noted by Andrej, we could reify this constructively as a function "epsilon :: (X -> Bool) -> X" for some X.
Now, for some sets, Hilbert's epsilon is really easy to define. If X is a finite set, you can just exhaustively enumerate all of the options returning a member of X such that the property holds if any of them do, otherwise since X is non-empty, just return one of the elements that you tested.
This would be a pretty boring article and I'd be back to eating Christmas dinner with my family if that was all there was to it. However, certain infinite spaces can also be searched.
Last year, Luke Palmer wrote a post on "Searchable Data Types" that might also serve as a good introduction. In that article he led off with the easiest infinite space to search, the lazy naturals, or the 'one point compactification of the naturals'. That is to say the natural numbers extended with infinity.
data LazyNat = Zero | Succ LazyNat
infinity :: LazyNat
infinity = Succ infinity
Now we can implement Palmer's epsilon (called lyingSearch
in his article).
palmer :: (LazyNat -> Bool) -> LazyNat palmer p | p Zero = Zero | otherwise = Succ $ palmer $ p . Succ
The trick to making this work is that we place a requirement that the predicate that you pass has to terminate in a bounded amount of time no matter what input you give it, and since we're working with the naturals extended with infinity
, if no natural satisfies the predicate, we'll just keep returning a longer and longer chain of Succ
's, effectively yielding infinity
.
To check to see if the returned number satisfies the predicate you can always use p (palmer p)
. The predicate is required to terminate in finite time, even when given infinity, so this will yield a Bool and not bottom out unless the user supplied predicate takes an unbounded amount of time.
I posted a reply to Luke's article when it came up on reddit which included a Hinze-style generic implementation of his lyingSearch
predicate, which you can see now is just Hilbert's epsilon for arbitrary recursive polynomial data types.
http://www.reddit.com/r/haskell/comments/e7nij/searchable_data_types/c15zs6l
Another space we can search is the Cantor space 2^N.
type Cantor = Integer -> Bool
With that we jump clear from countable infinity to uncountable infinity, but it can still be searched in finite time!
This is the space we'll be paying attention to for the rest of this article.
First we'll define how to "book a room in Hilbert's Hotel."
infixr 0 # (#) :: Bool -> Cantor -> Cantor (x # a) 0 = x (x # a) i = a (i - 1)
Then this can be used to obtain the following implementation of Hilbert's epsilon for the Cantor space, attributed by Andrej to Ulrich Berger.
berger :: (Cantor -> Bool) -> Cantor berger p = if ex $ \a -> p $ False # a then False # berger $ \a -> p $ False # a else True # berger $ \a -> p $ True # a where ex q = q (berger q)
This version is particularly close in structure to the one for searching the LazyNats, but it is dreadfully slow!
It would be nice to be able to search the space faster and that is just what Martin Escardo's improved version does, through a more sophisticated divide and conquer technique.
escardo :: (Cantor -> Bool) -> Cantor escardo p = go x l r where go x l r n = case divMod n 2 of (0, 0) -> x (q, 1) -> l q (q, 0) -> r $ q-1 x = ex $ \l -> ex $ \r -> p $ go True l r l = escardo $ \l -> ex $ \r -> p $ go x l r r = escardo $ \r -> p $ go x l r ex q = q (escardo q)
To proceed from here I'll need a State monad:
newtype S s a = S { runS :: s -> (a, s) } instance Functor (S s) where fmap f (S m) = S $ \s -> case m s of (a, s') -> (f a, s') instance Applicative (S s) where pure = return (< *>) = ap instance Monad (S m) where return a = S $ \s -> (a, s) S m >>= k = S $ \s -> case m s of (a, s') -> runS (k a) s'
And now we've reached the point. From here, Andrej's pure code ends, and his side-effecting ocaml and custom programming language start. The first thing he does is compute the modulus of continuity by using a side-effect that writes to a reference cell which he very carefully ensures doesn't leak out of scope, so he doesn't have to concern himself with the proposition code editing the value of the reference.
let mu f a = let r = ref 0 in let b n = (r := max n ! r; a n) in ignore (f b); !r
To obtain the same effect we'll instead make a predicate using the state monad to model the single reference cell.
-- bad
modulus :: (Num b, Ord s) =>
((s -> S s a) -> S b c) -> (s -> a) -> b
We can mash b and s together, and try to make the ordering and number agree by claiming that it is instead Real and we'd get the slightly more reasonable looking type:
-- still bad modulus :: Real a => ((a -> S n b) -> S n c) -> (a -> b) -> a
In the imperative code, lexical scoping had ensured that no other code could edit the reference cell, but with this type we don't have that. The predicate is allowed to use arbitrary state actions to muck with the modulus of convergence even though the only thing that should be editing it is the wrapper beta that we placed around alpha.
But how can we ensure that the end user couldn't gain access to any of the additional functionality from the monad? Parametricity!
-- getting better modulus :: Real a => (forall f. Monad f => (a -> f b) -> f c) -> (a -> b) -> a
Here the only thing you are allowed to assume about f is that it forms a monad. This gives you access to return and >>=, but the predicate can't do anything interesting with them. All it can do is work with what is effectively the identity monad, since it knows no additional properties!
We can have mercy on the end user and give them a little bit more syntactic sugar, since it doesn't cost us anything to let them also have access to the Applicative instance.
-- good modulus :: Real a => (forall f. (Monad f, Applicative f) => (a -> f b) -> f c) -> (a -> b) -> a
With that we can show Andrej's version of the modulus of convergence calculation does not need side-effects!
> > modulus (\a -> a 10 >>= a) (\n -> n * n) 100
Admittedly plumbing around the monadic values in our proposition is a bit inconvenient.
His next example was written in a custom ocaml-like programming language. For translating his effect type into Haskell using parametricity, we'll need a CPS'd state monad, so we can retry from the current continuation while we track a map of assigned values.
newtype K r s a = K { runK :: (a -> s -> r) -> s -> r } instance Functor (K r s) where fmap = liftM instance Applicative (K r s) where pure = return (< *>) = ap instance Monad (K r s) where return a = K $ \k -> k a K m >>= f = K $ \k -> m $ \a -> runK (f a) k
For those of you who have been paying attention to my previous posts, K r s
is just a Codensity
monad!
neighborhood :: (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) -> IntMap Bool neighborhood phi = snd $ runK (phi beta) (,) IntMap.empty where beta n = K $ \k s -> case IntMap.lookup n s of Just b -> k b s Nothing -> case k True (IntMap.insert n True s) of (False, _) -> k False (IntMap.insert n False s) r -> r
With that we can adapt the final version of Hilbert's epsilon for the Cantor space that Andrej provided to run in pure Haskell.
bauer :: (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) -> Cantor bauer p = \n -> fromMaybe True $ IntMap.lookup n m where m = neighborhood p
With a little work you can implement a version of an exists and forAll predicate on top of that by running them through the identity monad.
exists :: (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) -> Bool forAll :: (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) -> Bool
I've gone further in playing with this idea, using monad homomorphisms rather than simply relying on the canonical homomorphism from the identity monad. You can get the gist of it here:
https://gist.github.com/1518767
This permits the predicates themselves to embed some limited monadic side-effects, but then you get more extensional vs. intensional issues.
An obvious direction from here is to fiddle with a version of Martin Escardo's search monad that takes advantage of these techniques, but I'll leave the exploration of these ideas to the reader for now and go enjoy Christmas dinner.
Happy Holidays,
Edward Kmett
December 25th, 2011 at 2:14 am
I would hesitate calling the Cantor space
type Cantor = Integer -> Bool
“uncountable”. After all, this is the type of Haskell functions, of which there is only countably many.
December 25th, 2011 at 2:52 am
@Sean: It is not fair to judge a Haskell type from outside Haskell. As far as Haskell is concerned, Integer -> Bool is uncountable. Given any sequence alleged enumeration e : Integer -> (Integer -> Bool) of cantor space, the sequence \n -> not (e n n) is missed by e. So you cannot enumerate the type in Haskell, which is the relevant thing, who cares if god can enumerate Haskell types? (Yes, it takes the birth of Jesus to enumerate total functions, so Merry Christmas!) And another observation: Haskell has access to the real world and can use real-world data to construct elements of type Integer -> Bool. Are you saying there are only countably many streams of bits in the real world?
@Edward: Very nice post. I suspected something of the kind can be done, but you’re very good at minimizing the “damage” done by explicit monads. One questions remains, I think. What specification is the input predicate supposed to satisfy? The LICS 2010 paper by Martin Hofmann et al. might be relevant, see http://www2.in.tum.de/bib/files/Hofmann10Pure.pdf. They also use the trick with “parametric in all monads”.
December 25th, 2011 at 3:25 am
@Andrej: Fair enough, but if you want to stay within Haskell, your definition of “uncountable” may not be what you are expecting anymore. It’s true that there is no total surjection e : Integer -> (Integer -> Bool), but there is a partial surjection. Decode the Integer into Haskell code via your favorite encoding, and interpret the Haskell code. Some will fail to decode as well-typed Haskell programs of type Integer -> Bool, and some will fail to terminate. But you’ll cover all of them. So in some sense there are actually “fewer” functions Integer -> Bool than there are Integers.
As for the existence of uncountably many streams of bits in the real world, I’m not willing to make any sort of commitment on that statement. I will say that no one has yet satisfactorily shown me uncountably many streams in the real world :)
December 25th, 2011 at 5:22 am
@Sean: one of the nice things about Haskell is that you can pretty much run it on paper or on an abstract machine.
We use it to model mathematical ideas all the time, ignoring the fact that (for example) Integer, being implemented on top of GMP, is finite (2^53 bits or something is the maximum allowed, even if you did have infinite memory). I think that pointing out one of the implementation details of one of Haskell’s “interpreters” (a CPU) is kind of missing the point here.
December 25th, 2011 at 9:08 am
@Sean: the definition of countable simply is “there is a surjection from the natural numbers”. Subsets of countable sets are called “subcountable” and can be very, very complicated in intuitionistic logic, for example they need not be countable. In addition, your proposed plan to take codes of functions does not quite work inside Haskell, as there is no way to get the code of a function in Haskell, i.e., all total maps (int -> bool) -> int are constant. You would have to extend Haskell by some sort of a quoting mechanism that would allow you to always access source code (or the machine instructions of the compiled code). With such modifications your idea demonstrates the fact that it is consistent to assume intuitionistically that the Cantor space is a quotient of a subset of a countable set. Nevertheless, it is still uncountable.
December 25th, 2011 at 1:18 pm
@Dan P: I don’t see what any of this has to do with implementation details of any form. I’m just requiring some bijection between integers and strings which we interpret as Haskell code (Godel numbering will do).
@Andrej: Agreed (although I don’t believe I’m requiring any sort of quoting mechanism…). I’m simply arguing that while “uncountable” has the same formal definition, intuitively it behaves much differently than it does in ZFC. In ZFC it means “very large”, while in this kind of intuitionistic setting it can mean “smaller than countable” (subcountable). The meanings are so different that I’m leery to call them the same thing, for fear that we mislead the classical mathematicians.
December 25th, 2011 at 1:40 pm
> With that we jump clear from countable infinity to uncountable infinity, but it can still be searched in finite time!
Indeed, when we take uncountable to mean subcountable, I find it far less shocking that we can search it in finite time. To propose that we can search an uncountable ZFC set in finite time is a much stranger proposition.
December 25th, 2011 at 3:14 pm
“Are you saying there are only countably many streams of bits in the real world?”
What would make you think there are even infinitely many streams of bits in the real world?
December 25th, 2011 at 3:22 pm
Sean: the fact that we can search it in finite time derives from the fact that the predicate is required to evaluate in a bounded amount of time regardless of the ’size’ of the input. This keeps me from trying to compare with a lazy nat with infinity, and ensures I only look at so many of the results from the Cantor space.
The ability to do this in finite time would hold even if you had a language that could represent uncountably many programs! That I believe was Andrej’s point.
Consider data vs. codata. Think of the Cantor’s space as ‘codata’. It is infinitely large, but we can productively consume any finite portion of it. Determining the neighborhood used by a predicate only requires a finite amount of time, and then we wrap back into codata by filling in the infinite set of other cases with a valid inhabitant.
Ultimately all we have to do is find the neighborhood that was inspected (which is finite and can be done in finite time) and find a member of that neighborhood where the property does or does not hold.
This ability to search doesn’t make any use of the finiteness of the language. The finiteness of the language is interesting but irrelevant for this purpose.
February 22nd, 2012 at 6:02 pm
One thing that’s interesting to me is the transformation from
type Pure a b c = (a -> b) -> c
to
type Effect a b c = forall f. Monad f => (a -> f b) -> f c
Clearly f_effect is pure; you can get f_pure from f_effect by specializing f to the identity monad:
purify :: Effect a b c -> Pure a b c
purify f ab = runIdentity $ f $ \a -> Identity $ ab a
But it seems to me that no unsoundness is introduced by allowing this isomorphism in the other direction as well, although I don’t think that’s implementable in Haskell.
effectify :: Pure a b c -> Effect a b c
effectify f = ???
February 22nd, 2012 at 6:43 pm
Hm, I changed my mind. effectify requires you to order all the calls inside the otherwise pure f, which violates referential transparency; in (\k :: (String->Int) -> k “a” + k “b”) (with strict (+)), the compiler is allowed to choose whether it wants to evaluate k “a” or k “b” first, while in the effectful version the caller needs to explicitly make that decision.