A Neighborhood of Infinity

Saturday, March 08, 2008

Comonadic Arrays

On haskell-cafe, ajb, aka Pseudonym, laments that many people don't have enough experience with comonads to recognise them. So I thought I'd mention a really simple example of a comonad that nevertheless captures the essence of a large class of comonads. It's conceptually not much different to my cellular automaton example (making this a bit of a rerun), but this should be easier to understand. And if it's too trivial, I hint at a particle physics connection towards the end.

Firstly, you can skip this paragraph if you don't want a quick bit of theoretical discussion. Consider arrays of fixed dimension. As types, they look something like X^N for some fixed integer N. From a container we construct its zipper by applying X d/dX. In this case we get XNX^N-1=NX^N. In other words, the corresponding zipper is an array paired with an index into the array. We can stretch the meaning of comonad slightly to allow this to relate to arrays whose size isn't fixed.

So here's some code:



> import Data.Array

The usual definition of Comonad:



> class Functor w => Comonad w where
>  (=>>) :: w a -> (w a -> b) -> w b
>  coreturn :: w a -> a

And now a type that is a pair of an array and an index into that array:



> data Pointer i e = P i (Array i e) deriving Show

Think of it as an array with one of its elements singled out for special attention. It trivially inherits the functoriality of ordinary arrays:



> instance Ix i => Functor (Pointer i) where
>   fmap f (P i a) = P i (fmap f a)

And now comes the Comonad implementation. coreturn serves to pop out the special element from its context - in other words it gives you the special element, whle throwing away the array it lived in. (=>>), on the other hand, applies a function f of type P i a -> b to the entire array. The function is applied to each element in turn, making each element the special element for long enough to apply f.



> instance Ix i => Comonad (Pointer i) where
>   coreturn (P i a) = a!i
>   P i a =>> f = P i $ listArray bds (fmap (f . flip P a) (range bds))
>       where bds = bounds a

Compare with fmap for arrays. This walks through each element in turn, applying a function to each element, and return an array of results. The computation for each element is separate from all the others. With =>> however, the entire array may be used for the computation of each element of the result, with the index into the array serving to indicate which element it is we should be focussing on.

For example, here's an array of values:



> x = listArray (0,9) [0..9]

We want to consider this to be a circular array so that going off one end wraps around to the beginning:



> wrap i = if i<0 then i+10 else if i>9 then i-10 else i

Now here's a simple operation that 'blurs' the single ith pixel in the 1-D image represented by x:



> blur (P i a) = let
>       k = wrap (i-1)
>       j = wrap (i+1)
>   in 0.25*a!k + 0.5*a!i + 0.25*a!j

We can apply this to the entire image thusly



> test1 = P 0 x =>> blur

Note the curious way I have to use P 0 x as an input to blur. There seems to be a redundancy here, we want the resulting array and don't care what the focussed element is. But =>> wants us to give it a focal point. Curiously it's making explicit something that's familiar to C programmers, but is slightly hidden in C. In C, you refer to an array of floats using a float *. But the same type points to elements of the array as well. So when you point to an array, you are, in effect, blurring the distinction between a pointer to an array and a pointer to the first element. Comonads make that distinction explicit.

Anyway, suppose you wanted to apply a sequence of operations. Adding, blurring, scaling, nonlinearly transforming and so on. You could write a pipeline like



> x ==> f = f x
> test2 = P 0 x ==> fmap (+1) =>> blur ==> fmap (*2) ==> fmap (\x -> x*x)

Note how ==> fmap f ==> fmap g = ==> fmap (g . f). If you think of fmap as farming out workloads to a SIMD processor with one thread applied to each array element, sequences of fmaps correspond to threads that can continue to work independently. The comonadic operations, however, correspond to steps
where the threads must synchronise and talk to each other. I believe this last statement explains the a cryptic comment in the comments to this blog entry.

One final thing going back to my optional paragraph above. If D means d/dX then we can see the operator XD as a kind of number operator. When you apply it to an array like X^N the array becomes multiplied by a type corresponding to the array index type. For ordinary arrays, these are just integers. So you can see XD as a way of counting how many elements there are in a container. Also, for any container F, we also have the equations D(XF) = XDF+F, which we can write as DX=XD+1. At some point, when I have time, I'll point out how this is closely related to the Heisenberg uncertainty principle and how when we say that differentiation makes holes in a data type, it's related to the notion of a hole in solid state physics.

Oh, and I sketched this diagram but don't have time to write an explanation:

Labels: comonads, physics, programming, quantum

Tuesday, February 05, 2008

A Third Order Quine in Three Languages

Suppose F_nxy is a program, written in language x, that takes as input n string arguments as input, G₁,...,G_n and outputs a program in language y that is the application of the function whose source code is G₂ to the strings G₂,G₃,...,G_n,G₁. Then F_3xy(F_3xy,F_3yz,F_3zx) will be a program in language x that that outputs a program in y that computes F_3yz(F_3yz,F_3zx,F_3xy).

We can put this into practice:


q a b c=putStrLn $ b ++ [toEnum 10,'q','('] ++ show b ++ [','] ++ show c ++ [','] ++ show a ++ [')']
main=q "q a b c=putStrLn $ b ++ [toEnum 10,'q','('] ++ show b ++ [','] ++ show c ++ [','] ++ show a ++ [')']" "def q(a,b,c):print b+chr(10)+'q('+repr(b)+','+repr(c)+','+repr(a)+')'" "def e(x) return 34.chr+x+34.chr end;def q(a,b,c) print b+10.chr+'main=q '+e(b)+' '+e(c)+' '+e(a)+' '+10.chr end"

This is a Haskell program that outputs a Python program that outputs a Ruby program that outputs the original Haskell program.

Apologies for the lack of line breaks. If it's not readable, it should be possible to copy and paste that source. Also, you may need to tweak it if your OS doesn't treat character 10 as a new line.

Labels: haskell, programming, self-reference

Saturday, February 02, 2008

Purely functional recursive types in Haskell and Python

"""
This post is simultaneously Python and literate Haskell.

There is a certain truth to Greenspun's tenth law of programming. A Python project I was developing at work has slowly mutated into a compiler for a programming language without me planning it that way. Usually (I assume) compilers parse their input and construct an AST which is passed to the compiler proper. My code didn't have an AST, just a bunch of lambdas. I realised that I'd actually come across a real world example of what Wadler was talking about in Recursive Types for Free!.

In Haskell, the foldr function reduces a list using a binary function and some initial value. Suppose the function is called a and the initial value is b. Take a list, for example [1,2,3]. Now write it without using list notation, directly in terms of its constructors. Ie. 1:(2:(3:[])). foldr replaces (:) by a and [] by b. So this becomes a(1,a(2,a(3,b))). The best known example is a=(+) and b = 0 so we get 1+2+3+0 and hence the sum of the values in the list. Here is how we'd use foldr in Haskell:



> x = foldr (+) 0 [1,2,3]

The interesting thing is that anything you might want to know about a (finite) list can be extracted using foldr. There is a sense in which it the universal function on lists and all other functions can be factored through it. For example, we can implement head and tail as follows



> head = foldr const undefined
> tail x = let Just (_,t) = foldr tailHelper Nothing x in t where
>   tailHelper x Nothing      = Just (x,[])
>   tailHelper x (Just (y,z)) = Just (x,y:z)

So if x is a list, \a b -> foldr a b x tells you everything you could want to know about the list. In other words, you can completely replace the list itself with functions like this. In fact, we can replace the list constructors with functions that build such functions:



> nil a b = b
> cons h t a b = a h (t a b)

We can use nil and cons just like [] and (:). In fact, given an element defined by



> y = cons 1 (cons 2 (cons 3 nil))

We can convert it to a conventional list via



> z = y (:) []

So foldr embeds a list as a function.

We can write the same thing in Python. (Note that Python already has a variation of foldr, called reduce.)


"""

def foldr(a,b,l):
    if l==[]:
        return b
    else:
        return a(l[0],foldr(a,b,l[1:]))

print foldr(lambda x,y:x+y,0,[1,2,3])

"""

It's surprisingly easy to implement cons and nil in Python too:


"""

def nil(a,b):
    return b

def cons(h,t):
    def _cons(a,b):
        return a(h,t(a,b))
    return _cons

l = cons(1,cons(2,cons(3,nil)))

print l(lambda x,y:x+y,0)

print l(lambda x,y:[x]+y,[])

"""

Folds can be generalised to any recursive type, not just lists. (Stricly speaking I mean recursive rather than corecursive types. Folds aren't appropriate for infinite structures.) Note how for lists, foldr takes two arguments besides the list: a two argument function and a zero argument function. Applying a fold simply replaces the list constructors (:) and [] with these functions. Generalised folds do something similar: each constructor gives rise to an argument to the fold and when the fold is evaluated, each constructor is replaced with the appropriate function. Here's an example:

Now consider a simple expression type in Haskell:



> data Expr = X | Const Int | Binop (Int -> Int -> Int) Expr Expr

This is a recursive type so it has a generalised fold associated with it. This fold will take three arguments, one for each of X, Const and Binop, and each one will take the same number of arguments as the constructor. Here it is:



> efold :: a -> (Int -> a) -> ((Int -> Int -> Int) -> a -> a -> a) -> Expr -> a
> efold x _ _ X = x
> efold _ c _ (Const a) = c a
> efold x c b (Binop f lt rt) = b f (efold x c b lt) (efold x c b rt)

efold simply replaces each constructor with an application of the matching function recursively through the entire Expr.

Anything you might want to do to an Expr can be done using efold, and many things you might naturally want to do with an Expr are particularly easy to write using it. Here the functions to (1) evaluate the expression for X equal to some Int, and (2) to determine whether or not an expression is free of references to X:



> eval x e = efold x id id e
> freeX e = efold False (const True) (const (&&)) e
> identity e = efold X Const Binop e

Now we can do the same thing we did above, replace the Expr structure with its corresponding fold. And again, I'm implementing it in Python rather than Haskell:


"""

def X():
    def _X(x,c,b):
        return x
    return _X

def Const(a):
    def _Const(x,c,b):
        return c(a)
    return _Const

def Binop(f,l,r):
    def _Binop(x,c,b):
        return b(f,l(x,c,b),r(x,c,b))
    return _Binop

def eval(x,e):
    return e(x,lambda x:x,lambda f,l,r:f(l,r))

def freeX(e):
    return e(False,lambda x:True,lambda f,l,r:l and r)

"""

So we have translated the Haskell algebraic type Expr into functional expressions in Python. Here are some examples of their use:

Evaluating X, 2 and X+2 at X=2:


"""

print eval(3,X())
print eval(3,Const(2))
print eval(3,Binop(lambda x,y:x+y,X(),Const(2)))

"""

Testing whether 10-2 and X()+2 are free of references to X():



"""
print freeX(Binop(lambda x,y:x-y,Const(10),Const(2)))
print freeX(Binop(lambda x,y:x+y,X(),Const(2)))
"""

You can even implement a version in a blend of functional and OO style:



"""
class X:
    def __call__(self,x,c,b):
        return x

class Const:
    def __init__(self,a):
        self.a = a
    def __call__(self,x,c,b):
        return c(self.a)

class Binop:
    def __init__(self,f,l,r):
        self.f = f
        self.l = l
        self.r = r
    def __call__(self,x,c,b):
        return b(self.f,self.l(x,c,b),self.r(x,c,b))

"""

Some final comments:

This can sometimes be an inefficient style of programming, especially so in a strict language. Look again at tail for the cons/nil lists. But many uses are quite efficient, and folds capture a very common design pattern.

When I wrote this post a while back I left out mention of what the main point of the paper was. This post fixes that.

Wadler's paper also describes a dual version of this for codata such as streams. But as far as I understand it's not very interesting.

It's interesting that theory about static types has something to say about programming in a dynamically typed programming language.

Just so you know, my work project doesn't look anything like the code above.

Oh...and I guess you could say this was a form of the visitor pattern. Ugh. It's hideously complicated in C++.
"""

Labels: haskell, programming

Saturday, January 26, 2008

The Type that Should Not Be



> import Prelude hiding ((^))

I've been pondering a type that's so mind-bogglingly weird that I can no longer think about it without frying my neurons. So instead I'll just dump the contents of my brain here so you can fry your neurons too.

Here's an interesting type to warm up with. In Haskell:



> data X = Tree [X]

At first look it seems not be well-founded in the sense that a value of type X is a list of X's and we haven't yet defined what X's are. But we don't need an X to make an empty list of X's. So we can build elements inductively starting with Tree []. If you think of Tree [] as being a leaf, you can see why I called this type Tree (though they are trees whose limbs can have infinite sequences of children).

Even without the empty set forming a 'foundation' we can still build circular elements like



> x = Tree [x,x]

Now lists are ordered and can have repeated elements. What about if we eliminate these two properties. The way to do that would be with a set type. So how do we represent sets in Haskell. Well certainly not with Haskell's Data.Set because that only represents finite sets. On approach to supporting infinite sets is to identify a set with a predicate for telling whether an element is a member of a set. In other words, the type X→Bool makes a tolerably good set of X's.

So now we're ready for an interesting type:



> data U = U { f :: U -> Bool }

(BTW U stands for universe).

The first thing to think about is what this might mean mathematically. Well it looks like a solution to the equation X=2^X, in other words, a set that equals its own power set. We know that in set theory this gives rise to Russell's paradox. One way to embed recursive types into set theory is to tweak set theory, for example by using non-well-founded set theory. But Russell's paradox is so bad that it breaks just about any attempt at devising a set theory where a set can equal its power set.

And yet we can build this type in Haskell. How come? The explanation is simple, Russell's paradox tells us that there must be some elements whose evaluation doesn't terminate. In fact, we can replay Russell's paradox directly in Haskell. But before that we need some definitions.

Firstly, are we sure we can even construct an element of U? In order to define an element of U we need to define a boolean predicate on U but we can't even compare two elements of U for equality. But we can define the 'empty' element phi, along with a membership test:



> phi = U $ const False
> x `e` y = f y x

Unlike conventional set theory, we can define negation of elements. We can also define union and intersection:



> neg a = U $ \x -> not (x `e` a)
> a \/ b = U $ \x -> x `e` a || x `e` b
> a /\ b = U $ \x -> x `e` a && x `e` b

We're ready for the Russell paradox. Define the 'set' (I give in, let's informally call these things sets) of all sets that don't contain themselves.



> nonSelfContaining = U $ \a -> not (a `e` a)

Now try to compute nonSelfContaining `e` nonSelfContaining. It doesn't terminate. It's not just a matter of having implemented it badly, it can't possibly terminate as there is no answer that could make sense. So if you're designing a language that's intended to be total, you'd better rule out types like U.

So we now have two elements of U, phi and neg phi. Can we make any more? Well here's one way:



> c a = U $ \x -> a `e` x

Intuitively, c a is the set of all sets that contain a. But there are two potential problems that come to mind with this. One is, we've only made two sets so far, so "the set of all sets that contain a" might not be interesting. And secondly, can we be sure c a is distinct from a?

The first issue is easily dismissed. neg phi is obviously in c phi, but phi isn't. So we can distinguish phi and c phi. c phi is also different from neg phi (check to see if phi is an element of each). In fact, define functional power by



> f ^ 0 = \x -> x
> f ^ n = \x -> (f^(n-1)) (f x)

and it can be shown that c^m phi is distinct from c^n phi for m≠n. (Hint: Try checking to see if c^m (neg phi) is an element of c^n phi for various m and n.) So we have constructed an infinite number of elements of U.

But, we know that we can write code for elements of U that doesn't terminate. Why not restrict ourselves to just the computable elements of U? But there's a catch. How do we define a computable element of U when elements of U are functions. One approach is this: define x to be computable if y `e` x is computable for all y. But that won't do because it won't terminate if y isn't computable. So how about defining x to be computable if y `e` x is computable for all computable y. But that's circular, and not in a nice inductive way. U is so perverse that just attempting to define computability for it gives a circular definition. But that's not necessarily a problem. Define a 'universe' to be a set, C, of computations of type U such that C = the set of x such that y`e` x terminates for all y in C. The question now is this: is there a unique set C satisfying this 'equation'. We can rule out the empty set as phi must be in any such C. There's also another thing we can prove. If we have two such sets, C and C', we can prove that the intersection of C and C' is a universe. So we can take the intersection of all universes to form the smallest universe. So we can define a something in U to be computable if it is in this smallest set.

So here's my first open question: Are there larger 'universes' than the smallest one, or is it unique? It's not hard to see that a universe is closed under (non-recursive applications of) negation, finite union, finite intersection and the function c above. My second open question: What other computable elements are there?

Time to stop this and start cooking dinner.

Labels: mathematics, programming, self-reference