Sunday, August 31, 2014

lazy closure creation

Some issues with closure creation become more important in a lazy environment. Consider this simple Wombat Void=>Int closure:
{ x+y }
Since the add doesn't involve the (empty) parameter, it is a subexpression of closure creation, not of closure execution. In an eager environment we mightn't want to do the add until the closure is called to avoid doing it unnecessarily (this would be more clear if it was a more expensive operation). But (yet another) nice thing about being lazy is that we can do these things correctly without cost (apart from the cost of laziness itself).

It is not so easy to switch Wombat to being lazy. The intention was that the language would have 3 levels:

  • A very low level that maps directly to the particular hardware format. Let me call this a grade in Mercury style. This is the primitive procedures and types that vary with grade.
  • The next level up is C-like and abstracts out the grade-specific crud, but still doesn't provide a safe programming environment for most uses.
  • At the top level we have types and operations with mathematically nice semantics, designed for most uses. This is where laziness might be appropriate.

parallel laziness

I've just started reading Paul Bone's thesis: https://mercurylang.org/documentation/papers/pbone_phd_thesis.pdf. I got to the bit that said "laziness makes parallelization harder". That set me thinking. Here is some idle speculation.

When we are executing our lazy program we can look at every possible expression in our program, and classify them:

  • There are expressions that we need to evaluate to advance the next IO. This includes all subexpressions that we will definitely need to evaluate.
    • These have subexpressions that we might or might not need to evaluate. In wombat these will always be in closures. Note that creating those closures (filling in the values of identifiers) is also an expression.
  • On the other side of the coin, and working from the bottom up instead of top down, there are expressions that we could evaluate because we have all the parameters.
    • And conceivably we could order these by how likely the value is to be useful, either for the current IO or for possible future IOs. And conceivably we could improve this ordering as time goes on, based on the actual behaviour of the program.
So one can imagine a program that has as many cores as the operating system wants to give it at any particular time. And those cores can beaver away on all the available work that can be done...

And doesn't this remind me of work that was going on in the next lab when I was younger: http://www.eganfamily.id.au/archive30nov2007/monash/research/projects/CCC/index.html. (Actually their lab was up the road at RMIT since it was a joint CSIRO-RMIT project.)
[update!! And I got this wrong: the Prof Greg Egan is not the the Greg Egan (http://gregegan.customer.netspace.net.au/) who is an author and amateur (I presume) mathematician who collaborates with John Baez]. I also see Mark Rawling in the photo. Later we (mostly Mark) wrote a C++ library for reactive programming that is somewhat related to this post. And I now remember that I applied for a job with that CSIRAC2 project, because of my interest in functional programming which they were using. I didn't get it :-(.

Saturday, August 30, 2014

laziness, backtracking and explicit io threading

from http://gifsoup.com/view/37759/lazy.html.
When I first heard about laziness in programming languages I thought "That's just an optimization technique". Then I saw the example of an infinite lazy list and I thought "That's just a stream, umm I mean a memoized stream [1]". Well yes, but streams are uglier and harder to think about than lists, and this quickly gets worse for more complex structures. So I've come around to being rather sympathetic to laziness by default.

And laziness seems closely related to an important emerging trend in programming, which is to allow the programmer, as much as possible, to write the program as specifications and tests rather than in the traditional style of "step by step instructions to a complete idiot". This is also related to the trend that the programmer gets to deal with entities with mathematically nice semantics rather than entities with hardware-level semantics (and this is an important aim of Wombat).

If we're lazy then we need something to inspire us to do anything at all, and in programming it is the need to do IO that provides the inspiration. And, of course, we need to do the IO in the correct order, so you come around to the need to explicitly tell the compiler when IO is needed and in what order. In Haskell that is done with the IO monad. I won't explain the details since Mercury has a similar but conceptually simpler system.

I vaguely knew that the Mercury programming language (mercurylang.org) also had explicit threading of IO. So when I learnt about laziness, I naturally assumed that Mercury was also lazy and that was the reason for the IO threading. Now that I have (finally) started to learn Mercury [2] I see that this is wrong. Mercury is a logic programming language, in the tradition of Prolog, but much more hard core about being declarative. The reason it is explicit about IO is because it supports backtracking when searching for a logical result. So it needs to thread IO to prevent backtracking into an IO operation that has already happened.

The way Mercury does this is that procedures which want to do IO (starting necessarily with main) have:
  • an extra input parameter representing the world before the procedure is called. This parameter is marked "will be destroyed" so that it can't be reused after the procedure is called.
  • an extra output parameter representing the world after the procedure returns. This value is marked "unique", meaning that there are no other copies of it. And naturally this has to be used as the input parameter of the next IO operation.
This could get tedious, but there is syntactic sugar to handle the bookkeeping. An obvious question is: Why does Mercury have eager evaluation instead of lazy, since it is declarative? Is there some interaction with logic programming and backtracking that makes laziness difficult? I don't know the answer to that, but here's a different question. 

How do you handle the important need for IO concurrency? In Haskell the guys from Facebook have recently released their Haskell library for concurrently reading from multiple sources. However they are careful to only claim to handle reading. So I'm guessing that handling concurrent changes to the world is not so easy.

In Mercury one can write code (I hesitate to say "function") where spawned procedures can fan out and rejoin (actually I don't yet know how to rejoin before end of program, but I'm sure it is possible). Anyway this is all nicely explained by Mark Brown in this email: http://www.mercurylang.org/list-archives/users/2014-August/007757.html. You need to learn some Mercury to understand it, but I recommend that.

----------------------------------------------------------------
[1] A stream is just a procedure (taking no input) which either returns an end-of-stream marker or it returns a pair consisting of the current value (like the head of a list) and the next procedure to call (like the tail of the list). By a memoized stream I mean that all those procedures are memoized (which means those procedures quickly return their result rather than recalculating it when called more than once).

[2] Learning Mercury is easier than on my previous attempt (15 years ago) because of the rather nice, though unfinished, tutorial by Ralph Beckett which is at the top of their documentation page.

-----------------------------------------------------------------
Update 2014-09-20: I just saw Bob Harper's post "The Point of Laziness". He basically endorses memoized Streams as doing the job of lazy lists. I'm inclined to be convinced, and Wombat's neat terse procedure format should suit. It is a challenge to handle more complex lazy structures nicely with procedures.

Update 2016-09-28: The current spec calls for values to start as holes and be passed around like that till set. This does some of the job of laziness. Streams can be created in interator/yield style, making streams much easier to use.

Saturday, August 16, 2014

IO in a declarative or functional language

Haskell is declarative and functional. Its supporters seem to hang out more with the functional crowd. Maybe they think it is easier to sell "declarative" to functional folk, than to sell "functional" to other declarative folk. And there are good reasons why that might be so.

Suppose we declare that
y = x + 1
[This is not an assignment. Imagine a val in front if that is fits your language preferences]. This declares a relationship between x and y, but it does it in a functional style. It would look more functional still if we wrote "y = add 1 x", but that is just sugar.

But what if we have y and we want x? Since we are just declaring a relationship it isn't obvious why we have to change this. But in functional style this has to be changed to:
x = y - 1
Functional programs can only run forward, they can't run backwards. Haskell has, of course, a big exception to that: Constructors can be run forward (to construct) or backward (to deconstruct). But perhaps everything that can run backwards should be allowed to (and hence be useable in case statements). In addition to convenience there is a dodgy philosophical point.

When we look at the way the world works it is rather like a giant computer program. And it is particularly like a program in that, on a large scale it pushes forward in an irreversible way. Entropy increases and broken plates never reassemble themselves. And yet at the core is quantum mechanics which is completely reversible.

Haskell and Mercury are two languages in which the interaction with the external world (including mutable data in memory) is explicit. This is a consequence of the fact that both languages are declarative. In non-declarative languages the textual order of the program specifies an order of execution (optimisers alter that, but they are constrained to preserve the semantics, and are forced to make conservative decisions as a result). In declarative languages the compiler generates execution when required, and the thing that brings about that requirement is interaction with the outside world. Since that interaction is explicit, one would hope that the optimiser is never constrained by worrying about what separately compiled code might do.

So it seems that programming should have three levels.
  • A core language that is reversible. This specifies what can be handled by cases, though if you allow multiple returns (as Mercury does) then you can expand that.
  • Reducing operations (like summing a list) which are intrinsically irreversible.
  • Interaction with the unforgiving world which forces an order of execution.
But maybe its not quite that simple. It seems that if the part of the real world you are dealing with is quantum then it might be reversible and you might conceivably be able to interact with that in the reversible part of the language. Perhaps more practically: When two programs interact with each other then they might sometimes be able to do that while each stays within the reversible core.

I should have another go at learning Mercury. Last time was 15 years ago.

Tuesday, August 5, 2014

infinitesimals for programming

Per Martin-Löf has played a major role in uniting our ideas about mathematics and programming. So I was impressed to see this quote from 32 years ago:
I do not think that the search for logically ever more satisfactory high level programming languages can stop short of anything but a language in which (constructive) mathematics can be adequately expressed.Per Martin-Löf. “Constructive mathematics and computer programming” (Logic, Methodology and Philosophy of Science, 1982)
But things are getting HoTTer, and now it is possible to glimpse a world where all mathematics can be regarded as constructive. Or, at least, amenable to integration with programming.

Some real numbers, such as 3.7 and π, are computable in the sense that you can write a program to get as many decimal places as you want. Computable numbers are part of constructive mathematics in a clear way. But most of the reals aren't computable. In some sense those non-computable reals don't exist. But clearly our understanding of the reals helps us think about the real line, and many other important concepts. And the way the reals are constructed and reasoned about creates a consistent mathematical context. And that combination of usefulness for human thought and logical consistency defines the boundaries of mathematics.

Infinitesimals seem even further from constructive mathematics than non-computable reals, but actually that is not true. They are certainly useful for thought and they can be used consistently (as I remember from Uni in the early 70s). So imagine adding infinitesimals to the numeric types. I'll leave the programming language details rather vague. It isn't Wombat, though it could easily be changed to be:

def differentiate T::RealLike==> (f:T=>T) T=>Option(T):
    def dfdx(x:T) Option(T):
        val dx = Infinitesimal.new
        return Infinitesimal.to[T]((f(x+dx)-f(x))/dx)
    return dfdx

We presume that arithmetic operations on our float+infinitesimal are just the symbolic calculation we would do. Sometimes (depending somewhat on the cleverness of our libraries) we will end up with a numerator that can be divided by dx, leaving something which the to[T] function will handle by eliminating the Infinitesimals in a safe way. This works easily for polynomial functions as we remember from school. When it doesn't work we return Option.none.

We can call f with parameter (x+dx) because a RealLike type with added Infinitesimals is also a (different) RealLike type.