Monday, September 28, 2015

Multitype has gone!

Version 0.0.3 of the outline doc (https://docs.google.com/document/d/1sNnCvYFjjBmtrl7XF7JX_pdMZ_0ydrctjTR0pO0p5Y4/edit?usp=sharing) has been updated to remove Multitype and replace it with Empty weirdness, based on the fact that Empty is at the bottom of the isA hierarchy, so Empty isA X for any X.

  • [] is a List Empty, so can be used where any List X is required;
  • Option.none is an Option Empty, so can be used where any Option X is required.
  • An explicit closure, { ... }, has type Any=>Empty, and so isA X=>Y for any X, Y.
We've drifted a bit towards dynamic typing, but all is now very straightforward.

Friday, September 25, 2015

Mathematics influences Wombat again

In Wombat, types form a lattice (an order lattice) generated by isA declarations. The sup/lub/join is Union, the inf/glb/meet is Intersection. Unions between otherwise unrelated types are new types created as required. The intersection is just the union of types below all the types being intersected. So, for some examples: Integer isA Rational, Integer isA GaussianInteger. Union[Integer Rational] is Rational, but Union[Integer String] is a new type (with few properties, just the ability to extract one or other). And, of course, Intersection[Rational GaussianInteger] is Integer. The intersection of unrelated types is the type Empty, which has no values.

[notational note: list entries are separated by spaces. The parameter of Union and of Intersection is a set of types, but can be given as a list as I have done here (because, in a forgetful way, List X convertsTo Set X). The / operator invoked below takes 2 Integers and returns a Rational.]

I've been following along, in a desultory way, with a Mathematics discussion (https://golem.ph.utexas.edu/category/2015/09/the_free_modular_lattice_on_3.html) where I learnt of the difference between a lattice (which has sups and infs for finite sets of members of the lattice) and a bounded lattice which also has sups and infs for an empty set of members. Things usually work better when you include these degenerate cases, so I thought: I wonder if I can make Wombat's type hierarchy into a bounded lattice.

It doesn't take much imagination to see that Union[] is going to be Empty. And then it is natural, looking at everything upside down, to see that Intersection[] is the type Any which is the Union of all types.

Now consider the list [7 3/2 4/5]. What is this a list of? It is naturally a list of the Union of the types of its elements. Since those types are Integer and Rational, the result is a List Rational (as any ordinary mortal would expect).

So what is the type of the empty list []? The set of types of the members is the empty set. The Union is thus type Empty. So [] is a List Empty. But we've seen that Empty is below any other type X, so Empty isA X. So a List Empty isA List X for any X. Which is exactly what you need to allow the empty list to be used in the situations where one might want to use it.

Monday, September 21, 2015

Syntax is a minor matter

I just learnt of the existence of Marpa: https://jeffreykegler.github.io/Marpa-web-site/. Basically this generates a BNF parser that is simple and efficient. I'm not sure how easily Wombat's user-defined syntax system can be implemented with Marpa, but the key point is this:

Having syntax built in to a language definition is going to alienate people. It isn't necessary. A core aim of Wombat is that different folk can use different syntax systems, including rolling their own, and still use each others libraries without difficulty. Of course Wombat will have default syntax for the standard libraries, based on the prejudices of the core developers, but even that can be easily changed. You prefer to separate your list elements with commas instead of spaces, or you like your "if" expressions to end with "fi": easy to do. Wombat may provide standard WCTL (Wombat Compile Time Language) functions to allow programmers to specify common styles.

What is important is the structure of the Type system. This is where I think Wombat is different (though the relevant literature is vast, so I could well be wrong). If a Rational has a denominator of 1 (or a Gaussian Integer has an imaginary part of 0), then it is an Integer, and not distinguishable from an Integer that arose in a more obviously Integer way. If two types have a non-Empty intersection (as Gaussian Integers and Rationals do) then you are not allowed to define a lossy conversion from one to the other: Conversion can only happen losslessly by going down to the intersection and then up (and it may then, naturally, fail).

I trained as a Mathematician, and I think this is the way most Mathematicians operate most of the time. Sometimes they will show a bit of guilt by writing "by abuse of notation", but I think they really feel it is ok. By contrast Mathematicians involved in the foundations of Mathematics don't allow this sort of thing at all. I suspect this is a big part of the disconnect between the two camps. Programming is in a funny state, being a mixture of very hacky stuff on one side, plus stuff that has come over from work on the Foundations of Mathematics. I think that Wombat, by contrast, has the right balance, corresponding to the way most people think.

Wednesday, September 2, 2015

On Infinite Things

Infinite things don't really exist. Consider the simplest example: the natural numbers 0,1,2,3,... We have strong intuitions about this set and we use that intuition in applying mathematics to the real world. Yet everything about Nat, the natural numbers, follows logically from the following simple, and finite, set of rules:
  • Zero and Succ are primitive entities, completely defined by the following:
  • Zero belongs to Nat;
  • Succ maps from Nat to Nat. If N belongs to Nat, then Succ(N) belongs to Nat.
The recurrence relation is real but not easy to think about. The infinite set of natural numbers is something we are more comfortable with.
Consider the test program in the recently released first draft of Wombat compiler code:
   `fact1 _n = if (_n:Int>=?0)==0 then {1} else {_n*fact1(_n-1)};
   `fact2 = { `n = $:Int>=?0; if n==0 then {1} else {n*fact2(n-1)}};
   `in = getInt();
   putInt( fact1 in = fact2 in)
To understand the first line, consider this very finite situation. We have 2 expressions within a closure:
  1. `not true = false
  2. `not false = true
The first thing to note is that it is OK to declare the same name more than once in a closure as long as the names are compatible. Being equal is the simplest form of compatibility. The other acceptable form of compatibility is by combining procedures, and that is what applies here. For the gory details see http://wombatlang.blogspot.com.au/2015/05/combining-procedures-in-wombat.html. But basically they have to agree (or be combined!) where they overlap, and where one fails (harmlessly = Fail.next) and the other doesn't, the combination takes that value. The declared name can't be used until all declarations are complete.
Getting back to that first line above:
   `fact1 _n = if (_n:Int>=?0)==0 then {1} else {_n*fact1(_n-1)}
The first thing to notice is that this expression is the left parameter of a semicolon(;) operator, so the value is discarded, and only the side affect (defining fact1) is relevant. For the record the value would be a Type.
The expression involves a free variable (_n). That means this is an application of a Wombat macro. Since no macro is specified, the default applies, which is "forall". So this is really:
   forall _n (`fact1 _n = if (_n:Int>=?0)==0 then {1} else {_n*fact1(_n-1)})
This is the same as our definition of "not" above, except that there are an infinite number of procedures to be combined (all possibilities for which the expression doesn't fail):
   `fact1 0 = if (0:Int>=?0)==0 then {1} else {0*fact1(0-1)}
   `fact1 1 = if (1:Int>=?0)==0 then {1} else {1*fact1(1-1)}
   `fact1 2 = if (2:Int>=?0)==0 then {1} else {2*fact1(2-1)}
   ...
Except that it is not to be regarded as being in any order.
Note that we can refer to fact1 inside the definition because that is inside a closure. It is technically a forward reference which is filled in, completing the construction of the closure, when the definition completes. The closure can not be called until all forward references are filled in.

[update 2016-09-28: This is no longer quite correct in the new wombat.]