Wednesday, July 16, 2014

mathematics and programming

I wrote in my main blog about the collision between Mathematics and Programming: http://grampsgrumps.blogspot.com.au/2014/05/mathematics-and-programming-collide.html. More recently my corner of the twitterverse has gone mad on the subject of whether programming is mathematics.

Also recently, John Baez made a relevant comment on google+ (in a thread about learning R: https://plus.google.com/117663015413546257905/posts/T7foMTXinGG). Since we can't link to google+ comments, here it is:
I find that writing math papers, teaching, and explaining things online makes me as clear as I want to be.  Programming goes further, into making me clearer than I want to be.  I don't really want to explain something to a complete idiot who goes berserk and throws a temper tantrum whenever I make a simple typo.  :-)
 But Voevodsky would say that Mathematics has got too hard, and we need to explain it to a computerized idiot to avoid significant mistakes: http://www.math.ias.edu/~vladimir/Site3/Univalent_Foundations_files/2014_IAS.pdf.

The problem with serious software development isn't typos, it is the huge amount of stuff you have to know to get it done. Jonathan Edwards wrote recently (http://alarmingdevelopment.org/?p=865):
The way things are today if you want to be a programmer you had best be someone like me on the autism spectrum who has spent their entire life mastering vast realms of arcane knowledge — and enjoys it. Normal humans are effectively excluded from developing software.
Only a small part of that arcane knowledge is really the programming language. Most of it is in the libraries the programmer needs to use. In most cases the libraries give access to some aspect of the runtime environment. For example you can't use Google's App Engine libraries without a good understanding of the environment they operate in. And sadly it is often hard to use the bits that you want to use without understanding aspects of the environment that are not directly relevant but influence the structure of the library and the meaning of parameters.

It is mostly the availability of libraries that determine what language we choose to work in. For App Engine one is almost constrained to work in Python or Java, maybe Scala or Go. Clojure runs in the Java environment (JVM) and has a clojure style library for App Engine, but that library isn't kept very current. One could conceivably use Haskell by using the Haskell-to-Javascript compiler and running the javascript on the JVM using Rhino. Even if this worked (unlikely) one would find that there is no idiomatic Haskell way to access App Engine and one would have to write lots of horrible syntax to access the Java libraries from javascript, and use the mechanism for calling javascript routines from Haskell. Similarly we see that people doing statistics use R, because it has the libraries and builtin functionality required.

Increasingly everyone in the world will need to do bits of programming at times. That's why we have spreadsheets and statistical packages. R is more than that, but many would say it is too flexible and idiosyncratic for developing large programs safely. We have, in recent memory, seen an Economics paper that was very widely reported, that was subsequently shot down because it had a bug in an Excel spreadsheet.

What we need is for computers to understand a lot more about what we are trying to do, so they can help us more. On the one hand this means that programs should look more like specifications and tests. And researchers are working on this. The other thing is that programming environments just need to have good general knowledge.

And, as it happens, Stephen Wolfram is working on this. For example here is a list of file formats that the new (Mathematica-based) language supports in various ways: http://reference.wolfram.com/language/guide/ListingOfAllFormats.html. [It doesn't seem too much to ask that an interactive environment be able to guess the format of a file and offer you code to access it.]

Mathematics is about thinking clearly, so we expect that good support for it will be the most important thing in facilitating correct programming. The Wolfram Language knows a lot about Mathematics, as we expect from its Mathematica background, but I don't think it is in the best form to be useful. Maybe I just don't understand it yet.

Sunday, July 13, 2014

Yin Wang on mutable and immutable arrays

Wombat can easily accommodate Yin Wang's ideas on arrays: Design mistakes in Swift language’s array.

In Wombat a 1-dimensional Array is immutable and the same as a List (see footnote). To get a mutable Array of X you instead create an Array(Assignable(X)). Conceptually it is a fixed array of addresses in memory. And we can easily create an operator similar to the proposed [[1,2]]. It would create an Array(Assignable(Int)) with 2 entries initialized to 1 and 2 respectively. However the specific syntax is taken (being a one element Array whose element is [1,2]). But something similar like ≪1,2≫ is ok.

footnote: Traditionally a List was a singly linked list where you could efficiently prepend values, and an array was a contiguous block of memory. However the semantics can be identical (with differing efficiencies). Languages should be about semantics and not about implementation and efficiency. The programmer should mostly leave efficiency to the compiler, with the occasional pragma.

[update: just changed Mutable to Assignable]

Friday, July 11, 2014

overlays, shared libraries, and all that

Once upon a time you would link your program with everything it needed, and after that it stayed the same (unless the operating system calls changed behaviour).

But memory was insanely expensive so programs soon didn't fit. I well remember trying to get overlays to work.

So we invented virtual memory and shared libraries. Shared libraries had other advantages. Without changing your program you could get performance and security upgrades to the libraries which improved all the programs using that library. Sometimes there were also functionality improvements and we see that a lot in the smartphone/tablet world with google moving a lot of functionality out of the operating system and into a library.

So we've got stuck on shared libraries even though they bring up a big problem for the developer: version hell. This is where one library requires specific versions of another library. Then you want to use a different library as well, and it requires a different version of that 3rd library.

Memory is now cheap, so really it shouldn't be beyond the wit of man to design a system that incorporates the advantages of shared libraries without the problems. Or maybe allow old fashioned linking, but still allow updating without relinking main. At any rate it is obvious that different libraries should be able to grab stuff from other libraries without getting in each others way. It doesn't matter if there are multiple versions of the same subroutine in memory. It should be said that libraries should never use global memory: if they want global memory they should map a file with the particular level of granularity they want.

This post was inspired by this observation from the new Swift blog:
"Xcode embeds a small Swift runtime library within your app's bundle"
But this seems to be a temporary thing, not an attack on the problem.

Monday, July 7, 2014

Invariants

I well remember watching this talk from Google I/O 2009: A Design for a Distributed Transaction Layer for Google App Engine. The general message:
Programmers should concentrate on invariants in the software.
And it was interesting to see them apply this is in a very specific case (where I happened to understand the problem and the context).

And I wondered whether programming languages and libraries could support this approach better.

This paper and talk by Bob Atkey looks like it is very relevant: From Parametricity to Conservation Laws, via Noether's Theorem (slides). Unfortunately understanding it is a bit of a way down my stack.