Daily Archives: 9 May, 2013

A language with the structure of water….

 

So here’s an interesting question; what makes one computer programming language more expressive than another?  Everybody understands that My Favorite Toy Language is considered by its partisans to be “more expressive” than other languages, in the sense that it is easier to do more with it.

 

Douglas Hofstadter made an exposition of the problem with programming languages he called Goop and Floop and Bloop when he was writing in his book Metamagical Themas: Questing For The Essence Of Mind And Pattern.

 

I highly recommend that book by the way; it covers an amazing breadth of topics and covers them very well.

 

Wikipedia has an article on expressiveness of programming languages too.

 

But most measures of programming language expressiveness are subjective.  In some sense all real programming languages are equivalent in that they are supposedly Turing Complete.  (I say supposedly because Turing Machines have infinite memory; a feature not yet found in any real programming environment). But a quick comparison of these supposedly equivalent languages rapidly reveals that although it may be possible to write any program in any language, it can be very, very, hard in some of them.  This is called the Turing Tarpit.

 

So, there is something we call expressiveness.  Intuitively, it’s a measure of how easy it is to write programs in a given language and how easy it is to structure them how you want in order to keep them easy to understand and work with.  But it’s hard to define precisely.

 

Matthias Felleisen has an interesting notion of what Expressiveness means in a programming language.  And for what it’s worth, his is the best statement of the notion that I have come across.

 

He says that if you have two languages A and B, and it is the case that any program written in A can be rewritten in B without changing the structure of the code, But there are programs in B that cannot be rewritten in A without changing the structure of the code, then B is more expressive than A.  This notion still has some difficult subjective bits attached to it, but they’re manageable.  Firstly, we have to ask what it means to “change the structure of the code.”  Obviously, the two languages have different syntax and semantics; otherwise they’d be one language.  So there will necessarily be a certain amount of change involved as we swap out the syntax of one language for another.

 

The crucial point here is the notion of syntactic changes as ‘local’ transformations and more drastic changes that involve rearranging the entire structure of the program as ‘structural’ transformations.  So, if you have to change one symbol for a different one in an assignment statement, or mark comments in a different way, that’s a local syntactic change.  If you have to write a memory manager to replace the garbage collection provided by the other language, or transform a program into continuation-passing style to achieve the same control flow, that’s a structural change.

 

There is a language implementation I’ve been working on for some time, which offers a sort of absolute flexibility in that approximately anything I can think of any programming language syntax doing, (with the exception of a few esoteric languages that challenge our notion of what functional or syntactic abstraction  is in the first place) can be done by a routine or function defined in this language.  I had it in mind to make a dialect that can be used as a simple, easy translation target for automatic translation of code written in nearly any programming language, and as I read Felleisen’s article I realized that what I’ve been doing is maximizing expressiveness, as Felleisen defines expressiveness.

 

To be precise, my design goal is that, for as many programming languages as possible, it should be possible to take a program written in that language, parse it so you have its abstract syntax tree, express an isomorphic abstract syntax tree in Lisp syntax, add a language-specific library that defines the keywords used to name the nodes of the syntax tree as functions, and have a program with the same semantics expressed in the new language.

 

However, this leads to a paradox.  We like to think of more expressive languages as being easier to read, write, understand, and use.  But in a quest to allow translation libraries to express the semantics of any language without altering its abstract syntax tree, it is necessary to allow semantics that were mistakes in the original languages, or which conflict in subtle ways with the semantics of exactly the same abstract syntax tree in other languages, or which are just plain confusing and bad abstractions in the first place.  What it leads to, in fact, is the situation where code that uses any definition you don’t already know is de facto incomprehensible, and where any abstraction you may think you have built can be subverted by some other part of the code.  And that means that, although it is more expressive in Felleisen’s sense than any other language I can imagine, it is also potentially confusing, hard to understand, and subject to all the same difficulties in reading and writing as code in any of its translation-source languages.

 

Language designers these days boast about having impenetrable abstractions, about the things you don’t have to worry about because with their new, stricter type systems they simply can’t happen, and how knowing that no other routine is capable of breaking your abstraction you are free to change its implementation at any time.   I contemplate a language with no such guarantees.  And I think I’m not crazy — but then crazy people always think that, don’t they?

 

What you have here is a language with no structure — or, if you prefer, with the structure of water. It can be deformed arbitrarily, and without any stress to itself, to fit the available space.  All the usual tools for handling rigidly structured things will utterly fail when you use them to try to grasp water, or dig a hole in it, or whatever.  And while this maximizes Felleisen’s ideal of expressiveness, it utterly fails to have the intellectual tractability that allows human beings to understand and reason about a program — and that makes it unexpressive by some standards, in the same way that arbitrarily complicated languages like C++ are unexpressive.

 

I have not really decided how I feel about this yet.  I can’t keep the problematic cases and semantic abuses out of this language and still accomplish my goal of creating an absolute translation target.  Nevertheless, those things will deprive the users of some useful guarantees that would help them to understand and reason about code.

 

So I think that the best thing I can hope to do is to provide good tractable ways of doing all the things that people might otherwise use  the dangerous features to do and isolate those dangerous features behind a comprehensive set of danger signs.  A translation isn’t useful unless it provides not only the same results as the original, but is also subject to the same reasoning techniques and provides the same structural guarantees as the original.   What it takes is a change in perspective; it is individual programs, and not a language taken as a whole, that are subject to semantic certainties and guarantees.  I need to provide a comprehensive way of excluding things that break particular guarantees, then allow the user to declare those guarantees and give a build error if the code then attempts to include things from the parts of the language that will break them.

 

The usual way in computer programming is to find ways to include things, relying entirely on the programmer’s knowledge or the structure of the language itself to exclude things that don’t belong together.  But by providing an language more comprehensive in potential causes of misunderstanding than any which has gone before,  it becomes even more vitally important to provide ways to manage and guarantee certain kinds of exclusions than it is to find ways to allow inclusions.

 

The Pilgrims who founded the USA came here in search of what they called Freedom of Religious Expression — by which they meant they wanted to impose on themselves and each other restrictions greater than those then allowed by English law.  Quite a paradox that, and sometimes I think that that twisted logic has placed an inescapable and not altogether wholesome founder effect on this nation.  Still, then as now, sometimes freedom of expression requires the freedom to express restrictions and constraints.

 

This article about design goals and problems is part one of a series. In the next post in this series, I talk about fexprs, which are the approach that’s necessary to achieve this goal. And, inevitably, about some of the problems they pose in semantics. So if you’d like to keep reading, I invite you to follow the link.