Tuesday, January 20, 2009

Ripping the backend from Moscow ML

Suppose you have two functions:
(* Type: int * int -> int *)
fun f (x,y) = x + y

(* Type: int -> int -> int *)
fun g x y = x + y
These functions do the same thing, namely add two numbers 'x' and 'y'. They have different types however, and they are called in different ways. 'f' is the usual way in most languages. We take a pair, and return the result. 'g' is curried: it is a function that takes an 'x' and then produces a new function taking a 'y'. The produced function returns the result when given an 'y'. Interestingly, 'g' is more powerful than 'f'. If we write
(* Type: int -> int *)
g 3
we get a so called partial application of 'g'. This means we only supply the first parameter to the function. It allows us to optmize with respect to the first parameter and it is a good way to define very general functions and then specialize them into what we need later on. Here is an example:
(* Type: int -> int -> int *)
fun h x =
 let val z = expensive_operation x
 in (fn y => y + z)
 end
We have defined a function which carries out some expensive operation between getting its first and second parameter. An application, 'let val foo = h 37' say, will only compute the expensive operation once and each time we use 'foo' we reap the benefits. Moscow ML is a quasi-implementation of the Standard ML programming language. It does not follow the standard completely, but it follows it enough to be usable and to let simple code flow between SML implementations without too much problems. What does not flow that easily is the module system which is treated somewhat differently between Moscow ML and e.g. SML/NJ or MLton. Moscow ML works by taking SML source code through a lexer and a parser into an abstract syntax tree. Then the types of all expressions are inferred in an 'elaboration' phase. Next, modules are stripped and pattern matches are simplified via a pattern match compiler. Now, the code is translated into an internal languauge, Lambda. Lambda is a lambda calculus + stuff. Finally, Lambda is translated into bytecode of a virtual machine. The Moscow ML runtime is written in C and can interpret the bytecode output from the compiler. The engine is neat, but it suffers somewhat from age. Several speedups are readily possible on this machine with relative little effort. The naive implementation of the 'g' function from before follows the semantics directly. Take the 'x', produce the intermediate function as a closure and then invoke the closure on 'y'. It is inefficient to produce the closure however, and we don't always have to! in 'g x y' we know all the arguments, so we can avoid building the intermediate function. Moscow ML relies on a derivative of the ZINC abstract machine as its VM. ZINC was an experiment by Xavier Leroy for writing a virtual machine for the Caml language. It is interesting because curried arguments do not build intermediate closures unless they have to. In other words, using the curried, more powerful, variant of a function has no extra cost in this machine. I want to compile Moscow ML into LLVM bytecode. The idea is to compile from Lambda down to LLVM, which is not that different to what the Ocaml compiler is currently doing. In the process, I would like carry over the ZINC machines good handling of curried arguments. I am taking some preliminary steps to get this done, but don't expect anything wild the next couple of months. This takes a lot of time getting right since I am writing it as a hobby project for the time being. I might be posting some updates later when the project has progressed furhter. References -- if you know any reading which might be interesting, do not hesitate to write me: The ZINC Experiment How to make a fast curry (I love Simon Peyton Jones paper names) The Mosml-llvm code lives here
Post a Comment

About Me

My Photo
Lambda-loving CS Geek. Likes metal music. Likes dogs, cats. Does not like pictures of dogs and cats (unless they are lambdacats!)

Has an unhealthy coffee addiction. Calls himself the coffee zombie in the morning (BEEEEANS!)

Has a neverending curiosity gene, loves intelligence and passion.