Wednesday, March 04, 2009

LLVM Update

People tracking along will know that I am in the process of writing an LLVM backend for Moscow ML. There are several ways in which one can do this, so one has to be decided upon and tried out. It is however clear we need a way to manipulate and output LLVMs IR, so I set upon writing an set of bindings.

These bindings are not using an FFI to call into the C++ LLVM code. Rather they produce the IR assembly from an internal SML-like data structure. This approach has the disadvantage that it is more work to maintain. On the other hand, the OCaml bindings passes around llvalues all over the place so very few bugs will be captured at compile time (we could do oh so much better had we had dependent types).

I have written LLVM as a datatype in SML and then written about 40 percent of a type checker for it. The process is very simple as soon as one has converted LLVM rules into SML datatypes. Here is a part of the Type module

datatype t =
    T_Integer of int
  | T_Real of float_ty
  | T_Fun of {return: t, params: t list}
  | T_Struct of {elements: t list, packed: bool}
  | T_Array of {ty: t, length: int}
  | T_Pointer of t
  | T_Vector of {ty: t, length: int}
  | T_Opaque
  | T_Void
  | T_Label
  | T_Top

which is a straightforward adaptation of the Language Reference of LLVM. I could simplify it more, but at the moment there is little need for that.

Type checking is carried out by a set of combinators. They probably form a monad (Haskell guys can relate this to a variant of the Either monad with a Writer on it as well. It should be straightforward to write it down as one or use a transformer stack to achieve the same. I have not given it much thought though).

The checking basis is the datatype

datatype t_check = Ok | Wrong of string list

Immediately, a helper is introduced

fun type_fail str = Wrong [str]

And a function for running a checker

fun run checker_fun ty =
 case checker_fun ty of
   Ok => ()
 | Wrong errors =>
     <<Print out type checking errors>>

Note that run will have type (ty -> t_check) -> ty -> unit. The question is then how to produce type checkers. This is done with combinators but of course:

(* Logical or for types, consider monadizing *)
fun or checker1 checker2 ty =
case checker1 ty of
 Ok => Ok
| Wrong err1 => case checker2 ty of
         Ok => Ok
       | Wrong err2 => Wrong (List.concat [err1, err2])

The 'or' checker tries the first checker and if it fail it tries the second, picking up any errors on the way. It is also easy to assert that a given type must be the base type of a vector type:

fun vectorized checker1 vty =
case vty of
 T_Vector {ty, ...} => checker1 ty
| _ => type_fail "Checked type is not a vector type"

Perhaps one could be giving a better error message here by capturing the eventual error output from checker1, but this has not been done yet.

Primitive checks are just doint what they are supposed to do. The assertion for an integer looks like the following:

fun assert_int ty =
case ty of
 T_Integer _ => Ok
| _ => type_fail "Type is not of integer type."

Usage

An addition in LLVM is either an integer, a float or a vector of these. Aha:

val assert_int_float = or assert_int assert_float
val assert_add = or assert_int_float (vectorized assert_int_float)

Hopefully this shows the succinctness of the approach.

EDIT: Layout was bad. Fixed

Post a Comment

About Me

My Photo
Lambda-loving CS Geek. Likes metal music. Likes dogs. Likes cats. Does not like pictures of dogs and cats (unless they are lambdacats!)

Has an unhealthy coffee addiction. Calls himself the coffee zombie in the morning (BEEEEANS!)

Has a neverending curiosity gene. Likes intelligence.