This page moved
This page is now deprecated in favor of Google code/discuss. See:
Forklift
Forklift is a compiler for fluidly interfacing high level languages with C. It reads a set of unannotated C header files and a set of pattern matching rules about their use, and outputs ready to use library for your favorite high level language.
Skip to a demo of the typestub algebra. A demo of the patttern matching frontend is forthcoming.
What is the forklift view of library software development?
- The most effective software libraries should be freely interchangeable with the most effective programming language runtimes for any particular application.
What are economic substitutes to the forklift view of library software development?
- Interface definition languages (IDLs). By interface definition language we mean a code generator which takes a memory-safe description of an interface and injects it into C or any other language. WSDL/Soap is a trendy example of an IDL.
- The Simplified Wrapper and Interface Generator (SWIG).
- Virtual Machine library aggregation such as the Java Runtime Environment and the .NET Common Language Runtime
What makes forklift different?
- Forklift generates foreign function interface stubs based on
reading C header files intact, without annotation. IDL and SWIG
also make stubs from header files, but they do so by requiring
inline modifications to the source code, making any stub
preparation irrelevant when version 2.0 of the header files
are released. Forklift's annotations are kept as separate input
files.
- Forklift, like SWIG, works with C code snips rather than an
exhaustive grammar for describing every possible type. This
makes it extensible and allows it to describe multiple
representations for the same data.
- Forklift requires enough structure of the C code to provide
higher order type representations. For example, given a way
to convert "int" variables, one can provide an abstract
way to convert arrays such that an int array conversion
can be generated.
- Forklift's stub algebra admits full n variables -> k variables
transformation generality. For example we can map an (int, int*)
pair in consecutive parameters to a HLL int array, where the
the first is taken to mean the length and the second a pointer.
- Forklift provides for a rich pattern matching vocabulary, not
simply equality matching on variable names.
- Forklift generates constructors, destructors and accessors
for data structures automatically. A flexible set of
options allow choices between single accessors vs. bulk
accessors, read-only vs. read-write accessors, etc. For
languages that support finalization, finalization methods
can be generated.
- Forklift can be run as
an interactive session, typically as an emacs inferior
buffer. Its utility functions allow the user to query the
header files instead of reading them. So one can ask, "what
functions pass a pointer to this structure as a first
parameter?" and get an exhaustive list of function
identifiers. This saves laborious reading, copy-pasting,
and editing.
Why are virtual machines listed as an economic substitute?
Virtual machines attempt to solve Forklift's problem domain
from a different angle. Instead of solving the langugage
integration problem, they rewrite the basic functionality of an
operating system and make it available to languages conforming
to a particular notion of memory management and function calling
convention. Aggregating libraries into high level languages
has always been possible for single languages (eg, CPAN). But
VMs have been effective in making the economics of this wrapping
labor scale better. That is, the same number of labor hours go
to connecting larger number of language,library pairs. The
forklift way divorces the library annotation part from the
method of memory management, allowing forklift-annotated libraries
to be called from unorthodox kinds of runtime environments.
Limitations
- Forklift is currently in pre-alpha stage. It is not yet recommended for production because the design is being implemented in a bottom-up fashion, which makes the user interface the most raw part of the system. If you do use it, it is recommended that you keep a copy of the entire program along with each set of annotations you make. Future versions of the compiler are not guaranteed to read the same annotations!
- Forklift currently implements only an Objective Caml backend. The OCaml-specific code is two files of 90 lines each. If you are familar with your favorite HLL's foreign function interface, you should be able to implement one with a similar amount of code.
Further Reading
Here are some readings for getting a better sense of what forklift is about.
Design Discussions
Downloads
Browse the source code via cvsweb.
Download release 0.5.1. (92k .tar.gz)
Download release 0.5.2. (198k .tar.gz) (June 2007)
Credits
Special thanks for the systems which came before, which gave good examples of what works and what doesn't. Systems observed include:
Reini Urban has an interesting survey "Design Issues for Foreign Function Interfaces: A survey of existing native interfaces for several languages and some suggestions." http://xarch.tu-graz.ac.at/autocad/lisp/ffis.html
Olin Shivers years back had a project bullet that got me thinking on the back burner about FFI auto generation.
Ken Russell gave me the first experience of how fun it can be to have language bindings. His system Ivy bound OpenInventor in Scheme and was 90% automatically generated. The compiler munched on C++ header files. I believe it only handles the call-ins and he wrote the Ivy call-outs by hand.
LablGTK is a really nice binding, IMHO. It binds GTK in Objective Caml with labels which were optional in O'Caml until version 3.05.
Forklift is a subcomponent of the adlibitum project.
Return to Jeff's top page
Send mail to Jeff: jehenrik (put an at here) yahoo (put a dot here) com.