on the lisp family of languages

Date: 2008-11-29 13:23:00
Tags: psil, lisp

I've learned more about Lisp dialects in the last couple of weeks than I really expected to. As I mentioned in a previous entry, I'm looking to build an embeddable Lisp-like language for Python.

While it's true that people have built this sort of thing before (see list in previous post), there's nothing quite like building it yourself. I know if I simply started using PyScheme for example, I wouldn't earn the depth of understanding that comes from the experience of having actually done it.

The family of Lisp languages is broadly divided into two types: Lisp-1 and Lisp-2. Lisp-1 languages (eg. Scheme) have a single namespace for symbols. That means that if a symbol like foo refers to an object, it always refers to that object no matter which context it appears in. Lisp-2 languages have two or more namespaces, which means that a symbol bar could mean the number 5 in one context, and a completely different function in another context. For example:

> bar              ; refers to the value namespace
5
> (bar 4)          ; refers to the function namespace
16
> (display bar)    ; refers to the value namespace again
5
> (display #'bar)  ; special syntax needed to refer to the function namespace here
#<function>

This reminds me of Perl's namespace handling, where $baz, @baz, %baz, and so on all refer to completely different things. I'd like a Lisp-1, thanks.

Common Lisp is pretty heavyweight and contains a lot of stuff that isn't necessary for an embeddable language. The basic operators have odd names (setq, incf, progn, etc), and it contains its own whole world of data structures. And it's a Lisp-2, which I'd prefer not to have to deal with.

Scheme is another dialect of Lisp (and in fact predates Common Lisp), and it was designed from the start as suitable for an embedded language. It is a Lisp-1. The creators of Scheme objected to the implementation of macros in Lisp, due to the potential problems of variable capture. They created a whole new kind of macro system called "hygienic macros", which was designed to avoid common pitfalls in macro programming. However, with a bit of care one can avoid these problems when using conventional Lisp macros.

Arc is Paul Graham's new Lisp dialect and initially looks interesting. Arc is small like Scheme, but Paul Graham's obsession with making things shorter almost makes it Perl-ish in terseness. Without already knowing what each of the following symbols do, it would be hard to guess their purpose even for somebody familiar with Lisp or Scheme: scar, o, w/uniq, iso, ccc. The assignment operator is =, while the comparison operator is is. Writing code like (if (= x 5) ...) would almost certainly run contrary to the author's intentions (and is very much like the common C error if (x = 5) ...).

Arc is based closely on MzScheme, to the point of currently requiring a very specific (old) version of MzScheme to run. This tight implementation coupling shows through in various ways, including in the Arc library functions available (eg. the threading support). Arc itself is enjoying slow progress, and there almost immediately appeard a fork called Anarki, seemingly leaving Paul Graham's implementation behind.

Clojure is another new Lisp dialect targeting the Java VM. Its goal is good integration with the whole Java world, which actually aligns well with my goal of integrating with the Python world. It largely solves the problems of variable capture in macros by using an "auto-gensyms" feature for the quasiquote operator. Their Differences with Lisps page contains a nice summary table of important differences between Clojure, Common Lisp, Scheme, and Java. It is unsettling that the differences between nil/true/false in each language is so different, but I think those differences are largely historical, and reflect changing notions in computer science.

What comes of all this? I had initially thought that I wanted to implement an existing language (probably Scheme or Arc) in Python, but I'm becoming increasingly of the opinion that neither of those are quite suitable for what I want to do. I think it looks like I'm going to take a Clojure-like approach and build a new language based around the Python runtime and ecosystem, borrowing ideas freely from all of the above.

Comment

Greg Hewgill <greg@hewgill.com>