lisp-wins

Lisp Wins (I think)

Stevey's Drunken Blog Rants™

(Note: although I came to the conclusion in this article that Lisp beats Java hands-down as a language, more research has got me thinking that Java beats Lisp hands-down as a platform, and until a better Lisp comes along, Java still may be the premier general-purpose platform for building large server-side systems. At the moment I'm uncommitted.)

Folks who know me well enough to have seen me sloshed and mumbling incoherently about languages just before passing out, know that I've been on approximately a year-long quest to find a replacement for Java.

Here's the abbreviated history of my programming career, from a languages perspective, to put the following discussion in context:

    • HP 28s and 48c stack language (1986-1987)

    • Pascal, and eventually Turbo Pascal (1988-1989)

    • Fortran for introduction to programming (1989)

    • Ada for intro to programming part II (1990)

    • C, C++ at UW for most CS class projects (1990-1992)

    • Lisp, Smalltalk, Prolog in prog lang class (1991)

    • A bit of Lisp for AI course (1991)

    • 8086 assembly language (1992-1995)

    • very minor doses of Perl and Tcl (1992-1995)

    • minor but increasing doses of Emacs Lisp (1992-1995)

    • C++ and Win32 low-level (driver) API (1995)

    • back to 8086 assembler (1995-1996)

    • Java: Nov 1996 - 2004/present

    • Perl, Python: 1998-present

    • Emacs-Lisp (larger doses): ongoing

You'll notice that the very long stints were primarily assembly language for about 5 years, and Java for a bit over 7 years, with Perl, Python and Emacs-Lisp playing backup now and then.

So mostly I'm a Java programmer.

Initially it was pretty much a six-year-long love affair. Of course, after 5 years of assembly language, anything with if-statements and loops would have seemed like heaven. But half a million lines of code, most of it in one application, is enough to ruin the best of honeymoons. I realized, about 18 months ago, that I was getting sick of Java.

I was sick of it primarily because of its lack of compressibility. There's no preprocessor and no templates, so you can't use the standard (if gross) code-generation hacks to achieve something like compression. You can only go so far with Java's OO model before you've stripped it down to what I'd call "essential boilerplate code". Every language has some amount of it: code that doesn't affect the computation, and only exists (presumably) to help the compilers who were at the bottom of their class in Compiler School. Java, in my opinion, has waaay too much boilerplate that you can't eliminate, and it obscures your actual application logic and slows down development, refactoring, debugging, and just about everything else.

Part of the problem is just clutter: the Java programming language is pretty darn verbose. To illustrate, let's consider this snippet of what I hope will be reasonably intuitive Perl code:

my @squares = map { $_ * $_ } (1..5);

print "@squares\n";

This prints the sequence:

1 4 9 16 25

Even if you don't know Perl, you have to be able to deduce more or less what this code must be doing in order to print out the first five squares.

The equivalent code in Java, as reasonably compressed as I could make it, is:

import java.util.*;

public class Fugger {

public static void main ( String[] args ) {

List numbers = new ArrayList();

for ( int i = 0; i < 5; i++ ) {

numbers.add ( new Integer(i) );

}

List squares = new ArrayList();

Iterator it = numbers.iterator();

while ( it.hasNext() ) {

int i = ((Integer)it.next()).intValue();

squares.add ( new Integer(i * i) );

}

Iterator i2 = squares.iterator();

while ( i2.hasNext() ) {

int i = ((Integer)i2.next()).intValue();

System.out.print ( i );

if ( i2.hasNext() ) {

System.out.print ( " " );

}

}

System.out.println();

}

}

Golly. My half a million lines of Java code isn't looking quite so glamorous anymore, is it?

Before you go critiquing my Java code, remember that I'm trying to match the semantics of the Perl version as closely as possible, starting with the fact that it has to be a fully-functional, standalone program when fed to the interpreter. Here's what the Perl version does:

    • first it creates a list of the integers 1 through 5

    • then it creates a function that receives an int and returns its square

    • it passes that function to each element, creating a list of the results

    • it assigns the result list to a named variable

    • it prints the result list, separated by spaces, with a space at the end

Technically, I cheated a little in the Java version; it doesn't use the Visitor pattern, since that's not natively supported in any way by the Java language (or libraries). It uses Iterator, which is a poor-man's Visitor. In a larger program, it might actually matter - I might really need to pass a function in to visit every element of the first list. To do that, I'd have to declare a Visitor interface, an anonymous inner class, and create subclasses of ArrayList that can accept my Visitors. The program size would at least double.

I actually chose this example pretty much at random; I was as surprised as you probably are to see just how verbose the Java version is. Yes, I'm well aware that Java 5 came in and saved the day with its "smart" for-loops, autoboxing, static imports, and other amazing time- and space-savers. So here's the incredibly succinct Java 5 version:

import java.util.*;

import static java.lang.System.out;

public class Fugger {

public static void main ( String[] args ) {

List<Integer> numbers = new ArrayList<Integer>();

for ( int i = 0; i < 5; i++ ) {

numbers.add( i );

}

List<Integer> squares = new ArrayList<Integer>();

for ( int i : numbers ) {

squares.add( i * i );

}

Iterator<Integer> it = squares.iterator();

while ( it.hasNext() ) {

int i = it.next();

out.print( i );

if ( it.hasNext() ) {

out.print( " " );

}

}

out.println();

}

}

Well, poop! I guess that didn't help much, did it? It's a little cleaner, but it's still about 10 times longer than the Perl version.

All that Java 5 "cleanup" didn't address the root cause of this order-magnitude size difference -- namely that in Java, if you find yourself saying the same kinds of things over and over again, there's no way to create shortcuts. All Java 5 does (for compressibility, anyway) is add a few shortcuts for a few idioms that were especially aggravating. They didn't help you at all with the more general problem of creating your own shortcuts.

(The Java 5 release does fix some other problems that aren't central to this discussion; even if it's a long way from Guy Steele's wonderful long-term vision, it's still an important step for Java. It's gratifying to see that the language can still evolve, even with a huge bureacracy behind it.)

Occasional Elegance

Before you Perl-lovers get too smug, I'll assert that writing Perl code isn't exactly a walk in the park, either. I managed to find a particular Perl syntax that was reasonably minimal, almost bordering on elegant, but that's pretty unusual for Perl. Sometimes it fits just right, but when it misses, the best code a human could write looks like it was produced by an agitated baboon banging its feet on the keyboard for 20 minutes.

In sharp contrast with Perl's occasional elegance, many languages manage to seem elegant all the time. Consider:

puts (1..5).collect{|x| x * x }.join(' ')

This is Ruby code. Ruby's modeled loosely on Perl, somewhat, but Ruby's very, very clean. Every Perl programmer ought to take a look at it - you'll be amazed at how much Ruby you already know.

Ruby's not the only consistently-elegant language out there. Consider the equivalent Python, for instance:

' '.join([`x * x` for x in range(1, 6)])

Also very nice. But Ruby turns out to be easier for Perl programmers to learn.

In fact, Ruby's one of the easiest languages for just about anyone to learn -- it's maybe a 3-day exercise, and after a week or so of using Ruby, you just may never look back. It's like getting a new car: it still gets you from A to B, but without your old car's flaky transmission, cracked windshield, and 5-year-old french-fry remnants under the driver's seat. But I should save that discussion for another blog.

I could try being completely fair to the Java version, and use other syntactic tricks to try to tighten it up. But to make it apples-to-apples, I'd have to make the semantics closer as well, which would require the introduction of several new classes and functions to emulate the real behavior of map().

In the end, 10x bigger is as good as it gets. You can try any amount of squeezing, but unless you're willing to break the semantics of the problem then this is about as short as you can make this program.

But we all know what the program is supposed to do; it's clear enough in Ruby/Python/Perl versions: initialize a list of the numbers 1 to 5, map a square function onto it to produce a list of the squares, and print them out with spaces. It's shorter to say in Perl than in English. So the Java version clearly has a bunch of clutter that has nothing to do with the actual computation going on.

What you really want to be able to do in Java, particularly after writing half a million @#$! lines of the stuff, is factor it. Yes, I read Fowler's seminal Refactoring book, and it was truly wonderful and inspiring and all that happy hooey. Yes, I refactor. But Refactoring doesn't help you compress your code. Not much. In fact, typically it expands it, by adding in more clutter in the form of classes that are just wrappers for functions, or functions that are just wrappers for small computations, or interfaces that are just verbiage for what are incredibly fundamental operations that Java pretends not to know about, like map() and filter().

Java's shortcuts fail to satisfy

You may have noticed one minor detail of the problem that really bent us over in the Java version: we don't print a trailing space character at the end. After the last square, there's just a newline.

What are our options in Java? Well, we could create a StringBuffer and append everything to it, and then print out its toString() after trimming its length, but you really don't want to go there, trust me. It'll add five more lines of code.

So we're pretty much doomed to a complete overhaul of the final loop, because the "smart" for-loop provides you with no way to access the underlying iterator, so you can't check it to see if you're on the last element. You have to rewrite it to expose the iterator, which you were so happy to have hidden before.

It appears that Java 5's clever new shortcut isn't all that clever, or at least not very flexible: any variations on the loop pattern, and it's of no use to you. Back to the old way.

Well, let's say you do this all the time. You have 100 of these loops in a file, and 1000 of them scattered through your application, and none of them can use the new "smart" foreach-loop, because they want to do something different on the last iteration.

What you want is to be able to define a code template, and I mean this without any particular reference to C++ templates, although they would probably solve this particular problem pretty well. You just want a way to say: "here's a pattern that shows up all the time in my code, and I want to make a shortcut for it". You can't make it a method call. You could try, by creating a method in some utility class somewhere called printIntListElementsWithSpacesExceptAtTheEnd, and move the bloat into that method. Then everywhere you did this particular kind of loop, you'd be left with:

printIntListElementsWithSpacesExceptAtTheEnd(mylist);

Yeah, it's shorter, sort of. But all you've done is extract out that exact code. You only benefit from it (if you can even call this monstrously specialized function a benefit) if you use that exact pattern an awful lot. But what if you wanted your shortcut to work with output streams other than stdout? Or pass a char list instead of an int list? Or have a different separator, or terminator? You'd have to add a bunch of parameters to the function, or create a bunch of overloaded versions, and your code would be just as miserable as it was before (or nearly so). Classic Java-style Object-Oriented Refactoring has definite limits, and I don't like those limits anymore.

Unfortunately, Perl doesn't help you here, nor does Python. They're both far less verbose than Java, but they're just its slimmed-down cousins, really: they provide lots more shortcuts, but little or no ability to create your own.

Macros and Metaprogramming

[Note 8/25/05 -- Since writing this essay, I've found that Perl, Python and (especially) Ruby all provide more metaprogramming facilities than I thought. Ruby's even powerful enough to create some kinds of syntactic macros. Even so, none of them really compares to Lisp in this department, so I'll leave the section as-is. The point is that Java has absolutely nothing to help you here.]

When I say create your own shortcuts, I'm referring to the fancy-sounding (but not actually that tricky) practice of metaprogramming. Metaprogramming just means that your programming language has provided some features that let you change the behavior of the language to better suit your needs. Not all languages have it, and those that do have different approaches to it.

The easiest way to think of metaprogramming is as a horizontal line with your language sitting right in the middle of that line, as a single point. (Source: Kiczales, The Art of the MetaObject Protocol.) If you were to change the language very slightly, it would move left or right. For instance, a long time ago, someone made a Java preprocessor that let you make minor syntactic changes, such as this:

public void myMethod() {

// do something

} catch ( Exception x ) {

// clean up

}

This isn't Java, but you can see what it's trying to do. It's making a minor cosmetic change, to eliminate the extra noise (and indenting) introduced by a try/catch block. It has exactly the same semantics, and the preprocessor turns it into legal Java code.

That's a simple example of metaprogramming. Java doesn't have it, or anything even remotely resembling it. It's as if your code is carved into concrete.

C++ has templates and template metaprogramming. Templates are, in fact, pretty much exactly what I'm talking about here, and they would be a great solution, except for the fact that they have less power and more verbosity than the original language, whether it's C++ or Java that you're talking about. So yes, they work, sort of, but the acrobatics you need to go through, for very little actual extra expressive power, just doesn't seem worth it to me.

The kinds of metaprogramming I've been discussing are mostly syntactic changes: you're converting a verbose, cluttered code pattern into something that expresses the computation more cleanly. You might wonder why the language designer didn't make it clean to begin with, but tastes differ, and language design is really hard. That's why metaprogramming is useful: it gives you a vote in how the language should work. Syntactic changes require something like a preprocessor, since it has to interact with the compiler in some form. But make no mistake: there are better ways than C++ templates. I'll talk about them in a minute.

The other main kind of metaprogramming lets you muck with the semantics of the language: for instance, changing the rules for how to search the inheritance chain, or how to access instance variables. This kind of metaprogramming requires a MetaObject Protocol, which is a really fancy/scary-sounding term, but all it means (in OO terms) is that the classes representing Class, Method, Object, and so on can be subclassed. Java has a class called Class, and a class called Method, but they're final - you can't change their behavior.

Ruby has a MetaObject Protocol (MOP) that gives you some control over the semantics, but not the syntax. So does Python. They're both cool, and easy to use - e.g. once I was writing some Ruby code, and wished that the String class had an endsWith method like the Java String class. So I added one:

class String

def endsWith(arg)

arg =~ /.*arg$/

end

end

(Ruby fans will spot a minor mistake here, but I'll leave it as is for clarity). In any case, that's about as simple as it gets - you can declare the same class in multiple places, and Ruby assembles all the declarations, so the built-in String class wound up with my endsWith() method, which I could then use everywhere, even with literals:

> "foobar".endsWith("bar")

=> true

So that's cool and all, and it helps, but it also has its share of limitations, because you can't change the syntax. And with C++, you can change the semantics and the syntax, but only so much: with a lot of work, you can make the dot on our horizontal line move left or right, but it's a LOT of work, like pushing a dead horse down the street. It's about that pleasant, too.

Sophisticated metaprogramming systems offer you the ability to make both syntactic and semantic refinements to your language: in a word, shortcuts. Smalltalk has powerful metaprogramming support; Ocaml has a macro system (which someone used to create syntax for pattern-matching alternatives using Perl5-compatable regexes, which is really cool); some other languages support these things as well. Even C++ has some other metaprogramming features, like operator overloading - minor, but it does the trick: you use it to extend the syntax, hopefully to make your program shorter and more readable to others.

That, of course, is the problem with metaprogramming: if you change the language, then your readers will be looking at a language that's not quite the one they learned to love (or hate). You need to exercise some amount of taste, and restraint, or your language will become utterly unrecognizable, even to its own mother. (Bjarne). But taste is needed for many areas of programming; metaprogramming isn't exactly alone here.

The great big granddaddy of metaprogrammable languages is Common Lisp; it's safe to say that most (good) programming in Common Lisp involves refactoring via macros and metaprogramming. The object system itself (CLOS) is extensible by metaprogramming, although it was probably overkill. But Lisp's metaprogramming is the gold standard, and Lisp programmers rarely have to wait for their vendor to supply features they need.

Lisp Wins

To tie things up, I've decided that:

    • I hate Java's verbosity. The language is just inflexible. I love the runtime (JVM), the libraries, the documentation, the interoperability, the security - pretty much everything about Java is pretty decent, except for Java. And they didn't have the common courtesy to offer me a way out.

    • I need a way to turn my 450k lines of code into 100k lines, while doing the same computation with the same performance. According to Fred Brooks (and others), this will reduce my bug count by about 80%. And it'll be more readable, more maintainable, and generally offer much faster development.

After my year-long search, which included brief flings with cool (but impractical) research languages like Haskell, and the ever-promising-but-not-there-yet OCaml, I've settled on Lisp. Jury's still out on Common Lisp vs. Scheme vs. doing either one in a holding pattern until Arc comes out; we'll see. (Common Lisp is practical and industrial-strength, just like owning your own elephant. No worse than Java or C++, I think, but I'd sort of forgotten how much work it is to ramp up on a totally new non-scripting language platform.) One way or another, though, the winner seems to be Lisp.

And now I wonder how it took me 8 years to figure this out.

(first published in August 2004)

Comments

Now that you've found Lisp, what can you do with it? I've not spent much time on lisp, but without libraries for networking (or really anything else) I immediately can't figure out what to do with lisp. Isn't that why Paul Graham is making Arc?

Posted by: Justin P. at September 5, 2004 08:17 PM