About Lectures Research Software Blog
Musical Site

Moods Blog

Dojo Shin Kaï

RSS Feed
Thank you!

XHTML 1.0 conformant
CSS 2.0 conformant
Didier Verna's scientific blog: Lisp, Emacs, LaTeX and random stuff.

Tag - Extensibility

Entries feed - Comments feed

Monday, May 14 2012

Monday Troll: the syntax extension myth

Here's a little Monday Troll.

To my greatest disappointment, I discovered today that it is not possible to replace Lisp parenthesis by, say, ... curly braces. What a shame. Hell, it's not even possible to freely mix the two. Very naively, I had expected that:

(set-macro-character #\{ (get-macro-character #\())
(set-macro-character #\} (get-macro-character #\)))

would suffice, but no. All implementations that I have tested seem to agree on this, although the error messages may differ. For instance, trying to evaluate {and} gives you an "unmatched close parenthesis error" except for CMU-CL which chooses to ignore it, but then report an end-of-file error. The unmatched close parenthesis, of course, is the closing curly brace! So what is going on here?

When an opening curly brace is read, the original left paren macro function is called. In SBCL for instance, this is SB-IMPL::READ-LIST, which looks for a hardwired right paren on the stream. Yuck. It doesn't find one, but it finds my closing brace which triggers the "standalone" right paren behavior (spurious paren alert). In passing, it also surprised me that SB-IMPL::READ-LIST is not implemented in terms of READ-DELIMITED-LIST.

EDIT: as mentioned in several comments, we could use read-delimited-list to look for a closing curly brace, but even this won't work completely. The problem is with dotted lists (see Pascal's comment). SBCL hard-wires #\) in its dotted lists parsing procedures.

So it appears that dispatching macro characters are only shaky. What we miss is a true concept of syntactic categories (Common Lisp character syntax types are close, but not quite there yet). In fact, TeX, with its notion of catcodes (category codes), seems to be the only language that gets this right. Ideally, any character with associated status LIST TERMINATOR should do as good as a right paren (the problem is only with closing, not opening).

Instead of hard-wiring the right paren in the Lisp parser, a quick workaround would be to check whether the next character on the stream is a dispatching one, and in such a case, whether its macro function is the one originally associated with the right paren. If so, it should then simply stand as a list terminator. This is actually an interesting idea I think: could the built-in macro functions become equivalent to actual category codes, and could we completely remove hard-wired characters in Lisp parsers?

Anyway, this whole story is a true scandal because it ruined an otherwise cool live demo of mine. So much for syntax extensibility. I will immediately complain to the concerned authorities.

Looking for the concerned authorities to complain to... please wait.

Tuesday, January 3 2012

ACCU 2012 session on language extensibility

I'm pleased to announce that I will hold a 90 minutes session on language extensibility at the next ACCU conference. A shortened abstract is given below (a longer one is available at the conference website).

Impact of Extensibility on Domain Specific Language Design and Implementation

Domain-specific languages (DSLs) are usually very different from the general purpose language (GPL) in which the embedding application is written. The need for designing a DSL as a completely new language often comes from the lack of extensibility of the chosen GPL. By imposing a rigid syntax, a set of predefined operators and data structures, the traditional GPL approach leaves no choice but to implement a DSL as a different language, with its own lexical and syntactic parser, semantic analyzer and possibly its own brand new interpreter or even compiler.

Some GPLs, however, are extensible or customizable enough to let one implement a DSL merely as either a subset or an extension of the original language. While the end-user does not see a difference with the traditional approach, the gain for the developer is substantial. Since the DSL is now just another entry point for the same original GPL, there is essentially only one application written in only one language to maintain. Moreover, no specific language infrastructure (parser, interpreter, compiler etc.) is required for the DSL anymore, since it is simply expressed in terms of the original GPL.

The purpose of this presentation is to illustrate the most important factors that make a language truly extensible, and to show how extensibility impacts the process of DSL design and implementation.

French Flag English Flag
Copyright (C) 2008 -- 2018 Didier Verna didier@lrde.epita.fr