Dec 6th, 2012

S-expressions

(sexpr lexer reader eval forms special-forms macros walker meta-eval)

A symbolic expression is a notation of a nested list structure. It's origin lies within the Lisp family of programming languages whose entire syntax consists of these expressions.

I want to introduce you to the syntax and in subsequent posts go into how it can be parsed and interpreted.

Syntax

An s-expression is either an atom or an ordered pair of s-expressions. In practice that just means that you have lists of s-expressions that can be nested.

An atom is a symbol which is essentially a value that is not a list. Here are a few examples of atoms:

foo
an-atom
42

Lists are represented by parentheses. They can be empty, or hold atoms, delimited by spaces.

()
(foo)
(foo bar)
(foo bar baz)

Lists can also contain other lists, allowing them to be nested:

((foo) (bar))
((a 1) (b 2))
(alpha (beta (gamma (delta))))

This should give you a basic understanding of how the sexpr syntax works.

Representing data

You can use this format to represent data, just as you would use XML or JSON. In fact, let's take the composer.json from YOLO (the microframework with swag):

{
    "name": "igorw/yolo",
    "description": "The microframework with swag.",
    "keywords": ["useless", "microframework", "academic", "swag"],
    "license": "MIT",
    "authors": [
        {
            "name": "Igor Wiedler",
            "email": "igor@wiedler.ch"
        }
    ]
    ...
}

And transpose it to sexpr:

(dict (name "igorw/yolo")
      (description "The microframework with swag.")
      (keywords '(useless microframework academic swag))
      (license "MIT")
      (authors (list (dict (name "Igor Wiedler")
                           (email "igor@wiedler.ch")))))

A few things worth noting:

There is a dict keyword, which produces a dictionary from the pairs it receives as arguments.
Strings are written between quotes, which groups them from the whitespace.
The keywords are represented by a quoted list of atoms.
The data in authors is explicitly constructed as a list using a list construct.

Try not to worry about those details too much at this point. This is just one possible way of representing that data.

Side note: Wouldn't it be awesome if composer supported composer.sexpr files natively, so that we would no longer have to write JSON? No, not really. I would argue that the benefits of a unified standard format outweigh pluggability in this case.

Representing code

What's really fascinating about s-expressions is that they can be used to represent not only data, but also code.

Here is one of the most basic Lisp code snippets:

(+ 1 2)

It looks cryptic, but it's actually quite simple. Instead of infix notation, this is using Polish notation, also known as prefix notation. What the expression represents is this:

1 + 2

The parentheses tell you that it's a function application or function invocation. The first element of the list is a function, the remaining elements are the arguments passed to that function. + is simply a function which sums up any arguments it receives.

So what about other functions, that are not operators? They work the same way. For example, this application:

(foo a b)

Would be written like this in PHP:

foo($a, $b);

That's about as far as I will go in this post. This should give you an idea about the basics of the syntax and how to read it. Stay tuned for follow-up posts.

Conclusion

S-expressions are a strange looking format that is simple, yet powerful.
They can represent both data and code.
I want to know how to parse them in PHP.

(sexpr lexer reader eval forms special-forms macros walker meta-eval)