June 2014

Monoids, scope, and extensibility

My project at Hacker School is to build a syntactically extensible language. I've had the idea behind the project kicking around in my head for a while, and I thought it was about time I wrote it up.

The core idea is that monoids can be used to represent scoped extensibility. A monoid is an associative binary operator with an identity element. Examples include set-union with {} as identity, list concatenation with [] as identity, and integer addition with 0 as identity.

In this post, I explain how monoids relate to both scope and extensibility. In a later post, I'll explain how I intend to use that to implement a language with extensible syntax.

Scope as a monoid

Scopes in a programming language form a monoid. The monoid operator is merging two scopes, combining their definitions. The identity element is the empty scope, that binds no variables.

This "scope merge" operator is implicit in several common language constructs. Consider the following Haskell code:

import A
import B

This merges the scopes defined by the modules A and B and makes the result available in the current module. Similarly, consider nested let-bindings:

let x = 1     -- first binding
    y = 2
in let z = 3  -- second binding
   in (x,y,z) -- inner code

The scope available to the expression (x,y,z) is the merger of the scopes defined by the first and second let-bindings.

Finally, for the type theorists in the audience, consider a typical well-typedness judgment (Γ ⊢ e : τ). Contexts Γ form a monoid under concatenation (Γ₁,Γ₂), with the empty context (·) as identity.

Monoids characterize extensibility

Monoids also, I believe, characterize a certain notion of extensibility. Let us view values of a monoid as representing extensions to some core functionality. The monoid operator composes two extensions, giving the "union" of their features. This requires that extensions "play nice" with one another: the result of combining extensions is always well-defined (although it may not always be useful). Associativity is another "niceness" condition: clearly it shouldn't matter in what associative order I combine extensions; what matters is (at most) the list of extensions I choose to use. The identity element represents the absence of any new functionality, the "null extension" or "default behavior".

Other properties we might find useful are idempotence and commutativity. Idempotence means that using the same extension twice is the same as using it once. Commutativity means that the left-to-right order in which we use extensions does not matter. These are nice properties to have, but I think they are not essential to the meaning of "extensibility". This is very vague, admittedly. However, there are things I would call "extensible" that fail these properties.

For example, middleware in a web stack (which examines and modifies an incoming request) can be said to "extend" the stack's features. A middleware stack is essentially a list of functions through which the request is piped. These lists form a monoid under concatenation, with the empty list as identity. But it is not commutative (one function may depend on another having processed the request before it), and perhaps not even idempotent.

Scope as extension

So scopes form a monoid, and monoids characterize a certain notion of extensibility. This suggests that scopes are a form of extensibility. And so they are: binding a variable to a value in a scope extends the set of values easily accessible to someone using that scope. If I say:

fac n = product [1..n]    -- factorial function

Then in the future if I need a factorial function I write fac and it is available to me. So scope presents a (perhaps trivial) form of extensibility: the ability to extend our vocabulary with new definitions.

The motivating question for my project is: what if scope carried other monoids with it besides the typical one mapping variables to values, monoids which represented more substantial notions of extensibility? In particular, notions of syntactic extensibility?

About which, more later.