Photo by Todd Quackenbush on Unsplash

Geast: a generic Abstract Syntax Tree

In many of my personal projects (in JavaScript, Java or C#) I use a lexer and parser, to convert a textual representation in something executable. One representation I could use is the Abstract Syntax Tree (AST). After many projects that use such data structure, I decided to write a generic Abstract Syntax Tree library, in JavaScript.

An AST could represent a binary arithmetic operation 40 + 2 as:

Representing 40 + 2

Another example, for 40 + 4 / 2 could show the precedence of division over addition, something that is not evident from the textual representation:

Operator precedence

Not only expressions could be represented, also commands, like an assignment:

An assignmet to simple variable

And more elaborated commands, like an if command in C-like languages:

The condition, then and else nodes are not terminal

A generic AST could represent any of these nodes, and more. So, I wrote a library in javascript, geast. It have some predefined nodes, but you can define new ones or redefine the existing node definitions. Example code:

const geast = require('geast');geast.node('contract', [ 'name', 'body' ]);

In the above code, a new node type is defined, named contract with two properties name and body . After this definition, you can create a new node:

const node = geast.contract('Counter', geast.composite(....));

Currently, I’m using the generic AST library in many projects, like:

  • selang: Simple smart contract programming language
  • solcom: Solidity compiler
  • rlie: R-like interpreter
  • erlie: Erlang-like interpreter

I found that now, using generic lexer gelex, generic parser gepars, and this generic AST, I could write down a compiler or an interpreter that takes a text code as input, and produce the AST.

Then, having the AST, the nodes have provision to support the visitor pattern. Usually, I wrote a visitor to produce code, bytecodes, or a visitor to evaluate the tree. See examples at evmcompiler bytecode compiler and rlie interpreter.

As an example, to evaluate a binary operation in Rlie, this is the visitor code that process a binary node:

const binaryFns = {
'+': operations.add,
'-': operations.subtract,
'*': operations.multiply,
'/': operations.divide,
'^': operations.power,
// ...
}
this.processBinary = function (node) {
const oper = node.operator();
const lvalue = this.process(node.left());
const rvalue = this.process(node.right());

return binaryFns[oper](lvalue, rvalue);
};

Notably, having an AST, I could use the same visitors to produce Ethereum bytecodes, for DIFFERENT languages, like selang, and solcom, both of them use the same evmcompiler visitor to generate the compiled EVM program.

Related post:

Gelex: a generic lexer in JavaScript