18 Macros
Macros are a way to extend HULK with “functions” that are transpiled at compilation-time to standard HULK, instead of executed in runtime. But macros are considerable more powerful than functions, both sintactically and semantically. Macros in HULK are extremely powerful because they work at the sintactic level, which means they perform transformations directly over the abstract syntax tree. Besides that, their syntax allows to define sort of keyword-like language constructs.
Since macros are a complex topic, let’s start with a simple scenario.
Suppose you want to have something like the following in HULK:
repeat(10) {
// expressions
}
You quickly see that this code is equivalent to the (arguably a lot more verbose) following syntax:
let total = n in
while (total >= 0) {
total := total - 1;
// expressions
; }
You can easily encapsulate this pattern in a repeat
function that takes a number and an a general expression (as a functor):
function repeat(times: Number, expr: () -> Object): Object {
let total = n in
while (total >= 0) {
total := total - 1;
expr();
;
} }
And while this may work for your case, it has a couple of downsides. First, you don’t exactly get the desired syntax, instead of:
repeat(10) {
// expressions
}
You have to write something like the following, which is close, but still slightly more cumbersome and dirty.
repeat(10, () => {
// expressions
; })
The second, and most important one, is that the expr
here encapsulates a computation that, from the point of view of the repeat
function, is a black box. We will focus on why this matters later on.
18.1 Defining macros
Instead of a function, you can use a macro, which has a very similar syntax in HULK:
repeat(n: Number, *expr: Object): Object =>
def let total = n in
while (total >= 0) {
total := total - 1;
;
expr; }
But this change makes macros exceedingly more powerful than functions in a lot of cases, for a few reasons. First, notice the use of the *expr: Object
syntax, instead of the expr: () -> Object
. Here the *
denotes that this expr
is not a regular argument, instead it is a special argument that refers to the code inside the brackets after the macro invocation. Thus, you can use the following syntax:
repeat(10) {
print("Hello World");
}
The { print("Hello World"); }
expression block is precisely what is passed on in the special argument *expr
.
However, there is much more going on under that macro invocation. Instead of calling a functor in runtime, macros are expanded in compile time and transpiled into their bodies, which means there is no real repeat
function anywhere in the compiled code. Instead, the actual code that is executed is something like:
let _total = 10 in
while (_total >= 0) {
_total := _total - 1;
{print("Hello World");
;
} }
This is the reason why you don’t see expr();
in the macro body, but expr;
. That is, the body is not executed but interpolated inside the macro. This transpilation step makes macros often faster than functions because there is no extra overhead for passing arguments, however, you must be careful when thinking about the operational semantics of a macro especially where they differ from a regular function call.
18.2 Variable sanitization
Upon macro expansion, the variables inside the body of a macro are replaced with a special unique name generated by the compiler. This ensures that no variable in the context of the macro invocation can be accidentally hidden or used in unpredictable ways.
Take for example the following code:
let total = 10 in repeat(total) {
print(total);
; }
If variables inside the body of the repeat
macro wheren’t sanitazed, then the print
statement would print 9
, 8
, etc, which is kind of unexpected unless you happen to know how the repeat
macro is implemented, violating the principle of encapsulation. Even worse, this would happen if your variable is named total
, but not if it’s named something else, which again is surprising and inconsistent. However, since the variable total
inside the body of repeat
will be renamed to something completely different upon macro expansion, you can be certain that the print
statement will work as expected, regardless of the name you happen to choose for your variable.
18.3 Symbolic arguments
There are times, though, when you want the macro to reuse a symbol that comes from its external context (a variable or attribute). In these cases, you can use the especial syntax @symbol
to define a symbolic argument in the macro, and then bind a specific symbol upon macro expansion.
This is best explained with an example. Let’s suppose we want to implement a swap
macro that swaps the content of two variables. This cannot be done unless the macro can actually assign to the variables we want to swap. We would define the macro as:
swap(@a: Object, @b: Object) {
def let temp: Object = a in {
a := b;
b := temp;
} }
And we invoke the macro as:
let x: Object = 5, y: Object = "Hello World" in {
swap(@x, @y);
print(x);
print(y);
; }
Which will be expanded to something like (except that _temp
will be a generated name):
let x: Object = 5, y: Object = "Hello World" in {
let _temp = x in {
x := y;
y := _temp;
;
}print(x);
print(y);
; }
Notice how the actual names of the x
and y
variables are interpolated in the macro expansion. Of course, the type checker will guarantee that on invocation the x
and y
symbols are variables of the corresponding type.
18.4 Variable placeholders
Macros can also introduce a new symbol into the scope in which they are expanded, which can then be used in the body argument (or the other arguments). The syntax for this is $symbol
. We call this a “variable placeholder”, because it holds the name for a variable that will be introduced upon macro expansion.
Again, this is best explained with an example. Let’s add a variable to the repeat
macro to indicates the current iteration. We would define the macro as:
repeat($iter: Number, n: Number, *expr:Object) {
def let iter: Number = 0, total:Number = n in {
while (total >= 0) {
total := total - 1;
;
expriter := iter + 1
;
}
} }
Now when calling the macro, you can specify a name for the $iter
variable placeholder:
repeat(current, 10) {
print(current);
; }
The effect is that upon macro expansion, the variable placeholder $iter
will be renamed to current
and thus the body of the macro will correctly reference it. The actual expansion looks similar to the following code:
let current: Number = 0, _total:Number = n in {
while (_total >= 0) {
_total := _total - 1;
{print(current);
;
}current := current + 1
;
}; }
The compiler ensures that the use of the new variable in the body of the macro is consistent with the type declared for the variable placeholder in the macro. However, it is entirely possible for the macro not to define the variable, or to define it conditioned on some structure of the body (we will see how that’s achieved in the pattern matching section). In any case, since macro expansion is performed at compile time, any inconsistency that may arise will be captured by the compiler.
18.5 Pattern matching
By far the most powerful feature of macros is structural pattern matching. This feature allows to deconstruct an argument and generate a specific code depending on the argument structure. The reason this is possible is because macros run on compile time, so when you declare an argument of type Number
, for example, what you’ll get in the macro body is the actual expression tree of the argument, and not just the final evaluated object.
As everything else with macros, this feature is much better understood with examples. Let’s suppose you want to define a macro called simplify
, for no better use than to illustrate how powerful macros are compared to regular functions. This is how you would do it:
simplify(expr:Number) {
def match(expr) {
case (x1:Number + x2:Number) => simplify(x1) + simplify(x2);
case (x1:Number + 0) => simplify(x1);
case (x1:Number - x2:Number) => simplify(x1) + simplify(x2);
case (x1:Number - 0) => simplify(x1);
case (x1:Number * x2:Number) => simplify(x1) * simplify(x2);
case (x1:Number * 1) => simplify(x1);
// ... you get the idea
default => expr;
;
} }
You would use the macro as follows:
print(simplify((42+0)*1);
And the actual generated code would be:
print(42);
Notice that this transformation happens during compilation time, not execution. The actual code that gets compiled is the simplified expression.