UEBB machine14. Jul '14

UEBB machine is a virtual machine developed at Technical University of Berlin, and it is used to teach stack based virtual machines in Compiler Construction course. I have published all of the homework in Python at Github 1.

µ-Opal

µ-Opal is the source code used by the UEBB machine. It's grammar is specified as:

Prog    ::= Def DefOpt
DefOpt  ::= ',' Def DefOpt
Def     ::= 'DEF' Lhs '==' Expr 1
Lhs     ::= 'MAIN' ':' Type
          | id '(' ')' ':' Type
          | id '(' Arg ArgOpt ')' ':' Type
ArgOpt  ::= ',' Arg ArgOpt
          | eps
Arg     ::= id ':' Type 2
Type    ::= nat 3
          | bool 4
Expr    ::= number 5
          | true 6
          | false 7
          | id 8
          | id '(' ')' 9
          | id '(' Expr ExprOpt ')' 9
          | 'IF' Expr 'THEN' Expr 'FI' 10
          | 'IF' Expr 'THEN' Expr 'ELSE' Expr 'FI' 11
ExprOpt ::= ',' Expr ExprOpt
          | eps

This of course has to be transformed to LL(1) grammar:

Prog    ::= Def DefOpt eof
DefOpt  ::= ',' Def DefOpt
          | eps //Follow(DefOpt) = {'eof'}
Def     ::= 'DEF' Lhs '==' Expr 1
Lhs     ::= 'MAIN' ':' Type
          | id '(' Lhs1
Lhs1    ::= ')' ':' Type
          | Arg ArgOpt ')' ':' Type  //First(Arg) = {id}
ArgOpt  ::= ',' Arg ArgOpt
          | eps  //Follow(ArgOpt) = {')'}
Arg     ::= id ':' Type 2
Type    ::= nat 3
          | bool 4
Expr    ::= number 5
          | ture 6
          | false 7
          | id Expr1
          | 'IF' Expr 'THEN' Expr Expr3
Expr1   ::= '(' Expr2
          | eps 8  //Follow(Expr1) = Follow(Expr) = {',', 'eof', 'THEN', 'FI', 'ELSE', ',', '),}
ExprOpt ::= ',' Expr ExprOpt
          | eps  //Follow(ExprOpt) = {')'}
Expr2   ::= ')' 9
          | Expr ExprOpt ')' 9  //First(Expr) = {number, ture, false, id, 'IF'}
Expr3   ::=  'FI' 10
          |  'ELSE' Expr 'FI' 11

Compile!

Consider following µ-Opal snippet:

DEF fac(x:nat):nat ==
    IF eq(x,0) THEN 1 ELSE mul(x, fac(sub(x,1))) FI

DEF MAIN():nat == fac(10)

Once we compile it to uebb instructions we get this:

PushInt(12)
PushAddr(5)
Call
Slide(1)
Stop
PushInt(0)
Push(2)
Eq
Jz(11)
PushInt(1)
Jmp(19)
PushInt(1)
Push(2)
Sub
PushAddr(5)
Call
Slide(1)
Push(2)
Mul
Ret

It's of course obscure and it's hard to understand what's going on so here's annotated .s file:

 0. PushInt(12)                -- fac function call first and only argument
 1. PushAddr(5)                -- push address of fac()
 2. Call --------+             -- invoke fac()
 3. Slide(1)     |             -- pop address of invoked fac()?
 4. Stop         |             -- halt virtual machine
 5. PushInt(0) <-+ <-+         -- push 0 as second argument for eq()
 6. Push(2)          |         -- push x as argument for eq()
 7. Eq               |         -- invoke inlined eq(x, 0)
 8. Jz(11) ------+   |         -- jump if eq(x, 0) resulted in false/zero
 9. PushInt(1)   |   |         -- if eq(x, 0) was not zero, that is x was 0, return 1
10. Jmp(19) -----|---|---+     -- jump to the end of fac() and return with previously pushed 1
11. PushInt(1) <-+   |   |     -- if eq(x, 0) was zero, that is x was bigger than 0, push 1 for sub
12. Push(2)          |   |     -- push x for sub
13. Sub              |   |     -- calculate x-1 and push it to stack
14. PushAddr(5)      |   |     -- push address of fac()
15. Call ------------+   |     -- invoke fac(x-1)
16. Slide(1)             |     -- pop address of invoked fac()?
17. Push(2)              |     -- push x to stack
18. Mul                  |     -- multiply x and return value of fac(x-1)
19. Ret <----------------+     -- fac() returns with multiplication

Instruction set

Jump instructions:

Jz         - Jump to the code instruction on position a,
             if the top of the stack is zero. Removes the top of the stack.
Jlt        - Jump to the code instruction on position a,
             if the top of the stack is less than zero. Removes the top of the stack.
Jgt        - Jump to the code instruction on position a,
             if the top of the stack is greater than zero. Removes the top of the stack.
Jmp        - Jump to the code instruction on position a.

Function call/return instructions:

PushAddr a - Push a onto the stack.
Call       - Jump to the code instruction which is referenced by top of the stack and
             pushes the current addresses on the stack.
Ret        - A Ret jumps back to the code instruction which is referenced by
             the return address (second from the top on the stack) and
             removes the return address from the stack.

Stack intructions:

Push i     - Take the i-th element from the top of the stack.
             without removing it and push it onto the stack.
PushInt i  - Allocates space on the heap, saves the i on the heap
             and pushes the heap reference onto the stack.
Slide i    - Remove i elements from the stack, starting from the second.
Swap       - Swap the first two elements on the stack.

Misc instructions:

Stop       - Signal valid ending of a code sequence. The machine stops with out an error.

Calling functions

Whenever you call a function:

  1. Push all arguments on the stack

  2. Push address of target function

  3. Issue Call which substitutes target address on the stack with return address and jumps to the target address

  4. Inside function calculate stuff, whenever you want to use any of the previously pushed parameters you have to push them on the stack again. After all calculations the stack should contain all the initially pushed parameters, return address and return value of this particular function call.

  5. Finally inside the function issue Return which jumps to the return address (just before return value) and removes the address from the stack

  6. After function call use Slide to pop function call arguments so from the caller's perspective everything gets replaced with return value of the function call.

Caller example f(5,6):

PushInt(6)               -- Push parameter x
PushInt(5)               -- Push parameter y
PushAddr(address-of-f)   -- Push address of f
Call                     -- Substitute target address with return address and jump to target address
Slide(2)                 -- Remove pushed 5 and 6 retaining the return value of the function

Callee example f(x,y) == mul(x,y):

Push(2)  -- Push x
Push(2)  -- Push y
Mul      -- Pop pushed x and y and push the multiplication result
Return   -- Remove return address and jump to return address, the one just before return value
1

https://github.com/laurivosandi/compiler-construction

UEBB µ-Opal Compiler construction TU Berlin