RVM

The RmbRT language's virtual machine, RVM, is designed to facilitate an ideal execution environment for RL programs.

Design goals

The RVM is designed to meet these requirements:

RASM

RASM is the RVM's assembly language. RASM is naturally function-oriented, meaning that functions form distinct blocks of RASM; this is because instructions' opcodes and their immediate arguments are stored separately: a block of code begins with a prelude block containing all the immediate (constant) values for its instructions, followed by a much more compact stream of instructions that can also reuse constants. Since multiple instructions can refer to and share the same constants as their parameters, code naturally forms blocks of related instructions. For simpler decoding, values are all aligned words, while instructions are composed of independent single-byte values. The VM thus has two instruction pointers: the current operation and the address of the current constants block.

Code cannot refer to any values not set in the associated constants block. In memory, the sign function could roughly look as follows:

sign($r0: signed): ($r0: signed)
VALUES
    #neg1 = -1
    #0    =  0
    #1    = +1

    #switch_neg = rel_addr .switch_neg
    #switch_pos = rel_addr .switch_pos
CODE
    CMP $r0, #0 // compare, set condition flags
    SWITCH #switch_neg, #switch_pos // switch based on condition flags
    .switch_0: // implicit
        RET
    .switch_neg:
        SET $r0 = #neg1
        RET
    .switch_pos:
        SET $r0 = #1
        RET

This changes the nature of assembly from the conventional approach of independent instruction streams to function-like blocks of interconnected instructions.

All of this may seem unnecessarily complicated, but it makes it very simple to generate templated ASM code for it: TRASM. If a value is unknown because it is a templated placeholder, then the code can still be generated and a placeholder for it can be generated in the values block:

[T: TYPE] new($rArgs: stack addr 0x...): ($r0: addr)
VALUES
    TEMPLATE #T_SIZE
    TEMPLATE #T_ALIGN
    TEMPLATE #T_CONSTRUCTOR
    #alloc = code 0x...
    #0 = 0
    #alloc_failed = rel_addr .alloc_failed
CODE
    PUSH #T_SIZE
    PUSH #T_ALIGN
    CALL #alloc
    EQ? $r0, #0
    THEN #alloc_failed
        PUSH $r0 // save THIS
        CALL_WITH_ARGS #T_CONSTRUCTOR, rArgs // forward arguments, r0 = THIS
        POP $r0 // restore THIS
    .alloc_failed:
        RET // return 0 | THIS

This code can now either be instantiated fully upon compiling, so that all TEMPLATE marked values are filled with their real values, or it can be instantiated at runtime by simply injecting an additional VALUES block that will override all TEMPLATE values. For this, a special call operation exists. This works best for templated functions sharing the same template arguments (such as member functions sharing class-wide templates): they can just reuse the same template block they were called with, but it also works for unrelated template arguments by linking the template block needed to call the inner function as a template value in the outer function's template block. Thus, while for untemplated code, this separation of code and values may be slightly inefficient, for templated code, it allows for massive code deduplication, as types like hash tables or vectors would only need to exist in memory once, at a slight runtime overhead of passing an additional data pointer to such a call. To deduplicate a templated function, only the values depending on the template arguments have to be instantiated into injectable value blocks, and the call instruction would take one more argument.

Dynamic contextual state and function binding

Because the RVM allows injecting constants and template arguments into existing code, this also allows the run-time binding and overriding of global variables in function calls, or even overriding nested function calls. This makes thread-local and coroutine-local state easy to implement, as the adresses of state variables can be injected by the caller at runtime. Overriding nested function calls allows for powerful dynamic programming features.

TRASM object files

Due to RASM's support for very simple templating, the RmbRT compiler can precompile templated libraries into TRASM, so it naturally is used for object files. During compilation, only the declarations of variables, types, and functions are necessary, even if they are templated, as their implementations' TRASM code is sufficient for compiling. TRASM object files contain additional information that identifies template arguments' requirements.