Comparing WebAssembly/Wasm interpreters

This document provides a more detailed explanation behind the comparison chart.

Note: N/A means no measurements have been made yet.

Asynchronous interface

As of the time of this writing, there is no known asynchronous WebAssembly interpreter other than nanowasm. All other interpreters implement a function that will run the WebAssembly application until it either finishes or traps (e.g.: wasm_application_execute_main in wasm-micro-runtime).

In the scope of nanowasm, it was deemed interesting to design it as an asynchronous library for several reasons:

Running several module instances synchronously requires OS-level threading, which might not be desirable or even feasible to implement under some resource-constrained environments.
An asynchronous interface allows hosts to stop the execution for a module instance as easily as stop calling nw_run. Otherwise, interpreters must provide an terminate-like interface that must be called from a separate context (e.g.: wasm_runtime_terminate in wasm-micro-runtime).
The reasons above require a synchronous interpreter to be aware of the underlying platform, as it must be aware of primitives such as locks in order to remain thread-safe. However, this restricts portability towards new, unknown platforms.

Despite its advantages, an asynchronous interface requires a more careful design and, more importantly, it incurs a larger memory footprint. However, nanowasm strives to remain smaller compared to its synchronous counterparts.

I/O-agnostic

Module bytecode

As of the time of this writing, all interpreters other than nanowasm require the module bytecode be memory-mapped. Typically, this requires to either:

Dump the bytecode into memory.
Allocate a memory-mapped file.

Whereas the former is inefficient memory-wise and probably unacceptable on resource-constrained environments, the latter is just not possible unless a hardware MMU is present.

In the context of nanowasm, it was considered interesting not to assume where the module bytecode comes from, and instead access it via file-like semantics. For example, this would allow MMU-less systems to store module bytecode on non-volatile memory, which is often larger and less expensive, albeit slower.

Memories

WebAssembly defines four different memory areas:

Table memory.
Linear memory.
Global memory.
Stack.

All interpreters other than nanowasm allocate these memory areas internally via the system heap, or a custom heap defined by the user. This raises the following concerns:

Some resource-constrained environments might prefer to avoid the use of a heap, or maybe no heap implementation is even available.
It forces each of these memory areas to remain contiguous. On environments with segmented memory, this might limit the amount of contiguous memory that can be allocated.

In the context of nanowasm, MMU-less systems were considered a priority for its design, and therefore it was conceived so that these memory areas are never allocated by nanowasm itself. Instead, nanowasm provides to the host a series of interfaces (i.e., callbacks) to implement in order to define how these areas are accessed. Therefore, whether accessing those areas requires the use of a heap, and how it is used, is entirely up to the host implementation.

While possibly a bit cumbersome from a first glance, this flexible design brings in many new possibilities. For example, it allows MMU-less systems to store memory pages into non-contiguous memory areas, or even store them into larger, non-volatile memory, similarly to how fully-fledged operating systems implement virtual memory.

No heap required

All interpreters other than nanowasm would allocate many internal data structures, as well as arbitrarily large chunks of data in order to accomodate the different memory areas defined by the WebAssembly standard. Aside from the limitations explained above, this means the memory required by a module or a module instance cannot be known at compile-time, since it depends on how the heap is implemented, and even the module bytecode itself.

On the other hand, nanowasm was designed with resource-constrained environments in mind, where a heap implementation might be either undesired or just unavailable. Therefore, it had to be implemented so that the memory required by modules and module instances remained static. This is achieved efficiently by storing all data structures for all possible states into a union.

This design allows hosts to allocate modules and module instances in any way, be it:

Automatically i.e., from the stack.
Statically i.e., via the static qualifier.
Dynamically i.e., from the heap.

Big-endian support

Even if little-endian architectures, such as amd64, are arguably more popular as of the time of this writing, big-endian counterparts are still being produced and are therefore considered equally relevant by nanowasm.

[`wasm-micro-runtime`]

Despite the fact that wasm-micro-runtime seems to byte-swap integers according to the platform endianness, no big-endian platforms are listed so far on its README.md. Also, due to its big code base, it is difficult to ensure whether all integer reads and write are done in an endianness-agnostic way.

[`wac`]

wac naively compares the \0asm magic string as a little-endian integer.

No compiler-specific extensions

All interpreters other than nanowasm rely extensively on system-specific macros and/or extensions to the C language. These might restrict their use on less popular compilers and/or new environments.

On the other hand, nanowasm is written in standard ANSI C (C89/C90) i.e., without any language extensions, as well as no system-specific macros. On a broader sense, the use of macros and/or other preprocessor directives is restricted to a minimum on nanowasm, as opposed to other interpreters such as wasm3.

Such reduced use of the preprocessor is considered to enhance readability, even if it might incur some extra boilerplate code.

Public functions

`nanowasm`

The numbers were extracted from nw.h.

[`wasm-micro-runtime`]

Commit: 4e50d2191ca8f177ad03a9d80eebc44b59a932db

The numbers were extracted from wasm_export.h.

[`wasm3`]

Commit: 35b5e2fb53c5cbc1ff3d7e42c381cd7cfa14f308

The numbers were extracted from wasm3.h.

Minimal memory footprint

`nanowasm`

The numbers were extracted from the test application built by the project by default, which links the nanowasm library.

The project was built with:

cmake -B build
cmake --build build

Then, the size for the test executable was obtained via:

$ size build/test/test
   text    data     bss     dec     hex filename
  50590    3576      16   54182    d3a6 build/test/test

Of course, these numbers are subject to change since many opcodes are still not implemented in nanowasm.

[`wasm-micro-runtime`]

Commit: 4e50d2191ca8f177ad03a9d80eebc44b59a932db

A minimal application, namely wamr-ex, was written with wasm-micro-runtime's iwasm library. This application:

Dumps a .wasm file into memory, since wasm-micro-runtime requires module code to either reside in memory or belong to a memory-mapped file.
Calls the following functions:
- wasm_runtime_init
- wasm_runtime_load
- wasm_runtime_instantiate
- wasm_application_execute_main
- wasm_runtime_unload
- wasm_runtime_deinstantiate

The example was built with the default CMake flags i.e.:

cmake -B build
cmake --build build

The executable size was the obtained via:

$ size build/wamr-ex
   text    data     bss     dec     hex filename
 463165    9224     932  473321   738e9 build/wamr-ex

[`wasm3`]

Commit: 35b5e2fb53c5cbc1ff3d7e42c381cd7cfa14f308

wasm3 provides a sample application, also called wasm3, in its source tree. This application allows to run any .wasm file, along with some extra command line options.

The project was built with the default CMake flags i.e.:

cmake -B build
cmake --build build

The executable size was the obtained via:

$ size build/wasm3
   text    data     bss     dec     hex filename
 531667   20332    6720  558719   8867f build/wasm3

Comparing WebAssembly/Wasm interpreters

Asynchronous interface

I/O-agnostic

Module bytecode

Memories

No heap required

Big-endian support

[`wasm-micro-runtime`]

[`wac`]

No compiler-specific extensions

Public functions

`nanowasm`

[`wasm-micro-runtime`]

[`wasm3`]

Minimal memory footprint

`nanowasm`

[`wasm-micro-runtime`]

[`wasm3`]

Per-module memory usage

`nanowasm`

Per-instance memory usage

`nanowasm`