aboutsummaryrefslogtreecommitdiff
path: root/COMPARE.md
diff options
context:
space:
mode:
authorXavier Del Campo Romero <xavi.dcr@tutanota.com>2024-09-07 00:04:38 +0200
committerXavier Del Campo Romero <xavi92@disroot.org>2025-11-06 14:38:40 +0100
commit6d9d80362f9932bbc87e162b8ef7df06c73e27e1 (patch)
treee3e228c63fe26f07503f226de7fb5086b3dc2286 /COMPARE.md
downloadnanowasm-6d9d80362f9932bbc87e162b8ef7df06c73e27e1.tar.gz
First commit
Diffstat (limited to 'COMPARE.md')
-rw-r--r--COMPARE.md261
1 files changed, 261 insertions, 0 deletions
diff --git a/COMPARE.md b/COMPARE.md
new file mode 100644
index 0000000..635f6ff
--- /dev/null
+++ b/COMPARE.md
@@ -0,0 +1,261 @@
+# Comparing WebAssembly/Wasm interpreters
+
+This document provides a more detailed explanation behind the
+[comparison chart](README.md#comparison-chart).
+
+- **Note:** `N/A` means no measurements have been made yet.
+
+## Asynchronous interface
+
+As of the time of this writing, there is no known asynchronous WebAssembly
+interpreter other than `nanowasm`. All other interpreters implement a
+function that will run the WebAssembly application until it either
+finishes or traps (e.g.: `wasm_application_execute_main` in
+`wasm-micro-runtime`).
+
+In the scope of `nanowasm`, it was deemed interesting to design it as an
+asynchronous library for several reasons:
+
+- Running several module instances synchronously requires OS-level threading,
+which might not be desirable or even feasible to implement under some
+resource-constrained environments.
+- An asynchronous interface allows hosts to stop the execution for a module
+instance as easily as stop calling `nw_run`. Otherwise, interpreters must
+provide an `terminate`-like interface that must be called from a separate
+context (e.g.: `wasm_runtime_terminate` in `wasm-micro-runtime`).
+- The reasons above require a synchronous interpreter to be aware of the
+underlying platform, as it must be aware of primitives such as locks in order
+to remain thread-safe. However, this restricts portability towards new, unknown
+platforms.
+
+Despite its advantages, an asynchronous interface requires a more careful
+design and, more importantly, it incurs a larger memory footprint. However,
+`nanowasm` strives to remain smaller compared to its synchronous counterparts.
+
+## I/O-agnostic
+
+### Module bytecode
+
+As of the time of this writing, all interpreters other than `nanowasm` require
+the module bytecode be memory-mapped. Typically, this requires to either:
+
+- Dump the bytecode into memory.
+- Allocate a memory-mapped file.
+
+Whereas the former is inefficient memory-wise and probably unacceptable on
+resource-constrained environments, the latter is just not possible unless
+a hardware [MMU](https://en.wikipedia.org/wiki/Memory_management_unit) is
+present.
+
+In the context of `nanowasm`, it was considered interesting not to assume
+_where_ the module bytecode comes from, and instead access it via file-like
+semantics. For example, this would allow MMU-less systems to store module
+bytecode on non-volatile memory, which is often larger and less expensive,
+albeit slower.
+
+### Memories
+
+WebAssembly defines four different memory areas:
+
+- Table memory.
+- Linear memory.
+- Global memory.
+- Stack.
+
+All interpreters other than `nanowasm` allocate these memory areas internally
+via the system heap, or a custom heap defined by the user. This raises the
+following concerns:
+
+- Some resource-constrained environments might prefer to avoid the use of
+a heap, or maybe no heap implementation is even available.
+- It forces each of these memory areas to remain contiguous. On environments
+with segmented memory, this might limit the amount of contiguous memory that
+can be allocated.
+
+In the context of `nanowasm`, MMU-less systems were considered a priority for
+its design, and therefore it was conceived so that these memory areas are
+never allocated by `nanowasm` itself. Instead, `nanowasm` provides to the
+host a series of interfaces (i.e., callbacks) to implement in order to define
+how these areas are accessed. Therefore, whether accessing those areas
+requires the use of a heap, and how it is used, is entirely up to the host
+implementation.
+
+While possibly a bit cumbersome from a first glance, this flexible design
+brings in many new possibilities. For example, it allows MMU-less systems to
+store memory pages into non-contiguous memory areas, or even store them into
+larger, non-volatile memory, similarly to how fully-fledged operating systems
+implement virtual memory.
+
+## No heap required
+
+All interpreters other than `nanowasm` would allocate many internal data
+structures, as well as arbitrarily large chunks of data in order to accomodate
+the different memory areas defined by the WebAssembly standard. Aside from the
+limitations [explained above](#memories), this means the memory required by a
+module or a module instance cannot be known at compile-time, since it depends
+on how the heap is implemented, and even the module bytecode itself.
+
+On the other hand, `nanowasm` was designed with resource-constrained
+environments in mind, where a heap implementation might be either undesired
+or just unavailable. Therefore, it had to be implemented so that the memory
+required by modules and module instances remained static. This is achieved
+efficiently by storing all data structures for all possible states into a
+`union`.
+
+This design allows hosts to allocate modules and module instances in any way,
+be it:
+
+- Automatically i.e., from the stack.
+- Statically i.e., via the `static` qualifier.
+- Dynamically i.e., from the heap.
+
+## Big-endian support
+
+Even if little-endian architectures, such as `amd64`, are arguably more popular
+as of the time of this writing, big-endian counterparts are still being
+produced and are therefore considered equally relevant by `nanowasm`.
+
+### [`wasm-micro-runtime`]
+
+Despite the fact that `wasm-micro-runtime` seems to byte-swap integers
+according to the platform endianness, no big-endian platforms are listed so far
+on its `README.md`. Also, due to its big code base, it is difficult to ensure
+whether all integer reads and write are done in an endianness-agnostic way.
+
+### [`wac`]
+
+`wac` naively compares the `\0asm` magic string as a little-endian integer.
+
+## No compiler-specific extensions
+
+All interpreters other than `nanowasm` rely extensively on system-specific
+macros and/or extensions to the C language. These might restrict their use on
+less popular compilers and/or new environments.
+
+On the other hand, `nanowasm` is written in standard ANSI C (C89/C90)
+i.e., without any language extensions, as well as no system-specific macros.
+On a broader sense, the use of macros and/or other preprocessor directives is
+restricted to a minimum on `nanowasm`, as opposed to other interpreters such
+as `wasm3`.
+
+Such reduced use of the preprocessor is considered to enhance readability,
+even if it might incur some extra boilerplate code.
+
+## Public functions
+
+### `nanowasm`
+
+The numbers were extracted from [`nw.h`](include/nanowasm/nw.h).
+
+### [`wasm-micro-runtime`]
+
+- Commit: `4e50d2191ca8f177ad03a9d80eebc44b59a932db`
+
+The numbers were extracted from `wasm_export.h`.
+
+### [`wasm3`]
+
+- Commit: `35b5e2fb53c5cbc1ff3d7e42c381cd7cfa14f308`
+
+The numbers were extracted from `wasm3.h`.
+
+## Minimal memory footprint
+
+### `nanowasm`
+
+The numbers were extracted from the `test` application built by the project
+by default, which links the `nanowasm` library.
+
+The project was built with:
+
+```
+cmake -B build
+cmake --build build
+```
+
+Then, the size for the `test` executable was obtained via:
+
+```
+$ size build/test/test
+ text data bss dec hex filename
+ 50590 3576 16 54182 d3a6 build/test/test
+```
+
+Of course, these numbers are subject to change since many opcodes are still
+not implemented in `nanowasm`.
+
+### [`wasm-micro-runtime`]
+
+- Commit: `4e50d2191ca8f177ad03a9d80eebc44b59a932db`
+
+A minimal application, namely `wamr-ex`, was written with
+`wasm-micro-runtime`'s `iwasm` library. This application:
+
+1. Dumps a `.wasm` file into memory, since `wasm-micro-runtime` requires
+module code to either reside in memory or belong to a memory-mapped file.
+2. Calls the following functions:
+ - `wasm_runtime_init`
+ - `wasm_runtime_load`
+ - `wasm_runtime_instantiate`
+ - `wasm_application_execute_main`
+ - `wasm_runtime_unload`
+ - `wasm_runtime_deinstantiate`
+
+The example was built with the default CMake flags i.e.:
+
+```
+cmake -B build
+cmake --build build
+```
+
+The executable size was the obtained via:
+
+```
+$ size build/wamr-ex
+ text data bss dec hex filename
+ 463165 9224 932 473321 738e9 build/wamr-ex
+```
+
+### [`wasm3`]
+
+- Commit: `35b5e2fb53c5cbc1ff3d7e42c381cd7cfa14f308`
+
+`wasm3` provides a sample application, also called `wasm3`, in its source tree.
+This application allows to run any `.wasm` file, along with some extra command
+line options.
+
+The project was built with the default CMake flags i.e.:
+
+```
+cmake -B build
+cmake --build build
+```
+
+The executable size was the obtained via:
+
+```
+$ size build/wasm3
+ text data bss dec hex filename
+ 531667 20332 6720 558719 8867f build/wasm3
+```
+
+## Per-module memory usage
+
+### `nanowasm`
+
+The numbers were extracted by looking up `sizeof (struct nw_mod)` via `gdb(1)`,
+from an `x86_64-linux-gnu` machine. Results might vary depending on the
+target platform.
+
+## Per-instance memory usage
+
+### `nanowasm`
+
+The numbers were extracted by looking up `sizeof (struct nw_inst)` via `gdb(1)`,
+from an `x86_64-linux-gnu` machine. Results might vary depending on the
+target platform.
+
+[`wasm-micro-runtime`]: https://github.com/bytecodealliance/wasm-micro-runtime
+[`wasm3`]: https://github.com/wasm3/wasm3
+[`wac`]: https://github.com/kanaka/wac
+[`toywasm`]: https://github.com/yamt/toywasm