diff options
| author | Alon Zakai <alonzakai@gmail.com> | 2015-05-14 13:20:20 -0700 |
|---|---|---|
| committer | Alon Zakai <alonzakai@gmail.com> | 2015-05-15 11:38:22 -0700 |
| commit | 1b971360f1429fe66bb13d03c5d90102991d830c (patch) | |
| tree | 94549cfa002e46c3941ce59d448d7a489d185fb2 /BinaryEncoding.md | |
| parent | 7d7313da95f5bd3ba45095bf23208bf544c209bb (diff) | |
| download | nanowasm-design-1b971360f1429fe66bb13d03c5d90102991d830c.tar.gz | |
update binary encoding following recent talks
Diffstat (limited to 'BinaryEncoding.md')
| -rw-r--r-- | BinaryEncoding.md | 74 |
1 files changed, 31 insertions, 43 deletions
diff --git a/BinaryEncoding.md b/BinaryEncoding.md index 8a9d9d1..c801ae5 100644 --- a/BinaryEncoding.md +++ b/BinaryEncoding.md @@ -3,49 +3,42 @@ This document describes the binary encoding of the AST nodes defined in the [AST semantics](AstSemantics.md). For the context of this document, see the [Binary format](V1.md#binary-format) section of the [v.1 doc](V1.md). -*The current [polyfill](https://github.com/WebAssembly/polyfill) is a prototype and thus incomplete and not -representative of an actual WebAssembly binary format. However, there are a few key design choices in the -prototype format that significantly and demonstrably impact size and decode speed that are described below -as a starting point for defining a real standard binary format.* +The binary encoding is designed to allow fast startup, which includes reducing download size +and allow for quick decoding. Considering the matter of reducing download size, we can see +it as having three layers: + + * The **raw** binary encoding itself, natively decoded by the browser, and to be standardized in + v.1 of the spec. + * **Specific** compression to the binary encoding, that is unreasonable to expect a generic + compression algorithm like gzip to achieve. + * This is not meant to be standardized, at least not initially, as it can be done with a + downloaded decompressor that runs as web content on the client, and in particular + can be implemented in the polyfill. Not standardizing it leaves the binary + encoding as the only thing a WebAssembly implementation is required to implement, + which is nice. However, if the benefits are shown to be substantial, this will be + reconsidered in the future. + * We can do better than generic compression because we are aware of the AST structure + and other details: + * For example, macro compression that [deduplicates AST trees](https://github.com/WebAssembly/spec/issues/58#issuecomment-101863032) + can focus on AST nodes + their children, thus having `O(nodes)` entities to worry about, compared + to generic compression which in principle would need to look at `O(bytes*bytes)` entities. + Such macros would allow the logical equivalent of `#define ADD1(x) (x+1)`, i.e., + to be parametrized. Simpler macros (`#define ADDX1 (x+1)`) can implement useful + features like constant pools. + * Another example is reordering of functions and some internal nodes, which we know + does not change semantics, but + [can improve general compression](http://www.rfk.id.au/blog/entry/cromulate-improve-compressibility/). + * **Generic** compression, such as gzip, already supported in browsers, or LZMA and other + compression algorithms, which might be standardized as well. + +Each of the three layers work to find compression opportunities to the best of its abilities, without encroaching upon the subsequent layer's compression opportunities. ## Building blocks ### Variable-length integers - * 31% size reduction before compression, 7% size reduction after compression. + * 31% size reduction before compression, 7% size reduction after compression. * [LEB128](http://en.wikipedia.org/wiki/LEB128) except limited to uint32_t payloads. -### Constant pool - * 7% size reduction before compression, 2% size reduction after compression. - * Use a module-level constant pool for i32, f32, f64 literals. - * Still want to allow literals as immediates if: - * there is only one use of the literal - * the variable-length-encoded literal pool index is bigger than the variable-length-encoded literal. - * otherwise, the constant pool is only 5.6% win uncompressed / 0% win compressed. - -### Context-dependent index spaces - * With a pre-order encoding, a node is always encountered in a *context*: statement, i32-producing - expression, f32-producing expression, etc. - * Different contexts can reuse the same indices since there is never an ambiguity. - * For example, I32::Add can have the same index as F32::Add. - * This avoids opcodes spilling over into requiring >1 bytes. - * When SIMD is added, new opcodes would mostly be in new type contexts, so still fit in a single byte. - * This also enables the next technique: - -### Fold immediates into the opcodes - * 26% size reduction before compression, 5% size reduction after compression. - * In the opcode byte: - * High bit determines whether it is a normal op or an op-with-immediate. - * If normal op: the remaining 7 bits are the opcode - * If op-with-immediate: the next 2 bits are the opcode, the remaining 5 are the immediate - * Current set of ops-with-immediate: Stmt::{SetLoc,SetGlo}, {I32,F32,F64}::{GetLoc,LitPool,LitImm} - * How more than 4? Context-dependent index spaces lets, e.g., I32WithImm overlap F32WithImm. - * Just GetLoc+SetLoc accounts for 30-40% of total bytes in a module, and each has an immediate. - * Isn't this high? No, it's actually lower than a normal register-based bytecode which has to - spend bytes on input/output operands for *every* instruction. - * 98% of functions have < 32 variables so 5 bits of immediate is a good fit. - * Literals (pool and immediate) account for ~10% of bytes. - * Sorting constant pool by use count allows 50% of LitPool ops to use a folded immediate. - ## Global structure * A module contains: @@ -64,6 +57,7 @@ as a starting point for defining a real standard binary format.* * the serialized AST ## Serialized AST + * Use a preorder encoding of the AST * Efficient single-pass validation+compilation and polyfill * Allows context-dependent index spaces (described above) @@ -77,9 +71,6 @@ as a starting point for defining a real standard binary format.* * First write the opcode of `Call` (1 byte) * Then write the (variable-length) integer `Call::callee` (1-5 bytes) * Then recursively write each arg node (arity is determined by looking up `callee` in table of signatures) - * Given an AST node with foldable immediate: `struct GetLoc { uint32_t index; }` - * If `GetLoc::index` < 32, write a single byte as described [above](BinaryEncoding.md#fold-immediates-into-opcodes). - * Otherwise, write the opcode of `GetLoc` and the variable length integer `GetLoc::index`. ## Backwards Compatibility @@ -115,6 +106,3 @@ conflict-avoidance practices surrounding string names: or affect anything outside of the initialization of the index spaces; decoders would remain versionless and simply add cases for new *names* (as with current JS parsers). -## Further ideas -* Macro layer on top of serialized AST (allow the logical equivalent of `#define ADD1(x) (x+1)`) - * A simple variant would be just having nullary macros (`#define ADDX1 (x+1)`) |
