From 0ef4e93fe3bea4356fdee2c647b2dbc54cc2d3d0 Mon Sep 17 00:00:00 2001 From: Luke Wagner Date: Tue, 4 Aug 2015 09:02:31 -1000 Subject: Clarify, rename, and FAQ memory allocation --- AstSemantics.md | 28 ++++++++++++++++++++-------- FAQ.md | 33 +++++++++++++++++++++++++++++++++ FutureFeatures.md | 25 +++++++++++++++++++++---- Modules.md | 5 +++-- Nondeterminism.md | 1 + 5 files changed, 78 insertions(+), 14 deletions(-) diff --git a/AstSemantics.md b/AstSemantics.md index 35f71b2..9bba34e 100644 --- a/AstSemantics.md +++ b/AstSemantics.md @@ -71,17 +71,18 @@ Global variables and linear memory accesses use memory types. The main storage of a WebAssembly module, called the *linear memory*, is a contiguous, byte-addressable range of memory spanning from offset `0` and -extending for `memory_size` bytes. The linear memory can be considered to be a -untyped array of bytes. The linear memory is sandboxed; it does not alias the -execution engine's internal data structures, the execution stack, local +extending for `memory_size` bytes which can be dynamically adjusted by +[`resize_memory`](Modules.md#resizing). The linear memory can be considered to +be an untyped array of bytes. The linear memory is sandboxed; it does not alias +the execution engine's internal data structures, the execution stack, local variables, global variables, or other process memory. The initial state of linear memory is specified by the [module](Modules.md#initial-state-of-linear-memory). -In the MVP, linear memory is not shared between threads of execution or -modules: every module has its own separate linear memory. It will, -however, be possible to share linear memory between separate modules and -threads once [threads](PostMVP.md#threads) and -[dynamic linking](FutureFeatures.md#dynamic-inking) are added as features. +In the MVP, linear memory is not shared between threads of execution. Separate +modules can execute in separate threads but have their own linear memory and can +only communicate through messaging, e.g. in browsers using `postMessage`. It +will be possible to share linear memory between threads of execution when +[threads](PostMVP.md#threads) are added. ### Linear Memory Operations @@ -210,6 +211,17 @@ tradeoffs. execution of a module in a mode that threw exceptions on out-of-bounds access. +### Resizing + +As stated [above](AstSemantics.md#linear-memory), linear memory can be resized +by a `resize_memory` builtin operation. The resize delta is required to be a +multiple of a global `page_size` constant. Also as stated +[above](AstSemantics.md#linear-memory), linear memory is contiguous, meaning +there are no "holes" in the linear address space. After the MVP, there are +[future features](FutureFeatures.md#finer-grained-control-over-memory) proposed +to allow setting protection and creating mappings within the contiguous +linear memory. + ## Local variables Each function has a fixed, pre-declared number of local variables which occupy a single diff --git a/FAQ.md b/FAQ.md index 42fefcf..1e0fb5e 100644 --- a/FAQ.md +++ b/FAQ.md @@ -229,3 +229,36 @@ WebAssembly implementations run on the user side, so there is no opportunity for * Most of the individual floating point operations that WebAssembly does have already map to individual fast instructions in hardware. Telling `add`, `sub`, or `mul` they don't have to worry about NaN for example doesn't make them any faster, because NaN is handled quickly and transparently in hardware on all modern platforms. * WebAssembly has no floating point traps, status register, dynamic rounding modes, or signalling NaNs, so optimizations that depend on the absence of these features are all safe. + +## What about `mmap`? + +The [`mmap`](http://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html) +syscall has many useful features. While these are all packed into one +overloaded syscall in POSIX, WebAssembly unpacks this functionality into +multiple builtins: +* the MVP starts with the ability to resize linear memory via a + [`resize_memory`](AstSemantics.md#resizing) builtin operation; +* proposed [future features](FutureFeatures.md#finer-grained-control-over-memory) + would allow the application to change the protection and mappings for pages + in the contiguous range set by `resize_memory`. + +A significant feature of `mmap` that is missing from the above list is the +ability to allocate disjoint virtual address ranges. The reasoning for this +omission is: +* The above functionality is sufficient to allow a user-level libc to + implement full, compatible `mmap` with what appears to be noncontiguous + memory allocation (but, under the hood is just coordinated use of + `memory_resize` and `mprotect`/`map_file`/`map_shmem`/`madvise`). +* The benefit of allowing noncontiguous virtual address allocation would be if + it allowed the engine to interleave a WebAssembly module's linear memory with + other memory allocations in the same process (in order to mitigate virtual + address space fragmentation). There are two problems with this: + * This interleaving with unrelated allocations does not currently admit + efficient security checks to prevent one module from corrupting data outside + its heap (see discussion in #285). + * This interleaving would require making allocation nondeterministic. + Nondeterminism is something that WebAssemgly generally + [tries to avoid](Nondeterminism.md) and in this particular case, history + has clear examples of memory allocator nondeterminism leading to real-world + bustage ([[1](https://technet.microsoft.com/en-us/magazine/ff625273.aspx)], + [[2](http://lxr.free-electrons.com/source/include/linux/personality.h?v=3.2#L31)]). diff --git a/FutureFeatures.md b/FutureFeatures.md index f2320b1..f8a8397 100644 --- a/FutureFeatures.md +++ b/FutureFeatures.md @@ -40,10 +40,27 @@ possible to use a non-standard ABI for specialized purposes. ## Finer-grained control over memory -* `mmap` of files. -* `madvise(MADV_DONTNEED)`. -* Shared memory, where a physical address range is mapped to multiple physical - pages in a single WebAssembly module as well as across modules. +Provide access to safe OS-provided functionality including: +* `map_file(addr, length, Blob, file-offset)`: semantically, this operation + copies the specified range from `Blob` into the range `[addr, addr+length)` + (where `addr+length <= memory_size`) but implementations are encouraged + to `mmap(addr, length, MAP_FIXED | MAP_PRIVATE, fd)` +* `dont_need(addr, length)`: semantically, this operation zeroes the given range + but the implementation is encouraged to `madvise(addr, length, MADV_DONTNEED)` +* `shmem_create(length)`: create a memory object that can be simultaneously + shared between multiple linear memories +* `map_shmem(addr, length, shmem, shmem-offset)`: like `map_file` except + `MAP_SHARED`, which isn't otherwise valid on read-only Blobs +* `mprotect(addr, length, prot-flags)`: change protection on the range + `[addr, addr+length)` (where `addr+length <= memory_size`) + +The `addr` and `length` parameters above would be required to be multiples of +the [`page_size`](AstSemantics.md#resizing) global constant. + +The above list of functionality mostly covers the set of functionality +provided by the `mmap` OS primitive. One significant exception is that `mmap` +can allocate noncontiguous virtual address ranges. See the +[FAQ](FAQ.md#what-about-mmap) for rationale. ## More expressive control flow diff --git a/Modules.md b/Modules.md index 435788f..b91aba2 100644 --- a/Modules.md +++ b/Modules.md @@ -108,8 +108,9 @@ to allow *explicitly* sharing linear memory between multiple modules. ## Initial state of linear memory A module will contain a section declaring the linear memory size (initial and -maximum size allowed by `sbrk`) and the initial contents of memory (analogous -to `.data`, `.rodata`, `.bss` sections in native executables). +maximum size allowed by [`resize_memory`](AstSemantics.md#resizing) and the +initial contents of memory (analogous to `.data`, `.rodata`, `.bss` sections in +native executables). ## Code section diff --git a/Nondeterminism.md b/Nondeterminism.md index e3296cb..7685216 100644 --- a/Nondeterminism.md +++ b/Nondeterminism.md @@ -31,6 +31,7 @@ currently admits nondeterminism: nondeterministic. * Out of bounds heap accesses *may* want [some flexibility](AstSemantics.md#out-of-bounds) + * The [`page_size` global constant](AstSemantics.md#resizing) * NaN bit patterns in floating point [operations](AstSemantics.md#floating-point-operations) and [conversions](AstSemantics.md#datatype-conversions-truncations-reinterpretations-promotions-and-demotions) -- cgit v1.2.3