This PR follows @hafeidejiangyou advice to not only enable end users to
avoid reflection when calling host functions, but also use that approach
ourselves internally. The performance results are staggering and will be
noticable in high performance applications.
Before
```
BenchmarkHostCall/Call
BenchmarkHostCall/Call-16 1000000 1050 ns/op
Benchmark_EnvironGet/environGet
Benchmark_EnvironGet/environGet-16 525492 2224 ns/op
```
Now
```
BenchmarkHostCall/Call
BenchmarkHostCall/Call-16 14807203 83.22 ns/op
Benchmark_EnvironGet/environGet
Benchmark_EnvironGet/environGet-16 951690 1054 ns/op
```
To accomplish this, this PR consolidates code around host function
definition and enables a fast path for functions where the user takes
responsibility for defining its WebAssembly mappings. Existing users
will need to change their code a bit, as signatures have changed.
For example, we are now more strict that all host functions require a
context parameter zero. Also, we've replaced
`HostModuleBuilder.ExportFunction` and `ExportFunctions` with a new type
`HostFunctionBuilder` that consolidates the responsibility and the
documentation.
```diff
ctx := context.Background()
-hello := func() {
+hello := func(context.Context) {
fmt.Fprintln(stdout, "hello!")
}
-_, err := r.NewHostModuleBuilder("env").ExportFunction("hello", hello).Instantiate(ctx, r)
+_, err := r.NewHostModuleBuilder("env").
+ NewFunctionBuilder().WithFunc(hello).Export("hello").
+ Instantiate(ctx, r)
```
Power users can now use `HostFunctionBuilder` to define functions that
won't use reflection. There are two choices of interfaces to use
depending on if that function needs access to the calling module or not:
`api.GoFunction` and `api.GoModuleFunction`. Here's an example defining
one.
```go
builder.WithGoFunction(api.GoFunc(func(ctx context.Context, params []uint64) []uint64 {
x, y := uint32(params[0]), uint32(params[1])
sum := x + y
return []uint64{sum}
}, []api.ValueType{api.ValueTypeI32, api.ValueTypeI32}, []api.ValueType{api.ValueTypeI32})
```
As you'll notice and as documented, this approach is more verbose and
not for everyone. If you aren't making a low-level library, you are
likely able to afford the 1us penalty for the convenience of reflection.
However, we are happy to enable this option for foundational libraries
and those with high performance requirements (like ourselves)!
Fixes#825
Signed-off-by: Adrian Cole <adrian@tetrate.io>
It is more often the case that projects are enabling a freestanding
target, and that may or may not have an exporting memory depending on
how that's interpreted. This adds the ability to inspect memories
similar to how you can already inspect compiled code prior to
instantiation. For example, you can enforce an ABI constraint that
"memory" must be exported even if WASI is not in use.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
We formerly introduced `MemorySizer` as a way to control capacity independently of size. This was the first and only feature in `CompileConfig`. While possibly used privately, `MemorySizer` has never been used in public GitHub code.
These APIs interfere with how we do caching of compiled modules. Notably, they can change the min or max defined in wasm, which invalidates some constants. This has also had a bad experience, forcing everyone to boilerplate`wazero.NewCompileConfig()` despite that API never being used in open source.
This addresses the use cases in a different way, by moving configuration to `RuntimeConfig` instead. This allows us to remove `MemorySizer` and `CompileConfig`, and the problems with them, yet still retaining functionality in case someone uses it.
* `RuntimeConfig.WithMemoryLimitPages(uint32)`: Prevents memory from growing to 4GB (spec limit) per instance.
* This works regardless of whether the wasm encodes max or not. If there is no max, it becomes effectively this value.
* `RuntimeConfig.WithMemoryCapacityFromMax(bool)`: Prevents reallocations (when growing).
* Wasm that never sets max will grow from min to the limit above.
Note: Those who want to change their wasm (ex insert a max where there was none), have to do that externally, ex via compiler settings or post-build transformations such as [wabin](https://github.com/tetratelabs/wabin)
Signed-off-by: Adrian Cole <adrian@tetrate.io>
We at one point considered making `ModuleBuilder` create complete
WebAssembly binaries. However, we recently spun out
[wabin](https://github.com/tetratelabs/wabin), which allows this.
Meanwhile, the features in `ModuleBuilder` were confusing and misused.
For example, the only two cases memory was exported on GitHub were done
by accident. This is because host functions act on the guest's memory,
not their own.
Hence, this removes memory and globals from host side definitions, and
renames the type to HostModuleBuilder to clarify this is not ever going
to be used to construct normal Wasm binaries.
Most importantly, this simplifies the API and reduces a lot of code. It
is important to make changes like this, particularly deleting any
experimental things that didn't end up useful.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Anuraag Agrawal <anuraaga@gmail.com>
While compilers should be conservative when targeting WebAssembly Core
features, runtimes should be lenient as otherwise people need to
constantly turn on all features. Currently, most examples have to turn
on 2.0 features because compilers such as AssemblyScript and TinyGo use
them by default. This matches the policy with the reality, and should
make first time use easier.
This top-levels an internal type as `api.CoreFeatures` and defaults to
2.0 as opposed to 1.0, our previous default. This is less cluttered than
the excess of `WithXXX` methods we had prior to implementing all
planned WebAssembly Core Specification 1.0 features.
Finally, this backfills rationale as flat config types were a distinct
decision even if feature set selection muddied the topic.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This introduces wasm.CallEngine internal type, and assign it to the api.Function
implementations. api.Function.Call now uses that CallEngine assigned to it
to make function calls.
Internally, when creating CallEngine implementation, the compiler engine allocates
call frames and values stack. Previously, we allocate these stacks for each function calls,
which was a severe overhead as we can recognize in the benchmarks. As a result,
this reduces the memory usage (== reduces the GC jobs) as long as we reuse
the same api.Function multiple times.
As a side effect, now api.Function.Call is not goroutine-safe. So this adds the comment
about it on that method.
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
This simplifies FunctionListener definition by making it possible to
implement both interfaces without intermediate state. Passing the
function definition to the before/after callbacks is the key.
This also continues efforts towards Go 1.19 doc formatting.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Before, we allowed stubbed host functions to be defined in wasm instead
of Go. This improves performance and reduces a chance of side-effects vs
Go. In fact, any pure function was supported in wasm, provided it only
called pure functions.
This changes internals so that a wasm-defined host function can use
memory. Notably, host functions use the caller's memory, so this is
simpler to initially support in the interpreter.
This is needed to simplify and reduce performance hit of GOARCH=wasm,
GOOS=js code, which perform a lot of memory reads and do not have
idiomatic signatures.
Note: wasm-defined host functions remain internal until we gain
experience, at least conclusion of the wasm_exec host module.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
* Makes CacheNumInUint64 lazy and stops crashing in assemblyscript
This makes CacheNumInUint64 lazy so that all tests for function types
don't need to handle it. This also changes the assemblyscript special
functions so they don't crash when attempting to log. Finally, this
refactors `wasm.Func` so that it can enclose the parameter names as it
is more sensible than defining them elsewhere.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This removes the constraint of a module being exclusively wasm or host
functions. Later pull requests can optimize special imports to be
implemented in wasm, particularly useful for disabled logging callbacks.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This improves the experimental logging listener to show parameter name
and values like so:
```
--> ._start.command_export()
--> .__wasm_call_ctors()
--> .__wasilibc_initialize_environ()
==> wasi_snapshot_preview1.environ_sizes_get(result.environc=1048572,result.environBufSize=1048568)
<== ESUCCESS
<-- ()
==> wasi_snapshot_preview1.fd_prestat_get(fd=3,result.prestat=1048568)
<== ESUCCESS
--> .dlmalloc(2)
--> .sbrk(0)
<-- (1114112)
<-- (1060080)
--snip--
```
The convention `==>` implies it was a host function call
(def.IsHostFunction). This also improves the lifecycle by creating
listeners during compile. Finally, this backfills param names for
assemblyscript and wasi.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This top-levels `api.FunctionDefinition` which was formerly
experimental, and also adds import metadata to it. Now, it holds all
metadata known at compile time.
Here are the public API visible changes:
* api.ExportedFunction - replaced with api.FunctionDefinition as it is
usable for all types of functions.
* api.Function - `.ParamTypes/ResultTypes()` are replaced with
`.Definition().
* api.FunctionDefinition - extracted from experimental and adds
`.Import()` to get the any imported module and function name.
* experimental.FunctionDefinition - replaced with
api.FunctionDefinition.
* experimental.FunctionListenerFactory - adds first arg of the
instantiated module name, as it can be different than compiled.
* wazero.CompiledModule - Adds `.ImportedFunctions()` and changes result
type of `.ExportedFunctions()` to api.FunctionDefinition.
Internally, logic to create function definition are consolidated between
host and wasm-defined functions, notably wasm.Module now includes
`.BuildFunctionDefinitions()` which reduces duplication in
wasm.ModuleInstance `.BuildFunctions()`,
This obviates #681 by deleting the `ExportedFunction` type which
overlaps with this information.
This fixes#637 as it includes more metadata including imports.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Takeshi Yoneda <takeshi@tetrate.io>
This completes the implementation of SIMD proposal for both
the interpreter and compiler(amd64).
This also fixes#210 by adding the complete documentation
over all the wazeroir operations.
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
Co-authored-by: Crypt Keeper <64215+codefromthecrypt@users.noreply.github.com>
This drops the text format (%.wat) and renames
InstantiateModuleFromCode to InstantiateModuleFromBinary as it is no
longer ambiguous.
We decided to stop supporting the text format as it isn't typically used
in production, yet costs a lot of work to develop. Given the resources
available and the increased work added with WebAssembly 2.0 and soon
WASI 2, we can't afford to spend the time on it.
The old parser is used only internally and will eventually be moved to
its own repository named watzero, possibly towards archival.
See #59
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This commit implements the v128.const, i32x4.add and i64x2.add in
interpreter mode and this adds support for the vector value types in the
locals and globals.
Notably, the vector type values can be passed and returned by exported functions
as well as host functions via two-uint64 encodings as described in #484 (comment).
Note: implementation of these instructions on JIT will be done in subsequent PR.
part of #484
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
This commit enables WebAssembly 2.0 Core Specification tests.
In order to pass the tests, this fixes several places mostly on the
validation logic.
Note that SIMD instructions are not implemented yet.
part of #484
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
Co-authored-by: Crypt Keeper <64215+codefromthecrypt@users.noreply.github.com>
This commit completes the reference-types proposal implementation.
Notably, this adds support for
* `ref.is_null`, `ref.func`, `ref.is_null` instructions
* `table.get`, `table.set`, `table.grow`, `table.size` and `table.fill` instructions
* `Externref` and `Funcref` types (including invocation via uint64 encoding).
part of #484
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
This performs several changes to allow compilation config to be
centralized and scoped properly. The immediate effects are that we can
now process external types during `Runtime.CompileModule` instead of
doing so later during `Runtime.InstantiateModule`. Another nice side
effect is memory size problems can err at a source line instead of
having to be handled in several places.
There are some API effects to this, and to pay for them, some less used
APIs were removed. The "easy APIs" are left alone. For example, the APIs
to compile and instantiate a module from Go or Wasm in one step are left
alone.
Here are the changes, some of which are only for consistency. Rationale
is summarized in each point.
* ModuleBuilder.Build -> ModuleBuilder.Compile
* The result of this is similar to `CompileModule`, and pairs better
with `ModuleBuilder.Instantiate` which is like `InstantiateModule`.
* CompiledCode -> CompiledModule
* We punted on this name, the result is more than just code. This is
better I think and more consistent as it introduces less terms.
* Adds CompileConfig param to Runtime.CompileModule.
* This holds existing features and will have future ones, such as
mapping externtypes to uint64 for wasm that doesn't yet support it.
* Merges Runtime.InstantiateModuleWithConfig with Runtime.InstantiateModule
* This allows us to explain APIs in terms of implicit or explicit
compilation and config, vs implicit, kindof implicit, and explicit.
* Removes Runtime.InstantiateModuleFromCodeWithConfig
* Similar to above, this API only saves the compilation step and also
difficult to reason with from a name POV.
* RuntimeConfig.WithMemory(CapacityPages|LimitPages) -> CompileConfig.WithMemorySizer
* This allows all error handling to be attached to the source line
* This also allows someone to reduce unbounded memory while knowing
what its minimum is.
* ModuleConfig.With(Import|ImportModule) -> CompileConfig.WithImportRenamer
* This allows more types of import manipulation, also without
conflating functions with globals.
* Adds api.ExternType
* Needed for ImportRenamer and will be needed later for ExportRenamer.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
By testing the side-effects of Memory.Read, we ensure users who control
the underlying memory capacity can use the returned slice for
write-through access to Wasm addressible memory. Notably, this allows a
shared fixed length data structure to exist with a pointer on the Go
side and a memory offset on the Wasm side.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Takeshi Yoneda <takeshi@tetrate.io>
This commit adds support for multiple tables per module.
Notably, if the WithFeatureReferenceTypes is enabled,
call_indirect, table.init and table.copy instructions
can reference non-zero indexed tables.
part of #484
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
`Runtime.WithMemoryCapacityPages` is a function that determines memory
capacity in pages (65536 bytes per page). The inputs are the min and
possibly nil max defined by the module, and the default is to return
the min.
Ex. To set capacity to max when exists:
```golang
c.WithMemoryCapacityPages(func(minPages uint32, maxPages *uint32) uint32 {
if maxPages != nil {
return *maxPages
}
return minPages
})
```
Note: This applies at compile time, ModuleBuilder.Build or Runtime.CompileModule.
Fixes#500
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This adds an experimental package to expose two work-in-progress
features:
* FunctionListener - for tracing etc.
* Sys - to control random number generators
Both the functionality and the names of the features above are
not stable. However, this should help those who can tolerate drift a
means to test things out.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Co-authored-by: Anuraag Agrawal <anuraaga@gmail.com>
This commit implements the rest of the unimplemented instructions in the
bulk-memory-operations proposal.
Notably, this adds support for table.init, table.copy and elem.drop
instructions toggled by FeatureBulkMemoryOperations.
Given that, now wazero has the complete support for the bulk-memory-operations
proposal as described in https://github.com/WebAssembly/spec/blob/main/proposals/bulk-memory-operations/Overview.mdfixes#321
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
This commit adds the random module generator for compiler testing.
Notably, it only generates valid and (supposedly) compilable Wasm modules.
Generating such random sources/programs and using them for testing or
fuzzing compilers is the widely known way of securing and hardening them [1,2].
This commit only creates random modules where functions do not have
any meaningful body. That will be resolved in a subsequent commit
as the implementation for generating random function body could be
huge.
Note that the implementation has already found a bug #485, and can be
easily integrated with the fuzzing feature of Go 1.18:
func FuzzCompiler(f *testing.F) {
r := wazero.NewRuntimeWithConfig(wazero.NewRuntimeConfig().
WithFeatureMultiValue(true))
f.Fuzz(func(t *testing.T, seed []byte, numTypes, numFunctions, numImports, numExports, numGlobals, numElements, numData uint32, needStartSection bool) {
// Generate a random WebAssembly module.
m := Gen(seed, wasm.FeaturesFinished, numTypes, numFunctions, numImports, numExports, numGlobals, numElements, numData, needStartSection)
// Encode the generated module (*wasm.Module) as binary.
bin := binary.EncodeModule(m)
// Pass the generated binary into our compilers.
_, err := r.CompileModule(context.Background(), bin)
require.NoError(t, err)
})
}
[1] https://dl.acm.org/doi/fullHtml/10.1145/3363562
[2] https://en.wikipedia.org/wiki/Compiler_correctness
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
This commit allows CompiledCode to be re-used regardless of
the existence of import replacement configs for instantiation.
In order to achieve this, we introduce ModuleID, which is sha256
checksum calculated on source bytes, as a key for module compilation
cache. Previously, we used*wasm.Module as keys for caches which
differ before/after import replacement.
Signed-off-by: Takeshi Yoneda takeshi@tetrate.io
This commit makes it possible for functions to be compiled before instantiation.
Notably, this adds CompileModule method on Engine interface where we pass
wasm.Module (which is the decoded module) to engines, and engines compile
all the module functions and caches them keyed on *wasm.Module.
In order to achieve that, this stops the compiled native code from embedding typeID
which is assigned for all the function types in a store.
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
Global constants can be defined in wasm or in ModuleBuilder. In either
case, they end up being decoded and interpreted during instantiation.
This chooses signed encoding to avoid surprises. A more comprehensive
explanation was added to RATIONALE.md, but the motivation was a global
100 coming out negative.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This adjusts towards the exiting code which used int32/64 instead of
uint32/64. The reason is that the spec indicates intepretation as signed
numbers, which affects the maximum value.
See https://www.w3.org/TR/wasm-core-1/#value-types%E2%91%A2
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This adds functions to configure memory with ModuleBuilder. This uses
two functions, ExportMemory and ExportMemoryWithMax, as working with
uint32 pointers is awkward.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This adds tests that pass without changing deferred error handling.
There are some tests that don't pass even without deferred error
handling. I'll add those in a separate PR.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
During #425, @neilalexander gave constructive feedback that the API is
both moving fast, and not good enough yet. This attempts to reduce the
incidental complexity at the cost of a little conflation.
### odd presence of `wasm` and `wasi` packages -> `api` package
We had public API packages in wasm and wasi, which helped us avoid
leaking too many internals as public. That these had names that look
like there should be implementations in them cause unnecessary
confusion. This squashes both into one package "api" which has no
package collission with anything.
We've long struggled with the poorly specified and non-uniformly
implemented WASI specification. Trying to bring visibility to its
constraints knowing they are routinely invalid taints our API for no
good reason. This removes all `WASI` commands for a default to invoke
the function `_start` if it exists. In doing so, there's only one path
to start a module.
Moreover, this puts all wasi code in a top-level package "wasi" as it
isn't re-imported by any internal types.
### Reuse of Module for pre and post instantiation to `Binary` -> `Module`
Module is defined by WebAssembly in many phases, from decoded to
instantiated. However, using the same noun in multiple packages is very
confusing. We at one point tried a name "DecodedModule" or
"InstantiatedModule", but this is a fools errand. By deviating slightly
from the spec we can make it unambiguous what a module is.
This make a result of compilation a `Binary`, retaining `Module` for an
instantiated one. In doing so, there's no longer any name conflicts
whatsoever.
### Confusion about config -> `ModuleConfig`
Also caused by splitting wasm into wasm+wasi is configuration. This
conflates both into the same type `ModuleConfig` as it is simpler than
trying to explain a "will never be finished" api of wasi snapshot-01 in
routine use of WebAssembly. In other words, this further moves WASI out
of the foreground as it has been nothing but burden.
```diff
--- a/README.md
+++ b/README.md
@@ -49,8 +49,8 @@ For example, here's how you can allow WebAssembly modules to read
-wm, err := r.InstantiateModule(wazero.WASISnapshotPreview1())
-defer wm.Close()
+wm, err := wasi.InstantiateSnapshotPreview1(r)
+defer wm.Close()
-sysConfig := wazero.NewSysConfig().WithFS(os.DirFS("/work/home"))
-module, err := wazero.StartWASICommandWithConfig(r, compiled, sysConfig)
+config := wazero.ModuleConfig().WithFS(os.DirFS("/work/home"))
+module, err := r.InstantiateModule(binary, config)
defer module.Close()
...
```
This allows users to reduce the memory limit per module below 4 Gi. This
is often needed because Wasm routinely leaves off the max, which implies
spec max (4 Gi). This uses Ki Gi etc in error messages because the spec
chooses to, though we can change to make it less awkward.
This also fixes an issue where we instantiated an engine inside config.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
This deduplicates Engine tests so that there's less chance of copy/paste
errors and less requirement to use ad-hoc tests which are outside the
relevant source tree.
Signed-off-by: Adrian Cole <adrian@tetrate.io>