Add an extra argument `os_file_handle file` for `os_mmap` to support
mapping file from a file fd, and remove `os_get_invalid_handle` from
`posix_file.c` and `win_file.c`, instead, add it in the `platform_internal.h`
files to remove the dependency on libc-wasi.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
- Fix potential invalid push param phis and add incoming phis to a un-existed basic block
- Fix potential invalid shift count int rotl/rotr opcodes
- Resize memory_data_size to UINT32_MAX if it is 4G when hw bound check is enabled
- Fix negative linear memory offset is used for 64-bit target it is const and larger than INT32_MAX
Split memory instance's field `uint32 ref_count` into `bool is_shared_memory`
and `uint16 ref_count`, and lock the memory only when `is_shared_memory`
flag is true, no need to acquire a lock for non-shared memory when shared
memory feature is enabled.
Avoid repeatedly initializing the shared memory data when creating the child
thread in lib-pthread or lib-wasi-threads.
Add shared memory lock when accessing some fields of the memory instance
if the memory instance is shared.
Init shared memory's memory_data_size/memory_data_end fields according to
the current page count but not max page count.
Add wasm_runtime_set_mem_bound_check_bytes, and refine the error message
when shared memory flag is found but the feature isn't enabled.
Avoid the stack traces getting mixed up together when multi-threading is enabled
by using exception_lock/unlock in dumping the call stacks.
And remove duplicated call stack dump in wasm_application.c.
Also update coding guideline CI to fix the clang-format-12 not found issue.
Support muti-module for AOT mode, currently only implement the
multi-module's function import feature for AOT, the memory/table/
global import are not implemented yet.
And update wamr-test-suites scripts, multi-module sample and some
CIs accordingly.
Introduce module instance context APIs which can set one or more contexts created
by the embedder for a wasm module instance:
```C
wasm_runtime_create_context_key
wasm_runtime_destroy_context_key
wasm_runtime_set_context
wasm_runtime_set_context_spread
wasm_runtime_get_context
```
And make libc-wasi use it and set wasi context as the first context bound to the wasm
module instance.
Also add samples.
Refer to https://github.com/bytecodealliance/wasm-micro-runtime/issues/2460.
When embedding WAMR, this PR allows to register a callback that is
invoked when memory.grow fails.
In case of memory allocation failures, some languages allow to handle
the error (e.g. by checking the return code of malloc/calloc in C), some
others (e.g. Rust) just panic.
- Inherit shared memory from the parent instance, instead of
trying to look it up by the underlying module. The old method
works correctly only when every cluster uses different module.
- Use reference count in WASMMemoryInstance/AOTMemoryInstance
to mark whether the memory is shared or not
- Retire WASMSharedMemNode
- For atomic opcode implementations in the interpreters, use
a global lock for now
- Update the internal API users
(wasi-threads, lib-pthread, wasm_runtime_spawn_thread)
Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/1962
## Context
Currently, WAMR supports compiling iwasm with flag `WAMR_BUILD_WASI_NN`.
However, there are scenarios where the user might prefer having it as a shared library.
## Proposed Changes
Decouple wasi-nn context management by internally managing the context given
a module instance reference.
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:
1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
`iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.
The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT
This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
i32.load, i64.load, f32.load, f64.load, v128.load,
i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.
Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
Try using existing exec_env to execute wasm app's malloc/free func and
execute post instantiation functions. Create a new exec_env only when
no existing exec_env was found.
Use pre-created exec_env for instantiation and module_malloc/free,
use the same exec_env of the current thread to avoid potential
unexpected behavior.
And remove unnecessary shared_mem_lock in wasm_module_free,
which may cause dead lock.
- Remove notify_stale_threads_on_exception and change atomic.wait
to be interruptible by keep waiting and checking every one second,
like the implementation of poll_oneoff in libc-wasi
- Wait all other threads exit and then get wasi exit_code to avoid
getting invalid value
- Inherit suspend_flags of parent thread while creating new thread to
avoid terminated flag isn't set for new thread
- Fix wasi-threads test case update_shared_data_and_alloc_heap
- Add "Lib wasi-threads enabled" prompt for cmake
- Fix aot get exception, use aot_copy_exception instead
- Implement atomic.fence to ensure a proper memory synchronization order
- Destroy exec_env_singleton first in wasm/aot deinstantiation
- Change terminate other threads to wait for other threads in
wasm_exec_env_destroy
- Fix detach thread in thread_manager_start_routine
- Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread
- Add lib-pthread and lib-wasi-threads compilation to Windows CI
- Use execute_post_instantiate_functions to call start, _initialize,
__post_instantiate, __wasm_call_ctors functions after instantiation
- Always call start function for both main instance and sub instance
- Only call _initialize and __post_instantiate for main instance
- Only call ___wasm_call_ctors for main instance and when bulk memory
is enabled and wasi import functions are not found
- When hw bound check is enabled, use the existing exec_env_tls
to call func for sub instance, and switch exec_env_tls's module inst
to current module inst to avoid checking failure and using the wrong
module inst
The start/initialize functions of wasi module are to do some initialization work
during instantiation, which should be only called one time in the instantiation
of main instance. For example, they may initialize the data in linear memory,
if the data is changed later by the main instance, and re-initialized again by
the child instance, unexpected behaviors may occur.
And clear a shadow warning in classic interpreter.
- Reorganize the library structure
- Use the latest version of `wasi-nn` wit (Oct 25, 2022):
0f77c48ec1/wasi-nn.wit.md
- Split logic that converts WASM structs to native structs in a separate file
- Simplify addition of new frameworks
The original CI didn't actually run wasi test suite for x86-32 since the `TEST_ON_X86_32=true`
isn't written into $GITHUB_ENV.
And refine the error output when failed to link import global.
When a wasm module is duplicated instantiated with wasm_instance_new,
the function import info of the previous instantiation may be overwritten by
the later instantiation, which may cause unexpected behavior.
Store the function import info into the module instance to fix the issue.
Refine AOT exception check in the caller when returning from callee function,
remove the exception check instructions when hw bound check is enabled to
improve the performance: create guard page to trigger signal handler when
exception occurs.
Refactor LLVM JIT for some purposes:
- To simplify the source code of JIT compilation
- To simplify the JIT modes
- To align with LLVM latest changes
- To prepare for the Multi-tier JIT compilation, refer to #1302
The changes mainly include:
- Remove the MCJIT mode, replace it with ORC JIT eager mode
- Remove the LLVM legacy pass manager (only keep the LLVM new pass manager)
- Change the lazy mode's LLVM module/function binding:
change each function in an individual LLVM module into all functions in a single LLVM module
- Upgraded ORC JIT to ORCv2 JIT to enable lazy compilation
Refer to #1468
Refactor the layout of interpreter and AOT module instance:
- Unify the interp/AOT module instance, use the same WASMModuleInstance/
WASMMemoryInstance/WASMTableInstance data structures for both interpreter
and AOT
- Make the offset of most fields the same in module instance for both interpreter
and AOT, append memory instance structure, global data and table instances to
the end of module instance for interpreter mode (like AOT mode)
- For extra fields in WASM module instance, use WASMModuleInstanceExtra to
create a field `e` for interpreter
- Change the LLVM JIT module instance creating process, LLVM JIT uses the WASM
module and module instance same as interpreter/Fast-JIT mode. So that Fast JIT
and LLVM JIT can access the same data structures, and make it possible to
implement the Multi-tier JIT (tier-up from Fast JIT to LLVM JIT) in the future
- Unify some APIs: merge some APIs for module instance and memory instance's
related operations (only implement one copy)
Note that the AOT ABI is same, the AOT file format, AOT relocation types, how AOT
code accesses the AOT module instance and so on are kept unchanged.
Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1384
Memory num_bytes_per_page was incorrectly set in memory enlarging for
shared memory, we fix it. And don't set memory_data_size again for shared
memory.
Implement more socket APIs, refer to #1336 and below PRs:
- Implement wasi_addr_resolve function (#1319)
- Fix socket-api byte order issue when host/network order are the same (#1327)
- Enhance sock_addr_local syscall (#1320)
- Implement sock_addr_remote syscall (#1360)
- Add support for IPv6 in WAMR (#1411)
- Implement ns lookup allowlist (#1420)
- Implement sock_send_to and sock_recv_from system calls (#1457)
- Added http downloader and multicast socket options (#1467)
- Fix `bind()` calls to receive the correct size of `sockaddr` structure (#1490)
- Assert on correct parameters (#1505)
- Copy only received bytes from socket recv buffer into the app buffer (#1497)
Co-authored-by: Marcin Kolny <mkolny@amazon.com>
Co-authored-by: Marcin Kolny <marcin.kolny@gmail.com>
Co-authored-by: Callum Macmillan <callumimacmillan@gmail.com>
Fix two issues of building WAMR on Windows:
- The build_llvm.py script calls itself, spawning instances faster than they expire,
which makes Python3 eat up the entire RAM in a pretty short time.
- The MSVC compiler doesn't support preprocessor statements inside macro expressions.
Two places inside bh_assert() were found.
Fix multi-module issue:
don't call the sub module's function with "$sub_module_name$func_name"
Fix the aot_call_function free argv1 issue
Modify some API comments in wasm_export.h
Fix the wamrc help info
Normalize wasm types, for the two wasm types, if their parameter types
and result types are the same, we only save one copy, so as to reduce
the footprint and simplify the type comparison in opcode CALL_INDIRECT.
And fix issue in interpreter globals_instantiate, and remove used codes.
Implement boundary check with hardware trap for interpreter on
64-bit platforms:
- To improve the performance of interpreter and Fast JIT
- To prepare for multi-tier compilation for the feature
Linux/MacOS/Windows 64-bit are enabled.
Enable dump call stack to a buffer, use API
`wasm_runtime_get_call_stack_buf_size` to get the required buffer size
and use API
`wasm_runtime_dump_call_stack_to_buf` to dump call stack to a buffer