Now that the filesystem implementation is now complete, the previous
test filters on Windows can be removed. Some of the tests only pass when
certain environment variables have been set on Windows so an extra step
has been added in the wasi test runner script to modify the test config
files before the tests begin.
The stack profiler `aot_func#xxx` calls the wrapped function of `aot_func_internal#xxx`
by using symbol reference, but in some platform like xtensa, it’s translated into a native
long call, which needs to resolve the indirect address by relocation and breaks the XIP
feature which requires the eliminating of relocation.
The solution is to change the symbol reference into an indirect call through the lookup
table, the code will be like this:
```llvm
call_wrapped_func: ; preds = %stack_bound_check_block
%func_addr1 = getelementptr inbounds ptr, ptr %func_ptrs_ptr, i32 75
%func_tmp2 = load ptr, ptr %func_addr1, align 4
tail call void %func_tmp2(ptr %exec_env)
ret void
```
Fix the errors reported in the sanitizer test of nightly run CI.
When the stack is in polymorphic state, the stack operands may be changed
after pop and push operations (e.g. stack is empty but pop op can succeed
in polymorphic, and the push op can push a new operand to stack), this may
impact the following checks to other target blocks of the br_table opcode.
Use math functions only with `CONFIG_MINIMAL_LIBC=y`.
`CONFIG_PICOLIBC=y` or `CONFIG_NEWLIB_LIBC=y` provides math functions
that are used by wasm, and compilation fails when they are selected.
Signed-off-by: Maxim Kolchurin <maxim.kolchurin@gmail.com>
When running AOT code in Zephyr on STM32H743VIT6 without
CONFIG_CACHE_MANAGEMENT=y, a hard fault occurs, which leads to
SCB_CleanDCache().
It’s better to use the functions built into Zephyr.
This PR encompasses two complementing purposes:
A documentation on verifying an Intel SGX evidence as produced by WAMR,
including a guide for verification without an Intel SGX-enabled platform.
This also contains a small addition to the RA sample to extract specific
information, such as whether the enclave is running in debug mode.
A C# sample to verify evidence on trusted premises (and without Intel SGX).
Evidence is generated on untrusted environments, using Intel SGX.
As an original design rule, the code in `core/shared/platform` should not
rely on the code in `core/share/utils`. In the current implementation,
platform layer calls function `bh_memory_remap_slow` in utils layer.
This PR adds inline function `os_mremap_slow` in platform_api_vmcore.h,
and lets os_remap call it if mremap fails. And remove bh_memutils.h/c as
as they are unused.
And resolve the compilation warning in wamrc:
```bash
core/shared/platform/common/posix/posix_memmap.c:255:16:
warning: implicit declaration of function ‘bh_memory_remap_slow’
255 | return bh_memory_remap_slow(old_addr, old_size, new_size);
```
The wasm_interp_call_func_bytecode is called for the first time with the empty
module/exec_env to generate a global_handle_table. Before that happens though,
the function checks if the module instance has bounds check enabled. Because
the module instance is null, the program crashes. This PR added an extra check to
prevent the crashes.
Implement the GC (Garbage Collection) feature for interpreter mode,
AOT mode and LLVM-JIT mode, and support most features of the latest
spec proposal, and also enable the stringref feature.
Use `cmake -DWAMR_BUILD_GC=1/0` to enable/disable the feature,
and `wamrc --enable-gc` to generate the AOT file with GC supported.
And update the AOT file version from 2 to 3 since there are many AOT
ABI breaks, including the changes of AOT file format, the changes of
AOT module/memory instance layouts, the AOT runtime APIs for the
AOT code to invoke and so on.
This increases the chance to use "short" calls.
Assumptions:
- LLVM preserves the order of functions in a module
- The wrapper function are smaller than the wrapped functions
- The target CPU has "short" PC-relative variation of call/jmp instructions
and they are preferrable over the "long" ones.
A motivation:
- To avoid some relocations for XIP, I want to use xtensa PC-relative
call instructions, which can only reach ~512KB.
Using `CHECK_BULK_MEMORY_OVERFLOW(addr + offset, n, maddr)` to do the
boundary check may encounter integer overflow in `addr + offset`, change to
use `CHECK_MEMORY_OVERFLOW(n)` instead, which converts `addr` and `offset`
to uint64 first and then add them to avoid integer overflow.
With this approach we can omit using memset() for the newly allocated memory
therefore the physical pages are not being used unless touched by the program.
This also simplifies the implementation.
After #2995, AOT may stop working properly on arm MacOS:
```bash
wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c,
line 1270, WASM module load failed
AOT module load failed: mmap memory failed
```
That's because, without `#include <TargetConditionals.h>`, `TARGET_OS_OSX` is undefined,
since it's definition is in that header file.
This PR adds the initial support for WASM exception handling:
* Inside the classic interpreter only:
* Initial handling of Tags
* Initial handling of Exceptions based on W3C Exception Proposal
* Import and Export of Exceptions and Tags
* Add `cmake -DWAMR_BUILD_EXCE_HANDLING=1/0` option to enable/disable
the feature, and by default it is disabled
* Update the wamr-test-suites scripts to test the feature
* Additional CI/CD changes to validate the exception spec proposal cases
Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1884587513f3c68bebfe9ad759bccdfed8
Signed-off-by: Ricardo Aguilar <ricardoaguilar@siemens.com>
Co-authored-by: Chris Woods <chris.woods@siemens.com>
Co-authored-by: Rene Ermler <rene.ermler@siemens.com>
Co-authored-by: Trenner Thomas <trenner.thomas@siemens.com>
While we used a different approach for poll_oneoff [1],
the implementation works only when the poll list includes
an absolute clock event. That is, if we have a thread which is
polling on descriptors without a timeout, we fail to terminate
the thread.
This commit fixes it by applying wasm_runtime_begin_blocking_op
to poll as well.
[1] https://github.com/bytecodealliance/wasm-micro-runtime/pull/1951
This fixes the cosmopolitan platform.
- Switch `build_cosmocc.sh` and platform documentation to
explicitly use the x86_64 cosmocc compiler as multi-arch
cosmocc won't work here. Older version `cosmocc` just did
a x86_64 build.
- Add missing items from `platform_internal.h` to fix build.
Follow-up on #2907. The log level is needed in the host embedder to
better integrate with the embedder's logger.
Allow the developer to customize his bh_log callback with
`cmake -DWAMR_BH_LOG=<log_callback>`,
and update sample/basic to show the usage.
Possible alternatives:
* Make wasm_cluster_destroy_spawned_exec_env take two exec_env.
One for wasm execution and another to specify the target to destroy.
* Make execute functions to switch exec_env as briefly discussed in
https://github.com/bytecodealliance/wasm-micro-runtime/pull/2047
When WAMR is embedded to other application, the lifecycle of the socket
might conflict with other usecases. E.g. if WAMR is deinitialized before any
other use of sockets, the application goes into the invalid state. The new
flag allows host application to take control over the socket initialization.
Check whether the arguments are NULL before calling bh_hash_map_find,
or lots of "HashMap find elem failed: map or key is NULL" warnings may
be dumped. Reported in #3053.
Though SIMD isn't supported by interpreter, when JIT is enabled,
developer may run `iwasm --interp <wasm_file>` to trigger the SIMD
opcode in interpreter, which isn't handled before this PR.
It seems that some users want to wrap rather large chunk of code
with wasm_runtime_begin_blocking_op/wasm_runtime_end_blocking_op.
If the wrapped code happens to have a call to
e.g. wasm_runtime_spawn_exec_env, WASM_SUSPEND_FLAG_BLOCKING is
inherited to the child exec_env and it may cause unexpected behaviors.
- Enable quick aot entry when hw bound check is disabled
- Remove unnecessary ret_type argument in the quick aot entries
- Declare detailed prototype of aot function to call in each quick aot entry
Since there is no so rich api in freertos like embedded system, simply set
CONFIG_HAS_CAP_ENTER to 1 to support posix file api for freertos.
Test file api in wasm app pass.
Enhance the statistic of wasm function execution time, or the performance
profiling feature:
- Add os_time_thread_cputime_us() to get the cputime of a thread,
and use it to calculate the execution time of a wasm function
- Support the statistic of the children execution time of a function,
and dump it in wasm_runtime_dump_perf_profiling
- Expose two APIs:
wasm_runtime_sum_wasm_exec_time
wasm_runtime_get_wasm_func_exec_time
And rename os_time_get_boot_microsecond to os_time_get_boot_us.
For shared memory, the max memory size must be defined in advanced. Re-allocation
for growing memory can't be used as it might change the base address, therefore when
OS_ENABLE_HW_BOUND_CHECK is enabled the memory is mmaped, and if the flag is
disabled, the memory is allocated. This change introduces a flag that allows users to use
mmap for reserving memory address space even if the OS_ENABLE_HW_BOUND_CHECK
is disabled.
When the original wasm contains multiple compilation units, the current
logic uses the first one for everything. This commit tries to use a bit more
appropriate ones.
Errors were reported when initializing wasm_val_t values with WASM_I32_VAL like macros.
```
error: missing initializer for member ‘wasm_val_t::__paddings’ [-Werror=missing-field-initializers]
64 | wasm_val_t res = {WASM_INIT_VAL};
```
And rename DEPRECATED to WASM_API_DEPRECATED to avoid using defines with generic names.
Compilation error was reported when `cmake -DWAMR_BUILD_LIBC_WASI=0`
on linux-sgx platform:
```
core/shared/platform/linux-sgx/sgx_socket.c:8:10:
fatal error: libc_errno.h: No such file or directory
8 | #include "libc_errno.h"
| ^~~~~~~~~~~~~~
```
After fixing, both `cmake -DWAMR_BUILD_LIBC_WASI=1` and
`WAMR_BUILD_LIBC_WASI=0` work good.
The content in custom name section is changed after loaded since the strings
are adjusted with '\0' appended, the emitted AOT file then cannot be loaded.
The PR disables changing the content for AOT compiler to resolve it.
And disable emitting custom name section for `wamrc --enable-dump-call-stack`,
instead, use `wamrc --emit-custom-sections=name` to emit it.
`pthread_jit_write_protect_np` is only available on macOS, and
`sys_icache_invalidate` is available on both iOS and macOS and
has no restrictions on ARM architecture.
When using the wasm-c-api and there's a trap, `wasm_func_call()` returns
a `wasm_trap_t *` object. No matter which thread crashes, the trap contains
the stack frames of the main thread.
With this PR, when there's an exception, the stack frames of the thread
where the exception occurs are stored into the thread cluster.
`wasm_func_call()` can then return those stack frames.
Allow to invoke the quick call entry wasm_runtime_quick_invoke_c_api_import to
call the wasm-c-api import functions to speedup the calling process, which reduces
the data copying.
Use `wamrc --invoke-c-api-import` to generate the optimized AOT code, and set
`jit_options->quick_invoke_c_api_import` true in wasm_engine_new when LLVM JIT
is enabled.
In some scenarios there may be lots of callings to AOT/JIT functions from the
host embedder, which expects good performance for the calling process, while
in the current implementation, runtime calls the wasm_runtime_invoke_native
to prepare the array of registers and stacks for the invokeNative assemble code,
and the latter then puts the elements in the array to physical registers and
native stacks and calls the AOT/JIT function, there may be many data copying
and handlings which impact the performance.
This PR registers some quick AOT/JIT entries for some simple wasm signatures,
and let runtime call the entry to directly invoke the AOT/JIT function instead of
calling wasm_runtime_invoke_native, which speedups the calling process.
We may extend the mechanism next to allow the developer to register his quick
AOT/JIT entries to speedup the calling process of invoking the AOT/JIT functions
for some specific signatures.
On macOS, by default, the first 4GB is occupied by the pagezero.
While it can be controlled with link time options, as we are
an library, we usually don't have a control on how to link an
executable.
Add an API to set segue flags for wasm-c-api LLVM JIT mode:
```C
wasm_config_t *
wasm_config_set_segue_flags(wasm_config_t *config, uint32 segue_flags);
```
- Don't allocate the implicit/unused frame when calling the LLVM JIT function
- Don't set exec_env's thread handle and stack boundary in the recursive
calling from host, since they have been set in the first time calling
- Fix frame not freed in llvm_jit_call_func_bytecode
before the change, only support wasm app exit like:
```c
void *thread_routine(void *arg)
{
printf("Enter thread\n");
return NULL;
}
```
if call pthread_exit, it will crash:
```c
void *thread_routine(void *arg)
{
printf("Enter thread\n");
pthread_exit(NULL);
return NULL;
}
```
This commit lets both upstairs work correctly, test pass on stm32f103 mcu.
And refactor the original perf support
- use WAMR_BUILD_LINUX_PERF as the cmake compilation control
- use WASM_ENABLE_LINUX_PERF as the compiler macro
- use `wamrc --enable-linux-perf` to generate aot file which contains fp operations
- use `iwasm --enable-linux-perf` to create perf map for `perf record`
The host embedder may also want to terminate the wasm instance
for single-threading mode, and it should work by setting exception
to the wasm instance.
Loggers (e.g. glog) usually come with instrumentation to add timestamp
and other information when reporting. That results in the timestamp
being reported twice, making the output confusing.
According to the specification:
```
When instantiating a module which is expected to run
with `wasi-threads`, the WASI host must first allocate shared memories to
satisfy the module's imports.
```
Currently, if a test from the spec is executed while having the `multi-module`
feature enabled, WAMR fails with `WASM module load failed: unknown import`.
That happens because spec tests use memory like this:
`(memory (export "memory") (import "foo" "bar") 1 1 shared)`
and WAMR tries to find a registered module named `foo`.
At the moment, there is no specific module name that can be used to identify
that the memory is imported because using WASI threads:
https://github.com/WebAssembly/wasi-threads/issues/33,
so this PR only avoids treating the submodule dependency not being found
as a failure.
It's possible to set both `atim` and `atim_now` in the `fstflags`
parameter. Same goes for `mtin` and `mtim_now`. However, it's
ambiguous which time should be set in these two cases. This commit
checks this and returns `EINVAL`.
A wasm module can be either a command or a reactor, so it can export
either `_start` or `_initialize`. Currently, if a command module is run,
`iwasm` still looks for `_initialize`, resulting in the warning:
`can not find an export 0 named _initialize in the module`.
Change to look for `_initialize` only if `_start` not found to resolve the issue.
This fixes bug #2880. Zephyr 3.2 made changes to how headers are reference (see [release notes](https://docs.zephyrproject.org/latest/releases/release-notes-3.2.html)). Work item [49578](https://github.com/zephyrproject-rtos/zephyr/issues/49578) deprecated the old headers names.
The current WAMR codebase references these old headers, thus causing compile errors with
current versions of Zephyr.
This update adds #ifdefs around the header names. With this change, compiling with Zephyr 3.2.0
and above will use the new header files. Prior versions will use the existing code.
This commit adds a check to `fd_advise`. If the fd is a directory,
return `ebadf`. This brings iwasm in line with Wasmtime's behavior.
WASI folks have stated that fd_advise should not work on directories
as this is a Linux-specific behavior:
https://github.com/bytecodealliance/wasmtime/issues/6505#issuecomment-1574122949
- Fix op_br_table arity type check when the dest block is loop block
- Fix op_drop issue when the stack is polymorphic and it is to drop
an ANY type value in the stack
* Empty names are spec-wise valid.
* As we ignore unknown custom sections anyway, it's safe to
accept empty names here.
* Currently, the problem is not exposed on our CI because
the wabt version used there is a bit old.
For shared memory, runtime should get the memories pointer from
module_inst first, then get memory instance from memories array,
and then get the fields of the memory instance.
Support new a wasm_config_t, set allocation and linux_perf_support
options to it, and then pass it to wasm_engine_new_with_config to
new an engine with private configuration.
The JSON evidence is allocated on the module instance heap, but no API
was given to dispose of this memory buffer. The sample mentions using
the function free, which behaves differently depending on the
execution context.
This fix provides a new function called librats_dispose_evidence_json,
enabling freeing the JSON evidence directly from the Wasm app.
Change WASMMemoryInstance's field is_shared_memory's type from bool
to uint8 whose size is fixed, so as to make WASMMemoryInstance's size
and layout fixed and not break AOT ABI.
See discussion in https://github.com/bytecodealliance/wasm-micro-runtime/pull/2682.
The popped reachable block may be if block whose else branch hasn't been
translated, and should push the params for the else block if there are.
And use LLVMDisposeMessage to free memory allocated in is_win_platform.
Currently, `data.drop` instruction is implemented by directly modifying the
underlying module. It breaks use cases where you have multiple instances
sharing a single loaded module. `elem.drop` has the same problem too.
This PR fixes the issue by keeping track of which data/elem segments have
been dropped by using bitmaps for each module instances separately, and
add a sample to demonstrate the issue and make the CI run it.
Also add a missing check of dropped elements to the fast-jit `table.init`.
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2735
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2772
CryptGenRandom is deprecated by Microsoft and may be removed in future
releases. They recommend to use the next generation API instead. See
https://learn.microsoft.com/en-us/windows/win32/seccng/cng-portal for
more details. Also, refactor the random functions to return error codes
rather than aborting the program if they fail.
Error is reported when executing `wamrc --target=thumb -o <aot_file> <wasm_file>`:
```
LLVM ERROR: failed to perform tail call elimination on a call site marked musttail
Aborted (core dumped)
```
Set `abi` to "gnu" for the bare-metal target when `abi` is NULL,
or the below `bh_assert` and `bh_memcpy` may deference a NULL
pointer. Error is reported when running wamrc compiled with
`cmake .. -DCMAKE_BUILD_TYPE=Debug`:
```
core/iwasm/compilation/aot_llvm.c:2584:13: runtime error:
null pointer passed as argument 1, which is declared to never be null
```
Add an extra argument `os_file_handle file` for `os_mmap` to support
mapping file from a file fd, and remove `os_get_invalid_handle` from
`posix_file.c` and `win_file.c`, instead, add it in the `platform_internal.h`
files to remove the dependency on libc-wasi.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Heap corruption check in ems memory allocator is enabled by default
to improve the security, but it may impact the performance a lot, this
PR adds cmake variable and compiler flag to enable/disable it.
Returning uint16 from WASI functions is technically correct. However,
the smallest integer type in WASM is int32 and since we don't guarantee
that the upper 16 bits of the result are zero'ed, it can result in
tricky bugs if the language SDK being used in the WASM app does not cast
back immediately to uint16. To prevent this, we directly return uint32
instead, so that the result is well-defined as a 32-bit number.
Set the vendor-sys of bare-metal targets to "-unknown-none-",
and currently only add "thumbxxx" to the bare-metal target list.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
The commit fa5e9d72b0 ("Abstract POSIX filesystem functions") introduces
the build warning:
./core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/posix.c: In function ‘fd_object_release’:
./core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/posix.c:545:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
545 | if (os_is_dir_stream_valid(&fo->directory.handle)) {
| ^
./core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/posix.c:549:13: note: here
549 | default:
| ^~~~~~~
Refer to the commit fb4afc7ca4 ("Apply clang-format for core/iwasm compilation and libraries"),
add one line "// Fallthrough." to make compiler happy.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
`platform_common.h` already has a declaration for BH_VPRINTF so we can
get rid of the one in `platform_internal.h`. Also add some explicit
casts to avoid MSVC compiler warnings.
UWP apps do not have a console attached so any output to stdout/stderr
is lost. Therefore, provide a default BH_VPRINTF in that case for debug
builds which redirects output to the debugger.
To allow anything to depend on WASI types, including platform-specific
data structures, move the WASI libc filesystem/clock interface into
`platform_api_extension.h`, which leaves just WASI types in
`platform_wasi.h`. And `platform_wasi.h` has been renamed to
`platform_wasi_types.h` to reflect that it only defines types now and no
function declarations. Finally, these changes allow us to remove the
`windows_fdflags` type which was essentially a duplicate of
`__wasi_fdflags_t`.
Most of the WASI filesystem tests require at least creating/deleting a
file to test filesystem functionality so some additional filesystem APIs
have been implemented on Windows so we can test what has been
implemented so far. For those WASI functions which haven't been
implemented, we skip the tests. These will be implemented in a future PR
after which we can remove the relevant filters.
Additionally, in order to run the WASI socket and thread tests, we need
to install the wasi-sdk in CI and build the test source code prior to
running the tests.
`jit_reg_is_const_val` only checks whether the register is a const register and
the const value is stored in the register.
Should use `jit_reg_is_const` instead in the front end.
Reported in #2710.
- Fix potential invalid push param phis and add incoming phis to a un-existed basic block
- Fix potential invalid shift count int rotl/rotr opcodes
- Resize memory_data_size to UINT32_MAX if it is 4G when hw bound check is enabled
- Fix negative linear memory offset is used for 64-bit target it is const and larger than INT32_MAX
To run it locally:
```bash
export TSAN_OPTIONS=suppressions=<path_to_tsan_suppressions.txt>
./test_wamr.sh <your flags> -T tsan
```
An example for wasi-threads would look like:
```bash
export TSAN_OPTIONS=suppressions=<path_to_tsan_suppressions.txt>
./test_wamr.sh -w -s wasi_certification -t fast-interp -T tsan
```
Split memory instance's field `uint32 ref_count` into `bool is_shared_memory`
and `uint16 ref_count`, and lock the memory only when `is_shared_memory`
flag is true, no need to acquire a lock for non-shared memory when shared
memory feature is enabled.
Avoid repeatedly initializing the shared memory data when creating the child
thread in lib-pthread or lib-wasi-threads.
Add shared memory lock when accessing some fields of the memory instance
if the memory instance is shared.
Init shared memory's memory_data_size/memory_data_end fields according to
the current page count but not max page count.
Add wasm_runtime_set_mem_bound_check_bytes, and refine the error message
when shared memory flag is found but the feature isn't enabled.
Fixes the Cosmopolitan Libc platform attempting to use `/dev/urandom`
on operating systems that do not have it.
Signed-off-by: G4Vi <gavin@dylibso.com>
This patch enables mapping host directories to guest directories by parsing
the `map_dir_list` argument in API `wasm_runtime_init_wasi` for libc-wasi. It
follows the format `<guest-path>::<host-path>`.
It also adds argument `--map-dir=<guest::host>` argument for `iwasm`
common line tool, and allows to add multiple mappings:
```bash
iwasm --map-dir=<guest-path1::host-path1> --map-dir=<guest-path2::host-path2> ...
```
Implement the necessary os_ filesystem functions to enable successful
WASI initialization on Windows. Some small changes were also required to
the sockets implementation to use the new windows_handle type. The
remaining functions will be implemented in a future PR.
When labels-as-values is enabled in a target which doesn't support
unaligned address access, 16-bit offset is used to store the relative
offset between two opcode labels. But it is a little small and the loader
may report "pre-compiled label offset out of range" error.
Emitting 32-bit data instead to resolve the issue: emit label address in
32-bit target and emit 32-bit relative offset in 64-bit target.
See also:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/2635
To allow non-POSIX platforms such as Windows to support WASI libc
filesystem functionality, create a set of wrapper functions which provide a
platform-agnostic interface to interact with the host filesystem. For now,
the Windows implementation is stubbed but this will be implemented
properly in a future PR. There are no functional changes in this change,
just a reorganization of code to move any direct POSIX references out of
posix.c in the libc implementation into posix_file.c under the shared
POSIX sources.
See https://github.com/bytecodealliance/wasm-micro-runtime/issues/2495 for a
more detailed overview of the plan to port the WASI libc filesystem to Windows.
When doing more investigations related to this PR:
https://github.com/bytecodealliance/wasm-micro-runtime/pull/2619
We found that in some scenarios the constant might not be directly
available to the LLVM IR builder, e.g.:
```
(func $const_ret (result i32)
i32.const -5
)
(func $foo
(i32.shr_u (i32.const -1) (call $const_ret))
(i32.const 31)
)
```
In that case, the right parameter to `i32.shr_u` is not constant, therefore
the `SHIFT_COUNT_MASK` isn't applied. However, when the optimization
is enabled (`--opt-level` is 2 or 3), the optimization passes resolve the
call into constant, and that constant is poisoned, causing the compiler to
resolve the whole function to an exception.
According to the description of `buildPerModuleDefaultPipeline()` and
`buildLTOPreLinkDefaultPipeline()`, it is not allowed to call them with `O0` level.
Use `buildO0DefaultPipeline` instead when the opt-level is 0.
The LLVM zext IR may be inserted after the terminator of a basic block
when popping the arguments of a wasm block. Change to insert the
zext IR before the terminator of the basic block to resolve the issue.
Reported in #2620.
This PR adds the Cosmopolitan Libc platform enabling compatibility with multiple
x86_64 operating systems with the same binary. The platform is similar to the
Linux platform, but for now only x86_64 with interpreter modes are supported.
The only major change to the core is `posix.c/convert_errno()` was rewritten to use
a switch statement. With Cosmopolitan errno values depend on the currently
running operating system, and so they are non-constant and cannot be used in array
designators. However, the `cosmocc` compiler allows non-constant case labels in
switch statements, enabling the new version.
And updated wamr-test-suites script to add `-j <platform>` option. The spec tests
can be ran via `CC=cosmocc ./test_wamr.sh -j cosmopolitan -t classic-interp`
or `CC=cosmocc ./test_wamr.sh -j cosmopolitan -t fast-interp`.
To make it clearer to users when synchronization behaviour is not
supported, return ENOTSUP when O_RSYNC, O_DSYNC or O_SYNC are
respectively not defined. Linux also doesn't support O_RSYNC despite the
O_RSYNC flag being defined.
Fixed a bug in the processing of the br_table_cache opcode that caused out-of-range
references when the label index was greater than the length of the label.
Avoid the stack traces getting mixed up together when multi-threading is enabled
by using exception_lock/unlock in dumping the call stacks.
And remove duplicated call stack dump in wasm_application.c.
Also update coding guideline CI to fix the clang-format-12 not found issue.
According to the specification,
- fNxM_pmin/max returns v1 or v2 based on flt(v1,v2) result
- fNxM_min/max returns +/-NaN, +/-Inf, v1 or v2 based on more than
flt(v1,v2) result
Fixes issue #2561.
Only when the value kind is LLVMConstantIntValueKind and the value
is not undef and not poison can we extract the value of a constant int.
Fixes#2557 and #2559.
Support muti-module for AOT mode, currently only implement the
multi-module's function import feature for AOT, the memory/table/
global import are not implemented yet.
And update wamr-test-suites scripts, multi-module sample and some
CIs accordingly.
The CI might use clang-17 to build iwasm for Android platform and it may
report compilation error:
https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/6308980430/job/17128073777
/home/runner/work/wasm-micro-runtime/wasm-micro-runtime/core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/blocking_op.c:45:19: error: call to undeclared function 'preadv'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
ssize_t ret = preadv(fd, iov, iovcnt, offset);
^
Explicitly declare preadv and pwritev in android platform header file to resolve it.
`wasm_loader_push_pop_frame_offset` may pop n operands by using
`loader_ctx->stack_cell_num` to check whether the operand can be
popped or not. While `loader_ctx->stack_cell_num` is updated in the
later `wasm_loader_push_pop_frame_ref`, the check may fail if the stack
is in polymorphic state and lead to `ctx->frame_offset` underflow.
Fix issue #2577 and #2586.
There doesn't appear to be a clear reason not to support this behavior.
It seems it was disallowed previously as a precaution. See
67e2e57b02
for more context.
Adapt API usage to new interfaces where applicable, including LLVM function
usage, obsoleted llvm::Optional type and removal of unavailable headers.
Know issues:
- AOT static PGO isn't enabled
- LLVM JIT may run failed due to llvm_orc_registerEHFrameSectionWrapper
isn't linked into iwasm
Return a WASI error code (rather than a host POSIX one). In addition,
there is no need to return an error in the case that the provided buffer
is too large.
Unaligned store v128 value to the AOT function argument of the pointer for
the extra return value may cause segmentation fault.
Fix the issue reported in #2556.
The WASI docs allow for fewer rights to be applied to an fd than requested but
not more. This behavior is also asserted in the rust WASI tests, so it's necessary
for those to pass as well.
Send a signal whose handler is no-op to a blocking thread to wake up
the blocking syscall with either EINTR equivalent or partial success.
Unlike the approach taken in the `dev/interrupt_block_insn` branch (that is,
signal + longjmp similarly to `OS_ENABLE_HW_BOUND_CHECK`), this PR
does not use longjmp because:
* longjmp from signal handler doesn't work on nuttx
refer to https://github.com/apache/nuttx/issues/10326
* the singal+longjmp approach may be too difficult for average programmers
who might implement host functions to deal with
See also https://github.com/bytecodealliance/wasm-micro-runtime/issues/1910
Remove thread local attribute of prev_sig_act_SIGSEGV/SIGBUS to allow using
custom signal handler from non-main thread since in a thread spawned by
embedder, embedder may be unable to call wasm_runtime_init_thread_env to
initialize them.
And fix the handling of prev_sig_act when its sa_handler is SIG_DFL, SIG_IGN,
or a user customized handler.
Add API wasm_runtime_terminate to terminate a module instance
by setting "terminated by user" exception to the module instance.
And update the product-mini of posix platforms.
Note: this doesn't work for some situations like blocking system calls.
Preserve errno because this function is often used like
the following. The caller wants to report the error from the main
operation (`lseek` in this example), not from fd_object_release.
```
off_t ret = lseek(fd_number(fo), offset, nwhence);
fd_object_release(fo);
if (ret < 0)
return convert_errno(errno);
```
This fixes a few test cases in wasi-threads testsuite like wasi_threads_return_main_block.
And also move the special handling for "wasi proc exit" to a more appropriate place.
Introduce module instance context APIs which can set one or more contexts created
by the embedder for a wasm module instance:
```C
wasm_runtime_create_context_key
wasm_runtime_destroy_context_key
wasm_runtime_set_context
wasm_runtime_set_context_spread
wasm_runtime_get_context
```
And make libc-wasi use it and set wasi context as the first context bound to the wasm
module instance.
Also add samples.
Refer to https://github.com/bytecodealliance/wasm-micro-runtime/issues/2460.
When embedding WAMR, this PR allows to register a callback that is
invoked when memory.grow fails.
In case of memory allocation failures, some languages allow to handle
the error (e.g. by checking the return code of malloc/calloc in C), some
others (e.g. Rust) just panic.
While wasi proc exit is not a real trap, what the runtime does on it is mostly same as
real traps. That is, kill the siblings threads and represent the exit/trap as the result of
the "process" to the user api. There seems no reason to distinguish it from real traps
here.
Note that:
- The target thread either doesn't care the specific exception type or ignore wasi
proc exit by themselves. (clear_wasi_proc_exit_exception)
- clear_wasi_proc_exit_exception only clears local exception.
Add simple infrastructure to add more unit tests in the future. At the moment tests
are only executed on Linux, but can be extended to other platforms if needed.
Use https://github.com/google/googletest/ as a framework.
As a part of stress-testing we want to ensure that mutex implementation is working
correctly and protecting shared resource to be allocated from other threads when
mutex is locked.
This test covers the most common situations that happen when some program uses
mutexes like locks from various threads, locks from the same thread etc.
When AOT out of bound linear memory access or stack overflow occurs, the call stack of
AOT functions cannot be unwound currently, so from the exception handler, runtime
cannot jump back into the place that calls the AOT function.
We temporarily skip the current instruction and let AOT code continue to run and return
to caller as soon as possible. And use the zydis library the decode the current instruction
to get its size.
And remove using RtlAddFunctionTable to register the AOT functions since it doesn't work
currently.
AOT relocation to aot_func_internal#n is generated by wamrc --bounds-checks=1.
Resolve the issue by applying the relocation in the compilation stage by wamrc and
don't generate these relocations in the AOT file.
Fixes#2471.
We need to apply some bug fixes that were merged to wasi-libc because wasi-sdk-20
is about half a year old.
It is a temporary solution and the code will be removed when wasi-sdk 21 is released.
- Fix windows wamrc link error: aot_generate_tempfile_name undefined.
- Clear windows compile warnings.
- And rename folder `samples/bh_atomic` and `samples/mem_allocator` to
`samples/bh-atomic` and `samples/mem-allocator`.
## Context
Some native libraries may want to explicitly delete an externref object without
waiting for the module instance to be deleted.
In addition, it may want to add a cleanup function.
## Proposed Changes
Implement:
* `wasm_externref_objdel` to explicitly delete an externeref'd object.
* `wasm_externref_set_cleanup` to set a cleanup function that is called when
the externref'd object is deleted.
In macro bh_memcpy_s, bh_memcy_wa and bh_memmove_s, no need to do extra check
for length is zero or not because it was already done inside of the functions called.
- Inherit shared memory from the parent instance, instead of
trying to look it up by the underlying module. The old method
works correctly only when every cluster uses different module.
- Use reference count in WASMMemoryInstance/AOTMemoryInstance
to mark whether the memory is shared or not
- Retire WASMSharedMemNode
- For atomic opcode implementations in the interpreters, use
a global lock for now
- Update the internal API users
(wasi-threads, lib-pthread, wasm_runtime_spawn_thread)
Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/1962
- Avoid destroying module instance repeatedly in pthread_exit_wrapper and
wasm_thread_cluster_exit.
- Wait enough time in pthread_join_wrapper for target thread to exit and
destroy its resources.
We need to make a test that runs longer than the tests we had before to check
some problems that might happen after running for some time (e.g. memory
corruption or something else).
Tests were failing because the right permissions were not provided to iwasm.
Also, test failures didn't trigger build failure due to typo - also fixed in this change.
In addition to that, this PR fixes a few issues with the test itself:
* the `server_init_complete` was not reset early enough causing the client to occasionally
assume the server started even though it didn't yet
* set `SO_REUSEADDR` on the server socket so the port can be reused shortly after
closing the previous socket
* defined receive-send-receive sequence from server to make sure server is alive at the
time of sending message
The old method may not work for some cases. This PR iterates over all instructions
in the function, looking for memcpy, memmove and memset instructions, putting
them into a set, and finally expands them into a loop one by one.
And move this LLVM Pass after building the pipe line of pass builder to ensure that
the memcpy/memmove/memset instrinsics are generated before applying the pass.
And return ENOSYS. We do that so we can at least compile the code on CI.
We'll be gradually enabling more and more functions.
Also, enabled `proc_raise()` for windows.
* disable translations of errno codes that aren't defined on Windows
* undef `min()` macro if it is defined to not conflict with the `min()` function we define
* implement `shed_yield` wasi call
* disable some of the features in the config for windows by default
There is no standard `realpath` function in the C/C++ standard libraries for Windows,
use `_fullpath` function instead to get absolute path of a directory.
We have observed a significant performance degradation after merging
https://github.com/bytecodealliance/wasm-micro-runtime/pull/1991
Instead of protecting suspend flags with a mutex, we implement the flags
as atomic variable and only use mutex when atomics are not available
on a given platform.
esp32-s3's instruction memory and data memory can be accessed through mutual mirroring way,
so we define a new feature named as WASM_MEM_DUAL_BUS_MIRROR.
Build wasi-libc library on Windows since libuv may be not supported. This PR is a first step
to make it working, but there's still a number of changes to get it fully working.
Allow to use `cmake -DWAMR_CONFIGURABLE_BOUNDS_CHECKS=1` to
build iwasm, and then run `iwasm --disable-bounds-checks` to disable the
memory access boundary checks.
And add two APIs:
`wasm_runtime_set_bounds_checks` and `wasm_runtime_is_bounds_checks_enabled`
Calling `__wasi_sock_addr_resolve` syscall causes native stack overflow.
Given this is a standard function available in WAMR, we should have at least
the default stack size large enough to handle this case.
The socket tests were updated so they also run in separate thread, but
the simple retro program is:
```C
void *th(void *p)
{
struct addrinfo *res;
getaddrinfo("amazon.com", NULL, NULL, &res);
return NULL;
}
int main(int argc, char **argv)
{
pthread_t pt;
pthread_create(&pt, NULL, th, NULL);
pthread_join(pt, NULL);
return 0;
}
```
## Context
Currently, WAMR supports compiling iwasm with flag `WAMR_BUILD_WASI_NN`.
However, there are scenarios where the user might prefer having it as a shared library.
## Proposed Changes
Decouple wasi-nn context management by internally managing the context given
a module instance reference.
Fix some build errors when building wamrc with LLVM-13, reported in #2311
Fix some build warnings when building wamrc with LLVM-16:
```
core/iwasm/compilation/aot_llvm_extra2.cpp:26:26: warning:
‘llvm::None’ is deprecated: Use std::nullopt instead. [-Wdeprecated-declarations]
26 | return llvm::None;
```
Fix a maybe-uninitialized compile warning:
```
core/iwasm/compilation/aot_llvm.c:413:9: warning:
‘update_top_block’ may be used uninitialized in this function [-Wmaybe-uninitialized]
413 | LLVMPositionBuilderAtEnd(b, update_top_block);
```
## Context
Path to models use `/assets` for testing inside docker. While testing directly from
the repo we are forced to use soft-links or modify the paths.
## Proposed Changes
Use relative path and adjust docker volumes in docs.
Major changes:
- Public headers inside `wasi-nn/include`
- Put cmake files in `cmake` folder
- Make linux iwasm link with `${WASI_NN_LIBS}` so iwasm can enable wasi-nn
This PR attempts to search for the system libuv and use it if found instead of
downloading it. As reported in #1831, this is needed because some tools
build in a sandbox and clear the extra sources.
Move the native stack overflow check from the caller to the callee because the
former doesn't work for call_indirect and imported functions.
Make the stack usage estimation more accurate. Instead of making a guess from
the number of wasm locals in the function, use the LLVM's idea of the stack size
of each MachineFunction. The former is inaccurate because a) it doesn't reflect
optimization passes, and b) wasm locals are not the only reason to use stack.
To use the post-compilation stack usage information without requiring 2-pass
compilation or machine-code imm rewriting, introduce a global array to store
stack consumption of each functions:
For JIT, use a custom IRCompiler with an extra pass to fill the array.
For AOT, use `clang -fstack-usage` equivalent because we support external llc.
Re-implement function call stack usage estimation to reflect the real calling
conventions better. (aot_estimate_stack_usage_for_function_call)
Re-implement stack estimation logic (--enable-memory-profiling) based on the new
machinery.
Discussions: #2105.
Compilation in strict mode fails with
```
wasm_micro_runtime/core/shared/platform/android/platform_init.c:122:30:
error: declaration of 'struct epoll_event` will not be visible outside of this
function [-Werror,-Wvisibility]
epoll_pwait(int epfd, struct epoll_event *events, int maxevents, int timeout,
^
1 error generated.
```
Co-authored-by: Misha Gridnev <gridman@google.com>
Writing GS segment register is not allowed on linux-sgx since it is used as
the base address of thread data in 64-bit hw mode. Reported in #2252.
Disable writing it and disable segue optimization for linux-sgx platform.
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:
1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
`iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.
The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT
This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
i32.load, i64.load, f32.load, f64.load, v128.load,
i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.
Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
Add nightly (UTC time) checks with asan and ubsan, and also put gcc-4.8 build
to nightly run since we don't need to run it with every PR.
Co-authored-by: Maksim Litskevich <makslit@amazon.co.uk>
For some platforms WAMR gets compiled with `CONFIG_HAS_CLOCK_NANOSLEEP=1`,
while `clock_nanosleep` is not present at the platform, which causes compilation error.
Add check for macro `DISABLE_CLOCK_NANOSLEEP` to resolve the issue, only when
the macro isn't defined can the macro `CONFIG_HAS_CLOCK_NANOSLEEP` take effect.
Add VX delegation as an external delegation of TFLite, so that several NPU/GPU
(from VeriSilicon, NXP, Amlogic) can be controlled via WASI-NN.
Test Code can work with the X86 simulator.
Fix issue reported in #2172: wasm-c-api `wasm_func_call` may use a wrong exec_env
when multi-threading is enabled, with error "invalid exec env" reported
Fix issue reported in #2149: main instance's `c_api_func_imports` are not passed to
the counterpart of new thread's instance in wasi-threads mode
Fix issue of invalid size calculated to copy `c_api_func_imports` in pthread mode
And refactor the code to use `wasm_cluster_dup_c_api_imports` to copy the
`c_api_func_imports` to new thread for wasi-threads mode and pthread mode.
Currently, if a thread is spawned and raises an exception after the main thread
has finished, iwasm returns with success instead of returning 1 (i.e. error).
Since wasm_runtime_get_wasi_exit_code waits for all threads to finish and only
returns the wasi exit code, this PR performs the exception check again and
returns error if an exception was raised.
Since the Tensorflow library is already installed in many cases(especially in the
case of the embedded system), move the installation code to find_package.
According to the 1999 ISO C standard (C99), size_t is an unsigned integer type of
at least 16 bit (see sections 7.17 and 7.18.3), it may be uint32 in 32-bit platforms:
https://en.cppreference.com/w/cpp/types/size_t
Calling function `size_t min(size_t, size_t)` with two uint64 arguments may get
invalid result.
Co-authored-by: Georgii Rylov <godjan@amazon.co.uk>
Make `hmu_tree_node` struct packed and add 4 padding bytes before `kfc_tree_root_buf`
field in `gc_heap_struct` struct to ensure the `left/right/parent` fields in `hmu_tree_node`
are 8-byte aligned on the 64-bit target which doesn't support unaligned memory access.
Fix the issue reported in #2136.
- Translate all the opcodes of threads spec proposal for Fast JIT
- Add the atomic flag for Fast JIT load/store IRs to support atomic load/store
- Add new atomic related Fast JIT IRs and translate them in the codegen
- Add suspend_flags check in branch opcodes and before/after call function
- Modify CI to enable Fast JIT multi-threading test
Co-authored-by: TianlongLiang <tianlong.liang@intel.com>
In LLVM AOT/JIT compiler, only need to check the suspend_flags when memory is
a shared memory since the shared memory must be enabled for multi-threading,
so as not to impact the performance in non-multi-threading memory mode. Also
refine the LLVM IRs to check the suspend_flags.
And fix an issue of multi-tier jit for multi-threading, the instance of the child thread
should be removed from the instance list before it is de-instantiated.
In #1928 we added support for GCC 4.8 but we don't continuously test if it's
working. This PR added a GitHub actions job to test compilation on GCC 4.8
for interpreters and Fast JIT (LLVM JIT/AOT might be added in the future).
The compilation is done using ubuntu 14.04 image as that's the simplest way
to get GCC 4.8 compiler. The job only compiles the code but does not run any
tests.
Load memory data size in each time memory access boundary check in
multi-threading mode since it may be changed by other threads when
memory growing.
And use `memory->memory_data_size` instead of
`memory->num_bytes_per_page * memory->cur_page_count` to refine
the code.
When ref.func opcode refers to a function whose function index no smaller than
current function, the destination func should be forward-declared: it is declared
in the table element segments, or is declared in the export list.
In multi-threading, this line will eventually call `wasm_cluster_wait_for_all_except_self`:
`DEINIT_VEC(store->instances, wasm_instance_vec_delete)`
As the threads are joining they can call `wasm_interp_dump_call_stack` which tries to
use the module frames but they were already freed by this line:
`DEINIT_VEC(store->modules, wasm_module_vec_delete)`
This PR swaps the order that these are deleted so module is deleted after the instances.
Co-authored-by: Andrew Chambers <ncham@amazon.com>
Try using existing exec_env to execute wasm app's malloc/free func and
execute post instantiation functions. Create a new exec_env only when
no existing exec_env was found.
In some cases, the memory address of some variables may have 4 least significant
bytes set to zero. Because we cast the pointer to int, we look only at 4 least
significant bytes; the assertion may fail because 4 least significant bytes are 0.
Change bh_assert implementation to cast the assert expr to int64_t and it works
well with 64-bit architectures.
POLLRDNORM/POLLWRNORM may be not defined in uClibc, so replace them
with the equivalent POLLIN/POLLOUT.
Refer to https://www.man7.org/linux/man-pages/man2/poll.2.html
POLLRDNORM Equivalent to POLLIN
POLLWRNORM Equivalent to POLLOUT
Signed-off-by: Thomas Devoogdt <thomas.devoogdt@barco.com>
Update wasi-libc version to resolve the hang issue when running wasi-threads cases.
Implement custom sync primitives as a counterpart of `pthread_barrier_wait` to
attempt to replace pthread sync primitives since they seem to cause data races
when running with the thread sanitizer.
Use pre-created exec_env for instantiation and module_malloc/free,
use the same exec_env of the current thread to avoid potential
unexpected behavior.
And remove unnecessary shared_mem_lock in wasm_module_free,
which may cause dead lock.
Use the shared memory's shared_mem_lock to lock the whole atomic.wait and
atomic.notify processes, and use it for os_cond_reltimedwait and os_cond_notify,
so as to make the whole processes actual atomic operations:
the original implementation accesses the wait address with shared_mem_lock
and uses wait_node->wait_lock for os_cond_reltimedwait, which is not an atomic
operation.
And remove the unnecessary wait_map_lock and wait_lock, since the whole
processes are already locked by shared_mem_lock.
`wasi-sdk-20` pre-release can be used to avoid building `wasi-libc` to enable threads.
It's not possible to use `wasi-sdk-20` pre-release on Ubuntu 20.04 because of
incompatibility with the glibc version:
```bash
/opt/wasi-sdk/bin/clang: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found
(required by /opt/wasi-sdk/bin/clang)
```
- Remove notify_stale_threads_on_exception and change atomic.wait
to be interruptible by keep waiting and checking every one second,
like the implementation of poll_oneoff in libc-wasi
- Wait all other threads exit and then get wasi exit_code to avoid
getting invalid value
- Inherit suspend_flags of parent thread while creating new thread to
avoid terminated flag isn't set for new thread
- Fix wasi-threads test case update_shared_data_and_alloc_heap
- Add "Lib wasi-threads enabled" prompt for cmake
- Fix aot get exception, use aot_copy_exception instead
Fix a data race for test main_proc_exit_wait.c from #1963.
And fix atomic_wait logic that was wrong before:
- a thread 1 started executing wasm instruction wasm_atomic_wait
but hasn't reached waiting on condition variable
- a main thread calls proc_exit and notifies all the threads that reached
waiting on condition variable
Which leads to thread 1 hang on waiting on condition variable after that
Now it's atomically checked whether proc_exit was already called.
In the WASI thread test modified in this PR, malloc was used in multiple threads
without a lock. But wasi-libc implementation of malloc is not thread-safe.
Remove restrictions:
- Only 1 WASM app at a time
- Only 1 model at a time
- `graph` and `graph-execution-context` are ignored
Refer to previous document:
e8d718096d/core/iwasm/libraries/wasi-nn/README.md
- Implement atomic.fence to ensure a proper memory synchronization order
- Destroy exec_env_singleton first in wasm/aot deinstantiation
- Change terminate other threads to wait for other threads in
wasm_exec_env_destroy
- Fix detach thread in thread_manager_start_routine
- Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread
- Add lib-pthread and lib-wasi-threads compilation to Windows CI
The function always specified IPv4 socklen to sockaddr_to_bh_sockaddr(),
therefore the assertion was failing; however, sockaddr_to_bh_sockaddr()
never actually used socklen parameter, so we deleted it completely.
In wasm_cluster_create_thread, the new_exec_env is added into the cluster's
exec_env list before the thread is created, so other threads can access the
fields of new_exec_env once the cluster->lock is unlocked, while the
new_exec_env's handle is set later inside the thread routine. This may result
in the new_exec_env's handle be invalidly accessed by other threads.
- CMakeLists.txt: add lib_export.h to install list
- Fast JIT: enlarge spill cache size to enable several standalone cases
when hw bound check is disabled
- Thread manager: wasm_cluster_exit_thread may destroy an invalid
exec_env->module_inst when exec_env was destroyed before
- samples/socket-api: fix failure to run timeout_client.wasm
- enhance CI build wasi-libc and sample/wasm-c-api-imports CMakeLlist.txt
Support collecting code coverage with wamr-test-suites script by using
lcov and genhtml tools, eg.:
cd tests/wamr-test-suites
./test_wamr.sh -s spec -b -P -C
The default code coverage and html files are generated at:
tests/wamr-test-suites/workspace/wamr.lcov
tests/wamr-test-suites/workspace/wamr-lcov.zip
And update wamr-test-suites scripts to support testing GC spec cases to
avoid frequent synchronization conflicts between branch main and dev/gc.
Raising "wasi proc exit" exception, spreading it to other threads and then
clearing it in all threads may result in unexpected behavior: the sub thread
may end first, handle the "wasi proc exit" exception and clear exceptions
of other threads, including the main thread. And when main thread's
exception is cleared, it may continue to run and throw "unreachable"
exception. This also leads to some assertion failed.
Ignore exception spreading for "wasi proc exit" and don't clear exception
of other threads to resolve the issue.
And add suspend flag check after atomic wait since the atomic wait may
be notified by other thread when exception occurs.
Fix issues in the libc-wasi `poll_oneoff` when thread manager is enabled:
- The exception of a thread may be cleared when other thread runs into
`proc_exit` and then calls `clear_wasi_proc_exit_exception`, so should not
use `wasm_runtime_get_exception` to check whether an exception was
thrown, use `wasm_cluster_is_thread_terminated` instead
- We divided one time poll_oneoff into many times poll_oneoff to check
the exception to avoid long time waiting in previous PR, but if all events
returned by one time poll are all waiting events, we need to continue to
wait but not return directly.
Follow-up on #1951. Tested with multiple timeout values, with and without
interruption and measured the time spent sleeping.
In the previous code, the `*port` is assigned before `getsockname`, so the caller
may be not able to get the actual port number assigned by system.
Move the assigning of `*port` to be after `getsockname` to resolve the issue.
- Use execute_post_instantiate_functions to call start, _initialize,
__post_instantiate, __wasm_call_ctors functions after instantiation
- Always call start function for both main instance and sub instance
- Only call _initialize and __post_instantiate for main instance
- Only call ___wasm_call_ctors for main instance and when bulk memory
is enabled and wasi import functions are not found
- When hw bound check is enabled, use the existing exec_env_tls
to call func for sub instance, and switch exec_env_tls's module inst
to current module inst to avoid checking failure and using the wrong
module inst
Add shared memory lock when accessing the address to atomic wait/notify
inside linear memory to resolve its data race issue.
And statically initialize the goto table of interpreter labels to resolve the
data race issue of accessing the table.
The problem was found by a `Golang + WAMR (as CGO)` wrapped by EGO
in SGX Enclave.
`fstat()` in EGO returns dummy values:
- EGO uses a `mount` configuration to define the mount points that apply
the host file system presented to the Encalve.
- EGO has a different programming model: the entire application runs inside
the enclave. Manual ECALLs/OCALLs by application code are neither
required nor possible.
Add platform ego and add macro control for the return value checking of
`fd_determine_type_rights` in libc-wasi to resolve the issue.
The function has been there for long. While what it does look a bit unsafe
as it calls a function which may be not wasm-wise exported explicitly, it's
useful and widely used when implementing callback-taking APIs, including
our pthread_create's implementation.
Destroy child thread's exec_env before destroying its module instance and
add the process into cluster's lock to avoid possible data race: if exec_env
is removed from custer's exec_env_list and destroyed later, the main thread
may not wait it and start to destroy the wasm runtime, and the destroying
of the sub thread's exec_env may free or overread/written an destroyed or
re-initialized resource.
And fix an issue in wasm_cluster_cancel_thread.
The start/initialize functions of wasi module are to do some initialization work
during instantiation, which should be only called one time in the instantiation
of main instance. For example, they may initialize the data in linear memory,
if the data is changed later by the main instance, and re-initialized again by
the child instance, unexpected behaviors may occur.
And clear a shadow warning in classic interpreter.
Multiple threads generated from the same module should use the same
lock to protect the atomic operations.
Before this PR, each thread used a different lock to protect atomic
operations (e.g. atomic add), making the lock ineffective.
Fix#1958.
Add APIs to help prepare the imports for the wasm-c-api `wasm_instance_new`:
- wasm_importtype_is_linked
- wasm_runtime_is_import_func_linked
- wasm_runtime_is_import_global_linked
- wasm_extern_new_empty
For wasm-c-api, developer may use `wasm_module_imports` to get the import
types info, check whether an import func/global is linked with the above API,
and ignore the linking of an import func/global with `wasm_extern_new_empty`.
Sample `wasm-c-api-import` is added and document is updated.
In the esp-idf platform, Xtensa GCC 8.4.0 reported incompatible pointer warnings when
building with the lwip component.
Berkeley (POSIX) sockets uses composition in combination with type punning to handle
many protocol families, including IPv4 & IPv6. The type punning just has to be made
explicit with pointer casts from `sockaddr_in` for IPv4 to the generic `sockaddr`.
When de-instantiating the wasm module instance, remove it from the module's
instance list before freeing func_ptrs and fast_jit_func_ptrs of the instance, to avoid
accessing these freed memory in the JIT backend compilation threads.
Enable setting running mode when executing a wasm bytecode file
- Four running modes are supported: interpreter, fast-jit, llvm-jit and multi-tier-jit
- Add APIs to set/get the default running mode of the runtime
- Add APIs to set/get the running mode of a wasm module instance
- Add running mode options for iwasm command line tool
And add size/opt level options for LLVM JIT
The definitions `enum WASMExceptionID` in the compilation of wamrc and the compilation
of Fast JIT are different, since the latter enables the Fast JIT macro while the former doesn't.
This causes that the exception ID in AOT file generated by wamrc may be different from
iwasm binary compiled with Fast JIT enabled, and may result in unexpected behavior.
Remove the macro control to resolve it.
Change an error to warning when checking wasi abi compatibility in loader, for rust case below:
#[no_mangle]
pub extern "C" fn main() {
println!("foo");
}
compile it with `cargo build --target wasm32-wasi`, a wasm file is generated with wasi apis imported
and a "void main(void)" function exported.
Other runtime e.g. wasmtime allows to load it and execute the main function with `--invoke` option.
- Split logic in several dockers
- runtime: wasi-nn-cpu and wasi-nn- Nvidia-gpu.
- compilation: wasi-nn-compile. Prepare the testing wasm and generates the TFLites.
- Implement GPU support for TFLite with Opencl.
The current implementation throws a segmentation fault when padding
files using a large range, because the writing operation overflows the
source buffer, which was a single char.
IPFS previously assumed that the offset for the seek operation was related
to the start of the file (SEEK_SET). It now correctly checks the parameter
'whence' and computes the offset for SEEK_CUR (middle of the file) and
SEEK_END (end of the file).
- Reorganize the library structure
- Use the latest version of `wasi-nn` wit (Oct 25, 2022):
0f77c48ec1/wasi-nn.wit.md
- Split logic that converts WASM structs to native structs in a separate file
- Simplify addition of new frameworks
Add more types for attr_container, e.g. uint8, uint32, uint64
Add more APIs for attr_container for int8, int16 and int32 types
Rename fields of the union 'jvalue' and refactor some files that use attr_container
This syscall doesn't need allocating stack or TLS and it's expected from the application
to do that instead. E.g. WASI-libc already does this for `pthread_create`.
Also fix some of the examples to allocate memory for stack and not use stack before
the stack pointer is set to a correct value.
Because stack grows from high address towards low address, the value
returned by malloc is the end of the stack, not top of the stack. The top
of the stack is the end of the allocated space (i.e. address returned by
malloc + cluster size).
Refer to #1790.
The original CI didn't actually run wasi test suite for x86-32 since the `TEST_ON_X86_32=true`
isn't written into $GITHUB_ENV.
And refine the error output when failed to link import global.
Add thread_wait_list_end node for thread data and cond for Windows platform
to speedup the thread join and cond wait operations: no need to traverse the
wait list to get the end node to append the wait node.
According to the [WASI thread specification](https://github.com/WebAssembly/wasi-threads/pull/16),
some thread identifiers are reserved and should not be used. In fact, only IDs between `1` and
`0x1FFFFFFF` are valid.
The thread ID allocator has been moved to a separate class to avoid polluting the
`lib_wasi_threads_wrapper` logic.
Should use import_function_count but not import_count to calculate
the func_index in handle_name_section when custom name section
feature is enabled.
And clear the compile warnings of mini loader.
Support modes:
- run a commander module only
- run a reactor module only
- run a commander module and a/multiple reactor modules together
commander propagates WASIArguments to reactors
Implement 2-level Multi-tier JIT engine: tier-up from Fast JIT to LLVM JIT to
get quick cold startup by Fast JIT and better performance by gradually
switching to LLVM JIT when the LLVM JIT functions are compiled by the
backend threads.
Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1302
Allow to add watchpoints to variables for source debugging. For instance:
`breakpoint set variable var`
will pause WAMR execution when the address at var is written to.
Can also set read/write watchpoints by passing r/w flags. This will pause
execution when the address at var is read:
`watchpoint set variable -w read var`
Add two linked lists for read/write watchpoints. When the debug message
handler receives a watchpoint request, it adds/removes to one/both of these
lists. In the interpreter, when an address is read or stored to, check whether
the address is in these lists. If so, throw a sigtrap and suspend the process.
When a wasm module is duplicated instantiated with wasm_instance_new,
the function import info of the previous instantiation may be overwritten by
the later instantiation, which may cause unexpected behavior.
Store the function import info into the module instance to fix the issue.
This PR allows reusing thread ids once they are released. That is done by using
a stack data structure to keep track of the used ids.
When a thread is created, it takes an available identifier from the stack. When
the thread exits, it returns the id to the stack of available identifiers.
Implement 2-level Multi-tier JIT engine: tier-up from Fast JIT to LLVM JIT to
get quick cold startup by Fast JIT and better performance by gradually
switching to LLVM JIT when the LLVM JIT functions are compiled by the
backend threads.
Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1302
For now this implementation uses thread manager.
Not sure whether thread manager is needed in that case. In the future there'll be likely another syscall added (for pthread_exit) and for that we might need some kind of thread management - with that in mind, we keep thread manager for now and will refactor this later if needed.
Allow to add watchpoints to variables for source debugging. For instance:
`breakpoint set variable var`
will pause WAMR execution when the address at var is written to.
Can also set read/write watchpoints by passing r/w flags. This will pause
execution when the address at var is read:
`watchpoint set variable -w read var`
Add two linked lists for read/write watchpoints. When the debug message
handler receives a watchpoint request, it adds/removes to one/both of these
lists. In the interpreter, when an address is read or stored to, check whether
the address is in these lists. If so, throw a sigtrap and suspend the process.
When a wasm module is duplicated instantiated with wasm_instance_new,
the function import info of the previous instantiation may be overwritten by
the later instantiation, which may cause unexpected behavior.
Store the function import info into the module instance to fix the issue.
Use sha256 to hash binary file content. If the incoming wasm binary is
cached before, wasm_module_new() simply returns the existed one.
Use -DWAMR_BUILD_WASM_CACHE=0/1 to control the feature.
OpenSSL 1.1.1 is required if the feature is enabled.
Record the store number of current thread with struct thread_local_stores
or tls thread_local_stores_num to fix the issue:
- Only call wasm_runtime_init_thread_env() in the first wasm_store_new of
current thread
- Only call wasm_runtime_destroy_thread_env() in the last wasm_store_delete
of current thread
And remove the unused store list in the engine.
Refine AOT exception check in the caller when returning from callee function,
remove the exception check instructions when hw bound check is enabled to
improve the performance: create guard page to trigger signal handler when
exception occurs.
Add an option to pass user data to the allocator functions. It is common to
do this so that the host embedder can pass a struct as user data and access
that struct from the allocator, which gives the host embedder the ability to
do things such as track allocation statistics within the allocator.
Compile with `cmake -DWASM_MEM_ALLOC_WITH_USER_DATA=1` to enable
the option, and the allocator functions provided by the host embedder should
be like below (an extra argument `data` is added):
void *malloc(void *data, uint32 size) { .. }
void *realloc(void *data, uint32 size) { .. }
void free(void *data, void *ptr) { .. }
Signed-off-by: Andrew Chambers <ncham@amazon.com>
Change main thread hangs when encounter debugger encounters error to
main thread exits when debugger encounters error
Change main thread blocks when debugger detaches to
main thread continues executing when debugger detaches, and main thread
exits normally when finishing executing
Create trap for error message when wasm_instance_new fails:
- Similar to [this PR](https://github.com/bytecodealliance/wasm-micro-runtime/pull/1526),
but create a wasm_trap_t to output the error msg instead of adding error_buf to the API.
- Trap will need to be deleted by the caller but is not a breaking change as it is only
created if trap is not NULL.
- Add error messages for all failure cases here, try to make them accurate but welcome
feedback for improvements.
Signed-off-by: Andrew Chambers <ncham@amazon.com>
Current SGX lib-rats wasm module hash is stored in a global buffer,
which may be overwritten if there are multiple wasm module loadings.
We move the module hash into the enclave module to resolve the issue.
And rename the SGX_IPFS macro/variable in Makefile and Enclave.edl to
make the code more consistent.
And refine the sgx-ra sample document.
Limit max_stack_cell_num/max_csp_num to be no larger than UINT16_MAX,
and don't check all_cell_num in interpreter again.
And refine some codes in interpreter.
The current implementation of remote attestation does not take into
account the integrity of the wasm module. The SHA256 of the wasm
module has been put into user_data to generate the quote, and more
parameters are exposed for further verification.
Update build wasm app document, add how to set buildflags for Rust
project to reduce the footprint.
Clear Windows warnings and a shadow warning in aot_emit_numberic.c
Refine the generated LLVM IRs at the beginning of each LLVM AOT/JIT function
to fasten the LLVM IR optimization:
- Only create argv_buf if there are func calls in this function
- Only create native stack bound if stack bound check is enabled
- Only create aux stack info if there is opcode set_global_aux_stack
- Only create native symbol if indirect_mode is enabled
- Only create memory info if there are memory operations
- Only create func_type_indexes if there is opcode call_indirect
A limitation of the current implementation of SGX IPFS in WAMR is that
it prevents to open files which are not in the current directory.
This restriction is lifted and can now open files in paths, similarly to the
WASI openat call, which takes into account the sandbox of the file system.
Add a new options to control the native stack hw bound check feature:
- Besides the original option `cmake -DWAMR_DISABLE_HW_BOUND_CHECK=1/0`,
add a new option `cmake -DWAMR_DISABLE_STACK_HW_BOUND_CHECK=1/0`
- When the linear memory hw bound check is disabled, the stack hw bound check
will be disabled automatically, no matter what the input option is
- When the linear memory hw bound check is enabled, the stack hw bound check
is enabled/disabled according to the value of input option
- Besides the original option `--bounds-checks=1/0`, add a new option
`--stack-bounds-checks=1/0` for wamrc
Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1677
Support to get/set recv_buf_size/send_buf_size/reuse_port/reuse_addr for wasm app
Add socket APIs for esp-idf platform
Add setsockopt for linux-sgx platform
Allow to wait for a new debugger connection once the previous one
is disconnected:
- when receiving a detach command
- when the client socket is closed (for example, lldb process is killed)
Currently we initialize and destroy LLVM environment in aot_create_comp_context
and aot_destroy_comp_context, which are called in wasm_module_load/unload,
and the latter may be invoked multiple times, which leads to duplicated LLVM
initialization/destroy and may result in unexpected behaviors.
Move the LLVM init/destroy into runtime init/destroy to resolve the issue.
Allow to have multiple stores in an engine and multiple instances
in a store. Letting a wasm_function_t pass its wasm_store_t to make
it more efficient.
Add macro WASM_ENABLE_WORD_ALING_READ to enable reading
1/2/4 and n bytes data from vram buffer, which requires 4-byte addr
alignment reading.
Eliminate XIP AOT relocations related to the below ones:
i32_div_u, f32_min, f32_max, f32_ceil, f32_floor, f32_trunc, f32_rint
Change wasm-c-api default log level to output less logs by default:
- For debug mode, change log level from 5 to 4
- For release mode, change log level from 3 to 2
The host embedder may new/delete wasm-c-api engine simultaneously
in multiple threads, which requires lock for the operations. Since there
isn't one time called global init/destroy APIs provided by wasm-c-api,
we define a global lock and initialize it with thread mutex initializer if
the platform supports that, and use it to lock the operations of engine.
If the platform doesn't support thread mutex initializer, we require
developer to create the lock by himself to ensure the thread-safe of the
engine operations.
Allow to unregister (or unload) the previously registered native libs,
so that no need to restart the whole engine by using
`wasm_runtime_destroy/wasm_runtime_init`.
Use the cmake variable `WAMR_BUILD_GLOBAL_HEAP_POOL` and
`WAMR_BUILD_GLOBAL_HEAP_SIZE` to enable/disable the global heap pool
and set its size. And set the default global heap size in core/config.h and
the cmake files.
As a result, the developers who build iwasm can easily enable/disable the
global heap pool and change its size regardless of the iwasm implementation,
without manually finding and patching the right location for that value.
The general optimizations may create some intrinsic function calls
like llvm.memset, so we move indirect mode optimization after them
to remove these function calls at last.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Some offsets can be directly gotten at the compilation stage after the interp/AOT
module instance refactoring PR was merged, so as to reduce some unnecessary
load instructions and improve the Fast JIT performance:
- Access fields of wasm memory instance structure
- Access fields of wasm table instance structure
- Access the global data
Translate call_indirect opcode by calling wasm functions with Fast JIT IRs instead of
calling jit_call_indirect runtime API, so as to improve the performance.
Translate call native function process with Fast JIT IRs to validate each pointer argument
and convert it into native address, and then call the native function directly instead
of calling jit_invoke_native runtime API, so as to improve the performance.
Refactor LLVM JIT for some purposes:
- To simplify the source code of JIT compilation
- To simplify the JIT modes
- To align with LLVM latest changes
- To prepare for the Multi-tier JIT compilation, refer to #1302
The changes mainly include:
- Remove the MCJIT mode, replace it with ORC JIT eager mode
- Remove the LLVM legacy pass manager (only keep the LLVM new pass manager)
- Change the lazy mode's LLVM module/function binding:
change each function in an individual LLVM module into all functions in a single LLVM module
- Upgraded ORC JIT to ORCv2 JIT to enable lazy compilation
Refer to #1468
With hardware boundary checking enabled, the app heap memory comes from `os_mmap()`.
Clearing the whole heap in the memory allocator causes process RSS to reach maximum
app heap size immediately and wastes lots of memory, so we had better remove the
unnecessary memory clean operations in the memory allocator.
Refactor the layout of interpreter and AOT module instance:
- Unify the interp/AOT module instance, use the same WASMModuleInstance/
WASMMemoryInstance/WASMTableInstance data structures for both interpreter
and AOT
- Make the offset of most fields the same in module instance for both interpreter
and AOT, append memory instance structure, global data and table instances to
the end of module instance for interpreter mode (like AOT mode)
- For extra fields in WASM module instance, use WASMModuleInstanceExtra to
create a field `e` for interpreter
- Change the LLVM JIT module instance creating process, LLVM JIT uses the WASM
module and module instance same as interpreter/Fast-JIT mode. So that Fast JIT
and LLVM JIT can access the same data structures, and make it possible to
implement the Multi-tier JIT (tier-up from Fast JIT to LLVM JIT) in the future
- Unify some APIs: merge some APIs for module instance and memory instance's
related operations (only implement one copy)
Note that the AOT ABI is same, the AOT file format, AOT relocation types, how AOT
code accesses the AOT module instance and so on are kept unchanged.
Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1384