Support getting global type from `wasm_runtime_get_import_type` and
`wasm_runtime_get_export_type`, and add two APIs:
```C
wasm_valkind_t
wasm_global_type_get_valkind(const wasm_global_type_t global_type);
bool
wasm_global_type_get_mutable(const wasm_global_type_t global_type);
```
If there is no else branch, make a virtual else opcode for easier integrity
check and to copy the correct results to the block return address for
fast-interp mode: change if block from `if ... end` to `if ... else end`.
Reported in issue #3386, #3387, #3388.
In classic interpreter, fast interpreter and fast-jit running modes, set the local
variables' default value to NULL_REF (0xFFFFFFFF) rather than 0 if they are type
of externref or funcref.
The issue was reported in #3390 and #3391.
Fix aot debugger compilation error on windows as reported in #3184.
And update the stack size configuration for product-mini zephyr sample
since the native stack overflow check was enhanced and the zephyr-sdk
was also upgraded.
- Add a few API (https://github.com/bytecodealliance/wasm-micro-runtime/issues/3325)
```c
wasm_runtime_detect_native_stack_overflow_size
wasm_runtime_detect_native_stack_overflow
```
- Adapt the runtime to use them
- Adapt samples/native-stack-overflow to use them
- Add a few missing overflow checks in the interpreters
- Build and run the sample on the CI
Fix the integer overflow issue when checking target branch depth in opcode
br_table, and fix is_32bit_type not check VALUE_TYPE_ANY issue, which may
cause wasm_loader_push_frame_offset push extra unneeded offset.
Add WASI support for esp-idf platform:
1. add Kconfig and cmake scripts
2. add API "openat" when using littlefs
3. add clock/rwlock/file/socket OS adapter
The old value (1KB) doesn't seem sufficient for many cases.
I suspect that the new value is still not sufficient for some cases.
But it's far safer than the old value.
Consider if the classic interpreter loop (2600 bytes) calls
host snprintf. (2000 bytes)
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/3314
Fix wasm loader integrity checks for opcode ref.func and opcode else:
for opcode ref.func, the function must be an import, exported, or present in a
table elem segment or global initializer to be used as the operand to ref.func,
for opcode else, there must not be an else opcode previously.
Reported in #3336 and #3337.
And fix mini loader PUSH_MEM_OFFSET/POP_MEM_OFFSET macro
definitions due to the introducing of memory64 feature.
This PR fixes a readir for posix. readdir is not working correctly in rust.
The current WAMR's readdir implementation for posix is, if readdir returns 0,
it will exit with an error. But posix readdir returns 0 at the end of the directory.
To handle this correctly, if readdir returns 0, it should only raise an error if
errno has changed. We can reproduce it with the following rust code:
```rust
use std::fs;
fn main() {
let entries = fs::read_dir(".").unwrap();
for entry in entries {
println!("read_dir:{:?}", entry);
}
}
```
Some issues are related with memory fragmentation, which may cause
the linear memory cannot be allocated. In WAMR, the memory managed
by the system is often trivial, but linear memory usually directly allocates
a large block and often remains unchanged for a long time. Their sensitivity
and contribution to fragmentation are different, which is suitable for
different allocation strategies. If we can control the linear memory's allocation,
do not make it from system heap, the overhead of heap management might
be avoided.
Add `mem_alloc_usage_t usage` as the first argument for user defined
malloc/realloc/free functions when `WAMR_BUILD_ALLOC_WITH_USAGE` cmake
variable is set as 1, and make passing `Alloc_For_LinearMemory` to the
argument when allocating the linear memory.
Enhance the GC subtyping checks:
- Fix issues in the type equivalence check
- Enable the recursive type subtyping check
- Add a equivalence type flag in defined types of aot file, if there is an
equivalence type before, just set it true and re-use the previous type
- Normalize the defined types for interpreter and AOT
- Enable spec test case type-equivalence.wast and type-subtyping.wast,
and enable some commented cases
- Enable set WAMR_BUILD_SANITIZER from cmake variable
`posix_fadvise()` returns 0 on success and the errno on error. This
commit fixes the handling of the return value such that it does not
always succeeds.
Fixes#3322.
1. Fix API "futimens" and "utimensat" compiling error in different esp-idf version
2. Update component registry description file
ps. refer to PR #3296 on branch release/1.3x
For use with WAMR_BUILD_LIB_PTHREAD, add os_thread_detach,
os_thread_exit, os_cond_broadcast.
Signed-off-by: Maxim Kolchurin <maxim.kolchurin@gmail.com>
The current frame was freed before tail calling to an import or native function
and the prev_frame was set as exec_env's cur_frame, so after the tail calling,
we should recover context from prev_frame but not current frame.
Found in https://github.com/bytecodealliance/wasm-micro-runtime/issues/3279.
When thread manager is enabled, the aux stack of exec_env may be allocated
by wasm_cluster_allocate_aux_stack or disabled by setting aux_stack_bottom
as UINTPTR_MAX directly. For the latter, no need to free it.
And fix an issue when paring `--gc-heap-size=n` argument for iwasm, and
fix a variable shadowed warning in fast-jit.
- Add new API wasm_runtime_load_ex() in wasm_export.h
and wasm_module_new_ex in wasm_c_api.h
- Put aot_create_perf_map() into a separated file aot_perf_map.c
- In perf.map, function names include user specified module name
- Enhance the script to help flamegraph generations
Fix the warnings and issues reported:
- in Windows platform
- by CodeQL static code analyzing
- by Coverity static code analyzing
And update CodeQL script to build exception handling and memory features.
This reverts commit 0e8d949440.
Because it doesn't make much sense anymore after we disabled debug info
processing on C++ functions in:
"aot debug: process lldb_function_to_function_dbi only for C".
Adding a new cmake flag (cache variable) `WAMR_BUILD_MEMORY64` to enable
the memory64 feature, it can only be enabled on the 64-bit platform/target and
can only use software boundary check. And when it is enabled, it can support both
i32 and i64 linear memory types. The main modifications are:
- wasm loader & mini-loader: loading and bytecode validating process
- wasm runtime: memory instantiating process
- classic-interpreter: wasm code executing process
- Support memory64 memory in related runtime APIs
- Modify main function type check when it's memory64 wasm file
- Modify `wasm_runtime_invoke_native` and `wasm_runtime_invoke_native_raw` to
handle registered native function pointer argument when memory64 is enabled
- memory64 classic-interpreter spec test in `test_wamr.sh` and in CI
Currently, it supports memory64 memory wasm file that uses core spec
(including bulk memory proposal) opcodes and threads opcodes.
ps.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3091https://github.com/bytecodealliance/wasm-micro-runtime/pull/3240https://github.com/bytecodealliance/wasm-micro-runtime/pull/3260
The PR #3259 reverted PR #3192, it fixes#3210 but makes #3170 failed again.
The workaround is that we should update `ctx->dynamic_offset` only for opcode br
and should not update it for opcode br_if. This PR fixes both issue #3170 and #3210.
Some environment may call wasm_runtime_full_init/wasm_runtime_init multiple
times without knowing that runtime is initialized or not, it is better to add lock
and increase reference count during initialization.
ps. https://github.com/bytecodealliance/wasm-micro-runtime/discussions/3253.
Add cmake variable `-DWAMR_BUILD_AOT_INTRINSICS=1/0` to enable/disable
the aot intrinsic functions, which are normally used by AOT XIP feature, and
can be disabled to reduce the aot runtime binary size.
And refactor the code in aot_intrinsics.h/.c.
Should not update `ctx->dynamic_offset` in emit_br_info, since the `Part e` only
sets the dst offsets, the operand stack should not be changed, e.g., the stack
operands are to be used by the opcodes followed by `br_if` opcode.
Reported in https://github.com/bytecodealliance/wasm-micro-runtime/issues/3210.
This PR fixes the random failing test case `nofollow_errors` mentioned in
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3222
```C
// dirfd: This is the file descriptor of the directory relative to which the pathname is interpreted.
int openat(int dirfd, const char *pathname, int flags, ...);
```
The value should be a directory handle instead of a file handle (which is always -1 in this context)
returned from `openat`.
The symbols in windows 32-bit may start with '_' and can not be found
when resolving the relocations to them. This PR ignores the underscore
when handling the relocation name of AOT_FUNC_INTERNAL_PREFIX, and
redirect the relocation with name "_aot_stack_sizes" to the relocation with
name ".aot_stack_sizes" (the name of the data section created).
ps.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3216
- Merge unused field `used_to_be_wasi_ctx` in `AOTModuleInstance` into `reserved` area
- Add field `memory_lock` in `WASMMemoryInstance` for future refactor
- Go binding: fix type error
https://github.com/bytecodealliance/wasm-micro-runtime/issues/3220
- Python binding:
type annotation uses the union operator "|", which requires Python version >=3.10
This allows to know the beginning of the wasm address space. At the moment
to achieve that, we need to apply a `hack wasm_runtime_addr_app_to_native(X)-X`
to get the beginning of WASM memory in the nativ code, but I don't see a good
reason why not to allow zero address as a parameter value for this function.
This PR adds a max_memory_pages parameter to module instantiation APIs,
to allow overriding the max memory defined in the WASM module.
Sticking to the max memory defined in the module is quite limiting when
using shared memory in production. If targeted devices have different
memory constraints, many wasm files have to be generated with different
max memory values. And device constraints may not be known in advance.
Being able to set the max memory value during module instantiation allows
to reuse the same wasm module, e.g. by retrying instantiation with different
max memory value.
If the language is not specified, CMake will try to find C++ compiler, even
though it is not really needed in that case (as the project is only written in C).
Now that the filesystem implementation is now complete, the previous
test filters on Windows can be removed. Some of the tests only pass when
certain environment variables have been set on Windows so an extra step
has been added in the wasi test runner script to modify the test config
files before the tests begin.
The stack profiler `aot_func#xxx` calls the wrapped function of `aot_func_internal#xxx`
by using symbol reference, but in some platform like xtensa, it’s translated into a native
long call, which needs to resolve the indirect address by relocation and breaks the XIP
feature which requires the eliminating of relocation.
The solution is to change the symbol reference into an indirect call through the lookup
table, the code will be like this:
```llvm
call_wrapped_func: ; preds = %stack_bound_check_block
%func_addr1 = getelementptr inbounds ptr, ptr %func_ptrs_ptr, i32 75
%func_tmp2 = load ptr, ptr %func_addr1, align 4
tail call void %func_tmp2(ptr %exec_env)
ret void
```
Fix the errors reported in the sanitizer test of nightly run CI.
When the stack is in polymorphic state, the stack operands may be changed
after pop and push operations (e.g. stack is empty but pop op can succeed
in polymorphic, and the push op can push a new operand to stack), this may
impact the following checks to other target blocks of the br_table opcode.
Use math functions only with `CONFIG_MINIMAL_LIBC=y`.
`CONFIG_PICOLIBC=y` or `CONFIG_NEWLIB_LIBC=y` provides math functions
that are used by wasm, and compilation fails when they are selected.
Signed-off-by: Maxim Kolchurin <maxim.kolchurin@gmail.com>
When running AOT code in Zephyr on STM32H743VIT6 without
CONFIG_CACHE_MANAGEMENT=y, a hard fault occurs, which leads to
SCB_CleanDCache().
It’s better to use the functions built into Zephyr.
This PR encompasses two complementing purposes:
A documentation on verifying an Intel SGX evidence as produced by WAMR,
including a guide for verification without an Intel SGX-enabled platform.
This also contains a small addition to the RA sample to extract specific
information, such as whether the enclave is running in debug mode.
A C# sample to verify evidence on trusted premises (and without Intel SGX).
Evidence is generated on untrusted environments, using Intel SGX.
As an original design rule, the code in `core/shared/platform` should not
rely on the code in `core/share/utils`. In the current implementation,
platform layer calls function `bh_memory_remap_slow` in utils layer.
This PR adds inline function `os_mremap_slow` in platform_api_vmcore.h,
and lets os_remap call it if mremap fails. And remove bh_memutils.h/c as
as they are unused.
And resolve the compilation warning in wamrc:
```bash
core/shared/platform/common/posix/posix_memmap.c:255:16:
warning: implicit declaration of function ‘bh_memory_remap_slow’
255 | return bh_memory_remap_slow(old_addr, old_size, new_size);
```
The wasm_interp_call_func_bytecode is called for the first time with the empty
module/exec_env to generate a global_handle_table. Before that happens though,
the function checks if the module instance has bounds check enabled. Because
the module instance is null, the program crashes. This PR added an extra check to
prevent the crashes.
Implement the GC (Garbage Collection) feature for interpreter mode,
AOT mode and LLVM-JIT mode, and support most features of the latest
spec proposal, and also enable the stringref feature.
Use `cmake -DWAMR_BUILD_GC=1/0` to enable/disable the feature,
and `wamrc --enable-gc` to generate the AOT file with GC supported.
And update the AOT file version from 2 to 3 since there are many AOT
ABI breaks, including the changes of AOT file format, the changes of
AOT module/memory instance layouts, the AOT runtime APIs for the
AOT code to invoke and so on.
This increases the chance to use "short" calls.
Assumptions:
- LLVM preserves the order of functions in a module
- The wrapper function are smaller than the wrapped functions
- The target CPU has "short" PC-relative variation of call/jmp instructions
and they are preferrable over the "long" ones.
A motivation:
- To avoid some relocations for XIP, I want to use xtensa PC-relative
call instructions, which can only reach ~512KB.
Using `CHECK_BULK_MEMORY_OVERFLOW(addr + offset, n, maddr)` to do the
boundary check may encounter integer overflow in `addr + offset`, change to
use `CHECK_MEMORY_OVERFLOW(n)` instead, which converts `addr` and `offset`
to uint64 first and then add them to avoid integer overflow.
With this approach we can omit using memset() for the newly allocated memory
therefore the physical pages are not being used unless touched by the program.
This also simplifies the implementation.
After #2995, AOT may stop working properly on arm MacOS:
```bash
wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c,
line 1270, WASM module load failed
AOT module load failed: mmap memory failed
```
That's because, without `#include <TargetConditionals.h>`, `TARGET_OS_OSX` is undefined,
since it's definition is in that header file.
This PR adds the initial support for WASM exception handling:
* Inside the classic interpreter only:
* Initial handling of Tags
* Initial handling of Exceptions based on W3C Exception Proposal
* Import and Export of Exceptions and Tags
* Add `cmake -DWAMR_BUILD_EXCE_HANDLING=1/0` option to enable/disable
the feature, and by default it is disabled
* Update the wamr-test-suites scripts to test the feature
* Additional CI/CD changes to validate the exception spec proposal cases
Refer to:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/1884587513f3c68bebfe9ad759bccdfed8
Signed-off-by: Ricardo Aguilar <ricardoaguilar@siemens.com>
Co-authored-by: Chris Woods <chris.woods@siemens.com>
Co-authored-by: Rene Ermler <rene.ermler@siemens.com>
Co-authored-by: Trenner Thomas <trenner.thomas@siemens.com>
While we used a different approach for poll_oneoff [1],
the implementation works only when the poll list includes
an absolute clock event. That is, if we have a thread which is
polling on descriptors without a timeout, we fail to terminate
the thread.
This commit fixes it by applying wasm_runtime_begin_blocking_op
to poll as well.
[1] https://github.com/bytecodealliance/wasm-micro-runtime/pull/1951
This fixes the cosmopolitan platform.
- Switch `build_cosmocc.sh` and platform documentation to
explicitly use the x86_64 cosmocc compiler as multi-arch
cosmocc won't work here. Older version `cosmocc` just did
a x86_64 build.
- Add missing items from `platform_internal.h` to fix build.
Follow-up on #2907. The log level is needed in the host embedder to
better integrate with the embedder's logger.
Allow the developer to customize his bh_log callback with
`cmake -DWAMR_BH_LOG=<log_callback>`,
and update sample/basic to show the usage.
Possible alternatives:
* Make wasm_cluster_destroy_spawned_exec_env take two exec_env.
One for wasm execution and another to specify the target to destroy.
* Make execute functions to switch exec_env as briefly discussed in
https://github.com/bytecodealliance/wasm-micro-runtime/pull/2047
When WAMR is embedded to other application, the lifecycle of the socket
might conflict with other usecases. E.g. if WAMR is deinitialized before any
other use of sockets, the application goes into the invalid state. The new
flag allows host application to take control over the socket initialization.
Check whether the arguments are NULL before calling bh_hash_map_find,
or lots of "HashMap find elem failed: map or key is NULL" warnings may
be dumped. Reported in #3053.
Though SIMD isn't supported by interpreter, when JIT is enabled,
developer may run `iwasm --interp <wasm_file>` to trigger the SIMD
opcode in interpreter, which isn't handled before this PR.
It seems that some users want to wrap rather large chunk of code
with wasm_runtime_begin_blocking_op/wasm_runtime_end_blocking_op.
If the wrapped code happens to have a call to
e.g. wasm_runtime_spawn_exec_env, WASM_SUSPEND_FLAG_BLOCKING is
inherited to the child exec_env and it may cause unexpected behaviors.
- Enable quick aot entry when hw bound check is disabled
- Remove unnecessary ret_type argument in the quick aot entries
- Declare detailed prototype of aot function to call in each quick aot entry
Since there is no so rich api in freertos like embedded system, simply set
CONFIG_HAS_CAP_ENTER to 1 to support posix file api for freertos.
Test file api in wasm app pass.
Enhance the statistic of wasm function execution time, or the performance
profiling feature:
- Add os_time_thread_cputime_us() to get the cputime of a thread,
and use it to calculate the execution time of a wasm function
- Support the statistic of the children execution time of a function,
and dump it in wasm_runtime_dump_perf_profiling
- Expose two APIs:
wasm_runtime_sum_wasm_exec_time
wasm_runtime_get_wasm_func_exec_time
And rename os_time_get_boot_microsecond to os_time_get_boot_us.
For shared memory, the max memory size must be defined in advanced. Re-allocation
for growing memory can't be used as it might change the base address, therefore when
OS_ENABLE_HW_BOUND_CHECK is enabled the memory is mmaped, and if the flag is
disabled, the memory is allocated. This change introduces a flag that allows users to use
mmap for reserving memory address space even if the OS_ENABLE_HW_BOUND_CHECK
is disabled.
When the original wasm contains multiple compilation units, the current
logic uses the first one for everything. This commit tries to use a bit more
appropriate ones.
Errors were reported when initializing wasm_val_t values with WASM_I32_VAL like macros.
```
error: missing initializer for member ‘wasm_val_t::__paddings’ [-Werror=missing-field-initializers]
64 | wasm_val_t res = {WASM_INIT_VAL};
```
And rename DEPRECATED to WASM_API_DEPRECATED to avoid using defines with generic names.
Compilation error was reported when `cmake -DWAMR_BUILD_LIBC_WASI=0`
on linux-sgx platform:
```
core/shared/platform/linux-sgx/sgx_socket.c:8:10:
fatal error: libc_errno.h: No such file or directory
8 | #include "libc_errno.h"
| ^~~~~~~~~~~~~~
```
After fixing, both `cmake -DWAMR_BUILD_LIBC_WASI=1` and
`WAMR_BUILD_LIBC_WASI=0` work good.
The content in custom name section is changed after loaded since the strings
are adjusted with '\0' appended, the emitted AOT file then cannot be loaded.
The PR disables changing the content for AOT compiler to resolve it.
And disable emitting custom name section for `wamrc --enable-dump-call-stack`,
instead, use `wamrc --emit-custom-sections=name` to emit it.
`pthread_jit_write_protect_np` is only available on macOS, and
`sys_icache_invalidate` is available on both iOS and macOS and
has no restrictions on ARM architecture.
When using the wasm-c-api and there's a trap, `wasm_func_call()` returns
a `wasm_trap_t *` object. No matter which thread crashes, the trap contains
the stack frames of the main thread.
With this PR, when there's an exception, the stack frames of the thread
where the exception occurs are stored into the thread cluster.
`wasm_func_call()` can then return those stack frames.
Allow to invoke the quick call entry wasm_runtime_quick_invoke_c_api_import to
call the wasm-c-api import functions to speedup the calling process, which reduces
the data copying.
Use `wamrc --invoke-c-api-import` to generate the optimized AOT code, and set
`jit_options->quick_invoke_c_api_import` true in wasm_engine_new when LLVM JIT
is enabled.
In some scenarios there may be lots of callings to AOT/JIT functions from the
host embedder, which expects good performance for the calling process, while
in the current implementation, runtime calls the wasm_runtime_invoke_native
to prepare the array of registers and stacks for the invokeNative assemble code,
and the latter then puts the elements in the array to physical registers and
native stacks and calls the AOT/JIT function, there may be many data copying
and handlings which impact the performance.
This PR registers some quick AOT/JIT entries for some simple wasm signatures,
and let runtime call the entry to directly invoke the AOT/JIT function instead of
calling wasm_runtime_invoke_native, which speedups the calling process.
We may extend the mechanism next to allow the developer to register his quick
AOT/JIT entries to speedup the calling process of invoking the AOT/JIT functions
for some specific signatures.
On macOS, by default, the first 4GB is occupied by the pagezero.
While it can be controlled with link time options, as we are
an library, we usually don't have a control on how to link an
executable.
Add an API to set segue flags for wasm-c-api LLVM JIT mode:
```C
wasm_config_t *
wasm_config_set_segue_flags(wasm_config_t *config, uint32 segue_flags);
```
- Don't allocate the implicit/unused frame when calling the LLVM JIT function
- Don't set exec_env's thread handle and stack boundary in the recursive
calling from host, since they have been set in the first time calling
- Fix frame not freed in llvm_jit_call_func_bytecode
before the change, only support wasm app exit like:
```c
void *thread_routine(void *arg)
{
printf("Enter thread\n");
return NULL;
}
```
if call pthread_exit, it will crash:
```c
void *thread_routine(void *arg)
{
printf("Enter thread\n");
pthread_exit(NULL);
return NULL;
}
```
This commit lets both upstairs work correctly, test pass on stm32f103 mcu.
And refactor the original perf support
- use WAMR_BUILD_LINUX_PERF as the cmake compilation control
- use WASM_ENABLE_LINUX_PERF as the compiler macro
- use `wamrc --enable-linux-perf` to generate aot file which contains fp operations
- use `iwasm --enable-linux-perf` to create perf map for `perf record`
The host embedder may also want to terminate the wasm instance
for single-threading mode, and it should work by setting exception
to the wasm instance.
Loggers (e.g. glog) usually come with instrumentation to add timestamp
and other information when reporting. That results in the timestamp
being reported twice, making the output confusing.
According to the specification:
```
When instantiating a module which is expected to run
with `wasi-threads`, the WASI host must first allocate shared memories to
satisfy the module's imports.
```
Currently, if a test from the spec is executed while having the `multi-module`
feature enabled, WAMR fails with `WASM module load failed: unknown import`.
That happens because spec tests use memory like this:
`(memory (export "memory") (import "foo" "bar") 1 1 shared)`
and WAMR tries to find a registered module named `foo`.
At the moment, there is no specific module name that can be used to identify
that the memory is imported because using WASI threads:
https://github.com/WebAssembly/wasi-threads/issues/33,
so this PR only avoids treating the submodule dependency not being found
as a failure.
It's possible to set both `atim` and `atim_now` in the `fstflags`
parameter. Same goes for `mtin` and `mtim_now`. However, it's
ambiguous which time should be set in these two cases. This commit
checks this and returns `EINVAL`.
A wasm module can be either a command or a reactor, so it can export
either `_start` or `_initialize`. Currently, if a command module is run,
`iwasm` still looks for `_initialize`, resulting in the warning:
`can not find an export 0 named _initialize in the module`.
Change to look for `_initialize` only if `_start` not found to resolve the issue.
This fixes bug #2880. Zephyr 3.2 made changes to how headers are reference (see [release notes](https://docs.zephyrproject.org/latest/releases/release-notes-3.2.html)). Work item [49578](https://github.com/zephyrproject-rtos/zephyr/issues/49578) deprecated the old headers names.
The current WAMR codebase references these old headers, thus causing compile errors with
current versions of Zephyr.
This update adds #ifdefs around the header names. With this change, compiling with Zephyr 3.2.0
and above will use the new header files. Prior versions will use the existing code.
This commit adds a check to `fd_advise`. If the fd is a directory,
return `ebadf`. This brings iwasm in line with Wasmtime's behavior.
WASI folks have stated that fd_advise should not work on directories
as this is a Linux-specific behavior:
https://github.com/bytecodealliance/wasmtime/issues/6505#issuecomment-1574122949
- Fix op_br_table arity type check when the dest block is loop block
- Fix op_drop issue when the stack is polymorphic and it is to drop
an ANY type value in the stack
* Empty names are spec-wise valid.
* As we ignore unknown custom sections anyway, it's safe to
accept empty names here.
* Currently, the problem is not exposed on our CI because
the wabt version used there is a bit old.
For shared memory, runtime should get the memories pointer from
module_inst first, then get memory instance from memories array,
and then get the fields of the memory instance.
Support new a wasm_config_t, set allocation and linux_perf_support
options to it, and then pass it to wasm_engine_new_with_config to
new an engine with private configuration.
The JSON evidence is allocated on the module instance heap, but no API
was given to dispose of this memory buffer. The sample mentions using
the function free, which behaves differently depending on the
execution context.
This fix provides a new function called librats_dispose_evidence_json,
enabling freeing the JSON evidence directly from the Wasm app.
Change WASMMemoryInstance's field is_shared_memory's type from bool
to uint8 whose size is fixed, so as to make WASMMemoryInstance's size
and layout fixed and not break AOT ABI.
See discussion in https://github.com/bytecodealliance/wasm-micro-runtime/pull/2682.
The popped reachable block may be if block whose else branch hasn't been
translated, and should push the params for the else block if there are.
And use LLVMDisposeMessage to free memory allocated in is_win_platform.
Currently, `data.drop` instruction is implemented by directly modifying the
underlying module. It breaks use cases where you have multiple instances
sharing a single loaded module. `elem.drop` has the same problem too.
This PR fixes the issue by keeping track of which data/elem segments have
been dropped by using bitmaps for each module instances separately, and
add a sample to demonstrate the issue and make the CI run it.
Also add a missing check of dropped elements to the fast-jit `table.init`.
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2735
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2772
CryptGenRandom is deprecated by Microsoft and may be removed in future
releases. They recommend to use the next generation API instead. See
https://learn.microsoft.com/en-us/windows/win32/seccng/cng-portal for
more details. Also, refactor the random functions to return error codes
rather than aborting the program if they fail.
Error is reported when executing `wamrc --target=thumb -o <aot_file> <wasm_file>`:
```
LLVM ERROR: failed to perform tail call elimination on a call site marked musttail
Aborted (core dumped)
```
Set `abi` to "gnu" for the bare-metal target when `abi` is NULL,
or the below `bh_assert` and `bh_memcpy` may deference a NULL
pointer. Error is reported when running wamrc compiled with
`cmake .. -DCMAKE_BUILD_TYPE=Debug`:
```
core/iwasm/compilation/aot_llvm.c:2584:13: runtime error:
null pointer passed as argument 1, which is declared to never be null
```
Add an extra argument `os_file_handle file` for `os_mmap` to support
mapping file from a file fd, and remove `os_get_invalid_handle` from
`posix_file.c` and `win_file.c`, instead, add it in the `platform_internal.h`
files to remove the dependency on libc-wasi.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Heap corruption check in ems memory allocator is enabled by default
to improve the security, but it may impact the performance a lot, this
PR adds cmake variable and compiler flag to enable/disable it.
Returning uint16 from WASI functions is technically correct. However,
the smallest integer type in WASM is int32 and since we don't guarantee
that the upper 16 bits of the result are zero'ed, it can result in
tricky bugs if the language SDK being used in the WASM app does not cast
back immediately to uint16. To prevent this, we directly return uint32
instead, so that the result is well-defined as a 32-bit number.
Set the vendor-sys of bare-metal targets to "-unknown-none-",
and currently only add "thumbxxx" to the bare-metal target list.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
The commit fa5e9d72b0 ("Abstract POSIX filesystem functions") introduces
the build warning:
./core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/posix.c: In function ‘fd_object_release’:
./core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/posix.c:545:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
545 | if (os_is_dir_stream_valid(&fo->directory.handle)) {
| ^
./core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/posix.c:549:13: note: here
549 | default:
| ^~~~~~~
Refer to the commit fb4afc7ca4 ("Apply clang-format for core/iwasm compilation and libraries"),
add one line "// Fallthrough." to make compiler happy.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
`platform_common.h` already has a declaration for BH_VPRINTF so we can
get rid of the one in `platform_internal.h`. Also add some explicit
casts to avoid MSVC compiler warnings.
UWP apps do not have a console attached so any output to stdout/stderr
is lost. Therefore, provide a default BH_VPRINTF in that case for debug
builds which redirects output to the debugger.
To allow anything to depend on WASI types, including platform-specific
data structures, move the WASI libc filesystem/clock interface into
`platform_api_extension.h`, which leaves just WASI types in
`platform_wasi.h`. And `platform_wasi.h` has been renamed to
`platform_wasi_types.h` to reflect that it only defines types now and no
function declarations. Finally, these changes allow us to remove the
`windows_fdflags` type which was essentially a duplicate of
`__wasi_fdflags_t`.
Most of the WASI filesystem tests require at least creating/deleting a
file to test filesystem functionality so some additional filesystem APIs
have been implemented on Windows so we can test what has been
implemented so far. For those WASI functions which haven't been
implemented, we skip the tests. These will be implemented in a future PR
after which we can remove the relevant filters.
Additionally, in order to run the WASI socket and thread tests, we need
to install the wasi-sdk in CI and build the test source code prior to
running the tests.
`jit_reg_is_const_val` only checks whether the register is a const register and
the const value is stored in the register.
Should use `jit_reg_is_const` instead in the front end.
Reported in #2710.
- Fix potential invalid push param phis and add incoming phis to a un-existed basic block
- Fix potential invalid shift count int rotl/rotr opcodes
- Resize memory_data_size to UINT32_MAX if it is 4G when hw bound check is enabled
- Fix negative linear memory offset is used for 64-bit target it is const and larger than INT32_MAX
To run it locally:
```bash
export TSAN_OPTIONS=suppressions=<path_to_tsan_suppressions.txt>
./test_wamr.sh <your flags> -T tsan
```
An example for wasi-threads would look like:
```bash
export TSAN_OPTIONS=suppressions=<path_to_tsan_suppressions.txt>
./test_wamr.sh -w -s wasi_certification -t fast-interp -T tsan
```
Split memory instance's field `uint32 ref_count` into `bool is_shared_memory`
and `uint16 ref_count`, and lock the memory only when `is_shared_memory`
flag is true, no need to acquire a lock for non-shared memory when shared
memory feature is enabled.
Avoid repeatedly initializing the shared memory data when creating the child
thread in lib-pthread or lib-wasi-threads.
Add shared memory lock when accessing some fields of the memory instance
if the memory instance is shared.
Init shared memory's memory_data_size/memory_data_end fields according to
the current page count but not max page count.
Add wasm_runtime_set_mem_bound_check_bytes, and refine the error message
when shared memory flag is found but the feature isn't enabled.
Fixes the Cosmopolitan Libc platform attempting to use `/dev/urandom`
on operating systems that do not have it.
Signed-off-by: G4Vi <gavin@dylibso.com>
This patch enables mapping host directories to guest directories by parsing
the `map_dir_list` argument in API `wasm_runtime_init_wasi` for libc-wasi. It
follows the format `<guest-path>::<host-path>`.
It also adds argument `--map-dir=<guest::host>` argument for `iwasm`
common line tool, and allows to add multiple mappings:
```bash
iwasm --map-dir=<guest-path1::host-path1> --map-dir=<guest-path2::host-path2> ...
```
Implement the necessary os_ filesystem functions to enable successful
WASI initialization on Windows. Some small changes were also required to
the sockets implementation to use the new windows_handle type. The
remaining functions will be implemented in a future PR.
When labels-as-values is enabled in a target which doesn't support
unaligned address access, 16-bit offset is used to store the relative
offset between two opcode labels. But it is a little small and the loader
may report "pre-compiled label offset out of range" error.
Emitting 32-bit data instead to resolve the issue: emit label address in
32-bit target and emit 32-bit relative offset in 64-bit target.
See also:
https://github.com/bytecodealliance/wasm-micro-runtime/issues/2635
To allow non-POSIX platforms such as Windows to support WASI libc
filesystem functionality, create a set of wrapper functions which provide a
platform-agnostic interface to interact with the host filesystem. For now,
the Windows implementation is stubbed but this will be implemented
properly in a future PR. There are no functional changes in this change,
just a reorganization of code to move any direct POSIX references out of
posix.c in the libc implementation into posix_file.c under the shared
POSIX sources.
See https://github.com/bytecodealliance/wasm-micro-runtime/issues/2495 for a
more detailed overview of the plan to port the WASI libc filesystem to Windows.
When doing more investigations related to this PR:
https://github.com/bytecodealliance/wasm-micro-runtime/pull/2619
We found that in some scenarios the constant might not be directly
available to the LLVM IR builder, e.g.:
```
(func $const_ret (result i32)
i32.const -5
)
(func $foo
(i32.shr_u (i32.const -1) (call $const_ret))
(i32.const 31)
)
```
In that case, the right parameter to `i32.shr_u` is not constant, therefore
the `SHIFT_COUNT_MASK` isn't applied. However, when the optimization
is enabled (`--opt-level` is 2 or 3), the optimization passes resolve the
call into constant, and that constant is poisoned, causing the compiler to
resolve the whole function to an exception.
According to the description of `buildPerModuleDefaultPipeline()` and
`buildLTOPreLinkDefaultPipeline()`, it is not allowed to call them with `O0` level.
Use `buildO0DefaultPipeline` instead when the opt-level is 0.
The LLVM zext IR may be inserted after the terminator of a basic block
when popping the arguments of a wasm block. Change to insert the
zext IR before the terminator of the basic block to resolve the issue.
Reported in #2620.
This PR adds the Cosmopolitan Libc platform enabling compatibility with multiple
x86_64 operating systems with the same binary. The platform is similar to the
Linux platform, but for now only x86_64 with interpreter modes are supported.
The only major change to the core is `posix.c/convert_errno()` was rewritten to use
a switch statement. With Cosmopolitan errno values depend on the currently
running operating system, and so they are non-constant and cannot be used in array
designators. However, the `cosmocc` compiler allows non-constant case labels in
switch statements, enabling the new version.
And updated wamr-test-suites script to add `-j <platform>` option. The spec tests
can be ran via `CC=cosmocc ./test_wamr.sh -j cosmopolitan -t classic-interp`
or `CC=cosmocc ./test_wamr.sh -j cosmopolitan -t fast-interp`.
To make it clearer to users when synchronization behaviour is not
supported, return ENOTSUP when O_RSYNC, O_DSYNC or O_SYNC are
respectively not defined. Linux also doesn't support O_RSYNC despite the
O_RSYNC flag being defined.
Fixed a bug in the processing of the br_table_cache opcode that caused out-of-range
references when the label index was greater than the length of the label.
Avoid the stack traces getting mixed up together when multi-threading is enabled
by using exception_lock/unlock in dumping the call stacks.
And remove duplicated call stack dump in wasm_application.c.
Also update coding guideline CI to fix the clang-format-12 not found issue.
According to the specification,
- fNxM_pmin/max returns v1 or v2 based on flt(v1,v2) result
- fNxM_min/max returns +/-NaN, +/-Inf, v1 or v2 based on more than
flt(v1,v2) result
Fixes issue #2561.
Only when the value kind is LLVMConstantIntValueKind and the value
is not undef and not poison can we extract the value of a constant int.
Fixes#2557 and #2559.
Support muti-module for AOT mode, currently only implement the
multi-module's function import feature for AOT, the memory/table/
global import are not implemented yet.
And update wamr-test-suites scripts, multi-module sample and some
CIs accordingly.
The CI might use clang-17 to build iwasm for Android platform and it may
report compilation error:
https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/6308980430/job/17128073777
/home/runner/work/wasm-micro-runtime/wasm-micro-runtime/core/iwasm/libraries/libc-wasi/sandboxed-system-primitives/src/blocking_op.c:45:19: error: call to undeclared function 'preadv'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
ssize_t ret = preadv(fd, iov, iovcnt, offset);
^
Explicitly declare preadv and pwritev in android platform header file to resolve it.
`wasm_loader_push_pop_frame_offset` may pop n operands by using
`loader_ctx->stack_cell_num` to check whether the operand can be
popped or not. While `loader_ctx->stack_cell_num` is updated in the
later `wasm_loader_push_pop_frame_ref`, the check may fail if the stack
is in polymorphic state and lead to `ctx->frame_offset` underflow.
Fix issue #2577 and #2586.
There doesn't appear to be a clear reason not to support this behavior.
It seems it was disallowed previously as a precaution. See
67e2e57b02
for more context.
Adapt API usage to new interfaces where applicable, including LLVM function
usage, obsoleted llvm::Optional type and removal of unavailable headers.
Know issues:
- AOT static PGO isn't enabled
- LLVM JIT may run failed due to llvm_orc_registerEHFrameSectionWrapper
isn't linked into iwasm
Return a WASI error code (rather than a host POSIX one). In addition,
there is no need to return an error in the case that the provided buffer
is too large.
Unaligned store v128 value to the AOT function argument of the pointer for
the extra return value may cause segmentation fault.
Fix the issue reported in #2556.
The WASI docs allow for fewer rights to be applied to an fd than requested but
not more. This behavior is also asserted in the rust WASI tests, so it's necessary
for those to pass as well.
Send a signal whose handler is no-op to a blocking thread to wake up
the blocking syscall with either EINTR equivalent or partial success.
Unlike the approach taken in the `dev/interrupt_block_insn` branch (that is,
signal + longjmp similarly to `OS_ENABLE_HW_BOUND_CHECK`), this PR
does not use longjmp because:
* longjmp from signal handler doesn't work on nuttx
refer to https://github.com/apache/nuttx/issues/10326
* the singal+longjmp approach may be too difficult for average programmers
who might implement host functions to deal with
See also https://github.com/bytecodealliance/wasm-micro-runtime/issues/1910
Remove thread local attribute of prev_sig_act_SIGSEGV/SIGBUS to allow using
custom signal handler from non-main thread since in a thread spawned by
embedder, embedder may be unable to call wasm_runtime_init_thread_env to
initialize them.
And fix the handling of prev_sig_act when its sa_handler is SIG_DFL, SIG_IGN,
or a user customized handler.
Add API wasm_runtime_terminate to terminate a module instance
by setting "terminated by user" exception to the module instance.
And update the product-mini of posix platforms.
Note: this doesn't work for some situations like blocking system calls.
Preserve errno because this function is often used like
the following. The caller wants to report the error from the main
operation (`lseek` in this example), not from fd_object_release.
```
off_t ret = lseek(fd_number(fo), offset, nwhence);
fd_object_release(fo);
if (ret < 0)
return convert_errno(errno);
```
This fixes a few test cases in wasi-threads testsuite like wasi_threads_return_main_block.
And also move the special handling for "wasi proc exit" to a more appropriate place.
Introduce module instance context APIs which can set one or more contexts created
by the embedder for a wasm module instance:
```C
wasm_runtime_create_context_key
wasm_runtime_destroy_context_key
wasm_runtime_set_context
wasm_runtime_set_context_spread
wasm_runtime_get_context
```
And make libc-wasi use it and set wasi context as the first context bound to the wasm
module instance.
Also add samples.
Refer to https://github.com/bytecodealliance/wasm-micro-runtime/issues/2460.
When embedding WAMR, this PR allows to register a callback that is
invoked when memory.grow fails.
In case of memory allocation failures, some languages allow to handle
the error (e.g. by checking the return code of malloc/calloc in C), some
others (e.g. Rust) just panic.
While wasi proc exit is not a real trap, what the runtime does on it is mostly same as
real traps. That is, kill the siblings threads and represent the exit/trap as the result of
the "process" to the user api. There seems no reason to distinguish it from real traps
here.
Note that:
- The target thread either doesn't care the specific exception type or ignore wasi
proc exit by themselves. (clear_wasi_proc_exit_exception)
- clear_wasi_proc_exit_exception only clears local exception.
Add simple infrastructure to add more unit tests in the future. At the moment tests
are only executed on Linux, but can be extended to other platforms if needed.
Use https://github.com/google/googletest/ as a framework.
As a part of stress-testing we want to ensure that mutex implementation is working
correctly and protecting shared resource to be allocated from other threads when
mutex is locked.
This test covers the most common situations that happen when some program uses
mutexes like locks from various threads, locks from the same thread etc.
When AOT out of bound linear memory access or stack overflow occurs, the call stack of
AOT functions cannot be unwound currently, so from the exception handler, runtime
cannot jump back into the place that calls the AOT function.
We temporarily skip the current instruction and let AOT code continue to run and return
to caller as soon as possible. And use the zydis library the decode the current instruction
to get its size.
And remove using RtlAddFunctionTable to register the AOT functions since it doesn't work
currently.
AOT relocation to aot_func_internal#n is generated by wamrc --bounds-checks=1.
Resolve the issue by applying the relocation in the compilation stage by wamrc and
don't generate these relocations in the AOT file.
Fixes#2471.
We need to apply some bug fixes that were merged to wasi-libc because wasi-sdk-20
is about half a year old.
It is a temporary solution and the code will be removed when wasi-sdk 21 is released.
- Fix windows wamrc link error: aot_generate_tempfile_name undefined.
- Clear windows compile warnings.
- And rename folder `samples/bh_atomic` and `samples/mem_allocator` to
`samples/bh-atomic` and `samples/mem-allocator`.
## Context
Some native libraries may want to explicitly delete an externref object without
waiting for the module instance to be deleted.
In addition, it may want to add a cleanup function.
## Proposed Changes
Implement:
* `wasm_externref_objdel` to explicitly delete an externeref'd object.
* `wasm_externref_set_cleanup` to set a cleanup function that is called when
the externref'd object is deleted.
In macro bh_memcpy_s, bh_memcy_wa and bh_memmove_s, no need to do extra check
for length is zero or not because it was already done inside of the functions called.
- Inherit shared memory from the parent instance, instead of
trying to look it up by the underlying module. The old method
works correctly only when every cluster uses different module.
- Use reference count in WASMMemoryInstance/AOTMemoryInstance
to mark whether the memory is shared or not
- Retire WASMSharedMemNode
- For atomic opcode implementations in the interpreters, use
a global lock for now
- Update the internal API users
(wasi-threads, lib-pthread, wasm_runtime_spawn_thread)
Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/1962
- Avoid destroying module instance repeatedly in pthread_exit_wrapper and
wasm_thread_cluster_exit.
- Wait enough time in pthread_join_wrapper for target thread to exit and
destroy its resources.
We need to make a test that runs longer than the tests we had before to check
some problems that might happen after running for some time (e.g. memory
corruption or something else).
Tests were failing because the right permissions were not provided to iwasm.
Also, test failures didn't trigger build failure due to typo - also fixed in this change.
In addition to that, this PR fixes a few issues with the test itself:
* the `server_init_complete` was not reset early enough causing the client to occasionally
assume the server started even though it didn't yet
* set `SO_REUSEADDR` on the server socket so the port can be reused shortly after
closing the previous socket
* defined receive-send-receive sequence from server to make sure server is alive at the
time of sending message
The old method may not work for some cases. This PR iterates over all instructions
in the function, looking for memcpy, memmove and memset instructions, putting
them into a set, and finally expands them into a loop one by one.
And move this LLVM Pass after building the pipe line of pass builder to ensure that
the memcpy/memmove/memset instrinsics are generated before applying the pass.
And return ENOSYS. We do that so we can at least compile the code on CI.
We'll be gradually enabling more and more functions.
Also, enabled `proc_raise()` for windows.
* disable translations of errno codes that aren't defined on Windows
* undef `min()` macro if it is defined to not conflict with the `min()` function we define
* implement `shed_yield` wasi call
* disable some of the features in the config for windows by default
There is no standard `realpath` function in the C/C++ standard libraries for Windows,
use `_fullpath` function instead to get absolute path of a directory.
We have observed a significant performance degradation after merging
https://github.com/bytecodealliance/wasm-micro-runtime/pull/1991
Instead of protecting suspend flags with a mutex, we implement the flags
as atomic variable and only use mutex when atomics are not available
on a given platform.
esp32-s3's instruction memory and data memory can be accessed through mutual mirroring way,
so we define a new feature named as WASM_MEM_DUAL_BUS_MIRROR.
Build wasi-libc library on Windows since libuv may be not supported. This PR is a first step
to make it working, but there's still a number of changes to get it fully working.
Allow to use `cmake -DWAMR_CONFIGURABLE_BOUNDS_CHECKS=1` to
build iwasm, and then run `iwasm --disable-bounds-checks` to disable the
memory access boundary checks.
And add two APIs:
`wasm_runtime_set_bounds_checks` and `wasm_runtime_is_bounds_checks_enabled`
Calling `__wasi_sock_addr_resolve` syscall causes native stack overflow.
Given this is a standard function available in WAMR, we should have at least
the default stack size large enough to handle this case.
The socket tests were updated so they also run in separate thread, but
the simple retro program is:
```C
void *th(void *p)
{
struct addrinfo *res;
getaddrinfo("amazon.com", NULL, NULL, &res);
return NULL;
}
int main(int argc, char **argv)
{
pthread_t pt;
pthread_create(&pt, NULL, th, NULL);
pthread_join(pt, NULL);
return 0;
}
```
## Context
Currently, WAMR supports compiling iwasm with flag `WAMR_BUILD_WASI_NN`.
However, there are scenarios where the user might prefer having it as a shared library.
## Proposed Changes
Decouple wasi-nn context management by internally managing the context given
a module instance reference.
Fix some build errors when building wamrc with LLVM-13, reported in #2311
Fix some build warnings when building wamrc with LLVM-16:
```
core/iwasm/compilation/aot_llvm_extra2.cpp:26:26: warning:
‘llvm::None’ is deprecated: Use std::nullopt instead. [-Wdeprecated-declarations]
26 | return llvm::None;
```
Fix a maybe-uninitialized compile warning:
```
core/iwasm/compilation/aot_llvm.c:413:9: warning:
‘update_top_block’ may be used uninitialized in this function [-Wmaybe-uninitialized]
413 | LLVMPositionBuilderAtEnd(b, update_top_block);
```
## Context
Path to models use `/assets` for testing inside docker. While testing directly from
the repo we are forced to use soft-links or modify the paths.
## Proposed Changes
Use relative path and adjust docker volumes in docs.
Major changes:
- Public headers inside `wasi-nn/include`
- Put cmake files in `cmake` folder
- Make linux iwasm link with `${WASI_NN_LIBS}` so iwasm can enable wasi-nn
This PR attempts to search for the system libuv and use it if found instead of
downloading it. As reported in #1831, this is needed because some tools
build in a sandbox and clear the extra sources.
Move the native stack overflow check from the caller to the callee because the
former doesn't work for call_indirect and imported functions.
Make the stack usage estimation more accurate. Instead of making a guess from
the number of wasm locals in the function, use the LLVM's idea of the stack size
of each MachineFunction. The former is inaccurate because a) it doesn't reflect
optimization passes, and b) wasm locals are not the only reason to use stack.
To use the post-compilation stack usage information without requiring 2-pass
compilation or machine-code imm rewriting, introduce a global array to store
stack consumption of each functions:
For JIT, use a custom IRCompiler with an extra pass to fill the array.
For AOT, use `clang -fstack-usage` equivalent because we support external llc.
Re-implement function call stack usage estimation to reflect the real calling
conventions better. (aot_estimate_stack_usage_for_function_call)
Re-implement stack estimation logic (--enable-memory-profiling) based on the new
machinery.
Discussions: #2105.
Compilation in strict mode fails with
```
wasm_micro_runtime/core/shared/platform/android/platform_init.c:122:30:
error: declaration of 'struct epoll_event` will not be visible outside of this
function [-Werror,-Wvisibility]
epoll_pwait(int epfd, struct epoll_event *events, int maxevents, int timeout,
^
1 error generated.
```
Co-authored-by: Misha Gridnev <gridman@google.com>
Writing GS segment register is not allowed on linux-sgx since it is used as
the base address of thread data in 64-bit hw mode. Reported in #2252.
Disable writing it and disable segue optimization for linux-sgx platform.
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:
1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
`iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.
The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT
This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
i32.load, i64.load, f32.load, f64.load, v128.load,
i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.
Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
Add nightly (UTC time) checks with asan and ubsan, and also put gcc-4.8 build
to nightly run since we don't need to run it with every PR.
Co-authored-by: Maksim Litskevich <makslit@amazon.co.uk>
For some platforms WAMR gets compiled with `CONFIG_HAS_CLOCK_NANOSLEEP=1`,
while `clock_nanosleep` is not present at the platform, which causes compilation error.
Add check for macro `DISABLE_CLOCK_NANOSLEEP` to resolve the issue, only when
the macro isn't defined can the macro `CONFIG_HAS_CLOCK_NANOSLEEP` take effect.
Add VX delegation as an external delegation of TFLite, so that several NPU/GPU
(from VeriSilicon, NXP, Amlogic) can be controlled via WASI-NN.
Test Code can work with the X86 simulator.
Fix issue reported in #2172: wasm-c-api `wasm_func_call` may use a wrong exec_env
when multi-threading is enabled, with error "invalid exec env" reported
Fix issue reported in #2149: main instance's `c_api_func_imports` are not passed to
the counterpart of new thread's instance in wasi-threads mode
Fix issue of invalid size calculated to copy `c_api_func_imports` in pthread mode
And refactor the code to use `wasm_cluster_dup_c_api_imports` to copy the
`c_api_func_imports` to new thread for wasi-threads mode and pthread mode.
Currently, if a thread is spawned and raises an exception after the main thread
has finished, iwasm returns with success instead of returning 1 (i.e. error).
Since wasm_runtime_get_wasi_exit_code waits for all threads to finish and only
returns the wasi exit code, this PR performs the exception check again and
returns error if an exception was raised.
Since the Tensorflow library is already installed in many cases(especially in the
case of the embedded system), move the installation code to find_package.
According to the 1999 ISO C standard (C99), size_t is an unsigned integer type of
at least 16 bit (see sections 7.17 and 7.18.3), it may be uint32 in 32-bit platforms:
https://en.cppreference.com/w/cpp/types/size_t
Calling function `size_t min(size_t, size_t)` with two uint64 arguments may get
invalid result.
Co-authored-by: Georgii Rylov <godjan@amazon.co.uk>
Make `hmu_tree_node` struct packed and add 4 padding bytes before `kfc_tree_root_buf`
field in `gc_heap_struct` struct to ensure the `left/right/parent` fields in `hmu_tree_node`
are 8-byte aligned on the 64-bit target which doesn't support unaligned memory access.
Fix the issue reported in #2136.
- Translate all the opcodes of threads spec proposal for Fast JIT
- Add the atomic flag for Fast JIT load/store IRs to support atomic load/store
- Add new atomic related Fast JIT IRs and translate them in the codegen
- Add suspend_flags check in branch opcodes and before/after call function
- Modify CI to enable Fast JIT multi-threading test
Co-authored-by: TianlongLiang <tianlong.liang@intel.com>
In LLVM AOT/JIT compiler, only need to check the suspend_flags when memory is
a shared memory since the shared memory must be enabled for multi-threading,
so as not to impact the performance in non-multi-threading memory mode. Also
refine the LLVM IRs to check the suspend_flags.
And fix an issue of multi-tier jit for multi-threading, the instance of the child thread
should be removed from the instance list before it is de-instantiated.
In #1928 we added support for GCC 4.8 but we don't continuously test if it's
working. This PR added a GitHub actions job to test compilation on GCC 4.8
for interpreters and Fast JIT (LLVM JIT/AOT might be added in the future).
The compilation is done using ubuntu 14.04 image as that's the simplest way
to get GCC 4.8 compiler. The job only compiles the code but does not run any
tests.
Load memory data size in each time memory access boundary check in
multi-threading mode since it may be changed by other threads when
memory growing.
And use `memory->memory_data_size` instead of
`memory->num_bytes_per_page * memory->cur_page_count` to refine
the code.
When ref.func opcode refers to a function whose function index no smaller than
current function, the destination func should be forward-declared: it is declared
in the table element segments, or is declared in the export list.
In multi-threading, this line will eventually call `wasm_cluster_wait_for_all_except_self`:
`DEINIT_VEC(store->instances, wasm_instance_vec_delete)`
As the threads are joining they can call `wasm_interp_dump_call_stack` which tries to
use the module frames but they were already freed by this line:
`DEINIT_VEC(store->modules, wasm_module_vec_delete)`
This PR swaps the order that these are deleted so module is deleted after the instances.
Co-authored-by: Andrew Chambers <ncham@amazon.com>
Try using existing exec_env to execute wasm app's malloc/free func and
execute post instantiation functions. Create a new exec_env only when
no existing exec_env was found.
In some cases, the memory address of some variables may have 4 least significant
bytes set to zero. Because we cast the pointer to int, we look only at 4 least
significant bytes; the assertion may fail because 4 least significant bytes are 0.
Change bh_assert implementation to cast the assert expr to int64_t and it works
well with 64-bit architectures.
POLLRDNORM/POLLWRNORM may be not defined in uClibc, so replace them
with the equivalent POLLIN/POLLOUT.
Refer to https://www.man7.org/linux/man-pages/man2/poll.2.html
POLLRDNORM Equivalent to POLLIN
POLLWRNORM Equivalent to POLLOUT
Signed-off-by: Thomas Devoogdt <thomas.devoogdt@barco.com>
Update wasi-libc version to resolve the hang issue when running wasi-threads cases.
Implement custom sync primitives as a counterpart of `pthread_barrier_wait` to
attempt to replace pthread sync primitives since they seem to cause data races
when running with the thread sanitizer.
Use pre-created exec_env for instantiation and module_malloc/free,
use the same exec_env of the current thread to avoid potential
unexpected behavior.
And remove unnecessary shared_mem_lock in wasm_module_free,
which may cause dead lock.
Use the shared memory's shared_mem_lock to lock the whole atomic.wait and
atomic.notify processes, and use it for os_cond_reltimedwait and os_cond_notify,
so as to make the whole processes actual atomic operations:
the original implementation accesses the wait address with shared_mem_lock
and uses wait_node->wait_lock for os_cond_reltimedwait, which is not an atomic
operation.
And remove the unnecessary wait_map_lock and wait_lock, since the whole
processes are already locked by shared_mem_lock.