In macro bh_memcpy_s, bh_memcy_wa and bh_memmove_s, no need to do extra check
for length is zero or not because it was already done inside of the functions called.
We have observed a significant performance degradation after merging
https://github.com/bytecodealliance/wasm-micro-runtime/pull/1991
Instead of protecting suspend flags with a mutex, we implement the flags
as atomic variable and only use mutex when atomics are not available
on a given platform.
esp32-s3's instruction memory and data memory can be accessed through mutual mirroring way,
so we define a new feature named as WASM_MEM_DUAL_BUS_MIRROR.
Compilation in strict mode fails with
```
wasm_micro_runtime/core/shared/platform/android/platform_init.c:122:30:
error: declaration of 'struct epoll_event` will not be visible outside of this
function [-Werror,-Wvisibility]
epoll_pwait(int epfd, struct epoll_event *events, int maxevents, int timeout,
^
1 error generated.
```
Co-authored-by: Misha Gridnev <gridman@google.com>
Writing GS segment register is not allowed on linux-sgx since it is used as
the base address of thread data in 64-bit hw mode. Reported in #2252.
Disable writing it and disable segue optimization for linux-sgx platform.
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:
1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
`iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.
The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT
This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
i32.load, i64.load, f32.load, f64.load, v128.load,
i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.
Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
Add nightly (UTC time) checks with asan and ubsan, and also put gcc-4.8 build
to nightly run since we don't need to run it with every PR.
Co-authored-by: Maksim Litskevich <makslit@amazon.co.uk>
Make `hmu_tree_node` struct packed and add 4 padding bytes before `kfc_tree_root_buf`
field in `gc_heap_struct` struct to ensure the `left/right/parent` fields in `hmu_tree_node`
are 8-byte aligned on the 64-bit target which doesn't support unaligned memory access.
Fix the issue reported in #2136.
In #1928 we added support for GCC 4.8 but we don't continuously test if it's
working. This PR added a GitHub actions job to test compilation on GCC 4.8
for interpreters and Fast JIT (LLVM JIT/AOT might be added in the future).
The compilation is done using ubuntu 14.04 image as that's the simplest way
to get GCC 4.8 compiler. The job only compiles the code but does not run any
tests.
In some cases, the memory address of some variables may have 4 least significant
bytes set to zero. Because we cast the pointer to int, we look only at 4 least
significant bytes; the assertion may fail because 4 least significant bytes are 0.
Change bh_assert implementation to cast the assert expr to int64_t and it works
well with 64-bit architectures.
Use the shared memory's shared_mem_lock to lock the whole atomic.wait and
atomic.notify processes, and use it for os_cond_reltimedwait and os_cond_notify,
so as to make the whole processes actual atomic operations:
the original implementation accesses the wait address with shared_mem_lock
and uses wait_node->wait_lock for os_cond_reltimedwait, which is not an atomic
operation.
And remove the unnecessary wait_map_lock and wait_lock, since the whole
processes are already locked by shared_mem_lock.
- Implement atomic.fence to ensure a proper memory synchronization order
- Destroy exec_env_singleton first in wasm/aot deinstantiation
- Change terminate other threads to wait for other threads in
wasm_exec_env_destroy
- Fix detach thread in thread_manager_start_routine
- Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread
- Add lib-pthread and lib-wasi-threads compilation to Windows CI
The function always specified IPv4 socklen to sockaddr_to_bh_sockaddr(),
therefore the assertion was failing; however, sockaddr_to_bh_sockaddr()
never actually used socklen parameter, so we deleted it completely.
In the previous code, the `*port` is assigned before `getsockname`, so the caller
may be not able to get the actual port number assigned by system.
Move the assigning of `*port` to be after `getsockname` to resolve the issue.
The problem was found by a `Golang + WAMR (as CGO)` wrapped by EGO
in SGX Enclave.
`fstat()` in EGO returns dummy values:
- EGO uses a `mount` configuration to define the mount points that apply
the host file system presented to the Encalve.
- EGO has a different programming model: the entire application runs inside
the enclave. Manual ECALLs/OCALLs by application code are neither
required nor possible.
Add platform ego and add macro control for the return value checking of
`fd_determine_type_rights` in libc-wasi to resolve the issue.
In the esp-idf platform, Xtensa GCC 8.4.0 reported incompatible pointer warnings when
building with the lwip component.
Berkeley (POSIX) sockets uses composition in combination with type punning to handle
many protocol families, including IPv4 & IPv6. The type punning just has to be made
explicit with pointer casts from `sockaddr_in` for IPv4 to the generic `sockaddr`.
The current implementation throws a segmentation fault when padding
files using a large range, because the writing operation overflows the
source buffer, which was a single char.
IPFS previously assumed that the offset for the seek operation was related
to the start of the file (SEEK_SET). It now correctly checks the parameter
'whence' and computes the offset for SEEK_CUR (middle of the file) and
SEEK_END (end of the file).
Add thread_wait_list_end node for thread data and cond for Windows platform
to speedup the thread join and cond wait operations: no need to traverse the
wait list to get the end node to append the wait node.
A limitation of the current implementation of SGX IPFS in WAMR is that
it prevents to open files which are not in the current directory.
This restriction is lifted and can now open files in paths, similarly to the
WASI openat call, which takes into account the sandbox of the file system.
Add a new options to control the native stack hw bound check feature:
- Besides the original option `cmake -DWAMR_DISABLE_HW_BOUND_CHECK=1/0`,
add a new option `cmake -DWAMR_DISABLE_STACK_HW_BOUND_CHECK=1/0`
- When the linear memory hw bound check is disabled, the stack hw bound check
will be disabled automatically, no matter what the input option is
- When the linear memory hw bound check is enabled, the stack hw bound check
is enabled/disabled according to the value of input option
- Besides the original option `--bounds-checks=1/0`, add a new option
`--stack-bounds-checks=1/0` for wamrc
Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1677
Support to get/set recv_buf_size/send_buf_size/reuse_port/reuse_addr for wasm app
Add socket APIs for esp-idf platform
Add setsockopt for linux-sgx platform
Add macro WASM_ENABLE_WORD_ALING_READ to enable reading
1/2/4 and n bytes data from vram buffer, which requires 4-byte addr
alignment reading.
Eliminate XIP AOT relocations related to the below ones:
i32_div_u, f32_min, f32_max, f32_ceil, f32_floor, f32_trunc, f32_rint