LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:
1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
`iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.
The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT
This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
i32.load, i64.load, f32.load, f64.load, v128.load,
i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.
Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>
Add nightly (UTC time) checks with asan and ubsan, and also put gcc-4.8 build
to nightly run since we don't need to run it with every PR.
Co-authored-by: Maksim Litskevich <makslit@amazon.co.uk>
In #1928 we added support for GCC 4.8 but we don't continuously test if it's
working. This PR added a GitHub actions job to test compilation on GCC 4.8
for interpreters and Fast JIT (LLVM JIT/AOT might be added in the future).
The compilation is done using ubuntu 14.04 image as that's the simplest way
to get GCC 4.8 compiler. The job only compiles the code but does not run any
tests.
Use the shared memory's shared_mem_lock to lock the whole atomic.wait and
atomic.notify processes, and use it for os_cond_reltimedwait and os_cond_notify,
so as to make the whole processes actual atomic operations:
the original implementation accesses the wait address with shared_mem_lock
and uses wait_node->wait_lock for os_cond_reltimedwait, which is not an atomic
operation.
And remove the unnecessary wait_map_lock and wait_lock, since the whole
processes are already locked by shared_mem_lock.
- Implement atomic.fence to ensure a proper memory synchronization order
- Destroy exec_env_singleton first in wasm/aot deinstantiation
- Change terminate other threads to wait for other threads in
wasm_exec_env_destroy
- Fix detach thread in thread_manager_start_routine
- Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread
- Add lib-pthread and lib-wasi-threads compilation to Windows CI
The function always specified IPv4 socklen to sockaddr_to_bh_sockaddr(),
therefore the assertion was failing; however, sockaddr_to_bh_sockaddr()
never actually used socklen parameter, so we deleted it completely.
In the previous code, the `*port` is assigned before `getsockname`, so the caller
may be not able to get the actual port number assigned by system.
Move the assigning of `*port` to be after `getsockname` to resolve the issue.
The problem was found by a `Golang + WAMR (as CGO)` wrapped by EGO
in SGX Enclave.
`fstat()` in EGO returns dummy values:
- EGO uses a `mount` configuration to define the mount points that apply
the host file system presented to the Encalve.
- EGO has a different programming model: the entire application runs inside
the enclave. Manual ECALLs/OCALLs by application code are neither
required nor possible.
Add platform ego and add macro control for the return value checking of
`fd_determine_type_rights` in libc-wasi to resolve the issue.
In the esp-idf platform, Xtensa GCC 8.4.0 reported incompatible pointer warnings when
building with the lwip component.
Berkeley (POSIX) sockets uses composition in combination with type punning to handle
many protocol families, including IPv4 & IPv6. The type punning just has to be made
explicit with pointer casts from `sockaddr_in` for IPv4 to the generic `sockaddr`.
The current implementation throws a segmentation fault when padding
files using a large range, because the writing operation overflows the
source buffer, which was a single char.
IPFS previously assumed that the offset for the seek operation was related
to the start of the file (SEEK_SET). It now correctly checks the parameter
'whence' and computes the offset for SEEK_CUR (middle of the file) and
SEEK_END (end of the file).
Add thread_wait_list_end node for thread data and cond for Windows platform
to speedup the thread join and cond wait operations: no need to traverse the
wait list to get the end node to append the wait node.
A limitation of the current implementation of SGX IPFS in WAMR is that
it prevents to open files which are not in the current directory.
This restriction is lifted and can now open files in paths, similarly to the
WASI openat call, which takes into account the sandbox of the file system.
Add a new options to control the native stack hw bound check feature:
- Besides the original option `cmake -DWAMR_DISABLE_HW_BOUND_CHECK=1/0`,
add a new option `cmake -DWAMR_DISABLE_STACK_HW_BOUND_CHECK=1/0`
- When the linear memory hw bound check is disabled, the stack hw bound check
will be disabled automatically, no matter what the input option is
- When the linear memory hw bound check is enabled, the stack hw bound check
is enabled/disabled according to the value of input option
- Besides the original option `--bounds-checks=1/0`, add a new option
`--stack-bounds-checks=1/0` for wamrc
Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1677
Support to get/set recv_buf_size/send_buf_size/reuse_port/reuse_addr for wasm app
Add socket APIs for esp-idf platform
Add setsockopt for linux-sgx platform
The host embedder may new/delete wasm-c-api engine simultaneously
in multiple threads, which requires lock for the operations. Since there
isn't one time called global init/destroy APIs provided by wasm-c-api,
we define a global lock and initialize it with thread mutex initializer if
the platform supports that, and use it to lock the operations of engine.
If the platform doesn't support thread mutex initializer, we require
developer to create the lock by himself to ensure the thread-safe of the
engine operations.
1. Support cross building wamrc and installing it
2. Remove PIE flag for Windows to fix compilation error when compiled by clang
3. Support linking LLVM shared libs to help build with system default or custom
LLVM installation and reduce binary size.
This PR integrates an Intel SGX feature called Intel Protection File System Library (IPFS)
into the runtime to create, operate and delete files inside the enclave, while guaranteeing
the confidentiality and integrity of the data persisted. IPFS can be referred to here:
https://www.intel.com/content/www/us/en/developer/articles/technical/overview-of-intel-protected-file-system-library-using-software-guard-extensions.html
Introduce a cmake variable `WAMR_BUILD_SGX_IPFS`, when enabled, the files interaction
API of WASI will leverage IPFS, instead of the regular POSIX OCALLs. The implementation
has been written with light changes to sgx platform layer, so all the security aspects
WAMR relies on are conserved.
In addition to this integration, the following changes have been made:
- The CI workflow has been adapted to test the compilation of the runtime and sample
with the flag `WAMR_BUILD_SGX_IPFS` set to true
- Introduction of a new sample that demonstrates the interaction of the files (called `file`),
- Documentation of this new feature
Fix the issue reported by #1484:
Platform ESP-IDF broken for WAMR 1.0.0 with ESP-IDF 4.4.2
Let the dummy ftruncate only work with ESP-IDF earlier than 4.4.2
Implement more socket APIs, refer to #1336 and below PRs:
- Implement wasi_addr_resolve function (#1319)
- Fix socket-api byte order issue when host/network order are the same (#1327)
- Enhance sock_addr_local syscall (#1320)
- Implement sock_addr_remote syscall (#1360)
- Add support for IPv6 in WAMR (#1411)
- Implement ns lookup allowlist (#1420)
- Implement sock_send_to and sock_recv_from system calls (#1457)
- Added http downloader and multicast socket options (#1467)
- Fix `bind()` calls to receive the correct size of `sockaddr` structure (#1490)
- Assert on correct parameters (#1505)
- Copy only received bytes from socket recv buffer into the app buffer (#1497)
Co-authored-by: Marcin Kolny <mkolny@amazon.com>
Co-authored-by: Marcin Kolny <marcin.kolny@gmail.com>
Co-authored-by: Callum Macmillan <callumimacmillan@gmail.com>
And enable classic interpreter instead fast interpreter when llvm jit is enabled,
so as to fix the issue that llvm jit cannot handle opcode drop_64/select_64.
esp-idf: Make esp-idf support Libc WASI
1. Support to get WASM APP libs' DIR from upper layer
2. Add SSP support for esp-idf platform
3. Change the errno of readlinkat
Thread data should not be destroyed when thread exits, or other thread
may not be able to join it. This PR saves the thread data into thread data
list when thread exits, sets thread status and stores the return value, so
that other thread can join it.
Also set MEM_TOP_DOWN flag for Windows VirtualAlloc to yield LLVM
JIT relocation error.
And set opt/size level to 3 for LLVM JIT for future use, currently the flags
are not used by LLVM JIT.
Enhance the hw bound check reported in #1262:
When registering signal handlers for SIGSEGV & SIGBUS in boundary
check with hardware trap, preserve the previous handlers for signal
SIGSEGV and SIGBUS, and forward the signal to the preserved signal
handlers if it isn't handled by hw bound check.
Implement Go binding APIs of runtime, module and instance
Add sample, build scripts and update the document
Co-authored-by: venus-taibai <97893654+venus-taibai@users.noreply.github.com>
Add aot relocation for ".rodata.str" symbol to support more cases
Fix some coding style issues
Fix aot block/value stack destroy issue
Refine classic/fast interpreter codes
Clear compile warning of libc_builtin_wrapper.c in 32-bit platform
Implement Berkeley Socket API for Intel SGX
- bring Berkeley socket API in Intel SGX enclaves,
- adapt the documentation of the socket API to mention Intel SGX enclaves,
- adapt _iwasm_ in the mini-product _linux-sgx_ to support the same option as the one for _linux_,
- tested on the socket sample as provided by WAMR (the TCP client/server).
Refer to [Networking API design](https://github.com/WebAssembly/WASI/issues/370)
and [feat(socket): berkeley socket API v2](https://github.com/WebAssembly/WASI/pull/459):
- Support the socket API of synchronous mode, including `socket/bind/listen/accept/send/recv/close/shutdown`,
the asynchronous mode isn't supported yet.
- Support adding `--addr-pool=<pool1,pool2,..>` argument for command line to identify the valid ip address range
- Add socket-api sample and update the document
In some Linux systems whose kernel version is smaller than 2.6.38, the macro
MADV_HUGEPAGE isn't introduced yet which causes compilation error.
Add macro control to fix the compilation error.
Allow compilation on Windows MinGW, see build_wamr.md for more details.
Note that WASI and some other smallish details are still not supported, but
we have a starting point. See more discussion at #993
Implement pthread_cond_broadcast wrapper for lib-pthread
- support pthread_cond_broadcast wrapper for posix/linux-sgx/windows
- update document for building multi-thread wasm app with emcc
Refactor LLVM Orc JIT to actually enable the lazy compilation and speedup
the launching process:
https://llvm.org/docs/ORCv2.html#laziness
Main modifications:
- Create LLVM module for each wasm function, wrap it with thread safe module
so that the modules can be compiled parallelly
- Lookup function from aot module instance's func_ptrs but not directly call the
function to decouple the module relationship
- Compile the function when it is first called and hasn't been compiled
- Create threads to pre-compile the WASM functions parallelly when loading
- Set Lazy JIT as default, update document and build/test scripts
Currently when calling wasm_runtime_call_wasm() to invoke wasm function
with externref type argument from runtime embedder, developer needs to
use wasm_externref_obj2ref() to convert externref obj into an internal ref
index firstly, which is not convenient to developer.
To align with GC feature in which all the references passed to
wasm_runtime_call_wasm() can be object pointers directly, we change the
interface of wasm_runtime_call_wasm() to allow to pass object pointer
directly for the externref argument, and refactor the related codes, update
the related samples and the document.
The return address of pthread_get_stackaddr_np() in MacOS and NuttX
may be the base address or the end (boundary) address of the native stack,
if it is the end address, we get the base address according to it and the
stack size, so as to get the actual stack boundary address correctly.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Fix some issues on MacOS platform
- Enable libc-wasi by default
- Set target abi to "gnu" if it is not set for wamrc to avoid generating
object file of unsupported Mach-O format
- Set `<vendor>-<sys>` info according to target abi for wamrc to support
generating AOT file for other OSs but not current host
- Set cpu name if arch/abi/cpu are not set to avoid checking SIMD
capability failed
- Set size level to 1 for MacOS/Windows platform to avoid relocation type
unsupported warning
- Clear posix_memmap.c compiling warning
- Fix spec case test script issues, enable test spec cases on MacOS
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Various fixes and beautifications coordinated with @1c3t3a,
fixes 2 of the 3 all remaining issues from #892:
- enable to os_mmap executable memory
- fix os_malloc/os_realloc/os_free issues
- implement os_thread_get_stack_boundary
- add build scripts to include with esp-idf to use wamr as
an ESP-IDF component
- update sample and document
- use platform independent data types in debug-engine library
- add os_socket APIs and provide windows and posix implementation
- avoid using platform related header files in non-platform layer
- use format specifiers macros for sprintf and sscanf
- change thread handle type from uint64 to korp_tid
- add lock when sending socket packet to avoid thread racing
Use `PRIxxx` related macros to format the output strings so as to clear
compile warnings, e.g. PRIu32, PRId32, PRIX32, PRIX64 and so on.
And add the related macro definitions in platform_common.h if they
are not defined, as some compilers might not support them.
This PR introduces an implementation of the WAMR platform APIs for ESP-IDF and enables support for Espressif microcontrollers, and adds the documentation around how to build WAMR for ESP-IDF.
This PR is related to the following issues at WAMR: closes#883, #628, #449 and #668 as well as [#4735](https://github.com/espressif/esp-idf/issues/4735) at the esp-idf repo. It implements most functions required by [platform_api_vmcore.h](https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/shared/platform/include/platform_api_vmcore.h).
The PR works in interpreter mode on Esp32c3 and Esp32. For the AOT mode, currently errors occur on both platforms with `Guru Meditation Error`. It seems that the AOT code isn't run with shared memory as os_mmap() allocates memory with malloc() API, it is to be fixed in the future.
This patch enables huge page support for posix platforms for performance
reason, if the request size to mmap is larger than 2MB, then we use madvise
to set some pages to huge page.
And add macro control to enable tracing the mmap/munmap.
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Implement XIP (Execution In Place) feature for AOT mode to enable running the AOT code inside AOT file directly, without memory mapping the executable memory for AOT code and applying relocations for text section. Developer can use wamrc with "--enable-indirect-mode --disable-llvm-intrinsics" flags to generate the AOT file and run iwasm with "--xip" flag. Known issues: there might still be some relocations in the text section which access the ".rodata" like sections.
And also enable ARC target support for both interpreter mode and AOT mode.
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Implement wasm_runtime_init_thread_env() for Windows platform by calling os_thread_env_init(): if current thread is created by developer himself but not runtime, developer should call wasm_runtime_init_thread_env() to init the thread environment before calling wasm function, and call wasm_runtime_destroy_thread_env() before thread exits.
And clear compile warnings for Windows platform, fix compile error for AliOS-Things platform
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
And output detail info when install wasm app failed, update document and fix some compile warnings and errors.
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Enable to build wamrc with custom llvm, enable to auto detect processor on apple silicon, and fix some compile warnings.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Enable RISCV AOT support, the supported ABIs are LP64 and LP64D for riscv64, ILP32 and ILP32D for riscv32.
For wamrc:
use --target=riscv64/riscv32 to specify the target arch of output AOT file,
use --target-abi=lp64d/lp64/ilp32d/ilp32 to specify the target ABI,
if --target-abi isn't specified, by default lp64d is used for riscv64, and ilp32d is used for riscv32.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Co-authored-by: wenyongh <wenyong.huang@intel.com>
Implement Windows thread/mutex/cond related APIs to support Windows multi-thread feature
Change Windows HW boundary check implementation for multi-thread: change SEH to VEH
Fix wasm-c-api issue of getting AOTFunctionInstance by index, fix wasm-c-api compile warnings
Enable to build invokeNative_general.c with cmake variable
Fix several issues in lib-pthread
Disable two LLVM passes in multi-thread mode to reserve volatile semantic
Update docker script and document to build iwasm with Docker image
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Some libc APIs required by wasi native lib are missing in some Android API versions, only when the version >= 24, all APIs are supported. Add the missing APIs in android platform layer, leave them empty, report error and return failed if they are called. Also update CMakeLists.txt to enable libc wasi by default.
Co-authored-by: Wenyong Huang <wenyong.huang@intel.com>
Fix sample littlevgl/guil build issues on zephyr platform, modify code of zephyr platform by adding some macros to control some apis to fit different zephyr version, and align to zephyr-v2.3.0 for sample littlevgl/gui as the latest zephyr version (>=v2.4.0) requires that the thread stack for the thread to create must be defined by macro but not allocating stack memory dynamically.
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Enable to use BH_VPRINTF macro for platform Linux/Windows/Darwin/VxWorks to redirect the stdout output from platform os_printf/os_vprintf, or the wasi output from wasm app to the vprintf like callback function specified by BH_VPRINTF macro of cmake WAMR_BH_VPRINTF variable.
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
add more checks to enhance security
clear "wasi proc exit" exception before return to caller in wasm/aot call functions
fix memory profiling issue
change movdqa to movdqu in simd invokeNative asm codes to fix issue of unaligned address access
move setjmp/longjmp from libc-builtin to libc-emcc
fix zephyr platform compilation issue in latest zephyr version
Add the Windows COFF format support to wamr-compiler and iwasm can
load and excute it on Windows(X64) platform.
Signed-off-by: Wu Zhongmin <vwzm@live.com>
Signed-off-by: Xiaokang Qin <xiaokang.qxk@antgroup.com>
Co-authored-by: Wu Zhongmin <vwzm@live.com>
* Add Windows support for C-API and Runtime API libraries and examples.
Signed-off-by: Wu Zhongmin <vwzm@live.com>
Signed-off-by: Xiaokang Qin <xiaokang.qxk@antgroup.com>
* Address the review comments
Signed-off-by: Xiaokang Qin <xiaokang.qxk@antgroup.com>
* Rewrite the the bh_getopt to make it avaliable for more kinds of options
Signed-off-by: Wu Zhongmin <vwzm@live.com>
Signed-off-by: Xiaokang Qin <xiaokang.qxk@antgroup.com>
* Add the license header
Signed-off-by: Xiaokang Qin <xiaokang.qxk@antgroup.com>
Co-authored-by: Zhongmin Wu <vwzm@live.com>
Also implement native stack overflow check with hardware trap for 64-bit platforms
Refine classic interpreter and fast interpreter to improve performance
Update document
Update all links accordingly. Also update links to other repositories
whose branches have renamed.
The references to repositories whose branches have not renamed should be
referencing specific commits anyway, so reference those specific commits
by hash.
* Missing SGX SDK Include fixed
* Update shared_platform.cmake
* CMakeFile remove stdlib from untrusted part
* Added two times in function description zero as possible return value
* Update shared_platform.cmake
Co-authored-by: Joshua Heinemann <heineman@ibr.cs.tu-bs.de>
Co-authored-by: wenyongh <wenyong.huang@intel.com>
Add registration of libc-wasi to 'wasi_snapshot_preview1' to support cargo-wasi
change zephyr build method from cmake to west
fix problem when preserve space for local vars
fix wasi authority problem