compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
In the AOT compiler, allow the user to control stack boundary check when the boundary
check is enabled (e.g. `wamrc --bounds-checks=1`). Now the code logic is:
1. When `--stack-bounds-checks` is not set, it will be the same value as `--bounds-checks`.
2. When `--stack-bounds-checks` is set, it will be the option value no matter what the
status of `--bounds-checks` is.
compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
Make wamrc normalize "arm64" to "aarch64v8". Previously the only way to
make the "arm64" target was to not specify a target on 64 bit arm-based
mac builds. Now arm64 and aarch64v8 are treated as the same.
Make aot_loader accept "aarch64v8" on arm-based apple (as well as
accepting legacy "arm64" based aot targets).
This also removes __APPLE__ and __MACH__ from the block that defaults
size_level to 1 since it doesn't seem to be supported for aarch64:
`LLVM ERROR: Only small, tiny and large code models are allowed on AArch64`
For JIT, we naturally use mach-o on macOS, where the section name
we currently use is not valid and ends up with the errors like:
```
LLVM ERROR: Global variable '__orc_lcl.aot_stack_sizes.0' has an invalid section specifier '.aot_stack_sizes': mach-o section specifier requires a segment and section separated by a comma.
```
Because the dedicated section is not necessary for JIT,
this commit simply stops using it.
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/3730
The table index in the call_indirect/return_call_indirect opcode should be
one byte 0x00 when ref-types/GC isn't enabled, and should be treated as
leb u32 when ref-types/GC is enabled.
And make aot compiler bail out if ref-types/GC is disabled by command line
argument while ref-types instructions are used.
compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
Support to get `wasm_memory_type_t memory_type` from API
`wasm_runtime_get_import_type` and `wasm_runtime_get_export_type`,
and then get shared flag, initial page cout, maximum page count
from the memory_type:
```C
bool
wasm_memory_type_get_shared(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_init_page_count(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_max_page_count(const wasm_memory_type_t memory_type);
```
- Add new API wasm_runtime_load_ex() in wasm_export.h
and wasm_module_new_ex in wasm_c_api.h
- Put aot_create_perf_map() into a separated file aot_perf_map.c
- In perf.map, function names include user specified module name
- Enhance the script to help flamegraph generations
Fix the warnings and issues reported:
- in Windows platform
- by CodeQL static code analyzing
- by Coverity static code analyzing
And update CodeQL script to build exception handling and memory features.
The stack profiler `aot_func#xxx` calls the wrapped function of `aot_func_internal#xxx`
by using symbol reference, but in some platform like xtensa, it’s translated into a native
long call, which needs to resolve the indirect address by relocation and breaks the XIP
feature which requires the eliminating of relocation.
The solution is to change the symbol reference into an indirect call through the lookup
table, the code will be like this:
```llvm
call_wrapped_func: ; preds = %stack_bound_check_block
%func_addr1 = getelementptr inbounds ptr, ptr %func_ptrs_ptr, i32 75
%func_tmp2 = load ptr, ptr %func_addr1, align 4
tail call void %func_tmp2(ptr %exec_env)
ret void
```
Implement the GC (Garbage Collection) feature for interpreter mode,
AOT mode and LLVM-JIT mode, and support most features of the latest
spec proposal, and also enable the stringref feature.
Use `cmake -DWAMR_BUILD_GC=1/0` to enable/disable the feature,
and `wamrc --enable-gc` to generate the AOT file with GC supported.
And update the AOT file version from 2 to 3 since there are many AOT
ABI breaks, including the changes of AOT file format, the changes of
AOT module/memory instance layouts, the AOT runtime APIs for the
AOT code to invoke and so on.
This increases the chance to use "short" calls.
Assumptions:
- LLVM preserves the order of functions in a module
- The wrapper function are smaller than the wrapped functions
- The target CPU has "short" PC-relative variation of call/jmp instructions
and they are preferrable over the "long" ones.
A motivation:
- To avoid some relocations for XIP, I want to use xtensa PC-relative
call instructions, which can only reach ~512KB.
Allow to invoke the quick call entry wasm_runtime_quick_invoke_c_api_import to
call the wasm-c-api import functions to speedup the calling process, which reduces
the data copying.
Use `wamrc --invoke-c-api-import` to generate the optimized AOT code, and set
`jit_options->quick_invoke_c_api_import` true in wasm_engine_new when LLVM JIT
is enabled.
And refactor the original perf support
- use WAMR_BUILD_LINUX_PERF as the cmake compilation control
- use WASM_ENABLE_LINUX_PERF as the compiler macro
- use `wamrc --enable-linux-perf` to generate aot file which contains fp operations
- use `iwasm --enable-linux-perf` to create perf map for `perf record`
Error is reported when executing `wamrc --target=thumb -o <aot_file> <wasm_file>`:
```
LLVM ERROR: failed to perform tail call elimination on a call site marked musttail
Aborted (core dumped)
```
Set `abi` to "gnu" for the bare-metal target when `abi` is NULL,
or the below `bh_assert` and `bh_memcpy` may deference a NULL
pointer. Error is reported when running wamrc compiled with
`cmake .. -DCMAKE_BUILD_TYPE=Debug`:
```
core/iwasm/compilation/aot_llvm.c:2584:13: runtime error:
null pointer passed as argument 1, which is declared to never be null
```
Set the vendor-sys of bare-metal targets to "-unknown-none-",
and currently only add "thumbxxx" to the bare-metal target list.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Adapt API usage to new interfaces where applicable, including LLVM function
usage, obsoleted llvm::Optional type and removal of unavailable headers.
Know issues:
- AOT static PGO isn't enabled
- LLVM JIT may run failed due to llvm_orc_registerEHFrameSectionWrapper
isn't linked into iwasm
- Fix windows wamrc link error: aot_generate_tempfile_name undefined.
- Clear windows compile warnings.
- And rename folder `samples/bh_atomic` and `samples/mem_allocator` to
`samples/bh-atomic` and `samples/mem-allocator`.
Fix some build errors when building wamrc with LLVM-13, reported in #2311
Fix some build warnings when building wamrc with LLVM-16:
```
core/iwasm/compilation/aot_llvm_extra2.cpp:26:26: warning:
‘llvm::None’ is deprecated: Use std::nullopt instead. [-Wdeprecated-declarations]
26 | return llvm::None;
```
Fix a maybe-uninitialized compile warning:
```
core/iwasm/compilation/aot_llvm.c:413:9: warning:
‘update_top_block’ may be used uninitialized in this function [-Wmaybe-uninitialized]
413 | LLVMPositionBuilderAtEnd(b, update_top_block);
```
Move the native stack overflow check from the caller to the callee because the
former doesn't work for call_indirect and imported functions.
Make the stack usage estimation more accurate. Instead of making a guess from
the number of wasm locals in the function, use the LLVM's idea of the stack size
of each MachineFunction. The former is inaccurate because a) it doesn't reflect
optimization passes, and b) wasm locals are not the only reason to use stack.
To use the post-compilation stack usage information without requiring 2-pass
compilation or machine-code imm rewriting, introduce a global array to store
stack consumption of each functions:
For JIT, use a custom IRCompiler with an extra pass to fill the array.
For AOT, use `clang -fstack-usage` equivalent because we support external llc.
Re-implement function call stack usage estimation to reflect the real calling
conventions better. (aot_estimate_stack_usage_for_function_call)
Re-implement stack estimation logic (--enable-memory-profiling) based on the new
machinery.
Discussions: #2105.
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:
1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
`iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.
The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.
Segue is an optimization technology which uses x86 segment register to store
the WebAssembly linear memory base address, so as to remove most of the cost
of SFI (Software-based Fault Isolation) base addition and free up a general
purpose register, by this way it may:
- Improve the performance of JIT/AOT
- Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller
- Reduce the compilation time of JIT/AOT
This PR uses the x86-64 GS segment register to apply the optimization, currently
it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled,
developer can use the option below to enable it for wamrc and iwasm(with LLVM
JIT enabled):
```bash
wamrc --enable-segue=[<flags>] -o output_file wasm_file
iwasm --enable-segue=[<flags>] wasm_file [args...]
```
`flags` can be:
i32.load, i64.load, f32.load, f64.load, v128.load,
i32.store, i64.store, f32.store, f64.store, v128.store
Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`,
and `--enable-segue` means all flags are added.
Acknowledgement:
Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this
technology and the great support and guidance!
Signed-off-by: Wenyong Huang <wenyong.huang@intel.com>
Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>