compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
- Implement TINY / STANDARD frame modes - tiny mode is only able to keep track on the IP
and func idx, STANDARD mode provides more capabilities (parameters, stack pointer etc.).
- Implement FRAME_PER_FUNCTION / FRAME_PER_CALL modes - frame per function adds
code at the beginning and at the end of each function for allocating / deallocating stack frame,
whereas in per-call mode the frame is allocated before each call. The exception is call to
the imported function, where frame-per-function mode also allocates the stack before the
`call` instruction (as it can't instrument the imported function).
At the moment TINY + FRAME_PER_FUNCTION is automatically enabled in case GC and perf
profiling are disabled and `values` call stack feature is not requested. In all the other cases
STANDARD + FRAME_PER_CALL is used.
STANDARD + FRAME_PER_FUNCTION and TINY + FRAME_PER_CALL are currently not
implemented but possible, and might be enabled in the future.
ps. https://github.com/bytecodealliance/wasm-micro-runtime/issues/3758
compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
In the AOT compiler, allow the user to control stack boundary check when the boundary
check is enabled (e.g. `wamrc --bounds-checks=1`). Now the code logic is:
1. When `--stack-bounds-checks` is not set, it will be the same value as `--bounds-checks`.
2. When `--stack-bounds-checks` is set, it will be the option value no matter what the
status of `--bounds-checks` is.
compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
Make wamrc normalize "arm64" to "aarch64v8". Previously the only way to
make the "arm64" target was to not specify a target on 64 bit arm-based
mac builds. Now arm64 and aarch64v8 are treated as the same.
Make aot_loader accept "aarch64v8" on arm-based apple (as well as
accepting legacy "arm64" based aot targets).
This also removes __APPLE__ and __MACH__ from the block that defaults
size_level to 1 since it doesn't seem to be supported for aarch64:
`LLVM ERROR: Only small, tiny and large code models are allowed on AArch64`
For JIT, we naturally use mach-o on macOS, where the section name
we currently use is not valid and ends up with the errors like:
```
LLVM ERROR: Global variable '__orc_lcl.aot_stack_sizes.0' has an invalid section specifier '.aot_stack_sizes': mach-o section specifier requires a segment and section separated by a comma.
```
Because the dedicated section is not necessary for JIT,
this commit simply stops using it.
Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/3730
The table index in the call_indirect/return_call_indirect opcode should be
one byte 0x00 when ref-types/GC isn't enabled, and should be treated as
leb u32 when ref-types/GC is enabled.
And make aot compiler bail out if ref-types/GC is disabled by command line
argument while ref-types instructions are used.
compilation on macos / build_samples_wasm_c_api ($CLASSIC_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_wasm_c_api ($FAST_INTERP_BUILD_OPTIONS, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz) (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_arm_macos.outputs.cache_key }}, macos-14, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-s… (push) Has been cancelled
compilation on macos / build_samples_others (${{ needs.build_llvm_libraries_on_intel_macos.outputs.cache_key }}, macos-13, https://github.com/WebAssembly/wabt/releases/download/1.0.31/wabt-1.0.31-macos-12.tar.gz, https://github.com/WebAssembly/wasi-sdk/releases/download/wasi… (push) Has been cancelled
Support to get `wasm_memory_type_t memory_type` from API
`wasm_runtime_get_import_type` and `wasm_runtime_get_export_type`,
and then get shared flag, initial page cout, maximum page count
from the memory_type:
```C
bool
wasm_memory_type_get_shared(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_init_page_count(const wasm_memory_type_t memory_type);
uint32_t
wasm_memory_type_get_max_page_count(const wasm_memory_type_t memory_type);
```
- Add new API wasm_runtime_load_ex() in wasm_export.h
and wasm_module_new_ex in wasm_c_api.h
- Put aot_create_perf_map() into a separated file aot_perf_map.c
- In perf.map, function names include user specified module name
- Enhance the script to help flamegraph generations
Fix the warnings and issues reported:
- in Windows platform
- by CodeQL static code analyzing
- by Coverity static code analyzing
And update CodeQL script to build exception handling and memory features.
The stack profiler `aot_func#xxx` calls the wrapped function of `aot_func_internal#xxx`
by using symbol reference, but in some platform like xtensa, it’s translated into a native
long call, which needs to resolve the indirect address by relocation and breaks the XIP
feature which requires the eliminating of relocation.
The solution is to change the symbol reference into an indirect call through the lookup
table, the code will be like this:
```llvm
call_wrapped_func: ; preds = %stack_bound_check_block
%func_addr1 = getelementptr inbounds ptr, ptr %func_ptrs_ptr, i32 75
%func_tmp2 = load ptr, ptr %func_addr1, align 4
tail call void %func_tmp2(ptr %exec_env)
ret void
```
Implement the GC (Garbage Collection) feature for interpreter mode,
AOT mode and LLVM-JIT mode, and support most features of the latest
spec proposal, and also enable the stringref feature.
Use `cmake -DWAMR_BUILD_GC=1/0` to enable/disable the feature,
and `wamrc --enable-gc` to generate the AOT file with GC supported.
And update the AOT file version from 2 to 3 since there are many AOT
ABI breaks, including the changes of AOT file format, the changes of
AOT module/memory instance layouts, the AOT runtime APIs for the
AOT code to invoke and so on.
This increases the chance to use "short" calls.
Assumptions:
- LLVM preserves the order of functions in a module
- The wrapper function are smaller than the wrapped functions
- The target CPU has "short" PC-relative variation of call/jmp instructions
and they are preferrable over the "long" ones.
A motivation:
- To avoid some relocations for XIP, I want to use xtensa PC-relative
call instructions, which can only reach ~512KB.
Allow to invoke the quick call entry wasm_runtime_quick_invoke_c_api_import to
call the wasm-c-api import functions to speedup the calling process, which reduces
the data copying.
Use `wamrc --invoke-c-api-import` to generate the optimized AOT code, and set
`jit_options->quick_invoke_c_api_import` true in wasm_engine_new when LLVM JIT
is enabled.
And refactor the original perf support
- use WAMR_BUILD_LINUX_PERF as the cmake compilation control
- use WASM_ENABLE_LINUX_PERF as the compiler macro
- use `wamrc --enable-linux-perf` to generate aot file which contains fp operations
- use `iwasm --enable-linux-perf` to create perf map for `perf record`
Error is reported when executing `wamrc --target=thumb -o <aot_file> <wasm_file>`:
```
LLVM ERROR: failed to perform tail call elimination on a call site marked musttail
Aborted (core dumped)
```
Set `abi` to "gnu" for the bare-metal target when `abi` is NULL,
or the below `bh_assert` and `bh_memcpy` may deference a NULL
pointer. Error is reported when running wamrc compiled with
`cmake .. -DCMAKE_BUILD_TYPE=Debug`:
```
core/iwasm/compilation/aot_llvm.c:2584:13: runtime error:
null pointer passed as argument 1, which is declared to never be null
```
Set the vendor-sys of bare-metal targets to "-unknown-none-",
and currently only add "thumbxxx" to the bare-metal target list.
Signed-off-by: Huang Qi <huangqi3@xiaomi.com>
Adapt API usage to new interfaces where applicable, including LLVM function
usage, obsoleted llvm::Optional type and removal of unavailable headers.
Know issues:
- AOT static PGO isn't enabled
- LLVM JIT may run failed due to llvm_orc_registerEHFrameSectionWrapper
isn't linked into iwasm
- Fix windows wamrc link error: aot_generate_tempfile_name undefined.
- Clear windows compile warnings.
- And rename folder `samples/bh_atomic` and `samples/mem_allocator` to
`samples/bh-atomic` and `samples/mem-allocator`.
Fix some build errors when building wamrc with LLVM-13, reported in #2311
Fix some build warnings when building wamrc with LLVM-16:
```
core/iwasm/compilation/aot_llvm_extra2.cpp:26:26: warning:
‘llvm::None’ is deprecated: Use std::nullopt instead. [-Wdeprecated-declarations]
26 | return llvm::None;
```
Fix a maybe-uninitialized compile warning:
```
core/iwasm/compilation/aot_llvm.c:413:9: warning:
‘update_top_block’ may be used uninitialized in this function [-Wmaybe-uninitialized]
413 | LLVMPositionBuilderAtEnd(b, update_top_block);
```
Move the native stack overflow check from the caller to the callee because the
former doesn't work for call_indirect and imported functions.
Make the stack usage estimation more accurate. Instead of making a guess from
the number of wasm locals in the function, use the LLVM's idea of the stack size
of each MachineFunction. The former is inaccurate because a) it doesn't reflect
optimization passes, and b) wasm locals are not the only reason to use stack.
To use the post-compilation stack usage information without requiring 2-pass
compilation or machine-code imm rewriting, introduce a global array to store
stack consumption of each functions:
For JIT, use a custom IRCompiler with an extra pass to fill the array.
For AOT, use `clang -fstack-usage` equivalent because we support external llc.
Re-implement function call stack usage estimation to reflect the real calling
conventions better. (aot_estimate_stack_usage_for_function_call)
Re-implement stack estimation logic (--enable-memory-profiling) based on the new
machinery.
Discussions: #2105.
LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code
for how it actually runs. This PR implements the AOT static PGO, and is tested on
Linux x86-64 and x86-32. The basic steps are:
1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>`
to generate an instrumented aot file.
2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run
`iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>`
to generate the raw profile file.
3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>`
to merge the raw profile file into the profile file.
4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>`
to generate the optimized aot file.
5. Run the optimized aot_file: `iwasm <aot_file>`.
The test scripts are also added for each benchmark, run `test_pgo.sh` under
each benchmark's folder to test the AOT static pgo.