wasm-micro-runtime

mirror of https://github.com/bytecodealliance/wasm-micro-runtime.git synced 2025-10-20 16:01:24 +00:00

Author	SHA1	Message	Date
Huang Qi	10b18d85cd	Fix ExpandMemoryOpPass doesn't work properly (#2399 ) The old method may not work for some cases. This PR iterates over all instructions in the function, looking for memcpy, memmove and memset instructions, putting them into a set, and finally expands them into a loop one by one. And move this LLVM Pass after building the pipe line of pass builder to ensure that the memcpy/memmove/memset instrinsics are generated before applying the pass.	2023-07-29 10:28:09 +08:00
Wenyong Huang	24c6c6977b	Fix llvm jit failed to lookup aot_stack_sizes symbol issue (#2384 ) LVM JIT failed to lookup symbol "aot_stack_sizes" as it is an internal symbol, change to lookup "aot_stack_sizes_alias" instead. Reported in #2372.	2023-07-24 15:15:48 +08:00
Cengizhan Pasaoglu	57abdfdb5c	Fix typo (dwarf) in the codebase (#2367 ) In the codebase, the struct and functions were written without "f" for dwarf.	2023-07-19 17:58:52 +08:00
Huang Qi	aafea39b8c	Add "--enable-builtin-intrinsics=<flags>" option to wamrc (#2341 ) Refer to doc/xip.md for details.	2023-07-06 18:20:35 +08:00
YAMAMOTO Takashi	3bbf59ad45	wamrc: Warn on text relocations for XIP (#2340 )	2023-07-05 10:49:45 +08:00
Huang Qi	ae4069df41	Migrate ExpandMemoryOpPass to llvm new pass manager (#2334 ) Fix #2328	2023-07-04 17:17:15 +08:00
YAMAMOTO Takashi	1f89e446d9	Avoid switch lowering to lookup tables for XIP (#2339 ) Because it involves relocations for the table. (.Lswitch.table.XXX) Discussions: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2316	2023-07-04 16:48:32 +08:00
Huang Qi	44f4b4f062	Add "--enable-llvm-passes=<passes>" option to wamrc (#2335 ) Add "--enable-llvm-passes=<passes>" option to wamrc for customizing LLVM passes	2023-07-04 12:20:52 +08:00
YAMAMOTO Takashi	03418ef5ac	aot: Avoid possible relocations around "stack_sizes" for XIP mode (#2322 ) Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/2316 Lightly tested on riscv64 qemu.	2023-06-29 18:45:33 +08:00
YAMAMOTO Takashi	5831531449	aot: Move stack_sizes table to a dedicated section (#2317 ) To solve the "AOT module load failed: resolve symbol stack_sizes failed" issue. This PR partly fixes #2312 and was lightly tested on qemu armhf.	2023-06-27 16:18:14 +08:00
Wenyong Huang	ea78b89965	Fix wamrc build issues with LLVM 13 and LLVM 16 (#2313 ) Fix some build errors when building wamrc with LLVM-13, reported in #2311 Fix some build warnings when building wamrc with LLVM-16: ``` core/iwasm/compilation/aot_llvm_extra2.cpp:26:26: warning: ‘llvm::None’ is deprecated: Use std::nullopt instead. [-Wdeprecated-declarations] 26 \| return llvm::None; ``` Fix a maybe-uninitialized compile warning: ``` core/iwasm/compilation/aot_llvm.c:413:9: warning: ‘update_top_block’ may be used uninitialized in this function [-Wmaybe-uninitialized] 413 \| LLVMPositionBuilderAtEnd(b, update_top_block); ```	2023-06-27 08:59:49 +08:00
YAMAMOTO Takashi	cd7941cc39	AOT/JIT native stack bound check improvement (#2244 ) Move the native stack overflow check from the caller to the callee because the former doesn't work for call_indirect and imported functions. Make the stack usage estimation more accurate. Instead of making a guess from the number of wasm locals in the function, use the LLVM's idea of the stack size of each MachineFunction. The former is inaccurate because a) it doesn't reflect optimization passes, and b) wasm locals are not the only reason to use stack. To use the post-compilation stack usage information without requiring 2-pass compilation or machine-code imm rewriting, introduce a global array to store stack consumption of each functions: For JIT, use a custom IRCompiler with an extra pass to fill the array. For AOT, use `clang -fstack-usage` equivalent because we support external llc. Re-implement function call stack usage estimation to reflect the real calling conventions better. (aot_estimate_stack_usage_for_function_call) Re-implement stack estimation logic (--enable-memory-profiling) based on the new machinery. Discussions: #2105.	2023-06-22 07:27:07 +08:00
YAMAMOTO Takashi	92e073b8ce	AOTFuncContext: Remove a stale comment (#2283 )	2023-06-09 22:31:08 +08:00
YAMAMOTO Takashi	cabcb177c8	dwarf_extractor: Constify a bit (#2278 )	2023-06-09 09:52:03 +08:00
YAMAMOTO Takashi	6e3c3fe9ec	Fix build error with LLVM 16 (#2259 )	2023-06-06 13:45:18 +08:00
YAMAMOTO Takashi	5d69f364db	aot/jit: Set module layout (#2260 ) LLVM 15 and later sometimes perform wrong optimizations without this.	2023-06-06 10:18:16 +08:00
Wenyong Huang	8ef09be604	Fix compile error of wamrc with llvm-13/llvm-14 (#2261 )	2023-06-06 08:33:15 +08:00
Wenyong Huang	8d88471c46	Implement AOT static PGO (#2243 ) LLVM PGO (Profile-Guided Optimization) allows the compiler to better optimize code for how it actually runs. This PR implements the AOT static PGO, and is tested on Linux x86-64 and x86-32. The basic steps are: 1. Use `wamrc --enable-llvm-pgo -o <aot_file_of_pgo> <wasm_file>` to generate an instrumented aot file. 2. Compile iwasm with `cmake -DWAMR_BUILD_STATIC_PGO=1` and run `iwasm --gen-prof-file=<raw_profile_file> <aot_file_of_pgo>` to generate the raw profile file. 3. Run `llvm-profdata merge -output=<profile_file> <raw_profile_file>` to merge the raw profile file into the profile file. 4. Run `wamrc --use-prof-file=<profile_file> -o <aot_file> <wasm_file>` to generate the optimized aot file. 5. Run the optimized aot_file: `iwasm <aot_file>`. The test scripts are also added for each benchmark, run `test_pgo.sh` under each benchmark's folder to test the AOT static pgo.	2023-06-05 09:17:39 +08:00
Wenyong Huang	76be848ec3	Implement the segue optimization for LLVM AOT/JIT (#2230 ) Segue is an optimization technology which uses x86 segment register to store the WebAssembly linear memory base address, so as to remove most of the cost of SFI (Software-based Fault Isolation) base addition and free up a general purpose register, by this way it may: - Improve the performance of JIT/AOT - Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller - Reduce the compilation time of JIT/AOT This PR uses the x86-64 GS segment register to apply the optimization, currently it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled, developer can use the option below to enable it for wamrc and iwasm(with LLVM JIT enabled): ```bash wamrc --enable-segue=[<flags>] -o output_file wasm_file iwasm --enable-segue=[<flags>] wasm_file [args...] ``` `flags` can be: i32.load, i64.load, f32.load, f64.load, v128.load, i32.store, i64.store, f32.store, f64.store, v128.store Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`, and `--enable-segue` means all flags are added. Acknowledgement: Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this technology and the great support and guidance! Signed-off-by: Wenyong Huang <wenyong.huang@intel.com> Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>	2023-05-26 10:13:33 +08:00
YAMAMOTO Takashi	94204b90ad	aot_compile_op_call: Remove a wrong optimization (#2233 ) Unlike a tail-call, the caller of an ordinary recursive call doesn't necessarily return immediately.	2023-05-25 07:44:54 +08:00
YAMAMOTO Takashi	670567f8b3	core/iwasm/compilation: constify a bit (#2223 ) Just to make the code a bit easier to read.	2023-05-20 11:55:02 +08:00
YAMAMOTO Takashi	f759a1f960	A few changes related to WAMRC_LLC_COMPILER (#2218 ) Print `target triple` for wamrc and set target triple for the LLVM module. And update document.	2023-05-17 09:56:35 +08:00
YAMAMOTO Takashi	2b896c80ef	wamrc: Add --stack-usage option (#2158 )	2023-04-28 13:56:44 +08:00
Wenyong Huang	7e9bf9cdf5	Implement Fast JIT multi-threading feature (#2134 ) - Translate all the opcodes of threads spec proposal for Fast JIT - Add the atomic flag for Fast JIT load/store IRs to support atomic load/store - Add new atomic related Fast JIT IRs and translate them in the codegen - Add suspend_flags check in branch opcodes and before/after call function - Modify CI to enable Fast JIT multi-threading test Co-authored-by: TianlongLiang <tianlong.liang@intel.com>	2023-04-20 10:09:34 +08:00
Wenyong Huang	62fc486c20	Refine aot compiler check suspend_flags and fix issue of multi-tier jit (#2111 ) In LLVM AOT/JIT compiler, only need to check the suspend_flags when memory is a shared memory since the shared memory must be enabled for multi-threading, so as not to impact the performance in non-multi-threading memory mode. Also refine the LLVM IRs to check the suspend_flags. And fix an issue of multi-tier jit for multi-threading, the instance of the child thread should be removed from the instance list before it is de-instantiated.	2023-04-07 06:47:24 +08:00
Wenyong Huang	f279ba84ee	Fix multi-threading issues (#2013 ) - Implement atomic.fence to ensure a proper memory synchronization order - Destroy exec_env_singleton first in wasm/aot deinstantiation - Change terminate other threads to wait for other threads in wasm_exec_env_destroy - Fix detach thread in thread_manager_start_routine - Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread - Add lib-pthread and lib-wasi-threads compilation to Windows CI	2023-03-08 10:57:22 +08:00
Wenyong Huang	38c67b3f48	thread-mgr: Fix spread "wasi proc exit" exception and atomic.wait issues (#1988 ) Raising "wasi proc exit" exception, spreading it to other threads and then clearing it in all threads may result in unexpected behavior: the sub thread may end first, handle the "wasi proc exit" exception and clear exceptions of other threads, including the main thread. And when main thread's exception is cleared, it may continue to run and throw "unreachable" exception. This also leads to some assertion failed. Ignore exception spreading for "wasi proc exit" and don't clear exception of other threads to resolve the issue. And add suspend flag check after atomic wait since the atomic wait may be notified by other thread when exception occurs.	2023-02-24 20:05:39 +08:00
YAMAMOTO Takashi	7d3b2a8773	Make memory profiling show native stack usage (#1917 )	2023-02-01 11:52:15 +08:00
Huang Qi	f818f4c43f	Simplify fcmp intrinsic logic for AOT/XIP (#1881 )	2023-01-12 12:05:53 +08:00
liang.he	7401718311	Report error in instantiation when meeting unlinked import globals (#1859 )	2023-01-06 15:24:11 +08:00
Huang Qi	d5aa354d41	Return result directly if float cmp is called in AOT XIP (#1851 )	2022-12-30 16:45:39 +08:00
tonibofarull	ba5cdbee3a	Fix typo verify_module in aot_compiler.c (#1836 )	2022-12-26 12:24:23 +08:00
Wenyong Huang	14288f59b0	Implement Multi-tier JIT (#1774 ) Implement 2-level Multi-tier JIT engine: tier-up from Fast JIT to LLVM JIT to get quick cold startup by Fast JIT and better performance by gradually switching to LLVM JIT when the LLVM JIT functions are compiled by the backend threads. Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1302	2022-12-19 11:24:46 +08:00
dongsheng28849455	9083334f69	Fix XIP issue of handling 64-bit const in 32-bit target (#1803 ) - Handle i64 const like f64 const - Ensure i64/f64 const is stored on 8-byte aligned address	2022-12-13 12:45:26 +08:00
Huang Qi	f6bef1e604	Implement i32.rem_s and i32.rem_u intrinsic (#1789 )	2022-12-08 09:38:20 +08:00
Wenyong Huang	1652f22a77	Fix issues reported by Coverity (#1775 ) Fix some issues reported by Coverity and fix windows exception check with guard page issue	2022-12-01 19:24:13 +08:00
Wenyong Huang	ce3458da99	Refine AOT exception check when function return (#1752 ) Refine AOT exception check in the caller when returning from callee function, remove the exception check instructions when hw bound check is enabled to improve the performance: create guard page to trigger signal handler when exception occurs.	2022-11-30 20:18:28 +08:00
Wenyong Huang	96570cca22	Remove unused LLVM JIT wapper functions (#1747 ) Only create the necessary wrapper functions for LLVM JIT	2022-11-25 11:26:08 +08:00
Wenyong Huang	87c3195d47	Revert "Implement call Fast JIT function from LLVM JIT jitted code" (#1737 ) Reverts bytecodealliance/wasm-micro-runtime#1714, which was merged mistakenly.	2022-11-22 14:04:48 +08:00
Wenyong Huang	cf7b01ad82	Implement call Fast JIT function from LLVM JIT jitted code (#1714 ) Basically implement the Multi-tier JIT engine. And update document and wamr-test-suites script.	2022-11-21 10:42:18 +08:00
Wenyong Huang	6c16ff7654	Update document and clear compile warnings (#1701 ) Update build wasm app document, add how to set buildflags for Rust project to reduce the footprint. Clear Windows warnings and a shadow warning in aot_emit_numberic.c	2022-11-15 15:02:23 +08:00
Wenyong Huang	c70e1ebc3d	Avoid generating some unused LLVM IRs (#1696 ) Refine the generated LLVM IRs at the beginning of each LLVM AOT/JIT function to fasten the LLVM IR optimization: - Only create argv_buf if there are func calls in this function - Only create native stack bound if stack bound check is enabled - Only create aux stack info if there is opcode set_global_aux_stack - Only create native symbol if indirect_mode is enabled - Only create memory info if there are memory operations - Only create func_type_indexes if there is opcode call_indirect	2022-11-14 14:32:35 +08:00
Huang Qi	4b0660cf24	Fix missing float cmp for XIP (#1699 )	2022-11-14 11:58:38 +08:00
Wenyong Huang	7fd37190e8	Add control for the native stack check with hardware trap (#1682 ) Add a new options to control the native stack hw bound check feature: - Besides the original option `cmake -DWAMR_DISABLE_HW_BOUND_CHECK=1/0`, add a new option `cmake -DWAMR_DISABLE_STACK_HW_BOUND_CHECK=1/0` - When the linear memory hw bound check is disabled, the stack hw bound check will be disabled automatically, no matter what the input option is - When the linear memory hw bound check is enabled, the stack hw bound check is enabled/disabled according to the value of input option - Besides the original option `--bounds-checks=1/0`, add a new option `--stack-bounds-checks=1/0` for wamrc Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1677	2022-11-07 18:26:33 +08:00
Huang Qi	c8cacbd883	Add LLVM_BUILD_OP_OR_INTRINSIC to avoid code dup (#1672 )	2022-11-03 11:48:48 +08:00
Wenyong Huang	5b144c491d	Avoid initialize LLVM repeatedly (#1671 ) Currently we initialize and destroy LLVM environment in aot_create_comp_context and aot_destroy_comp_context, which are called in wasm_module_load/unload, and the latter may be invoked multiple times, which leads to duplicated LLVM initialization/destroy and may result in unexpected behaviors. Move the LLVM init/destroy into runtime init/destroy to resolve the issue.	2022-11-02 16:13:58 +08:00
liang.he	f1f6f4a125	Remove unused codes in AOT compiler (#1668 ) Remove the setup of JIT LLVMOrcIRTransformLayerSetTransform and LLVMOrcObjectTransformLayerSetTransform which is commented.	2022-11-02 08:32:16 +08:00
Huang Qi	94cecbe4cb	Fix XIP issues of fp to int cast and int rem/div (#1654 )	2022-11-01 20:29:07 +08:00
dongsheng28849455	e517dbc7b2	XIP adaptation for xtensa platform (#1636 ) Add macro WASM_ENABLE_WORD_ALING_READ to enable reading 1/2/4 and n bytes data from vram buffer, which requires 4-byte addr alignment reading. Eliminate XIP AOT relocations related to the below ones: i32_div_u, f32_min, f32_max, f32_ceil, f32_floor, f32_trunc, f32_rint	2022-10-31 17:25:24 +08:00
Wenyong Huang	ef21f0c951	Implement Fast JIT dump call stack and perf profiling (#1633 ) Implement dump call stack and perf profiling features for Fast JIT, and refine some code.	2022-10-27 09:28:32 +08:00

1 2 3 4

181 Commits