wasm-micro-runtime

mirror of https://github.com/bytecodealliance/wasm-micro-runtime.git synced 2025-10-20 07:51:09 +00:00

Author	SHA1	Message	Date
liang.he	99bbad8cdb	perf profiling: Adjust the calculation of execution time (#3089 )	2024-01-26 18:06:21 +08:00
Wenyong Huang	313ce8cb61	Fix memory/table segment checks in memory.init/table.init (#3081 ) According to the wasm core spec, the checks for the table segments in `table.init` opcode are similar to the checks for `memory.init` opcode: - The size of a passive segment is shrunk to zero after `data.drop` (or `elem.drop`) opcode is executed, and the segment can be used to do `memory.init` (or `table.init`) again - The `memory.init` only traps when `s+n > len(data.data)` or `d+n > len(mem.data)` and `table.init` only traps when `s+n > len(elem.elem)` or `d+n > len(tab.elem)` - The active segment can also be used to do `memory.init` (or `table.init`), while it behaves like a dropped passive segment https://github.com/WebAssembly/bulk-memory-operations/blob/master/proposals/bulk-memory-operations/Overview.md ``` Segments can also be shrunk to size zero by using the following new instructions: - data.drop: discard the data in an data segment - elem.drop: discard the data in an element segment An active segment is equivalent to a passive segment, but with an implicit memory.init followed by a data.drop (or table.init followed by a elem.drop) that is prepended to the module's start function. ``` ps. https://webassembly.github.io/spec/core/bikeshed/#-hrefsyntax-instr-memorymathsfmemoryinitx%E2%91%A0 https://webassembly.github.io/spec/core/bikeshed/#-hrefsyntax-instr-tablemathsftableinitxy%E2%91%A0 https://github.com/bytecodealliance/wasm-micro-runtime/issues/3020	2024-01-26 09:45:59 +08:00
liang.he	5c8b8a17a6	Enhancements on wasm function execution time statistic (#2985 ) Enhance the statistic of wasm function execution time, or the performance profiling feature: - Add os_time_thread_cputime_us() to get the cputime of a thread, and use it to calculate the execution time of a wasm function - Support the statistic of the children execution time of a function, and dump it in wasm_runtime_dump_perf_profiling - Expose two APIs: wasm_runtime_sum_wasm_exec_time wasm_runtime_get_wasm_func_exec_time And rename os_time_get_boot_microsecond to os_time_get_boot_us.	2024-01-17 09:51:54 +08:00
YAMAMOTO Takashi	18529253d8	interpreter: Simplify memory.grow a bit (#2899 )	2023-12-12 20:24:51 +08:00
Yage Hu	ef0cd22119	Fix memory size not updating after growing in interpreter (#2898 ) This commit fixes linear memory size not updating after growing. This causes `memory.fill` to throw an exception after `memory.grow`.	2023-12-12 08:36:59 +08:00
Maks Litskevich	63696ba603	Fix typo in CI config and suppress STORE_U8 in TSAN (#2802 ) This typo prevented sanitizers to work in the CI.	2023-12-11 09:16:30 +08:00
Enrico Loparco	0455071fc1	Access linear memory size atomically (#2834 ) Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2804	2023-11-29 20:27:17 +08:00
Huang Qi	0b29904f26	Fix configurable bounds checks typo (#2809 )	2023-11-21 17:32:45 +08:00
TianlongLiang	a57e70016a	Fix memory.init opcode issue in fast-interp (#2798 ) Fix fast interpreter didn't throw OOB exception correctly in some scenarios. Reported in #2797.	2023-11-20 16:25:43 +08:00
YAMAMOTO Takashi	562a5dd1b6	Fix data/elem drop (#2747 ) Currently, `data.drop` instruction is implemented by directly modifying the underlying module. It breaks use cases where you have multiple instances sharing a single loaded module. `elem.drop` has the same problem too. This PR fixes the issue by keeping track of which data/elem segments have been dropped by using bitmaps for each module instances separately, and add a sample to demonstrate the issue and make the CI run it. Also add a missing check of dropped elements to the fast-jit `table.init`. Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2735 Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2772	2023-11-18 08:50:16 +08:00
YAMAMOTO Takashi	24c4d256b3	Grab `cluster->lock` when modifying `exec_env->module_inst` (#2685 ) Fixes: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2680 And when switching back to the original module_inst, propagate exception if any. cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/2512	2023-11-09 18:56:02 +08:00
Wenyong Huang	d6bba13e86	Fix fast-interp "pre-compiled label offset out of range" issue (#2659 ) When labels-as-values is enabled in a target which doesn't support unaligned address access, 16-bit offset is used to store the relative offset between two opcode labels. But it is a little small and the loader may report "pre-compiled label offset out of range" error. Emitting 32-bit data instead to resolve the issue: emit label address in 32-bit target and emit 32-bit relative offset in 64-bit target. See also: https://github.com/bytecodealliance/wasm-micro-runtime/issues/2635	2023-10-24 10:47:17 +08:00
Enrico Loparco	00539620e9	Improve stack trace dump and fix coding guideline CI (#2599 ) Avoid the stack traces getting mixed up together when multi-threading is enabled by using exception_lock/unlock in dumping the call stacks. And remove duplicated call stack dump in wasm_application.c. Also update coding guideline CI to fix the clang-format-12 not found issue.	2023-09-29 10:52:54 +08:00
YAMAMOTO Takashi	51714c41c0	Introduce WASMModuleInstanceExtraCommon (#2429 ) Move the common parts of WASMModuleInstanceExtra and AOTModuleInstanceExtra into the new structure.	2023-08-08 09:35:29 +08:00
YAMAMOTO Takashi	91592429f4	Fix memory sharing (#2415 ) - Inherit shared memory from the parent instance, instead of trying to look it up by the underlying module. The old method works correctly only when every cluster uses different module. - Use reference count in WASMMemoryInstance/AOTMemoryInstance to mark whether the memory is shared or not - Retire WASMSharedMemNode - For atomic opcode implementations in the interpreters, use a global lock for now - Update the internal API users (wasi-threads, lib-pthread, wasm_runtime_spawn_thread) Fixes https://github.com/bytecodealliance/wasm-micro-runtime/issues/1962	2023-08-04 10:18:13 +08:00
Wenyong Huang	59b2099b68	Fix some check issues on table operations (#2392 ) Fix some check issues on table.init, table.fill and table.copy, and unify the check method for all running modes. Fix issue #2390 and #2096.	2023-07-27 21:53:48 +08:00
Marcin Kolny	0f4edf9735	Implement suspend flags as atomic variable (#2361 ) We have observed a significant performance degradation after merging https://github.com/bytecodealliance/wasm-micro-runtime/pull/1991 Instead of protecting suspend flags with a mutex, we implement the flags as atomic variable and only use mutex when atomics are not available on a given platform.	2023-07-21 08:27:09 +08:00
YAMAMOTO Takashi	228a3bed53	Fix unused warnings on disable_bounds_checks (#2347 )	2023-07-06 15:31:22 +08:00
Huang Qi	18092f86cc	Make memory access boundary check behavior configurable (#2289 ) Allow to use `cmake -DWAMR_CONFIGURABLE_BOUNDS_CHECKS=1` to build iwasm, and then run `iwasm --disable-bounds-checks` to disable the memory access boundary checks. And add two APIs: `wasm_runtime_set_bounds_checks` and `wasm_runtime_is_bounds_checks_enabled`	2023-07-04 16:21:30 +08:00
Wenyong Huang	76be848ec3	Implement the segue optimization for LLVM AOT/JIT (#2230 ) Segue is an optimization technology which uses x86 segment register to store the WebAssembly linear memory base address, so as to remove most of the cost of SFI (Software-based Fault Isolation) base addition and free up a general purpose register, by this way it may: - Improve the performance of JIT/AOT - Reduce the footprint of JIT/AOT, the JIT/AOT code generated is smaller - Reduce the compilation time of JIT/AOT This PR uses the x86-64 GS segment register to apply the optimization, currently it supports linux and linux-sgx platforms on x86-64 target. By default it is disabled, developer can use the option below to enable it for wamrc and iwasm(with LLVM JIT enabled): ```bash wamrc --enable-segue=[<flags>] -o output_file wasm_file iwasm --enable-segue=[<flags>] wasm_file [args...] ``` `flags` can be: i32.load, i64.load, f32.load, f64.load, v128.load, i32.store, i64.store, f32.store, f64.store, v128.store Use comma to separate them, e.g. `--enable-segue=i32.load,i64.store`, and `--enable-segue` means all flags are added. Acknowledgement: Many thanks to Intel Labs, UC San Diego and UT Austin teams for introducing this technology and the great support and guidance! Signed-off-by: Wenyong Huang <wenyong.huang@intel.com> Co-authored-by: Vahldiek-oberwagner, Anjo Lucas <anjo.lucas.vahldiek-oberwagner@intel.com>	2023-05-26 10:13:33 +08:00
Wenyong Huang	1e5f206464	Fix compile warnings on windows platform (#2208 )	2023-05-15 13:48:48 +08:00
Wenyong Huang	5fc48e3584	Fix interpreter read linear memory size for multi-threading (#2088 ) Load memory data size in each time memory access boundary check in multi-threading mode since it may be changed by other threads when memory growing. And use `memory->memory_data_size` instead of `memory->num_bytes_per_page * memory->cur_page_count` to refine the code.	2023-04-04 09:05:52 +08:00
Wenyong Huang	49d439a3bc	Fix/Simplify the atomic.wait/nofity implementations (#2044 ) Use the shared memory's shared_mem_lock to lock the whole atomic.wait and atomic.notify processes, and use it for os_cond_reltimedwait and os_cond_notify, so as to make the whole processes actual atomic operations: the original implementation accesses the wait address with shared_mem_lock and uses wait_node->wait_lock for os_cond_reltimedwait, which is not an atomic operation. And remove the unnecessary wait_map_lock and wait_lock, since the whole processes are already locked by shared_mem_lock.	2023-03-23 09:21:16 +08:00
Wenyong Huang	f279ba84ee	Fix multi-threading issues (#2013 ) - Implement atomic.fence to ensure a proper memory synchronization order - Destroy exec_env_singleton first in wasm/aot deinstantiation - Change terminate other threads to wait for other threads in wasm_exec_env_destroy - Fix detach thread in thread_manager_start_routine - Fix duplicated lock cluster->lock in wasm_cluster_cancel_thread - Add lib-pthread and lib-wasi-threads compilation to Windows CI	2023-03-08 10:57:22 +08:00
Enrico Loparco	e8d718096d	Add/reorganize locks for thread synchronization (#1995 ) Attempt to fix data races when using threads. - Protect access (from multiple threads) to exception and memory - Fix shared memory lock usage	2023-03-04 08:15:26 +08:00
Enrico Loparco	52e26e59cf	Add lock to protect the operations of accessing exec env (#1991 ) Data race may occur when accessing exec_env's fields, e.g. suspend_flags and handle. Add lock `exec_env->wait_lock` for them to resolve the issue.	2023-02-27 19:53:41 +08:00
Wenyong Huang	38c67b3f48	thread-mgr: Fix spread "wasi proc exit" exception and atomic.wait issues (#1988 ) Raising "wasi proc exit" exception, spreading it to other threads and then clearing it in all threads may result in unexpected behavior: the sub thread may end first, handle the "wasi proc exit" exception and clear exceptions of other threads, including the main thread. And when main thread's exception is cleared, it may continue to run and throw "unreachable" exception. This also leads to some assertion failed. Ignore exception spreading for "wasi proc exit" and don't clear exception of other threads to resolve the issue. And add suspend flag check after atomic wait since the atomic wait may be notified by other thread when exception occurs.	2023-02-24 20:05:39 +08:00
Enrico Loparco	216dc43ab4	Use shared memory lock for threads generated from same module (#1960 ) Multiple threads generated from the same module should use the same lock to protect the atomic operations. Before this PR, each thread used a different lock to protect atomic operations (e.g. atomic add), making the lock ineffective. Fix #1958.	2023-02-16 11:54:19 +08:00
YAMAMOTO Takashi	7d3b2a8773	Make memory profiling show native stack usage (#1917 )	2023-02-01 11:52:15 +08:00
Martin Klang	622cdbefd6	Prevent undefined behavior from c_api_func_imports == NULL (#1883 ) The module instance's c_api_func_imports may be NULL under some circumstances, add checks before accessing it.	2023-01-14 07:52:39 +08:00
Wenyong Huang	9d52960e4d	Fix wasm-c-api import func link issue in wasm_instance_new (#1787 ) When a wasm module is duplicated instantiated with wasm_instance_new, the function import info of the previous instantiation may be overwritten by the later instantiation, which may cause unexpected behavior. Store the function import info into the module instance to fix the issue.	2022-12-07 16:43:04 +08:00
Wenyong Huang	da7117a092	Refine the stack frame size check in interpreter (#1730 ) Limit max_stack_cell_num/max_csp_num to be no larger than UINT16_MAX, and don't check all_cell_num in interpreter again. And refine some codes in interpreter.	2022-11-22 15:32:48 +08:00
Wenyong Huang	440bbea81e	Fix interp/fast-jit float min/max issues (#1733 )	2022-11-22 09:14:20 +08:00
Xu Jun	032b9aa74b	Fix issue of restoring wasm operand stack (#1721 )	2022-11-18 18:51:13 +08:00
Wenyong Huang	7fd37190e8	Add control for the native stack check with hardware trap (#1682 ) Add a new options to control the native stack hw bound check feature: - Besides the original option `cmake -DWAMR_DISABLE_HW_BOUND_CHECK=1/0`, add a new option `cmake -DWAMR_DISABLE_STACK_HW_BOUND_CHECK=1/0` - When the linear memory hw bound check is disabled, the stack hw bound check will be disabled automatically, no matter what the input option is - When the linear memory hw bound check is enabled, the stack hw bound check is enabled/disabled according to the value of input option - Besides the original option `--bounds-checks=1/0`, add a new option `--stack-bounds-checks=1/0` for wamrc Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1677	2022-11-07 18:26:33 +08:00
Wenyong Huang	a182926a73	Refactor interpreter/AOT module instance layout (#1559 ) Refactor the layout of interpreter and AOT module instance: - Unify the interp/AOT module instance, use the same WASMModuleInstance/ WASMMemoryInstance/WASMTableInstance data structures for both interpreter and AOT - Make the offset of most fields the same in module instance for both interpreter and AOT, append memory instance structure, global data and table instances to the end of module instance for interpreter mode (like AOT mode) - For extra fields in WASM module instance, use WASMModuleInstanceExtra to create a field `e` for interpreter - Change the LLVM JIT module instance creating process, LLVM JIT uses the WASM module and module instance same as interpreter/Fast-JIT mode. So that Fast JIT and LLVM JIT can access the same data structures, and make it possible to implement the Multi-tier JIT (tier-up from Fast JIT to LLVM JIT) in the future - Unify some APIs: merge some APIs for module instance and memory instance's related operations (only implement one copy) Note that the AOT ABI is same, the AOT file format, AOT relocation types, how AOT code accesses the AOT module instance and so on are kept unchanged. Refer to: https://github.com/bytecodealliance/wasm-micro-runtime/issues/1384	2022-10-18 10:59:28 +08:00
Xu Jun	826cf4f8e1	Fix threads spec test issues (#1586 )	2022-10-13 13:53:09 +08:00
Wenyong Huang	ab929c20a3	Add check for code section size, fix interp float operations (#1480 ) And enable classic interpreter instead fast interpreter when llvm jit is enabled, so as to fix the issue that llvm jit cannot handle opcode drop_64/select_64.	2022-09-14 19:49:18 +08:00
FromLiQg	88bb4f3c81	Normalize wasm types (#1378 ) Normalize wasm types, for the two wasm types, if their parameter types and result types are the same, we only save one copy, so as to reduce the footprint and simplify the type comparison in opcode CALL_INDIRECT. And fix issue in interpreter globals_instantiate, and remove used codes.	2022-08-18 17:52:02 +08:00
Xu Jun	4b00432c1a	Fix dump call stack issue in interpreter (#1358 ) Fix dump call stack issue in interpreter introduced by hw bound check: the call stack isn't dumped if the exception is thrown and caught by signal handler. And restore the wasm stack frame to the original status after calling a wasm function.	2022-08-08 11:15:30 +08:00
Wenyong Huang	dd62b32b20	Fix interp hw bound check issues (#1322 ) Fix build script to enable hw bound check for interpreter when AOT is disabled, so as to enable spec cases test for interp with hw bound check. And fix the issues found.	2022-07-23 20:39:01 +08:00
Wenyong Huang	fd5030e02e	Implement interpreter hw bound check (#1309 ) Implement boundary check with hardware trap for interpreter on 64-bit platforms: - To improve the performance of interpreter and Fast JIT - To prepare for multi-tier compilation for the feature Linux/MacOS/Windows 64-bit are enabled.	2022-07-22 11:05:40 +08:00
liang.he	0f6e5a55a4	Fix sub module's aux stack info not synchronized to main module issue (#1279 ) Sub module's auxiliary stack boundary and bottom may be different from main module's counterpart, so when calling sub module, its aux stack info should be gotten and set to exec_env firstly, or aux stack overflow and out of bounds memory access exception may be thrown when calling sub module's function. Fix the issue reported in PR #1278.	2022-07-11 19:42:29 +08:00
Xu Jun	471cac4719	Enable dump call stack to a buffer (#1244 ) Enable dump call stack to a buffer, use API `wasm_runtime_get_call_stack_buf_size` to get the required buffer size and use API `wasm_runtime_dump_call_stack_to_buf` to dump call stack to a buffer	2022-06-25 21:38:43 +08:00
Wenyong Huang	d7a2888b18	Fix Windows compilation warnings (#1171 ) And update the Zephyr platform sample help, add arc target into usage list.	2022-05-16 09:12:58 +08:00
Xu Jun	98431225f2	Store import function pointer in module instance (#1130 ) Fix the issue reported by #1118 , use this approach since it avoids copying unnecessary static information into instance and reduces the footprint.	2022-04-27 20:21:51 +08:00
Wenyong Huang	adaaf348ed	Refine opcode br_table for classic interpreter (#1112 ) Refine opcode br_table for classic interpreter as there may be a lot of leb128 decoding when the br count is big: 1. Use the bytecode itself to store the decoded leb br depths if each decoded depth can be stored with one byte 2. Create br_table cache to store the decode leb br depths if the decoded depth cannot be stored with one byte After the optimization, the class interpreter can access the br depths array with index, no need to decode the leb128 again. And fix function record_fast_op() return value unchecked issue in source debugging feature.	2022-04-23 19:15:55 +08:00
YAMAMOTO Takashi	91222e1e44	interpreter: Fix an UBSan complaint in word_copy (#1106 ) Fix an UBSan complaint introduced by recent change by adding more checks to word_copy: ``` wasm_interp_fast.c:792:9: runtime error: applying zero offset to null pointer ```	2022-04-21 12:21:37 +08:00
Wenyong Huang	d6e781af28	Add more operand stack overflow checks for fast-interp (#1104 ) And clear some compile warnings on Windows	2022-04-20 16:19:12 +08:00
Wenyong Huang	d4758d7380	Refine codes and fix several issues (#1094 ) Add aot relocation for ".rodata.str" symbol to support more cases Fix some coding style issues Fix aot block/value stack destroy issue Refine classic/fast interpreter codes Clear compile warning of libc_builtin_wrapper.c in 32-bit platform	2022-04-18 17:33:30 +08:00

1 2

99 Commits