Reduce WASM_STACK_GUARD_SIZE a bit for posix-like platforms (#3350)

I found a few mistakes in my research on the stack consumption.
Update the comment and tweak WASM_STACK_GUARD_SIZE accordingly.
This commit is contained in:
YAMAMOTO Takashi 2024-04-24 17:18:58 +09:00 committed by GitHub
parent ca5209cd9c
commit 09a5be411f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -463,9 +463,12 @@
* *
* - w/o hw bound check, the intepreter loop * - w/o hw bound check, the intepreter loop
* *
* the classic interpreter wasm_interp_call_func_bytecode alone * the stack consumption heavily depends on compiler settings,
* seems to consume about 2600 bytes stack. * especially for huge functions like the classic interpreter's
* (with the default configuration for macOS/amd64) * wasm_interp_call_func_bytecode:
*
* 200 bytes (release build, macOS/amd64)
* 2600 bytes (debug build, macOS/amd64)
* *
* libc snprintf (used by eg. wasm_runtime_set_exception) consumes about * libc snprintf (used by eg. wasm_runtime_set_exception) consumes about
* 1600 bytes stack on macOS/amd64, about 2000 bytes on Ubuntu amd64 20.04. * 1600 bytes stack on macOS/amd64, about 2000 bytes on Ubuntu amd64 20.04.
@ -480,6 +483,10 @@
* Note: on platforms with lazy function binding, don't forget to consider * Note: on platforms with lazy function binding, don't forget to consider
* the symbol resolution overhead on the first call. For example, * the symbol resolution overhead on the first call. For example,
* on Ubuntu amd64 20.04, it seems to consume about 1500 bytes. * on Ubuntu amd64 20.04, it seems to consume about 1500 bytes.
* For some reasons, macOS amd64 12.7.4 seems to resolve symbols eagerly.
* (Observed with a binary with traditional non-chained fixups.)
* The latest macOS seems to apply chained fixups in kernel on page-in time.
* (thus it wouldn't consume userland stack.)
*/ */
#ifndef WASM_STACK_GUARD_SIZE #ifndef WASM_STACK_GUARD_SIZE
#if WASM_ENABLE_UVWASI != 0 #if WASM_ENABLE_UVWASI != 0
@ -489,15 +496,20 @@
/* /*
* Use a larger default for platforms like macOS/Linux. * Use a larger default for platforms like macOS/Linux.
* *
* For example, wasm_interp_call_func_bytecode + wasm_runtime_set_exception * For example, the classic intepreter loop which ended up with a trap
* would consume >4KB stack on x86-64 macOS. * (wasm_runtime_set_exception) would consume about 2KB stack on x86-64
* macOS. On Ubuntu amd64 20.04, it seems to consume a bit more.
* *
* Although product-mini/platforms/nuttx always overrides * Although product-mini/platforms/nuttx always overrides
* WASM_STACK_GUARD_SIZE, exclude NuttX here just in case. * WASM_STACK_GUARD_SIZE, exclude NuttX here just in case.
*/ */
#if defined(__APPLE__) || (defined(__unix__) && !defined(__NuttX__)) #if defined(__APPLE__) || (defined(__unix__) && !defined(__NuttX__))
#if BH_DEBUG != 0 /* assumption: BH_DEBUG matches CMAKE_BUILD_TYPE=Debug */
#define WASM_STACK_GUARD_SIZE (1024 * 5) #define WASM_STACK_GUARD_SIZE (1024 * 5)
#else #else
#define WASM_STACK_GUARD_SIZE (1024 * 3)
#endif
#else
/* /*
* Otherwise, assume very small requirement for now. * Otherwise, assume very small requirement for now.
* *