wasm-micro-runtime/language-bindings/python/wasm-c-api/docs/design.md
Zhenwei Jin 419f5cbf9e
Merge and solve conflict for extended const (#4435)
* build(deps): Bump github/codeql-action from 3.28.18 to 3.28.19 (#4346)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.18 to 3.28.19.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Commits](https://github.com/github/codeql-action/compare/v3.28.18...v3.28.19)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.19
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* wasi_socket_ext.c: avoid tls to make this library-friendly (#4338)

* Enable aot memory64 sw bounds checks by default (#4350)

- enable aot memory64 sw bounds checks by default

* build(deps): Bump requests from 2.32.3 to 2.32.4 in /build-scripts (#4349)

Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* wasi_nn_types.h: remove a seemingly stale comment (#4348)

* add heap-type check for GC when ref.null (#4300)

- According to [Link 1](https://webassembly.github.io/gc/core/valid/instructions.html#xref-syntax-instructions-syntax-instr-ref-mathsf-ref-null-mathit-ht), we must ensure that the heap type is valid when ref.null.
- According to [Link 2](https://webassembly.github.io/gc/core/valid/types.html#heap-types), a heap type is considered valid if it is either a concrete heap type or an abstract heap type.

However, in this function, the check for abstract heap types (absheaptype) was clearly missing, so this condition needs to be added explicitly in the if statement.

- When GC is disabled, no change is needed.
- When GC is enabled, heap types in WAMR are LEB-encoded values ([Link 3](https://webassembly.github.io/gc/core/appendix/index-types.html)). Therefore, we must use read_leb_int32 to parse the heap type correctly. And we can compute the original type1 using type1 = (uint8)((int32)0x80 + heap_type);.

* wamr-wasi-extensions: add a cmake package to provide our wasi extension (#4344)

* wasi_ephemeral_nn.h: add a convenience wrapper header
* wamr-wasi-extensions: add a cmake package to provide our wasi extension

the sample app was tested with:
* wasmtime
* iwasm with https://github.com/bytecodealliance/wasm-micro-runtime/pull/4308

currently only contains wasi-nn.
maybe it makes sense to add lib-socket things as well.

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4288

* wasi_nn_openvino.c: remove the tensor layout adjustment logic (#4308)

the logic in question seems like an attempt to work around
some application bugs.
my wild guess is that it was for classification-example.
cf. https://github.com/bytecodealliance/wasmtime/issues/10867

* Update type validation in load_table_import() and load_table() (#4296)

Prevent from value type.

https://webassembly.github.io/spec/core/valid/types.html#table-types
https://webassembly.github.io/gc/core/syntax/types.html#reference-types

* Follow #4268 to deprecate wamr_ide-related components (#4341)

refer to: Bypass wamr_ide-related components from the release process. (#4268)

* clean up incompatible running mode checks in test script and ci (#4342)

Rearrange the content of do_execute_in_running_mode() in alphabetical
order. 

Add an incompatible check for x86_32. Now, all belows will be bypassed:
- jit, fast-jit, multi-tier-jit
- memory64
- multi-memory
- simd

* Update WABT downloads URL (#4357)

Plus, skip unsupported running mode instead quit during wamr compiler
test

* Modify AOT static PGO to conform to llvm-18 and add a CI job to test static PGO on the coremark benchmark (#4345)

* static PGO compatible with llvm18 and add CI job to test static PGO on coremark benchmark
* update comments and warning info, bitmaps section in llvm profdata shouldn't be used in PGO

* Collective fix for typos and minor bugs (#4369)

* wasi-nn: fix backend leak on multiple loads (#4366)

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4340

* build(deps): Bump github/codeql-action from 3.28.19 to 3.29.0 (#4371)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.19 to 3.29.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Commits](https://github.com/github/codeql-action/compare/v3.28.19...v3.29.0)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* add validation for array type in load_init_expr(GC only) (#4370)

* wasi_nn_openvino.c: remove broken xml check (#4365)

`xml.buf[xml.size]` check is broken because it accesses past
the end of the buffer.

anyway, openvino doesn't seem to care the NUL termination.

* wamr-wasi-extensions: add lib-socket things (#4360)

* improve installation steps for wasi-sdk and wabt on Windows (#4359)

* wasi_ephemeral_nn.h: prefix identfiers to avoid too generic names (#4358)

* wasi_nn_openvino.c: add a missing buffer overflow check in get_output (#4353)

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4351

* send an empty/error reply from server (#4362)

Signed-off-by: Su Yihan <yihan.su@intel.com>

* wasi_nn_openvino.c: remove pre/postprocessing and layout assumptions (#4361)

as wasi-nn doesn't have these concepts, the best we can do without
risking breaking certain applications here is to pass through tensors
as they are.

this matches wasmtime's behavior.

tested with:

* wasmtime classification-example
  (with this change, this example fails on tensor size mismatch
  instead of implicitly resizing it.)

* license-plate-recognition-barrier-0007, a converted version
  with non-fp32 output. [1]
  (with this change, this model outputs integers as expected.)

[1] cd7ebe313b/models/public/license-plate-recognition-barrier-0007

* add nn-cli example (#4373)

an example application with flexible cli options which
aims to allow us to perform any wasi-nn operations.

eg.
```
--load-graph=file=fixture/model.xml,file=fixture/model.bin,id=graph
--init-execution-context=graph-id=graph,id=ctx
--set-input=file=fixture/tensor.bgr,context-id=ctx,dim=1,dim=3,dim=224,dim=224
--compute=context-id=ctx
--get-output=context-id=ctx,file=output.bin
```

* wasi-nn: apply the shared library hack to darwin as well (#4374)

copied from the linux version.

i'm a bit skeptical with this workaround though.
it might be simpler to prohibit the use of wamr api in these
shared libraries. after all, what these libraries do is nothing
specific to wasm.

* wasi-nn: don't try to deinit uninitialized backend (#4375)

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4339

* core/iwasm/libraries/wasi-nn/test/build.sh: add a tip for intel mac (#4389)

i keep forgetting this and had to re-investigate it at least twice.
hopefully this can be helpful for others too.

* wasi_nn_tensorflowlite.cpp: reject non-fp32 input earlier (#4388)

this backend assumes fp32 here and there.
it's safer to reject unexpected inputs explicitly.

* Fix several issues related to night-run CI and test scripts. (#4385)

- remove duplicated options
- fix test script
- change ci to use binary

* core/iwasm/libraries/wasi-nn/test: use the correct version of keras (#4383)

* wasi-nn: fix tensor_data abi for wasi_ephemeral_nn (#4379)

it's "(list u8)" in the witx definition.

the new definition matches both of our own host definition
(struct tensor_wasm) and wasmtime.

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4352

* enable WAMR_BUILD_WASI_EPHEMERAL_NN by default (#4381)

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4326

* deprecate legacy WAMR-specific "wasi_nn" module (#4382)

wasi_nn.h: deprecate legacy "wasi_nn"

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4326

* wasi-nn: add minimum serialization on WASINNContext (#4387)

currently this is not necessary because context (WASINNContext) is
local to instance. (wasm_module_instance_t)

i plan to make a context shared among instances in a cluster when
fixing https://github.com/bytecodealliance/wasm-micro-runtime/issues/4313.
this is a preparation for that direction.

an obvious alternative is to tweak the module instance context APIs
to allow declaring some kind of contexts instance-local. but i feel,
in this particular case, it's more natural to make "wasi-nn handles"
shared among threads within a "process".

note that, spec-wise, how wasi-nn behaves wrt threads is not defined
at all because wasi officially doesn't have threads yet. i suppose, at
this point, that how wasi-nn interacts with wasi-threads is something
we need to define by ourselves, especially when we are using an outdated
wasi-nn version.

with this change, if a thread attempts to access a context while
another thread is using it, we simply make the operation fail with
the "busy" error. this is intended for the mimimum serialization to
avoid problems like crashes/leaks/etc. this is not intended to allow
parallelism or such.

no functional changes are intended at this point yet.

cf.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/4313
https://github.com/bytecodealliance/wasm-micro-runtime/issues/2430

* Improve spec test execution by adding retry logic for transient errors (#4393)

* wasi_nn_openvino.c: implement multiple models per instance (#4380)

tested with two models:
```
--load-graph=id=graph1,file=public/license-plate-recognition-barrier-0007/FP32/license-plate-recognition-barrier-0007.xml,file=public/license-plate-recognition-barrier-0007/FP32/license-plate-recognition-barrier-0007.bin \
--load-graph=id=graph2,file=classify/model.xml,file=classify/model.bin \
--init-execution-context=id=exec1,graph-id=graph1 \
--init-execution-context=id=exec2,graph-id=graph2 \
--set-input=context-id=exec1,dim=1,dim=24,dim=94,dim=3,file=out.bin \
--set-input=context-id=exec2,file=classify/banana-3x224x224-bgr.bin,dim=1,dim=3,dim=224,dim=224 \
--compute=context-id=exec1 \
--compute=context-id=exec2 \
--get-output=context-id=exec1,file=exec1-result.bin \
--get-output=context-id=exec2,file=exec2-result.bin
```

a detailed HOWTO: https://github.com/bytecodealliance/wasm-micro-runtime/pull/4380#issuecomment-2986882718

* wamr-wasi-extensions/socket: disable reference-types (#4392)

and add a comment to explain why.

* CI: fix the description of upload_url (#4407)

* wasi-nn: fix context lifetime issues (#4396)

* wasi-nn: fix context lifetime issues

use the module instance context api instead of trying to roll
our own with a hashmap. this fixes context lifetime problems mentioned in
https://github.com/bytecodealliance/wasm-micro-runtime/issues/4313.

namely,

* wasi-nn resources will be freed earlier now. before this change,
  they used to be kept until the runtime shutdown. (wasm_runtime_destroy)
  after this change, they will be freed together with the associated
  instances.

* wasm_module_inst_t pointer uniqueness assumption (which is wrong
  after wasm_runtime_deinstantiate) was lifted.

as a side effect, this change also makes a context shared among threads
within a cluster. note that this is a user-visible api/abi breaking change.
before this change, wasi-nn "handles" like wasi_ephemeral_nn_graph were
thread-local. after this change, they are shared among threads within
a cluster, similarly to wasi file descriptors. spec-wise, either behavior
should be ok simply because wasi officially doesn't have threads yet.
althogh i feel the latter semantics is more intuitive, if your application
depends on the thread-local behavior, this change breaks your application.

tested with wamr-wasi-extensions/samples/nn-cli, modified to
call each wasi-nn operations on different threads. (if you are
interested, you can find the modification at
https://github.com/yamt/wasm-micro-runtime/tree/yamt-nn-wip-20250619.)

cf.
https://github.com/bytecodealliance/wasm-micro-runtime/issues/4313
https://github.com/bytecodealliance/wasm-micro-runtime/issues/2430

* runtime_lib.cmake: enable WAMR_BUILD_MODULE_INST_CONTEXT for wasi-nn

as we do for wasi (WAMR_BUILD_LIBC_WASI)

* wasi_nn_tensorflowlite.cpp: fix get_output return size (#4390)

it should be byte size, not the number of (fp32) values.

i'm ambivalent about how to deal with the compatibility for
the legacy wamr-specific "wasi_nn". for now, i avoided changing it.
(so that existing tests using the legacy abi, namely test_tensorflow.c
and test_tensorflow_quantized.c, passes as they are.)
if we have any users who still want to use the legacy abi,
i suppose they consider the compatibility is more important
than the consistency with other backends.

cf. https://github.com/bytecodealliance/wasm-micro-runtime/issues/4376

* Refactor copy callstack feature (#4401)

- Change `WAMR_ENABLE_COPY_CALLSTACK` to `WAMR_BUILD_COPY_CALL_STACK`, as
  `WAMR_BUILD` is the prefix for a command line option.
- Change `WAMR_ENABLE_COPY_CALLSTACK` to `WASM_ENABLE_COPY_CALL_STACK`, as
  `WASM_ENABLE` is the prefix for a macro in the source code.
- Change `CALLSTACK` to `CALL_STACK` to align with the existing
  `DUMP_CALL_STACK` feature.
- Continue using `WASMCApiFrame` instead of `wasm_frame_t` outside of
  *wasm_c_api.xxx* to avoid a typedef redefinition warning, which is
  identified by Clang.

* loader: add type index checking (#4402)

* Fix handling of non-nullable global_type during global import (#4408)

* wasi_nn_llamacpp.c: make this compilable (#4403)

* fix bug in bh_vector when extending (#4414)

* Collective fix (#4413)

* Fix vector growth check and typos in core (#9)
* Fix resource cleanup in memory and running modes tests (#10)
* Add end of file empty line in wasm_running_modes_test.cc

* wasi-nn: make the host use the wasi_ephemeral_nn version of tensor_data (#4411)

the motivations:

* make the actual input size available to the backends.
  (currently the backends have to make a guess from shape/type.)

* make the host logic look a bit similar to wasi_ephemeral_nn.

this is a backend api/abi change.

* wasi_nn_llamacpp.c: fix buffer overruns in set_input (#4420)

note: for some reasons, wasmedge seems to ignore type/dimensions
for the input of ggml. some user code relies on it.
cf. https://github.com/second-state/WasmEdge-WASINN-examples/issues/196

note: despite the comment in our code, the input doesn't seem
nul-terminated.

* wasi_nn_llamacpp.c: remove an unused variable (#4415)

* Fix few shadow warnings (#4409)

- declaration of ‘memidx’ shadows a previous local
- declaration of ‘count’ shadows a previous local

* CI: build wamr-wasi-extensions (#4394)

* wamr-wasi-extensions: separate test scripts
also, allow to specify the prefix directory.
for the convenience of the CI.

* CI: build wamr-wasi-extensions
fragments are copied from compilation_on_macos.yml.
(thus intel copyright notice)

* wasi_nn_openvino.c: fix a debug build (#4416)

after "wasi_nn_openvino.c: implement multiple models per instance" change.
(https://github.com/bytecodealliance/wasm-micro-runtime/pull/4380)

* loader: fix a potential overflow issue (#4427)

* CI: revert SGX retry attempts (#4421)

* Revert "Improve spec test execution by adding retry logic for transient errors (#4393)"

This reverts commit 64cafaff1e.

* Revert "Add error handling for sgx ci (#4222)"

This reverts commit 8ad47897d1.

* implement extended const expr (#4318)

* add a toggle to enable extended const on wamrc (#4412)

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Su Yihan <yihan.su@intel.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: YAMAMOTO Takashi <yamamoto@midokura.com>
Co-authored-by: TianlongLiang <111852609+TianlongLiang@users.noreply.github.com>
Co-authored-by: Liu Jia <jia3.liu@intel.com>
Co-authored-by: liang.he <liang.he@intel.com>
Co-authored-by: Su Yihan <yihan.su@intel.com>
2025-07-01 10:39:44 +08:00

710 lines
28 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# how to implement a python binding of WAMR
A python language binding of Wasm runtime allows its users to call a set of APIs of
the runtime from the python world. Those APIs maybe implemented in C, C++, or Rust.
In the WAMR case, a python binding allows APIs in `core/iwasm/include/wasm_c_api.h`
to be used in the python scripts. To achieve that, we will create two kinds
of stuff: wrappers of structured data types and wrappers of functions under the
help of _ctypes_.
Cyptes is a tool in the standard library for creating Python bindings. It
provides a low-level toolset for loading shared libraries and marshaling
data between Python and C. Other options include _cffi_, _pybind11_,
_cpython_ and so on. Because we tend to make the binding depending on least
items. The built-in module, _ctypes_, is a good choice.
## General rules to marshal
The core of the idea of a language binding is how to translate different
representations of types in different language.
### load libraries
The `ctypes` supports locating a dynamic link library in a way similar to the
compiler does.
Currently, `ctypes.LoadLibrary` supports:
- `CDLL`. Those libraries use the standard C calling conversion.
- `OleDLL` and `WinDLL`. Those libraries use the `stdcall` calling conversion on
Windows only
### fundamental datatypes
_ctypes_ provides [primitive C compatiable data types](https://docs.python.org/3/library/ctypes.html#fundamental-data-types).
Like `c_bool`, `c_byte`, `c_int`, `c_long` and so on.
> `c_int` represents the _C_ `signed int` datatype. On platforms where
> `sizeof(int) == sizeof(long)` it is an alias to `c_long`.
| c datatypes | ctypes |
| ------------------- | ----------------------- |
| bool | c_bool |
| byte_t | c_ubyte |
| char | c_char |
| float32_t | c_float |
| float64_t | c_double |
| int32_t | c_int32 |
| int64_t | c_int64 |
| intptr_t | c_void_p |
| size_t | c_size_t |
| uint8_t | c_uint8 |
| uint32_t | c_uint32 |
| void | None |
| wasm_byte_t | c_ubyte |
| wasm_externkind_t | c_uint8 |
| wasm_memory_pages_t | c_uint32 |
| wasm_mutability_t | c_bool |
| wasm_table_size_t | c_uint32 |
| wasm_valkind_t | c_uint8 |
| wasm_data_type\* | POINTER(wasm_data_type) |
- `c_void_p` only represents `void *` only
- `None` represents `void` in function parameter lists and return lists
### structured datatypes
Create a corresponding concept for every native structured data type includes
`enum`, `struct` and `union`, in the python world.
#### Enum types
For example, if there is a `enum wasm_mutability_enum` in native.
```c
typedef uint8_t wasm_mutability_t;
enum wasm_mutability_enum {
WASM_CONST,
WASM_VAR
};
```
Use `ctypes.int`(or any integer types in ctypes) to represents its value directly.
```python
# represents enum wasm_mutability_enum
wasm_mutability_t = c_uint8
WASM_CONST = 0
WASM_VAR = 1
```
> C standard only requires "Each enumerated type shall be compatible with char,
> a signed integer type, or an unsigned integer type. The choice of the integer
> type is implementation-defined, but shall be capable of representing the
> values of all the members of the enumeration.
#### Struct types
If there is a `struct wasm_byte_vec_t` in native(in C).
```c
typedef struct wasm_byte_vec_t {
size_t size;
wasm_byte_t *data;
size_t num_elems;
size_t size_of_elem;
} wasm_byte_vec_t;
```
Use `ctypes.Structure` to create its corresponding data type in python.
```python
class wasm_byte_vec_t(ctypes.Structure):
_fileds_ = [
("size", ctypes.c_size_t),
("data", ctypes.POINTER(c_ubyte)),
("num_elems", ctypes.c_size_t),
("size_of_elem", ctypes.c_size_t),
]
```
a list of `Structures`
| name |
| ----------------- |
| wasm_engine_t |
| wasm_store_t |
| wasm_limits_t |
| wasm_valtype_t |
| wasm_functype_t |
| wasm_globaltype_t |
| wasm_tabletype_t |
| wasm_memorytype_t |
| wasm_externtype_t |
| wasm_importtype_t |
| wasm_exporttype_t |
| wasm_ref_t |
| wasm_ref_t |
| wasm_frame_t |
| wasm_trap_t |
| wasm_foreign_t |
| WASMModuleCommon |
| WASMModuleCommon |
| wasm_func_t |
| wasm_global_t |
| wasm_table_t |
| wasm_memory_t |
| wasm_extern_t |
| wasm_instance_t |
not supported `struct`
- wasm_config_t
If there is an anonymous `union` in native.
```c
typedef struct wasm_val_t {
wasm_valkind_t kind;
union {
int32_t i32;
int64_t i64;
float32_t f32;
float64_t f64;
} of;
} wasm_val_t;
```
Use `ctypes.Union` to create its corresponding data type in python.
```python
class _OF(ctypes.Union):
_fields_ = [
("i32", ctypes.c_int32),
("i64", ctypes.c_int64),
("f32", ctypes.c_float),
("f64", ctypes.c_double),
]
class wasm_val_t(ctypes.Structure):
_anonymous_ = ("of",)
_fields_ = [
("kind", ctypes.c_uint8)
("of", _OF)
]
```
### wrappers of functions
Foreign functions (C functions) can be accessed as attributes of loaded shared
libraries or an instance of function prototypes. Callback functions(python
functions) can only be accessed by instantiating function prototypes.
For example,
```c
void wasm_name_new(wasm_name_t* out, size_t len, wasm_byte_t [] data);
```
Assume there are:
- `class wasm_name_t` of python represents `wasm_name_t` of C
- `libiwasm` represents loaded _libiwasm.so_
If to access a c function like an attribute,
```python
def wasm_name_new(out, len, data):
_wasm_name_new = libiwasm.wasm_name_new
_wasm_name_new.argtypes = (ctypes.POINTER(wasm_name_t), ctypes.c_size_t, ctypes.POINTER(ctypes.c_ubyte))
_wasm_name_new.restype = None
return _wasm_name_new(out, len, data)
```
Or to instantiate a function prototype,
```python
def wasm_name_new(out, len, data):
return ctypes.CFUNCTYPE(None, (ctypes.POINTER(wasm_name_t), ctypes.c_size_t, ctypes.POINTER(ctypes.c_ubyte)))(
("wasm_name_new", libiwasm), out, len, data)
```
Now it is able to create a `wasm_name_t` with `wasm_name_new()` in python.
Sometimes, need to create a python function as a callback of c.
```c
wasm_trap_t* (*wasm_func_callback_t)(wasm_val_vec_t* args, wasm_val_vec_t *results);
```
Use `cyptes.CFUNCTYPE` to create a _pointer of function_
```python
def hello(args, results):
print("hello from a callback")
wasm_func_callback_t = ctypes.CFUNCTYPE(c_size_t, POINTER(wasm_val_vec_t), POINTER(wasm_val_vec_t))
hello_callback = wasm_func_callback_t(hello)
```
or with a decorator
```python
def wasm_func_cb_decl(func):
return @ctypes.CFUNCTYPE(ctypes.POINTER(wasm_trap_t), (ctypes.POINTER(wasm_val_vec_t), ctypes.POINTER(wasm_val_vec_t)))(func)
@wasm_func_cb_decl
def hello(args, results):
print("hello from a callback")
```
### programming tips
#### `struct` and `ctypes.Structure`
There are two kinds of `cytes.Structure` in `binding.py`.
- has `__field__` definition. like `class wasm_byte_vec_t(Structure)`
- doesn't have `__field__` definition. like `class wasm_config_t(Structure)`
Since, `ctypes` will create its C world _mirror_ variable according to `__field__`
information, `wasm_config_t()` will only create a python instance without binding
to any C variable. `wasm_byte_vec_t()` will return a python instance with an internal
C variable.
That is why `pointer(wasm_config_t())` is a NULL pointer which can not be dereferenced.
#### deal with pointers
`byref()` and `pointer()` are two functions can return a pointer.
```python
x = ctypes.c_int(2)
# use pointer() to creates a new pointer instance which would later be used in Python
x_ptr = ctypes.pointer(x)
...
struct_use_pointer = Mystruct()
struct_use_pointer.ptr = x_ptr
# use byref() pass a pointer to an object to a foreign function call
func(ctypes.byref(x))
```
The main difference is that `pointer()` does a lot more work since it
constructs a real pointer object. It is faster to use `byref(`) if don't need
the pointer object in Python itself(e.g. only use it as an argument to pass
to a function).
There is no doubt that `wasm_xxx_new()` which return type is `ctypes.POINTER`
can return a pointer. Plus, the return value of `wasm_xxx_t()` can also be
used as a pointer without casting by `byref` or `pointer`.
#### array
In [ctypes document](https://docs.python.org/3/library/ctypes.html#arrays),
it states that "The recommended way to create array types is by multiplying a
data type with a positive integer". So _multiplying a data type_ should be a
better way to create arrays
```python
from ctypes import *
class POINT(Structure):
_fields_ = ("x", c_int), ("y", c_int)
# multiplying a data type
# type(TenPointsArrayType) is <class '_ctypes.PyCArrayType'>
TenPointsArrayType = POINT * 10
# Instances are created in the usual way, by calling the class:
arr = TenPointsArrayType()
arr[0] = POINT(3,2)
for pt in arr:
print(pt.x, pt.y)
```
On both sides, it is OK to assign an array to a pointer.
```c
char buf[128] = {0};
char *ptr = buf;
```
```python
binary = wasm_byte_vec_t()
binary.data = (ctypes.c_ubyte * len(wasm)).from_buffer_copy(wasm)
```
#### exceptions and traps
Interfaces of _wasm-c-api_ have their return values to represent failures.
The python binding should just keep and transfer them to callers instead of
raising any additional exception.
The python binding should raise exceptions when the python partial is failed.
#### readonly buffer
```python
with open("hello.wasm", "rb") as f:
wasm = f.read()
binary = wasm_byte_vec_t()
wasm_byte_vec_new_uninitialized(byref(binary), len(wasm))
# create a ctypes instance (byte[] in c) and copy the content
# from wasm(bytearray in python)
binary.data = (ctypes.c_ubyte * len(wasm)).from_buffer_copy(wasm)
```
in the above example, `wasm` is a python-created readable buffer. It is not
writable and needs to be copied into a ctype array.
#### variable arguments
A function with _variable arguments_ makes it hard to specify the required
argument types for the function prototype. It leaves us one way to call it
directly without any arguments type checking.
```python
libc.printf(b"Hello, an int %d, a float %f, a string %s\n", c_int(1), c_double(3.14), "World!")
```
#### Use `c_bool` to represent `wasm_mutability_t `
- `True` for `WASM_CONST`
- `False` for `WASM_VALUE`
#### customize class builtins
- `__eq__` for comparation.
- `__repr__` for printing.
### bindgen.py
`bindgen.py` is a tool to create WAMR python binding automatically. `binding.py`
is generated. We should avoid modification on it. Additional helpers should go
to `ffi.py`.
`bindgen.py` uses _pycparser_. Visit the AST of `core/iwasm/include/wasm_c_api.h`
created by _gcc_ and generate necessary wrappers.
```python
from pycparser import c_ast
class Visitor(c_ast.NodeVisitor):
def visit_Struct(self, node):
pass
def visit_Union(self, node):
pass
def visit_TypeDef(self, node):
pass
def visit_FuncDecl(self, node):
pass
ast = parse_file(...)
v = Visitor()
v.visit(ast)
```
Before running _bindgen.py_, the shared library _libiwasm.so_ should be generated.
```bash
$ cd /path/to/wamr/repo
$ # if it is in linux
$ pushd product-mini/platforms/linux/
$ cmake -S . -B build ..
$ cmake --build build --target iwasm
$ popd
$ cd binding/python
$ python utils/bindgen.py
```
`wasm_frame_xxx` and `wasm_trap_xxx` only work well when enabling `WAMR_BUILD_DUMP_CALL_STACK`.
```bash
$ cmake -S . -B build -DWAMR_BUILD_DUMP_CALL_STACK=1 ..
```
## OOP wrappers
Based on the above general rules, there will be corresponding python
APIs for every C API in `wasm_c_api.h` with same name. Users can do procedural
programming with those.
In next phase, we will create OOP APIs. Almost follow the
[C++ version of wasm_c_api](https://github.com/WebAssembly/wasm-c-api/blob/master/include/wasm.hh)
## A big list
| WASM Concept | Procedural APIs | OOP APIs | OOP APIs methods |
| ------------ | -------------------------------- | ---------- | ---------------- |
| XXX_vec | wasm_xxx_vec_new | | list |
| | wasm_xxx_vec_new_uninitialized | | |
| | wasm_xxx_vec_new_empty | | |
| | wasm_xxx_vec_copy | | |
| | wasm_xxx_vec_delete | | |
| valtype | wasm_valtype_new | valtype | \_\_init\_\_ |
| | wasm_valtype_delete | | \_\_del\_\_ |
| | wasm_valtype_kind | | \_\_eq\_\_ |
| | wasm_valtype_copy | | |
| | _vector methods_ | | |
| functype | wasm_functype_new | functype | |
| | wasm_functype_delete | | |
| | wasm_functype_params | | |
| | wasm_functype_results | | |
| | wasm_functype_copy | | |
| | _vector methods_ | | |
| globaltype | wasm_globaltype_new | globaltype | \_\_init\_\_ |
| | wasm_globaltype_delete | | \_\_del\_\_ |
| | wasm_globaltype_content | | \_\_eq\_\_ |
| | wasm_globaltype_mutability | | |
| | wasm_globaltype_copy | | |
| | _vector methods_ | | |
| tabletype | wasm_tabletype_new | tabletype | \_\_init\_\_ |
| | wasm_tabletype_delete | | \_\_del\_\_ |
| | wasm_tabletype_element | | \_\_eq\_\_ |
| | wasm_tabletype_limits | | |
| | wasm_tabletype_copy | | |
| | _vector methods_ | | |
| memorytype | wasm_memorytype_new | memorytype | \_\_init\_\_ |
| | wasm_memorytype_delete | | \_\_del\_\_ |
| | wasm_memorytype_limits | | \_\_eq\_\_ |
| | wasm_memorytype_copy | | |
| | _vector methods_ | | |
| externtype | wasm_externtype_as_XXX | externtype | |
| | wasm_XXX_as_externtype | | |
| | wasm_externtype_copy | | |
| | wasm_externtype_delete | | |
| | wasm_externtype_kind | | |
| | _vector methods_ | | |
| importtype | wasm_importtype_new | importtype | |
| | wasm_importtype_delete | | |
| | wasm_importtype_module | | |
| | wasm_importtype_name | | |
| | wasm_importtype_type | | |
| | wasm_importtype_copy | | |
| | _vector methods_ | | |
| exportype | wasm_exporttype_new | exporttype | |
| | wasm_exporttype_delete | | |
| | wasm_exporttype_name | | |
| | wasm_exporttype_type | | |
| | wasm_exporttype_copy | | |
| | _vector methods_ | | |
| val | wasm_val_delete | val | |
| | wasm_val_copy | | |
| | _vector methods_ | | |
| frame | wasm_frame_delete | frame | |
| | wasm_frame_instance | | |
| | wasm_frame_func_index | | |
| | wasm_frame_func_offset | | |
| | wasm_frame_module_offset | | |
| | wasm_frame_copy | | |
| | _vector methods_ | | |
| trap | wasm_trap_new | trap | |
| | wasm_trap_delete | | |
| | wasm_trap_message | | |
| | wasm_trap_origin | | |
| | wasm_trap_trace | | |
| | _vector methods_ | | |
| foreign | wasm_foreign_new | foreign | |
| | wasm_foreign_delete | | |
| | _vector methods_ | | |
| engine | wasm_engine_new | engine | |
| | wasm_engine_new_with_args\* | | |
| | wasm_engine_new_with_config | | |
| | wasm_engine_delete | | |
| store | wasm_store_new | store | |
| | wasm_store_delete | | |
| | _vector methods_ | | |
| module | wasm_module_new | module | |
| | wasm_module_delete | | |
| | wasm_module_validate | | |
| | wasm_module_imports | | |
| | wasm_module_exports | | |
| instance | wasm_instance_new | instance | |
| | wasm_instance_delete | | |
| | wasm_instance_new_with_args\* | | |
| | wasm_instance_new_with_args_ex\* | | |
| | wasm_instance_exports | | |
| | _vector methods_ | | |
| func | wasm_func_new | func | |
| | wasm_func_new_with_env | | |
| | wasm_func_delete | | |
| | wasm_func_type | | |
| | wasm_func_call | | |
| | wasm_func_param_arity | | |
| | wasm_func_result_arity | | |
| | _vector methods_ | | |
| global | wasm_global_new | global | |
| | wasm_global_delete | | |
| | wasm_global_type | | |
| | wasm_global_get | | |
| | wasm_global_set | | |
| | _vector methods_ | | |
| table | wasm_table_new | table | |
| | wasm_table_delete | | |
| | wasm_table_type | | |
| | wasm_table_get | | |
| | wasm_table_set | | |
| | wasm_table_size | | |
| | _vector methods_ | | |
| memory | wasm_memory_new | memory | |
| | wasm_memory_delete | | |
| | wasm_memory_type | | |
| | wasm_memory_data | | |
| | wasm_memory_data_size | | |
| | wasm_memory_size | | |
| | _vector methods_ | | |
| extern | wasm_extern_delete | extern | |
| | wasm_extern_as_XXX | | |
| | wasm_XXX_as_extern | | |
| | wasm_extern_kind | | |
| | wasm_extern_type | | |
| | _vector methods_ | | |
not supported _functions_
- wasm_config_XXX
- wasm_module_deserialize
- wasm_module_serialize
- wasm_ref_XXX
- wasm_XXX_as_ref
- wasm_XXX_as_ref_const
- wasm_XXX_copy
- wasm_XXX_get_host_info
- wasm_XXX_set_host_info
## test
there will be two kinds of tests in the project
- unit test. located in `./tests`. driven by _unittest_. run by
`$ python -m unittest` or `$ make test`.
- integration test. located in `./samples`.
The whole project is under test-driven development. Every wrapper function will
have two kinds of test cases. The first kind is a positive case. It checks a
wrapper function with expected and safe arguments combinations. Its goal is the
function should work well with expected inputs. Another kind is a negative
case. It feeds unexpected arguments combinations into a wrapper function. Arguments
should include but not be limited to `None`. It ensures that the function will
gracefully handle invalid input or unexpected behaviors.
## distribution
### package
Create a python package named `wamr`. Users should import it after installation
just like any other python module.
```python
from wamr import *
```
### PyPI
Refer to [tutorial provided by PyPA](https://packaging.python.org/en/latest/tutorials/packaging-projects/).
Steps to publish WAMR Python library:
1. Creating `pyproject.toml` tells build tools (like pip and build) what is
required to build a project. An example .toml file uses _setuptools_
```toml
[build-system]
requires = [
"setuptools>=42",
"wheel"
]
build-backend = "setuptools.build_meta"
```
2. Configuring metadata tells build tools about a package (such as the name
and the version), as well as which code files to include
- Static metadata (`setup.cfg`): guaranteed to be the same every time.
It is simpler, easier to read, and avoids many common errors, like
encoding errors.
- Dynamic metadata (`setup.py`): possibly non-deterministic. Any items that
are dynamic or determined at install-time, as well as extension modules
or extensions to setuptools, need to go into setup.py.
**_Static metadata should be preferred_**. Dynamic metadata should be used
only as an escape hatch when necessary. setup.py used to be
required, but can be omitted with newer versions of setuptools and pip.
3. Including other files in the distribution
- For [source distribution](https://packaging.python.org/en/latest/glossary/#term-Source-Distribution-or-sdist):
It's usually generated using `python setup.py sdist`, providing metadata
and the essential source files needed for installing by a tool like pip,
or for generating a Built Distribution.
It includes our Python modules, pyproject.toml, metadata, README.md,
LICENSE. If you want to control what goes in this explicitly,
see [Including files in source distributions with MANIFEST.in](https://packaging.python.org/en/latest/guides/using-manifest-in/#using-manifest-in).
- For [final built distribution](https://packaging.python.org/en/latest/glossary/#term-Built-Distribution)
A Distribution format containing files and metadata that only need to be
moved to the correct location on the target system, to be installed.
e.g. `Wheel`
It will have the Python files in the discovered or listed Python packages.
If you want to control what goes here, such as to add data files,
see [Including Data Files](https://setuptools.pypa.io/en/latest/userguide/datafiles.html) from the [setuptools docs](https://setuptools.pypa.io/en/latest/index.html).
4. Generating distribution archives. These are archives that are uploaded to
the Python Package Index and can be installed by pip.
example using `setuptools`
```shell
python3 -m pip install --upgrade build
python3 -m build
```
generated files:
```shell
dist/
WAMR-package-0.0.1-py3-none-any.whl
WAMR-package-0.0.1.tar.gz
```
The `tar.gz` file is a _source archive_ whereas the `.whl file` is a
_built distribution_. Newer pip versions preferentially install built
distributions but will fall back to source archives if needed. You should
always upload a source archive and provide built archives for compatibility
reasons.
5. Uploading the distribution archives
- Register an account on https://pypi.org.
- To securely upload your project, youll need a
[PyPI API token](https://pypi.org/help/#apitoken). It can create at
[here](https://pypi.org/manage/account/#api-tokens), and the “Scope”
the setting needs to be “Entire account”.
- After registration, now twine can be used to upload the distribution packages.
```shell
# install twine
python3 -m pip install --upgrade twine
# --repository is https://pypi.org/ by default.
# You will be prompted for a username and password. For the username, use __token__. For the password, use the token value, including the pypi- prefix.
twine upload dist/*
```
after all, the python binding will be installed with
```shell
$ pip install wamr
```
PS: A example lifecycle of a python package
![python-package-lifecycle](images/python_package_life_cycle.png)
## CI
There are several parts:
- code format check.
- test. include running all unit test cases and examples.
- publish built distribution.