Learning WebAssembly #2: Wasm Binary Format
Discovering the basic representation of WebAssembly: binary and text formats.
In the first part of this series, we created a simple Wasm program and executed it in a browser and as a server application. In this part, we will take a closer look at the generated Wasm binary code.
WebAssembly (Wasm) is a binary instruction format. Wat is a textual representation of Wasm to be read and edited by humans.
The following text focuses exclusively on the binary format. If you are only interested in the Wat textual format, feel free to skip directly to the third part.
Binary Wasm
In the first part of this series, we saw the following Wat code:
(module (func (export "main") (result i32) i32.const 42 return))
Now, we are about to explore the same WebAssembly program in its binary representation.
After been compiled into the Wasm format, the program bytes read as follows:
00000000: 0061 736d 00000004: 0100 0000 00000008: 0105 0160 0000000c: 0001 7f03 00000010: 0201 0007 00000014: 0801 046d 00000018: 6169 6e00 0000001c: 000a 0701 00000020: 0500 412a 00000024: 0f0b
It is the binary format of WebAssembly. We will explore the code byte by byte.
The first four bytes represent the Wasm binary magic number \0asm
; the next four bytes represent the Wasm binary version in a 32-bit format:
; Wasm magic number (\0asm) 0000000: 0061 736d ; version 0000004: 0100 0000
In Wasm, modules are organized into sections (type section, function section, export section, etc.). The first byte of each section represents the section ID (1 for the section “type”) and section size (5 following bytes):
; section "type" (ID 1) 0000008: 01 ; section size (5 bytes) 0000009: 05
The type section contains function signatures. Our example has one function with zero parameters and one return result of type i32
(32-bit integer):
; number of types (1) 0000000a: 01 ; func 0000000b: 60 ; number of parameters (0) 0000000c: 00 ; number of results (1) 0000000d: 01 ; result type i32 0000000c: 7f
After five bytes (the section size) a new section begins. In our example, ID 3 stands for a function section. The section stores indexes of the function signature:
; section "function" (ID 3) 0000000f: 03 ; section size (2 bytes) 00000010: 02 ; number of functions (1) 00000011: 01 ; index of the function (0) 00000012: 00
Next, the export section (ID 7) follows. The section defines the export name with the index to our function:
; section "export" (ID 7) 0000013: 07 ; section size (8 bytes) 0000014: 08 ; number of exports (1) 0000015: 01 ; length of the export name (4 bytes) 0000016: 04 ; export name ("main") 0000017: 6d61 696e ; export kind (0 for function) 000001b: 00 ; index of the exported function (0) 000001c: 00
The next section is a code section (ID 10) that represents the actual code of the function:
; section "code" (ID 10) 000001d: 0a ; section size (7 bytes) 000001e: 07 ; number of functions (1) 000001f: 01 ; function body size (5 bytes) 0000020: 05 ; number of local declarations (0) 0000021: 00
In our example, a numeric constant 42
(the answer) is pushed onto the stack and returned as the result:
; instruction i32.const 0000022: 41 ; i32 literal (42) 0000023: 2a ; return 0000024: 0f
The very last byte is simply the end of the function code:
; end of the function code 0000025: 0b
Summary
We have seen that the Wasm binary code is divided into a vector of sections. Our simple function is distributed into four sections: type, function, export, and code.
The whole program as pseudocode would read as follows:
Wasm magic number version section "type" func result type: i32 section "function" index of the function: 0 section "export" export name: "main" export kind: function index of the function: 0 section "code" i32.const i32 literal: 42 return
Further Steps
Our short excursion into the binary world of WebAssembly is over. I hope that it was just long enough to make a good impression on you and interesting enough for you not to get sick of the bytes.
For detailed information you might refer to the WebAssembly Core Specification.
In the next part of this series, we will delve deeper into the basics of Wat programming.
Stay tuned!