Learning WebAssembly #2: Wasm Binary Format

Discovering the basic representation of WebAssembly: binary and text formats.


In the first part of this series, we created a simple Wasm program and executed it in a browser and as a server application. In this part, we will take a closer look at the generated Wasm binary code.

WebAssembly (Wasm) is a binary instruction format. Wat is a textual representation of Wasm to be read and edited by humans.

The following text focuses exclusively on the binary format. If you are only interested in the Wat textual format, feel free to skip directly to the third part.

Binary Wasm

In the first part of this series, we saw the following Wat code:

(module
  (func (export "main") 
        (result i32)
    i32.const 42 
    return))

Now, we are about to explore the same WebAssembly program in its binary representation.

After been compiled into the Wasm format, the program bytes read as follows:

00000000: 0061 736d
00000004: 0100 0000
00000008: 0105 0160
0000000c: 0001 7f03
00000010: 0201 0007
00000014: 0801 046d
00000018: 6169 6e00
0000001c: 000a 0701
00000020: 0500 412a
00000024: 0f0b

It is the binary format of WebAssembly. We will explore the code byte by byte.

The first four bytes represent the Wasm binary magic number \0asm; the next four bytes represent the Wasm binary version in a 32-bit format:

; Wasm magic number (\0asm) 
0000000: 0061 736d

; version
0000004: 0100 0000

In Wasm, modules are organized into sections (type section, function section, export section, etc.). The first byte of each section represents the section ID (1 for the section “type”) and section size (5 following bytes):

; section "type" (ID 1)
0000008: 01

; section size (5 bytes)
0000009: 05

The type section contains function signatures. Our example has one function with zero parameters and one return result of type i32 (32-bit integer):

; number of types (1)
0000000a: 01

; func
0000000b: 60

; number of parameters (0)
0000000c: 00

; number of results (1)
0000000d: 01

; result type i32
0000000c: 7f

After five bytes (the section size) a new section begins. In our example, ID 3 stands for a function section. The section stores indexes of the function signature:

; section "function" (ID 3)
0000000f: 03

; section size (2 bytes)
00000010: 02

; number of functions (1)
00000011: 01 

; index of the function (0)
00000012: 00  

Next, the export section (ID 7) follows. The section defines the export name with the index to our function:

; section "export" (ID 7)
0000013: 07

; section size (8 bytes)
0000014: 08

; number of exports (1)
0000015: 01

; length of the export name (4 bytes)
0000016: 04

; export name ("main")
0000017: 6d61 696e

; export kind (0 for function)
000001b: 00

; index of the exported function (0)
000001c: 00

The next section is a code section (ID 10) that represents the actual code of the function:

; section "code" (ID 10)
000001d: 0a

; section size (7 bytes)
000001e: 07

; number of functions (1)
000001f: 01

; function body size (5 bytes)
0000020: 05

; number of local declarations (0) 
0000021: 00

In our example, a numeric constant 42 (the answer) is pushed onto the stack and returned as the result:

; instruction i32.const 
0000022: 41

; i32 literal (42)
0000023: 2a

; return
0000024: 0f

The very last byte is simply the end of the function code:

; end of the function code
0000025: 0b

Summary

We have seen that the Wasm binary code is divided into a vector of sections. Our simple function is distributed into four sections: type, function, export, and code.

The whole program as pseudocode would read as follows:

Wasm magic number
version

section "type"
  func
  result type: i32

section "function"
  index of the function: 0

section "export"
  export name: "main"
  export kind: function
  index of the function: 0

section "code"
  i32.const 
  i32 literal: 42
  return

Further Steps

Our short excursion into the binary world of WebAssembly is over. I hope that it was just long enough to make a good impression on you and interesting enough for you not to get sick of the bytes.

For detailed information you might refer to the WebAssembly Core Specification.

In the next part of this series, we will delve deeper into the basics of Wat programming.

Stay tuned!