Learning WebAssembly #7: Introducing WASI

Accessing operating system features from Wasm with examples in Wat.


In the previous part of this series, we executed Wasm modules in Node.js, a popular JavaScript backend platform. In this part, we will see how to make system calls from Wasm.

WebAssembly System Interface (WASI) is a family of APIs designed as a new standard engine-independent non-web system-oriented interface for WebAssembly. It enables working with files, networking, and other operating system features directly from Wasm.

WASI focuses highly on security and portability. In fact, these principles are baked explicitly into the WASI APIs.

Sandboxing

WebAssembly is sandboxed. This means that Wasm code can’t talk directly to the operating system. In order for Wasm to do anything with system resources, the host (browser or another runtime like Node.js or Wasmtime) must import functions in the sandbox that the code can use.

In this way, the host can limit what a program can do on a program-by-program basis.

Albeit sandboxing makes systems more secure, the host can still import capabilities that had better not be there. WASI gives us a way to extend this security even further.

Security

With WASI, it is possible to attach permissions to different resources on a module-by-module basis. By default, a module has no access to any resource.

When a module needs to access a file in a specific directory, the permission must be explicitly passed to the module:

var wasi = new WASI({
  preopens: {
    '/sandbox': '/real/path/that/wasm/can/access'
  }
});

This brings Wasm closer to the principle of least privilege, where a module can only access the exact resources it needs to do its job.

Portability

Wasm is a portable binary format. This means Wasm code could be compiled once and run across a whole bunch of different machines.

The same Wasm module could be executed in a browser and, for example, Node.js runtime.

WASI in Node.js

Starting with version 12, Node.js provides an implementation of WASI:

const fs = require('fs');
const { WASI } = require('wasi');

const wasi = new WASI({
  args: process.argv,
  env: process.env,
  preopens: {
    '/sandbox': '/some/real/path'
  }
});

const importObject = { 
  wasi_unstable: wasi.wasiImport
};

(async () => {
  const wasm = await WebAssembly
    .compile(fs.readFileSync('hello.wasm'));

  const instance = await WebAssembly
    .instantiate(wasm, importObject);

  wasi.start(instance);
})();

The following Wat code shows using WASI to print a string to the standard output:

(module
  (import "wasi_unstable" "fd_write" 
    (func $fd_write (param i32 i32 i32 i32) 
                    (result i32)))

  (memory 1)
  (export "memory" (memory 0))
  (data (i32.const 8) "hello\n")

  (func $main (export "_start")
  
    ;; io vector within memory
    (i32.store (i32.const 0) (i32.const 8))
    (i32.store (i32.const 4) (i32.const 6))

    (call $fd_write
        (i32.const 1)  ;; file_descriptor
        (i32.const 0)  ;; *iovs
        (i32.const 1)  ;; iovs_len
        (i32.const 14) ;; nwritten
    )
    drop ;; drop the result from the stack
  )
)

As WASI is an experimental feature, the --experimental-wasi-unstable-preview1 CLI argument is needed for this example to run:

$ node --version
v14.15.3

$ node --experimental-wasi-unstable-preview1 wasi-node.js
hello

Let's take a closer look at the code above.

The fd_write function is part of the WASI API and it is imported by the runtime. It takes a file descriptor, a pointer to a list of IO vectors and its length, and an index of a place in memory to store the number of bytes written as parameters and returns the number of bytes written. The IO vector is the most interesting part.

An IO vector is a structure to describe a piece of data in memory. It consists of two 32-bit integers: the index of the memory chunk where the data starts, and the length of the data in bytes:

;; data starts on 8th byte in memory
(i32.store (i32.const 0) (i32.const 8))

;; data "hello\n" has length of 6 bytes
(i32.store (i32.const 4) (i32.const 6))

Our vector is stored in 8 bytes of memory starting on index 0.

Why are two 32-bit integers stored in 8 bytes? Easy numbers: one byte has 8 bits, a 32-bit type needs 32 / 8 = 4 bytes; two 32-bit types need 4 * 2 = 8 bytes.

As parameters of the fd_write function, we put 1 for the stdout file descriptor, 0 as the memory index of the list IO vectors, 1 as the length of the list (we have only one string to print / one IO vector in the list), and finally, an empty place in memory to store the number of bytes written (data in the linear memory ends on the 14th byte — 8 + 6 = 14).

And of course, we can run the same code with another WASI runtime like Wasmtime as well:

$ wasmtime run hello.wasm
hello

WASI Read and Write

To demonstrate both read and write WASI capabilities, we write a slightly more advanced Echo program that reads input from stdout and prints it to stdout afterward:

(module
  (import "wasi_unstable" "fd_read" 
    (func $fd_read 
      (param i32 i32 i32 i32) 
      (result i32)))
  (import "wasi_unstable" "fd_write" 
    (func $fd_write 
      (param i32 i32 i32 i32) 
      (result i32)))

  (memory 1)
  (export "memory" (memory 0))

  (func $main (export "_start")

    ;; buffer of 100 chars to read into
    (i32.store (i32.const 4) (i32.const 12))
    (i32.store (i32.const 8) (i32.const 100))

    (call $fd_read
      (i32.const 0) ;; 0 for stdin
      (i32.const 4) ;; *iovs
      (i32.const 1) ;; iovs_len
      (i32.const 8) ;; nread
    )
    drop

    (call $fd_write
      (i32.const 1) ;; 1 for stdout
      (i32.const 4) ;; *iovs 
      (i32.const 1) ;; iovs_len
      (i32.const 0) ;; nwritten
    )
    drop
  )
)

The fd_read function has a familiar signature; it takes a file descriptor, a list of IO vectors to read into, the list length, and a place in memory for the number of bytes read.

This time, we will allocate memory of 100 bytes:

(i32.store (i32.const 4) (i32.const 12))
(i32.store (i32.const 8) (i32.const 100))

We update the length of data (on the memory index 8) based on the number of bytes read, so the same data length will be printed afterward:

(call $fd_read
  (i32.const 0) ;; 0 for stdin
  (i32.const 4) ;; *iovs
  (i32.const 1) ;; iovs_len
  (i32.const 8) ;; nread
)

Without this trick, the whole buffer of 100 bytes would be written to stdout (with trailing null bytes).

It works as expected:

$ wasmtime run echo.wasm
hello, wasi!
hello, wasi!

Further Steps

This time, we have shown how to work with system resources from Wasm using the WASI API.

In the next part of this series we will leave Wat programming for a while and take a look at how to compile Wasm modules from different programming languages, such as C, Kotlin, and AssemblyScript.

Stay tuned!