About SharedArrayBuffer & Atomics

Update

Andrea Giammarchi
10 min readJun 15, 2023

I’ve managed to polyfill SharedArrayBuffer and Atomics operations in a way that is:

  • unobtrusive, you just import stuff from a module
  • fast, as in: close to native performance for waitAsync operations
  • correct, as in: even Atomics.wait behaves the way you’d expect from a worker

This module is called sabayon and you can find it here 👋

How can I test SharedArrayBuffer?

To start with, and for debugging or developing sake, we need to bootstrap a localhost server that enables all the headers out of the box, reason I cannot easily share any live example in this post, as I don’t know which service enables these headers in the wild.

I also don’t know how easy or difficult it is for popular solutions to add such headers, but since I couldn’t bother myself investigating this bit, I just use static-handler in this way:

npx static-handler --cors --coop --coep --corp -p 8080 ./

# or ...
npx mini-coi ./

If curious in what headers are served when those flags are used, feel free to check the source code in GitHub.

If also curious to learn more about these headers, or why these are needed at all, please check this article.

You should now be able to reach http://localhost:8080/ at this point, but bear in mind static handler doesn’t reveal folder files or anything else, it literally serves assets every time these are reached via URL or 404 otherwise.

Understanding threads data roundtrip

Before we start playing around with postMessage and Atomics, I’d like to explain how data travels back and forward across threads:

  • the main thread at the top level (the visited page) cannot be blocking so it can postMessage any data compatible with structured clone algorithm but it can only Atomics.waitAsync results. Because it’s not blocking, the main thread can also keep listening to message events while waiting for results.
  • any worker thread though, could use Atomics.wait instead after a postMessage but because wait() is blocking, no event loop will ever dispatch or react to any listener while waiting for changes.
// main thread - demo / dummy example
const worker = new Worker('./worker.js');
worker.addEventListener('message', doSomethingWhileWaiting);
worker.postMessage({sharedBuffer, anyData});
// listeners and event loop running while awaiting notifications
await Atomics.waitAsync(new Int32Array(sharedBuffer), 0);

// worker thread - demo / dumm
postMessage({sharedBuffer, anyData});
// execution blocked until the buffer is notified
// no queueMicrotask, no listeners, no setInterval/Timeout
Atomics.wait(new Int32Array(sharedBuffer), 0);

Bear in mind that while postMessage can send almost any kind of data with ease, the only data we can read after through the buffer is binary, hence we better use some convention to receive results (i.e. JSON or similar libraries so that we can have easy to read and consume data back).

Communication strategies

There are at least 2 ways to exchange data with shared buffers:

  • one pass: the buffer is big enough to ensure all needed data/operations will fit into the pre allocated memory for such buffer
  • multi pass: the buffer is always minimal and it’s used first to ask for how much memory is actually needed to resolve the task and secondly to populate the new buffer with the needed/required length

The first case is more common in WASM or ASM related operations as SharedArrayBuffer is immutable in terms of allocated memory, but it might result into greedier memory consumption, example: we could ask for 10MB of RAM thinking that’s enough to run a WASM module or interpreter, and let it ask for more memory once the limit is reached, but on average maybe the module would use only a few KBs. Use many modules in the same page and we can say “bye bye” to our RAM, specially if we have many pages opened in multiple tabs or browsers.

The latter case might instead be relatively slower (we’ll see how much in a bit) but it guarantees that no more RAM than needed will ever be consumed and used RAM can be quickly freed garbage collected after.

As previously mentioned, there are 2 scenarios for Atomics to consider:

  • the main thread asking for exchange via a worker (async on the main thread, sync or async within the worker)
  • the worker asking for exchanges via the main thread (it could be async but that’s boring so we’ll see the sync version in here)

Each scenario could have either a single-pass strategy or a multi-pass one.

Main to Worker — Single Pass

Let’s start with the use case of needing heavy, non blocking, computation via a Worker: we need a m2w.html page, and a m2w.js file.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Main to Worker</title>
<script type="module">
// this is a module just ot have top level await enabled

// define enugh length to contain all data we want
// in this case we want to receive two integers for
// a match, to know the final score of each team
const length = 2 * Int32Array.BYTES_PER_ELEMENT;

// bear in mind, to be usable with Atomics the length
// must be one usable by Int32Array (or BigInt64Array)
const sab = new SharedArrayBuffer(length);

// use a compatible view for both atomics and result
const match = new Int32Array(sab);

// spin a worker and send over the buffer
const worker = new Worker('./m2w.js');
worker.postMessage(sab);

// wait to be notified at index zero (enough for this case)
await Atomics.waitAsync(match, 0).value; // note .value

const team1 = Atomics.load(match, 0);
const team2 = Atomics.load(match, 1);

// show the result in the body
document.body.textContent =
`Team ${team1 > team2 ? 1 : 2} WON !!!`;
</script>
</head>
</html>

Why Int32Array?

Beside being the right view choice for this example, there are limitations around views one can use to wait or notify a SharedArrayBuffer, so that Int32Array and BigInt64Array are the only options.

addEventListener('message', ({data: sab}) => {
// wrap the SharedArrayBuffer in a way we can
// both write results and notify back once we've done
const match = new Int32Array(sab);

// play a random game up to 3 points
let team1 = 0;
let team2 = 0;

while ((team1 + team2) < 3) {
if (Math.random() < .5)
Atomics.store(match, 0, ++team1);
else
Atomics.store(match, 1, ++team2);
}

// notify the match is completed
Atomics.notify(match, 0);
});

Please note that while Atomics operation are kinda strictly better than just reading or writing indexes, when we’re sure no overflow could possibly happen, we could also just assign directly results without having to worry about verbosity.

That means that both ways are equivalent:

// main page code after awaiting
// instead of Atomics.load
const [team1, team2] = match;

// worker code to set values
while ((team1 + team2) < 3) {
// instead of Atomics.store
if (Math.random() < .5)
match[0] = ++team1;
else
match[1] = ++team2;
}

Anyway, reaching http://localhost:8080/m2w.html (if you used the same names I’ve used) should either show “Team 1 WON !!!” or “Team 2 WON !!!” … cool?

Worker to Main — Multi Pass

Instead of playing silly games, this time we’re going to ask to the main thread to give us some localStorage data, as this storage is not available in workers.

The reason this example uses multi-pass strategy is that we can’t possibly know ahead of time how much data was stored in a specific entry … got it? Great! Let’s see how different it is from single-pass, using w2m.html on the page, and w2m.js as worker.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Worker to Main</title>
<script>
// no top level await needed, just a regular script

// add some key to locaStorage for demo sake
localStorage.setItem(
'sab-atomics',
`some random ${Math.random()} value 🥳`
);

// spin a worker that will ask for a key in localStorage
const worker = new Worker('./w2m.js');

// listen to requests assuming a shared array buffer and a key
// to look for in localStorage is passed along
worker.addEventListener('message', ({data: {sab, key, length}}) => {
// use an Int32Array view to notify the worker when needed
const notifier = new Int32Array(sab);
const entry = localStorage.getItem(key) || '';

// if the worker is asking for a specific length
if (length) {
notifier[0] = entry.length;
}
// populate the buffer with all chars as UInt16Array
// because JS has UTF-16 strings: each char code is max 0xFFFF
else {
for (let view = new Uint16Array(sab), i = 0; i < entry.length; i++)
view[i] = entry.charCodeAt(i);
}

// operation completed, free the worker
Atomics.notify(notifier, 0);
});
</script>
</head>
</html>

Why not TextEncoder to populate?

Thanks for asking. The TextEncoder.encodeInto helper doesn’t like shared array buffers so we need to re-create all chars manually as UTF16 codes.

… so, that’s why, no shared buffers allowed!
// shortcuts for demo sake
const {BYTES_PER_ELEMENT: I32_BYTES} = Int32Array;
const {BYTES_PER_ELEMENT: UI16_BYTES} = Uint16Array;

// mimic localStorage (read only for demo sake)
const localStorage = {
getItem(key) {
// pre-flight: just ask for the length
let sab = new SharedArrayBuffer(I32_BYTES);

// ask for the key length
postMessage({sab, key, length: true});

// block until received
const getLength = new Int32Array(sab);
Atomics.wait(getLength, 0);

// create a length usable with Uint16Array
const [length] = getLength;

// if there was an item
if (length) {
// calculate UInt16 bytes needed to store all char codes
const BYTES = length * UI16_BYTES;

// create a new buffer used to wait for data
// round up length to the nearest % 4
sab = new SharedArrayBuffer(BYTES + (BYTES % I32_BYTES));

// ask to populate the entry
postMessage({sab, key, length: false});

// wait for notification over the new sab
Atomics.wait(new Int32Array(sab), 0);

// return the reconstructed string form those chars
// removing trailing \x00 bytes due Int32 rounding up
return String.fromCharCode(
...new Uint16Array(sab).slice(0, length)
);
}

// otherwise return null
return null;
}
};

// for demo sake, check the console!
console.log([
// see this is synchronous and returned as string
localStorage.getItem('sab-atomics'),

// see this is null as non existent
localStorage.getItem('nope')
]);

Reaching http://localhost:8080/w2m.html this time should log in console an array with the localStorage entry and a null as second value as no entry was found, just like it would happen on the main page.

Enter coincident

I hope we agree what we’ve seen so far opens tons of previously impossible to achieve possibilities, but it should be fairly clear that the amount of boilerplate needed per each exchange is both ripetitive and boring to write. On top of that, as I’ve perviously mentioned dealing with just strings kinda limits the potential of the pattern, plus a worker could be shared and our listeners might interfere with other operations, plus we’d like to make the whole dance way more natural to both write and consume … but fear not!

This module not only makes all operations fully transparent for the consumer, it also fixes (internally) Firefox lack of Atomics.waitAsync functionality and it puts, through a secured CHANNEL, which is a unique data field identifier to avoid clashing with other listeners, a handy Proxy that orchestrates all the things with ease.

As example, this is how a fully capable localStorage from a Worker would look like, this time we use index.html and local-storage.js as drivers:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Worker localStorage</title>
<script type="module">
import coincident from 'https://unpkg.com/coincident';

// spin the Worker allowing modules in there
const worker = new Worker('./local-storage.js', {type: 'module'});

// define a callback usable by the worker to operate
coincident(worker).localStorage = (prop, ...args) => {
const value = localStorage[prop];
// avoid invokes on `length`
return typeof value === 'function' ?
value.apply(localStorage, args) :
value;
};
</script>
</head>
</html>
import coincident from 'https://unpkg.com/coincident';

// create a proxy for exchanges
const main = coincident(self);

// define the localStorage like reference
const localStorage = {
get length() {
return main.localStorage('length');
}
};

// add all generic Store methods to the reference
for (const prop of ['key', 'getItem', 'setItem', 'removeItem', 'clear'])
localStorage[prop] = (...args) => main.localStorage(prop, ...args);

// That's it!

// if we haven't cleared up from previous example this works
console.log(localStorage.getItem('sab-atomics'));

// and this would work too
console.log(localStorage.length);

// let's try this as well
localStorage.setItem('coincident', Math.random());
console.log(localStorage.getItem('coincident'));

Please reach http://localhost:8080/ and read the console … see how simple it is to enable any synchronous or asynchronous exchange now with this module? And it works main to workers too with exact same API, except, as we discussed, the main thread needs to await results by standard design.

Conclusion

The Atomics API might be seen as extremely advanced topic for most developers, and so could be the usage of SharedArrayBuffer in general.

However, this topic was pretty new to me a couple of weeks ago, or better, I never needed or tried to use it, but at the end of the day simple things are simple to implement, and once those simple things are wrapped around a super easy to use and reason about utility, such as coincident, there’s really no reason to be scared or avoid the usage of these modern primitives, as they bring all developers toward the right direction, where expensive tasks can be delegate and awaited, any sync to async task can be performed from a worker, and a plethora of new patterns and libraries can be built around this Web wonder!

I hope you enjoyed this post as much as I did in both writing it and finding meaningful examples, also fixing the undefined return case in my lib 😅

P.S. About that single VS multi pass benchmark …

single pass: 0.1962890625 ms
multi pass: 0.21484375 ms

I hope we can agree 0.02 ms to postMessage and Atomics.wait twice is not really an issue, compared to the amount of RAM saved in the process.

--

--

Andrea Giammarchi
Andrea Giammarchi

Written by Andrea Giammarchi

Web, Mobile, IoT, and all JS things since 00's. Formerly JS engineer at @nokia, @facebook, @twitter.

No responses yet