A JS Buffer maxByteLength Solution?

Andrea Giammarchi
5 min readJust now

--

Photo by Elevate on Unsplash

When low-level Web APIs are based on guestimates you already know what the outcome could be: a (very likely slow) mess!

Meet maxByteLength on MDN to know more what is this about …

Why not realloc?

Here the thing: in order to resize an ArrayBuffer or a SharedArrayBuffer we need to provide upfront a maxByteLength parameter which is based on … assumptions, nothing else. There’s no way to explain what that value is, what a good amount would be, we have hints from the Web it should better not exceed 1GB of data, where NodeJS apparently has a 2GB cap limit and yet, this amount is strictly platform dependent because on constrained environments, even asking for 1GB could fail, due limited amount of RAM inferior to such limit, or already overwhelmed by the rest of the system.

Put in the fact a library that would like to provide best effort/performance over a Raspberry Pi, as well as a 128GB RAM based server, is incapable of deciding a best effort on spot and that’s it: an API that backfires on users’ intents as opposite of providing what has been around forever in the programming field, realloc!

Super slow via SharedArrayBuffer

Trust me when I say that a resizable ArrayBuffer is going to be 2X, up to 5X, faster than a resizable SharedArrayBuffer, up to the point you wonder if using just duplicated memory to then fill up the shared one would outperform just having a resizable shared one around … and you’d be surprised if you perform only a single grow call on that one and fill its bytes via a view set instead of working directly with that shared one … but you duplicated needed RAM so you’ll probably fail at that point anyway?

Benchmark

I know you don’t want to take my rant for granted, I wouldn’t neither, so here some numbers:

RESIZABLE BUFFER
encode: 3.037ms cold
decode: 2.514ms cold
encode: 1.389ms hot
decode: 0.78ms hot

RESIZABLE SHARED BUFFER
encode: 7.77ms cold
decode: 3.093ms cold
encode: 4.477ms hot
decode: 2.298ms hot

MAGIC VIEW FIXED BUFFER
encode: 4.995ms cold
decode: 3.695ms cold
encode: 1.513ms hot
decode: 1.585ms hot

MAGIC VIEW RUNTIME BUFFER
encode: 5.089ms cold
decode: 3.892ms cold
encode: 2.08ms hot
decode: 2.013ms hot

DATA VIEW DECODE
decode: 1.591ms cold
decode: 0.95ms hot

FIXED BUFFER - REFERENCE
encode: 1.817ms cold
decode: 1.901ms cold
encode: 1.006ms hot
decode: 1.058ms hot

Let me breakdown that for you:

  • a RESIZABLE BUFFER is one that knows upfront what the buffer is going to be … “magic guessing” that might overflow the guessed max size
  • a RESIZABLE SHARED BUFFER falls into same previous category, providing way slower resizes in the making
  • a MAGIC VIEW FIXED BUFFER is there just to compare imaginary worlds, where you would use such utility and yet you already know the final size of the buffer, so that no resize is ever needed
  • a MAGIC VIEW RUNTIME BUFFER is what this post is about: a way to transparently re-grow the underlying ArrayBuffer so that you don’t need to think about any of this at all and RAM is preserved by all means
  • a DATA VIEW DECODE is what you would use to decode a pre-allocated ArrayBuffer via one way or another, it’s the native fastest decoding thing we have to date on the Web
  • a FIXED BUFFER — REFERENCE is there to represent how fast all of this could be if the DataView instance would have an already perfectly sized ArrayBuffer that can be used to both encode or decode, with enough RAM available and guaranteed on the running system … it’s the benchmark reference not by accident!

Analysis

In an ideal world, we should never use resizable ArrayBuffer primitive because it’s slow on resizing, but that’s like “crystal ball programming”.

On the other hand, the fastest alternative is to use a resizable ArrayBuffer but that might suddenly go “out of bounds” if we had no idea what the size of the “thing” we were going to encode would be: it cannot resize on demand, it checks the system behind the scene before agreeing the maxByteLength size is reasonable, without giving us any way to retrieve such heuristic.

Further down we have a deadly slow SharedArrayBuffer primitive that cannot compete by any mean with generic ArrayBuffer performance. The reason we use this primitive is to have a shared reference we can await on when bytes have been filled, but discovering such primitive is so slow at changing might be a bottleneck or a show-stopper already.

As the common case is to fill data into your view, I have created a thin abstraction that provides all DataView methods over an instance able to track changes and, only when needed, create a new buffer after transfering the previous one internally, so that there will be enough room for extra data as long as the system has RAM available, bypassing the need to know how much memory is available in there.

This module is known as, or called, MagicView, and it’s my current attempt to forget about all these constraints we have around RAM based APIs, allowing with nonchalance all DataView related operations plus the ability to set any dynamic typed array value or even an array of numbers, as long as those numbers are within the uint8 boundaries.

MagicView in a nutshell

  • it’s a DataView abstraction that lets you forget about memory contraints
  • it’s ideal to keep filling up incrementally the ArrayBuffer with data
  • the resulting buffer can be used to decode anything via native DataView or even Uint8Array capabilities
  • you can use its extra setTyped, getTyped, setArray and getArray methods when convinient for your use case (MessagePack or Buffered Clone like related fields)

So that is basically it: one day we’ll have a “just grow as needed” primitive or method that does the right thing in a way developers can stop guessing target HardWare capabilities and memory availability; today that simplfication is represented by this module and as benchmark states, it’s a wonder to deal with while encoding, because it’s nearly as fast as any other native alternative and more than twice as fast when it comes to SharedArrayBuffer equivalent to encode data.

Enjoy 👋

--

--

Andrea Giammarchi
Andrea Giammarchi

Written by Andrea Giammarchi

Web, Mobile, IoT, and all JS things since 00's. Formerly JS engineer at @nokia, @facebook, @twitter.

No responses yet