The Web smallest DOM diffing library

This post is a recap of a personal journey, that lasted about 3 years, in search of the best DOM diffing options I could find.

Update

Not sure if it was thanks to this post that js-diff-benchmark was created, which is also based on udomdiff testing code, but we’re benchmarking various options, extracted from various libraries and frameworks, to find which is the actual smallest diffing library and which is the most performant.

Let’s provide some context …

Early in March 2017, I’ve published, out of the blue, the hyperHTML library, and since that day, I’ve kinda condemned myself to find “better solutions” for it, at least until these recent days “new” discoveries that put an end to it (aka: me wasting weekends).

The pattern proposed in that library, at that time, was a first of a kind: a “never seen, new and cool” whatsoever “vanilla JS” thing, that also required, still in early 2017, manual tweaks and some ad-hoc “footgun” to handle changes on the Web/DOM UI side.

The “TL;DR” description of that library is that it lets you update well known DOM nodes, somehow related to some data of various kind, and at blazing fast speed … but it also leaves you in charge of possible shenanigans.

Please don’t get hyperHTML wrong: it’s been rock solid since, and deployed in production to million users for more than 1.5 years, so that such library is not really what this post is about, rather the “diffing thing” behind such library is.

What’s DOM diffing anyway

I am fairly sure you’ve been on Facebook or Twitter at least once in your life, so that understanding how UI updates work there is pretty simple: there is a new feed, tweet, info, or even a new advertisement to show? You get it!

Now, imagine you are in the middle of a long session in there: would you expect your browser to suddenly reload the whole thing from the scratch, making you lose the scroll position and information you were reading just few seconds before?

Well, DOM diffing is some sort of unobtrusive and smart “mapping data to the view” technique, mostly in charge of hopefully updating only the tiniest parts of the view that need changes, and only when their related data changed too:

  • some re-tweet count increased? DOM diffing!
  • somebody is typing to reply? DOM diffing!
  • a sudden like or answer to a Facebook post? DOM diffing!

and so on and so forth …

About The Diffing’s Complexity

When it comes to algorithms, everyone talks in Big O notation terms, but if you don’t have such academic knowledge, all it should matter to you is that “it just works” or, in most common cases, “it works best” 🎉

As side note, that’s a bit disrespectful for all the incredibly smart people that published papers around the topic, as we should know better, and most papers are even published online, so that we could know better indeed 😉

The most “look at this paper” publication regarding diffing matters, easily goes to the An O(ND) Difference Algorithm and Its Variations (PDF), a publication from Eugene W. Myers from the 1986, who surely deserved a Wikipedia page.

As many wrote here and there, that publication gives you an understanding of the problem, and why the proposed solution is superior to others.

I’ve even recently published a general purpose speedy-myers JS module, but despite the module name, I’ve contextually been underwhelmed by its general poor performance when it comes to the DOM world:

It’s not even a Myers’ algorithm issue though, ’cause most published diffing attempts, known to date, assume the diffing is mostly about text:

  • multiple online documents collaborations? check
  • any GitHub/GitLab live code diff based review? check
  • books authors previews and reviews? check

… but while it’s awesome that all these topics got figured out, the DOM is a different beast that cannot really be optimized in the same way.

What’s Special About The DOM?

Apparently an underestimated issue most wouldn’t care to solve, let’s see what’s so special about the DOM:

  • a node cannot simultaneously exist in two places
  • operations to mutate the DOM have side effects all over (because of the previous constrain)

To provide an example: the string “cool” and “cold” need theoretically 2 operations to morph: “replace the extra o with l” followed by “replace the final l with d”, right?

But some algorithm might do it differently: ”shrink the oosequence to o first, and then “append d”, which are still two operations, but with a new “look for sequences to shrink” concept.

And while all this is super interesting, in the DOM world there’s not even a case where two o can simultaneously be considered, so that all constraints are different to start with, but extremely relevant at the same time.

Patching On Demand

The “variations” part of the O(ND) Myers’ paper is a golden suggestion that boils down to this concept: “optimize everything you can until the moment you can’t anymore”.

That’s indeed what µdomdiff tries to do, applying most common patterns as it crawls the list, trying to avoid greedy “search best thing to do” operations:

  • same list? check
  • only new nodes to append or prepend? check
  • just two rows to swap? check
  • just one row to replace? check
  • LCS sequences to skip? check

Plus more, applicable at any point during the crawling of the new list of nodes to consider, and with the intent to keep DOM operations at a reasonable minimum, without trying to greedly find the absolute best diffing at all ’cause … you know … YAGNI 😅

Nothing really new, but …

During these years I’ve tried to apply Myers’ algorithm, the Levenshtein distance, and also the Hunt-Szymanski one, which is doing generally a better work in terms of overall performance, and yet it’s “a big one”.

Other libraries used any other kind of strategy too, so that basically everyone on the Web has tried to solve the same issue for years, bringing me to even ask the platform to help out behind the scene and yet, in a time where there’s a module for absolutely anything on npm, nobody has made one to solve the most common thing we need to solve on the Web.

Well, the “new” thing here is that not only I’ve already published domdiff a while ago, used by various projects, but from “today” there’s also µdomdiff, which goal is to keep it as small and simple as possible, but not simpler.

We’re talking about a library/helper that weights less than 0.5K, once compressed, and it applies a minimal amount of changes needed to go from a view to another:

import udomdiff from 'udomdiff';const nodes = udomdiff(
document.querySelector('#container'),
[...currentNodes],
[...futureNodes],
getNodeCallback,
optionalAnchorNode
);

If you look at the source code you’ll also find in comments all the operations performed by this differ, and libraries such as µhtml are using it already, tested against all cases previous domdiff library was tested too, and performing extremely well, considering its little size and logic.

And Goodbye Complexity 👋

I don’t want to sell you that µdomdiff is the most performing library of this kind, even if it’s close to that, but I do want to tell you that it’s the smallest of its kind, and that it performs already exceptionally well in every benchmark I could try so far.

In its worst case scenario, µdomdiff is still very fast, so that I hope you’ll experiment with it through any vanilla JS or any other library of choice.

Me? I think this is one of those challenges I can finally consider done 🎉

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store