With that preamble, here are some of the changes I’m making to D3 with an eye towards usability. But first a crash course on D3’s data-join.
Case 1. Removing the magic of enter.append.
“D3” stands for Data-Driven Documents. The data refers to the thing you want to visualize, and the document refers to its visual representation. It’s called a “document” because D3 is based on the standard model for web pages: the Document Object Model.
A simple page might look like this:
<!DOCTYPE html>
<svg width="960" height="540">
<g transform="translate(32,270)">
<text x="0">b</text>
<text x="32">c</text>
<text x="64">d</text>
<text x="96">k</text>
<text x="128">n</text>
<text x="160">r</text>
<text x="192">t</text>
</g>
</svg>
This happens to be an HTML document containing an SVG element, but you don’t need to know the semantics of every element and attribute to grasp the concept. Just know that each element, such as <text>…</text> for a piece of text, is a discrete graphical mark. Elements are grouped hierarchically (the <svg> contains the <g>, which contains the <text>, and so on) so that you can position and style groups of elements.
A corresponding simple dataset might look like this:
var data = [
"b",
"c",
"d",
"k",
"n",
"r",
"t"
];
This dataset is an array of strings. (A string is a character sequence, though the strings here are individual letters.) But data can have any structure you want, if you can represent it in JavaScript.
For each entry (each string) in the data array, we need a corresponding <text> element in the document. This is the purpose of the data-join: a concise method for transforming a document — adding, removing, or modifying elements — so that it corresponds to data.
The data-join takes as input an array of data and an array of document elements, and returns three selections:
- The enter selection represents “missing” elements (incoming data) that you may need to create and add to the document.
- The update selection represents existing elements (persisting data) that you may need to modify (for example, repositioning).
- The exit selection represents “leftover” elements (outgoing data) that you may need to remove from the document.
The data-join doesn’t modify the document itself. It computes enter, update and exit, and then you apply the desired operations to each. That affords expressiveness: for example, to animate elements as they enter and exit.
As you can imagine, the data-join is something you use often — when first creating a visualization, and again whenever the data changes. The usability of this feature is hugely important to D3’s overall usefulness. It looks like this:
var text = g
.selectAll("text")
.data(data, key); // JOIN
text.exit() // EXIT
.remove();
text // UPDATE
.attr("x", function(d, i) { return i * 32; });
text.enter() // ENTER
.append("text")
.attr("x", function(d, i) { return i * 32; }) // 🌶
.text(function(d) { return d; });
I’m glossing over a few details (like the key function that assigns data to elements), but I hope the gist is conveyed. After joining to data, the code above removes exiting elements, repositions updating elements, and appends entering elements.
There’s an irksome usability problem in the above code, which I’ve marked with a hot pepper 🌶. It is duplicate code: setting the x attribute on enter and update.
It’s common to apply operations to both entering and updating elements. If an element is updating (i.e., you’re not creating it from scratch), you may need to modify it to reflect the new data. Those modifications often also apply to entering elements, since they must reflect the new data, too.
D3 2.0 introduced a change to address this duplication: appending to the enter selection would now copy entering elements into the update selection. Thus, any operations applied to the update selection after appending to the enter selection would apply to both entering and updating elements, and duplicate code could be eliminated:
var text = g
.selectAll("text")
.data(data, key); // JOIN
text.exit() // EXIT
.remove();
text.enter() // ENTER
.append("text") // 🌶
.text(function(d) { return d; });
text // ENTER + UPDATE
.attr("x", function(d, i) { return i * 32; });
Alas, this made usability worse.
First, there’s no indication of what’s happening under the hood (poor role-expressiveness, or perhaps a hidden dependency). Most of the time, selection.append creates, appends and selects new elements; it does that here, but it also silently modifies the update selection. Surprise!
Second, the code is now dependent on the order of operations: if the operations to the update selection are applied before enter.append, they only affect updating nodes; if they occur after, they affect both entering and updating. The goal of the data-join is to eliminate such intricate logic, and to enable a more declarative specification of document transformations without complicated branching and iteration. The code might look simple, but it’s brushed the complexity under the rug.
D3 4.0 removes the magic of enter.append. (In fact, D3 4.0 removes the distinction between enter and normal selections entirely: there is now only one class of selection.) In its place, a new selection.merge method can unify the enter and update selections:
var text = g
.selectAll("text")
.data(data, key); // JOIN
text.exit() // EXIT
.remove();
text.enter() // ENTER
.append("text")
.text(function(d) { return d; })
.merge(text) // ENTER + UPDATE
.attr("x", function(d, i) { return i * 32; });
This eliminates the duplicate code without corrupting the behavior of a common method (selection.append) and without introducing a subtle dependency on ordering. Furthermore, the selection.merge method serves as a signpost to unfamiliar readers, which they can look up in the documentation.
Maxim 1. Avoid overloading meaning.
What can we learn from this failure? D3 3.x violated a Rams principle: good design makes a product understandable. In cognitive dimension terms, it had poor consistency because selection.append behaved differently on enter selections, and thus the user can’t extend understanding of normal selections to enter. It had poor role-expressiveness because the latter behavior wasn’t obvious. And there’s a hidden dependency: operations on the text selection must be run after appending to enter, though nothing in the code makes this requirement apparent.
D3 4.0 avoids overloading meaning. Rather than silently adding functionality to enter.append — even if it is useful in a common case — selection.append always only appends elements. If you want to merge selections, you need a new method! Hence, selection.merge.