Book Image

Data Visualization with d3.js

By : Swizec Teller
Book Image

Data Visualization with d3.js

By: Swizec Teller

Overview of this book

<p>d3.js. provides a platform that help you create your own beautiful visualization and bring data to life using HTML, SVG and CSS. It emphasis on web standards that will fully utilize the capabilities of your web browser.</p> <p>Data Visualization with d3.js walks you through 20 examples in great detail. You can finally stop struggling to piece together examples you've found online. With this book in hand, you will learn enough of the core concepts to conceive of and build your own visualizations from scratch.</p> <p>The book begins with the basics of putting lines on the screen, and builds on this foundation all the way to creating interactive animated visualizations using d3.js layouts.</p> <p>You will learn how to use d3.js to manipulate vector graphics with SVG, layout with HTML, and styling with CSS. You'll take a look at the basics of functional programming and using data structures effectively – everything from handling time to doing geographic projections. The book will also help make your visualizations interactive and teach you how automated layouts really work.</p> <p>Data Visualization with d3.js will unveil the mystery behind all those beautiful examples you've been admiring.</p>
Table of Contents (13 chapters)

A simple histogram


We'll go through the basics of d3.js by creating a histogram indicating when the GitHub users commit code. We're going to label axes, make sure things are scalable, and modify animations for that extra bit of flair.

The dataset contains 504,015 repositories and it took me a week to create it out of punchcard data for each repository. A punchcard is just a 7 x 24 grid of buckets, specifying how many commits happened within a specific day and hour. The dataset's histogram digest is hosted at http://nightowls.swizec.com/data/histogram-hours.json and maps hours to the sum of commits occurring within that hour.

This is what we're aiming for:

We begin by taking the environment prepared in the previous section and adding a few lines around the central <div> tag:

<div class="container">
  <div class="row">
    <div id="graph" class="span12"></div>
  </div>
</div>

The extra <div> tags center the graph horizontally and ensure that we have 900 px of width to work with. Don't forget to add the class="span12" parameter into the graph div. It tells Bootstrap the div should go the whole width of the grid.

To avoid tripping your browser's security restrictions regarding cross-domain requests, you should now take a moment to download the dataset and save it next to the other files. Remember, it's at http://nightowls.swizec.com/data/histogram-hours.json.

You can play around with the following code in Chrome Developer Tools to see what it does and then save it in code.js. Writing directly to the file also works, but just make sure you refresh frequently. Learning is if you know what each line does.

We begin with some variables as follows:

var width = 900, height = 300, pad = 20, left_pad = 100;

We're going to use these to specify the dimensions of our drawing area. The pad variable will define the padding from the edge, with left_pad giving a bigger margin on the left to allow for labels.

Next, we define a horizontal scale, x:

var x = d3.scale.ordinal().rangeRoundBands([left_pad, width - pad], 0.1);

The x scale is now a function that maps inputs from a yet unknown domain (we don't have the data yet) to a range of values between left_pad and width - pad, that is, between 100 and 880 with some spacing defined by the 0.1 value. Because it's an ordinal scale, the domain will have to be discrete rather than continuous. rangeRoundBands means the range will be split into bands that are guaranteed to be round numbers.

Then, we define another scale named y:

var y = d3.scale.linear().range([height-pad, pad]);

Similarly, the y scale is going to map a yet unknown linear domain to a range between height-pad and pad, that is, 880 and 20. Inverting the range is important because d3.js considers the top of a graph to be y=0.

Now, we define our axes as follows:

var xAxis = d3.svg.axis().scale(x).orient("bottom");
var yAxis = d3.svg.axis().scale(y).orient("left");

We've told each axis what scale to use when placing ticks and which side of the axis to put the labels on. D3 will automatically decide how many ticks to display, where they go, and how to label them.

The last step before loading the data is defining an SVG element for the histogram:

var svg = d3.select("#graph").append("svg")
                .attr("width", width).attr("height", height);

Switching quickly to the Elements tab, you'll notice a new HTML element with a width of 900 and a height of 100.

Now the fun begins!

We're going to use d3.js itself to load data remotely and then draw the graph in the callback function. Remember to use Shift + Enter to input multiline code in the Chrome console. Now might be a good time to switch to coding in code.js directly and refreshing after every couple of steps:

d3.json('histogram-hours.json', function (data) {
});

d3.json will create an Ajax request to load a JSON file, then parse the received text into a JavaScript object. D3 understands CSV and some other data formats as well, which is kind of awesome if you ask me.

From here on, we put everything in that callback function (before the }); bit). Our data will be in the data variable. D3 is a functional data-munging library, so we need to transform our dictionary data into a list of simple objects. We do this using the following code:

data = d3.keys(data).map(function (key) {
  return {bucket: Number(key),
    N: data[key]};
  });

d3.keys returns a list of keys in the data dictionary, which we then map over with an iterator function that returns a simple dictionary for every item. It tells us where an item fits in the histogram (bucket) and what value it holds (N).

We've turned our data into a list of two-value dictionaries.

Remember the x and y scales from before? We can finally give them a domain and make them useful:

    x.domain(data.map(function (d) { return d.bucket; }));
    y.domain([0, d3.max(data, function (d) { return d.N; })]);

Since most d3.js elements are objects and functions at the same time, we can change the internal state of both scales without assigning the result to anything. The domain of x is a list of discrete values. The domain of y is a range from 0 to d3.max of our dataset—the largest value.

Now we're going to draw the axes on our graph:

svg.append("g")
  .attr("class", "axis")
  .attr("transform", "translate(0, "+(height-pad)+")")
  .call(xAxis);

We've appended an element called g to the graph, given it the CSS class "axis", and moved the element to a place at the bottom-left of the graph with the transform attribute.

Finally, we call the xAxis function and let d3.js handle the rest.

Drawing the other axis works exactly the same, but with different arguments:

svg.append("g")
  .attr("class", "axis")
  .attr("transform", "translate("+(left_pad-pad)+", 0)")
  .call(yAxis);

Now that our graph is labeled, it's finally time to draw some data:

svg.selectAll('rect')
  .data(data)
  .enter()
  .append('rect')
  .attr('class', 'bar')
  .attr('x', function (d) { return x(d.bucket); })
  .attr('width', x.rangeBand())
  .attr('y', function (d) { return y(d.N); })
  .attr('height', function (d) { return height-pad - y(d.N); });

Okay, there's plenty going on here, but this code is saying something very simple: for all rectangles (rect) in the graph, load our data, go through it, and for each item append a rect and then define some attributes.

The x scale helps us calculate the horizontal positions and rangeBand gives the width of the bar. The y scale calculates vertical positions and we manually get the height of each bar from y to the bottom. Note that whenever we needed a different value for every element, we defined an attribute as a function (x, y, and height); otherwise, we defined it as a value (width).

Keep this in mind when you're tinkering.

Let's add some flourish and make each bar grow out of the horizontal axis. Time to dip our toes into animations!

Add five lines to the preceding code:

svg.selectAll('rect')
  .data(data)
  .enter()
  .append('rect')
  .attr('class', 'bar')
  .attr('x', function (d) { return x(d.bucket); })
  .attr('width', x.rangeBand())
  .attr('y', height-pad)
  .transition()
  .delay(function (d) { return d.bucket*20; })
  .duration(800)
  .attr('y', function (d) { return y(d.N); })
  .attr('height', function (d) { return height-pad - y(d.N); });

The difference is that we statically put all bars at the bottom (height-pad) and then entered a transition with .transition(). From here on, we define the transition we want.

First, we wanted each bar's transition delayed by 20 milliseconds using d.bucket*20. This gives the histogram a neat effect, gradually appearing from left to right instead of jumping up at once. Next, we said we wanted each animation to last just shy of a second with .duration(800). In the end, we defined the final values for the animated attributes—y and height are the same as in previous code—and d3.js is going to take care of the rest.

Refresh the page and voila! A beautiful histogram appears as shown in the following screenshot:

Hmm, not really. We need some CSS to make everything look perfect.

Remember that you can look at the full code on GitHub at https://github.com/Swizec/d3.js-book-examples/tree/master/ch1 if you didn't get something similar to the preceding screenshot.

Let's go into our HTML file and add some CSS on line 4, right after including bootstrap:

<style>
  .axis path,
  .axis line {
    fill: none;
    stroke: #eee;
    shape-rendering: crispEdges;
  }

  .axis text {
    font-size: 11px;
  }

  .bar {
    fill: steelblue;
  }
</style>

This is why we added all those classes to shapes. We made the axes thin, gave them a light-gray color, and used a relatively small font for the labels. The bars should be steel blue. Refresh the page now and the histogram is beautiful:

I suggest playing around with the values for width, height, left_pad, and pad to get a feel of the power of d3.js. You'll notice everything scales and adjusts to any size without having to change the other code. Marvelous!