Doing the Line Charts Right

Lately I joined Datawrapper, an open source project that aims to provide simple, embeddable charts for journalists. Really, no fancy stuff here, we’re just talking about line charts and bar charts. Limiting ourself to those types gave us a good opportunity to think about the best of doing them. So it came that this week I was thinking a bit about the perfect line chart.

Listen to Tufte and keep it simple

Of course you cannot talk about perfect charts without mentioning the great books of Edward Tufte. Especially in the book The Visual Display of Quantitative Information he summed up a lot of good advices for line charts. He argued that it’s a good idea to look at what he called the data ink ratio and showed how the removal of certain chart elements can increase its readability. For instance you don’t need to draw a box around the chart area. Also you can use the ends of axis lines to display the minimum and maximum value in the data.

Forget about the separate legend

Separate legends are the worst-case scenario in the line chart world. Often one can find the legend below the chart, or in an arbitrary order. You want to allow instant identification of the lines, but forcing the viewers to look them up in a legend takes way too much time. Instead you should put the labels somewhere close to the lines.

The great side effect of putting the labels next to the lines is that you no longer depend on fancy colors or disturbing symbols to identify individual lines. Extra points for simplicity.

Highlight what’s important

Although it is possible to tell hundred stories using a single line chart, it makes a lot of sense to keep the focus on just one story. Therefore you should highlight just one or two important lines in the chart, but keep the others as context in the background.

Baseline zero or not?

Sometimes you hear the advice that every (line) chart should have a baseline of zero, otherwise it would be “lying”. As a counter-example, here’s the (approximate) intraday stock quote data of the Facebook IPO day using baseline zero. The reason why nobody shows stock charts this way is obvious.  

It’s almost impossible to see the ups and downs of the first day of the Facebook stock. Without the zero-baseline the chart reveals much more of the data.

However, to minify the risk of confusing the readers with a non-zero baseline chart, I suggest to not draw the axes as connected lines. This way the y-axis doesn’t visually ‘touch’ the ‘ground’.

Finding a nice aspect ratio

The big advantage of line charts is that they enable the comparison of slopes, which is not easily possible in a bar chart, for instance. The problem, however, is that the perceivable slopes are highly dependent on the aspect ratio of the chart. The Facebook stock data would have looked much more dramatic in a taller chart. So which aspect ratio to chose? Some years ago, William Cleveland suggested a technique called banking to solve this problem.The core idea is that the slopes in a line chart are most readable if they average to 45°. In 2006, Jeffrey Heer and Maneesh Agrawala continued the work of Cleveland and described 12 different banking algorithms. I used one of the most simplest of them, the median-absolute-slope banking.

Finally, here’s what the Facebook stock chart looks like after banking. The curve looks less dramatic now, but is still easy to read.

The problem with banking is that sometimes you need the chart in a certain aspect ratio to fit into a page layout. Especially if banking produces portrait sized charts. But why not let the optimal chart ratio define your layout? For instance, you can put the additional information to the side of the chart. Remember that the main goal of banking is to increase the readability of the line slopes. In the following example, the slopes for Nuclear and Renewables would have been much more difficult to see, if the chart would have been ‘squeezed’ to a landscape aspect.

Turning best practices into actual tools

At the end, I am very happy to say that these best practices won’t remain gray theory in research papers. Everything I mentioned will be integrated in the upcoming release of Datawrapper, which I already used to produce most of the examples in this post. Please follow @datawrapper if you want to keep up-to-date with the project.

If you have further suggestions or recommendations for line charts, I’m looking forward to read your comments.