Turning Data Visualizations into Story-Telling Tools
More than ever, our modern economy is governed by data. Many of us spend much of our awake time extracting, analysing, interpreting, and/or presenting data. Usually, once “the hard part” is done, we rush through visualizing in our eagerness to share our findings. At best, visualization is treated as an art project, rather than a part of the analytic lifecycle. Faced with infinite possibilities to visually represent data on digital displays we often leave this decision to default software settings in our slideware or past practice (templates).
However, it is critically important to consider the implications that the chosen form of data presentation has on the further interpretation and understanding. Research suggests that information presentation formats have a strong influence on decision strategy (choice of decisions to make) and hence – decision quality (provide links). As Tufte said – “at best, graphics are instruments for reasoning about quantitative information”. In short – the way we’re presented data influences what we (and those we’re presenting to) make of it. Good data visualization practices will reduce cognitive effort and drive the attention of decision makers towards the most important cues. Most of what we put on slides do not fall under the “good data visualization” category (see below).
The rapid expansion in the application of infographics as part of the content marketing trends of early 2010s, led to a prevalence of form over function. Large and small datasets alike were treated to an infographic makeover to the point of oblivion. The result: countless examples of nonsense visualizations that are either impossible to interpret or misdirect the reader (see some here http://bit.ly/2d4COMP).
Today, we are seeing a “dashboarding” trend with many companies rushing to graph any numerical value they have access to, and add dynamics to it.
This increased attention towards data visualization calls for a reminder of the work done in Graphical Excellence in data visualization by Edward Tufte (and others) links. Below are a few suggestions on how to improve clarity, precision and efficiency in information presentation:
Key data points should be easily available to the user:
One of the main goals of a visual presentation is to show digested information at a certain level of aggregation that allows for drawing insights without actually looking at the raw data. Furthermore, a good visualization should provide enough context for the audience to make itself easily understandable without further research. If you fail to do that, your data presentation may turn into an art project. Whether the objective for a visualization is data-driven decision making, telling a story, or revealing a finding showing the key data is crucial. As an example, see below. Which of the two companies do you think is the overall market leader? What data is this based on – relative number/volume of sales, market penetration?
Pie charts deserve a special mention in this section – Don’t use pie charts, because:
- They are hard to read and do not facilitate the understanding of proportion
- They do not help noticing small differences between categories
- The rotation/position of the chart influences perception
- They are useless in cross-data-set comparisons
- The worst thing than a pie chart is a 3d pie chart
Focus on telling a story about the findings, not on methodology or design
Analytics is an attempt to tell a story using data or to help the decision maker interrogate the data for such story. That story should help us form an opinion or to make an active choice. If your outcome is a visualization fails to do that, it is not a good visualization. Many projects focus so much on providing eye candy or visualizing the methodology by which conclusions are reached, that they end up confusing the viewer or missing to communicate the core point. Look below and ask yourself – what is the story being told?
Beware of graphical data distortion
Truncating the axes or misrepresenting proportions of graphical elements in a visualization leads the audience to draw the wrong conclusions. Sometimes this is on purpose, but quite often, it is just the result of sloppy work. In either case, it defeats the purpose of a visualization. For example, the chart on bottom left makes it seem that 75.3mph is less than half of 77.3mph, while the one on the right seems to suggest that the number of gun deaths in Florida has increased from 873 to 721.
Use the space necessary to make a point, and nothing more
One of Tufte’s key suggestions is to maximize the data-ink ratio, defined simply as the ink (or space) used to visualize the data as a proportion of the total ink used in the graphic. Simply put – focus the viewer’s attention on the data, by avoiding visual pollution with non-essential elements. There are two important implications of this: First, visual pollution can cause wrong interpretations. Second, reduced cognitive effort is directly linked to the usefulness of a data visualization (in terms of ability to extract insights). Below is a good example of how grids, frames and colouring can make a simple chart difficult to read.
Balance between detail and complexity of your visualization:
The best data stories depend on the masterful combination of good analysis and aesthetic visualization design. Data visualization will provide the right perspective (frame) and focus attention on the key findings, while verbal analysis will provide context and back them up. This often relies on a coherent integration between large data sets and consistency between their verbal and visual analysis. However, it is important to note that the visualization and its analysis will often be distributed and interpreted independently from each other.
Hence, if the visual description does not contain the right level of context to be interpreted appropriately on its own, it is not good. Within its target audience, a good visualization should both set the frame for its own interpretation and provide the needed context. In contrast, a bad data presentation will leave the audience with more questions than answers. For example, not only does the graph below fail to set a reference point of comparison, but it makes it nearly impossible to extract trends or compare segments. Without the text accompanying the chart, it would even be difficult to draw the conclusions pointed out (see here for a full review: http://www.theusrus.de/blog/the-good-the-bad-122011/).
A way to remedy this while using the same chart is to provide additional context to set the right frame of reference through labels and notes, which would create visual pollution. An alternative, which avoids overloading a single chart with complexity, is to reveal the data in several levels of detail. In the aforementioned case, this is done through interaction with the different segments. After all, data presentation is about telling a story, not showing a picture. Nonetheless, that does not mean this should not be an excuse for compromising with the quality of the initial visualization. A simpler approach would be to provide context through plotting all the data in a clear reference system and encourage comparison between the different segments.
In conclusion, data visualization is almost always about encouraging comparison or examining a relationship between variables, where comparisons can be both time-based and/or categorical. Our visualizations should facilitate this through the careful use of contrast, colour and organization. Design should enhance the communication not get in the way of it. By following the principles outlined above we can improve our data visualizations and the stories they tell. And finally, as with everything else, the best way to visualize data depends on the problem we are trying to solve or the story we’re trying to tell. If we are not clear on what our end goal is, it would be difficult to justify any of the design and analytical decisions taken to produce a visualization.