More on this book
Community
Kindle Notes & Highlights
When it comes to explanatory analysis, being able to concisely articulate exactly who you want to communicate to and what you want to convey before you start to build content reduces iterations and helps ensure that the communication you build meets the intended purpose. Understanding and employing concepts like the 3-minute story, the Big Idea, and storyboarding will enable you to clearly and succinctly tell your story and identify the desired flow.
While pausing before actually building the communication might feel like it’s a step that slows you down, in fact it helps ensure that you have a solid understanding of what you want to do before you start creating content, which will save you time down the road.
When I look back over the 150+ visuals that I created for workshops and consulting projects in the past year, there were only a dozen different types of visuals that I used (Figure 2.1).
Simple text When you have just a number or two to share, simple text can be a great way to communicate. Think about solely using the number—making it as prominent as possible—and a few supporting words to clearly make your point.
Figure 2.2 Stay-at-home moms original graph The fact that you have some numbers does not mean that you need a graph! In Figure 2.2, quite a lot of text and space are used for a grand total of two numbers. The graph doesn’t do much to aid in the interpretation of the numbers (and with the positioning of the data labels outside of the bars, it can even skew your perception of relative height such that 20 is less than half of 41 doesn’t really come across visually).
In this case, a simple sentence would suffice: 20% of children had a traditional stay-at-home mom in 2012, compared to 41% in 1970.
Figure 2.3 Stay-at-home moms simple text makeover
For example, you could reframe in terms of the percent change: “The number of children having a traditional stay-at-home mom decreased more than 50% between 1970 and 2012.” I advise caution, however, any time you reduce from multiple numbers down to a single one—think about what context may be lost in doing so. In this case, I find that the actual magnitude of the numbers (20% and 41%) is helpful in interpreting and understanding the change.
When you have just a number or two that you want to communicate: use the numbers directly.
Tables are great for just that—communicating to a mixed audience whose members will each look for their particular row of interest. If you need to communicate multiple different units of measure, this is also typically easier with a table than a graph.
Tables in live presentations Using a table in a live presentation is rarely a good idea. As your audience reads it, you lose their ears and attention to make your point verbally.
When you find yourself using a table in a presentation or report, ask yourself: what is the point you are trying to make? Odds are that there will be a better way to pull out and visualize the piece or pieces of interest. In the event that you feel you’re losing too much by doing this, consider whether including the full table in the appendix and a link or reference to it will meet your audience’s needs.
One thing to keep in mind with a table is that you want the design to fade into the background, letting the data take center stage. Don’t let heavy borders or shading compete for attention. Instead, think of using light borders or simply white space to set apart elements of the table.
Figure 2.4 Table borders
Borders should be used to improve the legibility of your table. Think about pushing them to the background by making them grey, or getting rid of them altogether. The data should be what stands out, not the borders.
Figure 2.5 Two views of the same data
we can use color saturation to provide visual cues, helping our eyes and brains more quickly target the potential points of interest. In the second iteration of the table on the right entitled “Heatmap,” the higher saturation of blue, the higher the number. This makes the process of picking out the tails of the spectrum—the lowest number (11%) and highest number (58%)—an easier and faster process than it was in the original table where we didn’t have any visual cues to help direct our attention.
Be sure when you leverage this to always include a legend to help the reader interpret the data (in this case, the LOW-HIGH subtitle on the heatmap with color corresponding to the conditional formatting color serves this purpose).
While tables interact with our verbal system, graphs interact with our visual system, which is faster at processing information.
a well-designed graph will typically get the information across more quickly than a well-designed table.
The types of graphs I frequently use fall into four categories: points, lines, bars, and area.
Scatterplots can be useful for showing the relationship between two things, because they allow you to encode data simultaneously on a horizontal x-axis and vertical y-axis to see whether and what relationship exists.
Line graphs are most commonly used to plot continuous data. Because the points are physically connected via the line, it implies a connection between the points that may not make sense for categorical data (a set of data that is sorted or divided into different categories). Often, our continuous data is in some unit of time: days, months, quarters, or years.
Within the line graph category, there are two types of charts that I frequently find myself using: the standard line graph and the slopegraph.
Line graph The line graph can show a single series of data, two series of data, or multiple series,
Figure 2.8 Line graphs
Showing average within a range in a line graph In some cases, the line in your line graph may represent a summary statistic, like the average, or the point estimate of a forecast. If you also want to give a sense of the range (or confidence level, depending on the situation), you can do that directly on the graph by also visualizing this range. For example, the graph in Figure 2.9 shows the minimum, average, and maximum wait times at passport control for an airport over a 13-month period.
Figure 2.9 Showing average within a range in a line graph
Note that when you’re graphing time on the horizontal x-axis of a line graph, the data plotted must be in consistent intervals. I recently saw a graph where the units on the x-axis were decades from 1900 forward (1910, 1920, 1930, etc.) and then switched to yearly after 2010 (2011, 2012, 2013, 2014). This meant that the distance between the decade points and annual points looked the same. This is a misleading way to show the data. Be consistent in the time points you plot.
Slopegraphs can be useful when you have two time periods or points of comparison and want to quickly show relative increases and decreases or differences across various categories between the two data points.
The best way to explain the value of and use case for slopegraphs is through a specific example. Imagine that you are analyzing and communicating data from a recent employee feedback survey. To show the relative change in survey categories from 2014 to 2015, the slopegraph might look something like Figure 2.10.
Figure 2.10 Sl...
This highlight has been truncated due to consecutive passage length restrictions.
Slopegraphs pack in a lot of information. In addition to the absolute values (the points), the lines that connect them give you the visual increase or decrease in rate of change (via the slope or direction) without ever having to explain that’s what they are doing, or what exactly a “rate of change” is—rather, it’s intuitive.
Slopegraphs can take a bit of patience to set up because they often aren’t one of the standard graphs included in graphing applications. An Excel template with an example slopegraph and instructions for customized use can be downloaded here: storytellingwithdata.com/slopegraph-template.
Whether a slopegraph will work in your specific situation depends on the data itself. If many of the lines are overlapping, a slopegraph may not work, though in some cases you can still emphasize a single series at a time with success. For example, we can draw attention to the single category that decreased over time
Figure 2.11 Modified slopegraph
bars tend to be my go-to graph type for plotting categorical data, where information is organized into groups.
bar charts should be leveraged because they are common, as this means less of a learning curve for your audience.
Note that, because of how our eyes compare the relative end points of the bars, it is important that bar charts always have a zero baseline (where the x-axis crosses the y-axis at zero), otherwise you get a false visual comparison.
Figure 2.13 Bar charts must have a zero baseline
The y-axis labels that were placed on the right-hand side of the original visual were moved to the left (so we see how to interpret the data before we get to the actual data).
The data labels that were originally outside of the bars were pulled inside to reduce clutter.
If I were plotting this data outside of this specific lesson, I might omit the y-axis entirely and show only the data labels within the...
This highlight has been truncated due to consecutive passage length restrictions.
Graph axis vs. data labels When graphing data, a common decision to make is whether to preserve the axis labels or eliminate the axis and instead label the data points directly. In making this decision, consider the level of specificity needed. If you want your audience to focus on big-picture trends, think about preserving the axis but deemphasizing it by making it grey. If the specific numerical values are important, it may be better to label the data points directly. In this latter case, it’s usually best to omit the axis to avoid the inclusion of redundant information. Always consider how
...more
bar charts must have a zero baseline. Note that this rule does not apply to line graphs. With line graphs, since the focus is on the relative position in space (rather than the length from the baseline or axis), you can get away with a nonzero baseline. Still, you should approach with caution—make it clear to your audience that you are using a nonzero baseline and take context into account so you don’t overzoom and make minor changes or differences appear significant.
Misleading in this manner by inaccurately visualizing data is not OK. Beyond ethical concerns, it is risky territory. All it takes is one discerning audience member to notice the issue (for example, the y-axis of a bar chart beginning at something other than zero) and your entire argument will be thrown out the window, along with your credibility.
general the bars should be wider than the white space between the bars. You don’t want the bars to be so wide, however, that your audience wants to compare areas instead of lengths.
Like line graphs, vertical bar charts can be single series, two series, or multiple series. Note that as you add more series of data, it becomes more difficult to focus on one at a time and pull out insight, so use multiple series bar charts with caution.
Be aware also that there is visual grouping that happens as a result of the spacing in bar charts having more than one data series. This makes the relative order of the categorization important. Consider what you want your audience to be able to compare, and structure your categorization hierarchy to make that as easy as possible.