Data visualization is increasingly at the center of how we digest information. The last several decades have seen an explosion in the use of charts, and a recognition of the incredible ability of the human mind to process data visually. The rise of visualization has coincided, probably not coincidentally, with a formalization and deeper consideration of just what works best when attempting to convey information in graphical form.
Perhaps no person is more responsible for giving data visualization a scientific foundation than the statistician William Cleveland. His studies on graphical perception, the cognitive processes people use to understand a chart, are among the earliest attempts to study visualization and develop a theory of how it should be best done.
The cleaner, minimalist charts in vogue today owe a great debt to Cleveland’s work. His research is also the ultimate reason most data visualizers have a fondness for bar charts and scatter plots, and tend to avoid pie charts and stacked bars.
As a statistician working in the early 1980s, William Cleveland was deeply concerned about the “largely unscientific” manner in which statisticians and others were visualizing data. Although charts had been used to represent data since the 18th Century, there was very little theory or research about how it should be done. In Cleveland’s view, most of the contemporary ideas about “proper” visualization were mostly unstructured wisdom.He believed the conventions and best-practices of data visualization -- a tool widely used by scientists and engineers -- should be backed up by data.
He was not alone. Noted statisticians David Cox and William Kruskal, had also called for theoretical and empirical foundations on how to best use graphs. Cleveland would answer this call.
In 1984, Cleveland and his colleague Robert McGill published the seminal paper Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods. This paper, which has now been cited thousands of times by academics, remains a touchstone for data visualization researchers and practitioners.
In Graphical Perception, Cleveland and McGill detailed the common cognitive tasks that happen when somebody reads a chart, then they evaluated how well study subjects performed these tasks, depending on features of the graph.
For example, when people look at a bar chart, they claim that the main task is judging “position on a common scale” -- assessing which bar goes higher on the scale, how much higher, etc. When people look at a map in which states are saturated by a certain variable, the main task is assessing “color saturation” -- assessing which shape is more saturated, how much more saturated. The following figure from their paper displays what they believed to be the common “Elementary perceptual tasks” that people are asked to complete when looking at charts. “Color saturation,” at the bottom, is not illustrated to avoid the “nuisance” of color reproduction.
After laying out this “task” paradigm for thinking about charts, the remainder of Graphical Perception is focused on understanding how skilled people are at each of these tasks. The authors ran a number of randomized control trials to assess how accurately people perceive the information on a bar chart (position on common scale), pie chart (angle), stacked bar chart (area), colored maps and shaded maps (color saturation and shading), and others.
Perhaps most famously, they had students look at a variety of two-valued bar charts and pie charts, and asked them to assess the percentage the lesser value was of the greater value. Subjects consistently read the bar charts more accurately than the pie charts. This research would mark the the beginning of the end for pie charts -- an already rarely used form -- in serious quantitative research.
Chief Analytics Officer Spring 2017
15% off with code MP15
Big Data and Analytics for Healthcare Philadelphia
$200 off with code DATA200
10% off with code 7WDATASMX
Data Science Congress 2017
20% off with code 7wdata_DSC2017
20% off with code AIP17-7WDATA-20