How William Cleveland Turned Data Visualization Into a Science

How William Cleveland Turned Data Visualization Into a Science

How William Cleveland Turned Data Visualization Into a Science

Data visualization is increasingly at the center of how we digest information. The last several decades have seen an explosion in the use of charts, and a recognition of the incredible ability of the human mind to process data visually. The rise of visualization has coincided, probably not coincidentally, with a formalization and deeper consideration of just what works best when attempting to convey information in graphical form.

Perhaps no person is more responsible for giving data visualization a scientific foundation than the statistician William Cleveland. His studies on graphical perception, the cognitive processes people use to understand a chart, are among the earliest attempts to study visualization and develop a theory of how it should be best done. 

The cleaner, minimalist charts in vogue today owe a great debt to Cleveland’s work. His research is also the ultimate reason most data visualizers have a fondness for bar charts and scatter plots, and tend to avoid pie charts and stacked bars.

As a statistician working in the early 1980s, William Cleveland was deeply concerned about the “largely unscientific” manner in which statisticians and others were visualizing data. Although charts had been used to represent data since the 18th Century, there was very little theory or research about how it should be done. In Cleveland’s view, most of the contemporary ideas about “proper” visualization were mostly unstructured wisdom.He believed the conventions and best-practices of data visualization -- a tool widely used by scientists and engineers -- should be backed up by data.

Read Also:
BI Reporting tools – Uncut Diamond to Refined Diamond

He was not alone. Noted statisticians David Cox and William Kruskal, had also called for theoretical and empirical foundations on how to best use graphs. Cleveland would answer this call.

In 1984, Cleveland and his colleague Robert McGill published the seminal paper Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods. This paper, which has now been cited thousands of times by academics, remains a touchstone for data visualization researchers and practitioners.

In Graphical Perception, Cleveland and McGill detailed the common cognitive tasks that happen when somebody reads a chart, then they evaluated how well study subjects performed these tasks, depending on features of the graph. 

For example, when people look at a bar chart, they claim that the main task is judging “position on a common scale” -- assessing which bar goes higher on the scale, how much higher, etc. When people look at a map in which states are saturated by a certain variable, the main task is assessing “color saturation” -- assessing which shape is more saturated, how much more saturated. The following figure from their paper displays what they believed to be the common “Elementary perceptual tasks” that people are asked to complete when looking at charts. “Color saturation,” at the bottom, is not illustrated to avoid the “nuisance” of color reproduction.

Read Also:
16 analytic disciplines compared to data science

After laying out this “task” paradigm for thinking about charts, the remainder of Graphical Perception is focused on understanding how skilled people are at each of these tasks. The authors ran a number of randomized control trials to assess how accurately people perceive the information on a bar chart (position on common scale), pie chart (angle), stacked bar chart (area), colored maps and shaded maps (color saturation and shading), and others.

Perhaps most famously, they had students look at a variety of two-valued bar charts and pie charts, and asked them to assess the percentage the lesser value was of the greater value. Subjects consistently read the bar charts more accurately than the pie charts. This research would mark the the beginning of the end for pie charts -- an already rarely used form -- in serious quantitative research.

 



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
How big data will transform the ways people travel
Read Also:
Positioning a Machine Learning Company

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
LinkedIn open sources its WhereHows data mining software

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
Platforms Are Eating The World

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
The Rise of Insurtech in the Age of Algorithms

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
BI Reporting tools – Uncut Diamond to Refined Diamond

Leave a Reply

Your email address will not be published. Required fields are marked *