Visual Revelations
Chapter 1. How to Display Data Badly
The Aim of Good Data Graphics is to Display Data Accurately and Clearly
How to display data badly:
- Don’t show much data
- Show the data inaccurately
- Obfurscate the data
Don’t Show Data
Rule1 Show as little data is possible(minimize the data density)
Edward Tufte defines that:
- data density index : the number of numbers plotted per square inch
- data/ink ratio : the ratio of the amount of ink used to graph the data divided by the amount of ink int the graph.
A graph contains little or no information the plot can look empty and thus raise the viewer’s suspicions that nothing is to be communicated. -> Chartjunk
Rule 2 Hide what data you do show
Hiding the data in the grid
This data/ink radio for this plot is close to zero
Idea:
Graphic -> mataphor -> architecture
grid -> scaffolding surrounding the building under construction
Leaving the grid in detracts both from a graph’s beauty and its ability to communicate. (though grids are now largely anachronistic since modern, computer-generated graphs don’t need them).
Hiding the data in the scale
We can hardly see the increasing of the private school.
Expanding the scale and showing the data for the number of private elementary school
Show Data Inaccurately
Rule 3 Ignore the visual metaphor altogether
graphos(the analogue of textual typos)
Reversing the metaphor in mid-grapsh while changing scales on both axes
in fact
Redone with a consistent scale and visual metaphor
Rule 4 Only order matters
The distortion
(这里学一下给0.44和2.06加上上标)
$\dfrac{1.00-.44}{.44}=1.27$
$\dfrac{22.00-2.06}{2.06}=9.68$
lie factor = 9.68/1.27 = 7.62
“the old goosing up the effect by squaring the eyeball trick”
IDEA:
Lie in one place if you can tell the truth in another?
Real power of numbers is in their magnitudes.
Rule 5 Graph data out of context
The value of a fact shrinks enormaously without context. Knowing that more than 600,000 Americans were killed inthe Civil War is horrifying, but its impact is multiplied when you learn that it is 200,000 more than the number of Americans killed in World War II.
Hiding the effect by careful choice of scale and origin.( Take care of the scale and origin)
Obfuscate the Data
Rule 6 Change scales in mid-axes
can make large diferences look small and make exponential changes look linear.
incomes of doctors Vs other professionals
redone
Rule 7 Emphasize the trivial(ignore the important)
Emphasizing the trivial: Hiding the main effect of sex differences in income through the vertical placement of plots
redone with the two plots horizontally opposed, showing the size of sex differece more clearly
IDEA:
most displays must be arrayed on a two-demiensional surface. A third(or higher) dimension must be rendered with some other sort of metaphor. Such a rendering must be read and not viewed. Such “read” variabless are more dimly, and less accurately, perceived. (such as tltitude on the map, is often done with tour lines)
see more 13-15 female = 0-8 male
Redone with the large effects of sex and education emphasized and the small-time trend suppressed.
Rule 8 Jiggle the baseline
judgements about the trends in the stocks of the other three components except United States cannot be made.
The sharp decline in petroleum stock evidenced by the aggregation of “All Other OECD” is missing.
Rule 9 Alabama first!
In tabular presentations the effects of alphabetizing (or presesnting the data ordered by some other aspect unrelated to the data) can be particularly profound.
Ordering the spacing the data as a stem-and-leaf diagram provides insights previously invisible.
Rule 10 Label:(a) illegibly,(b) incompletely, (c)incorrectly, and (d) ambiguously
A picture may be worth a thousand words, but it may take a hundred words to make it so.
Rule 11 More is murkier:(a) more decimal places and (b) more demensions
Just as increasing the number of decimal place can make a table harder to understand, so can increasing the number of dimensions make a graph more confusing.
Rule 12 If it has been done well in the past, think of a new way to do it
Directly add 2 map color into one is wrong( the variable in different grafic may not have any relaaionship)
The rules for good display are simple:
- Examine the data carefully enough to know what thet have to say, and then let them say it with a minimum of adornment.
- Indepicting scale, follow practices of “reasonable regularity”.
- Label clearly and fully.