[NTG-context] Charts, Graphs, Tufte, and ConTeXt
Karl Ove Hufthammer
karloh at mi.uib.no
Thu Jul 27 16:14:23 CEST 2006
Nicolas Grilly skreiv:
> Karl Ove Hufthammer <karloh at mi.uib.no> wrote:
>> Yes! R (especially using the new grid and lattice framework) produces
>> excellent charts and graphs, with very sensible default options (much
>> of it based on Cleveland's research).
>
> What is Cleveland's research? Can you provide references on the web?
Cleveland has done much research on graphical perception and the visual
decoding of information from data displays. He was one of the first to do
actual scientific study on this.
Earlier, many people had opinions on various common graphs (e.g., ‘pie
charts are bad – I don’t like them’). Cleveland came along and did actual
scientific *experiments* to show why some type of graphs were worse than
others for presenting data (e.g., ‘humans are very bad at judging angles
and very good at judging position along a common scale; that’s why pie
charts are terrible and dot plots good at presenting (the same) data’),
and he proposed new graphical display *based* on this research.
See for example this very interesting and easy to read article:
Title: Graphical Perception: Theory, Experimentation, and
Application to the Development of Graphical Methods
Author(s): William S. Cleveland; Robert McGill
Source: Journal of the American Statistical Association, Vol. 79,
No. 387. (Sep., 1984), pp. 531-554.
Stable URL:
http://links.jstor.org/sici?sici=0162-1459%28198409%2979%3A387%3C531%3AGPTEAA%3E2.0.CO%3B2-Y
Some of Cleveland’s research resulted in novel graphical displays, such as
trellis displays, coplots and dot plots, and much of it resulted in
improvements to common displays. Unfortunately, many of these smaller
improvements and very minor but important details seems to be unknown to
people who design graphing software. Let me mention a few (not too
exciting) examples:
Circles should be used instead of rectangles as plotting symbols, especially
with data overlap, because overlapping rectangles still look like
rectangles, while overlapping circles look nothing like circles. Cleveland
actually recommended a list of plotting symbols (for plotting several
groups in one plot) for use in scatterplots; see:
Title: A Model for Studying Display Methods of Statistical
Graphics
Author(s): William S. Cleveland
Source: Journal of Computational and Graphical Statistics, Vol. 2,
No. 4. (Dec., 1993), pp. 323-343.
Stable URL:
http://links.jstor.org/sici?sici=1061-8600%28199312%292%3A4%3C323%3AAMFSDM%3E2.0.CO%3B2-Y
Tick marks should point outwards, not inwards (so they don’t camouflage
data).
The data rectangle should always be slightly smaller than the scale-line
rectangle (the box around the data), again to avoid camouflaging the data.
These are just a few (perhaps less interesting) features of graph design
that R does correctly, but many other programs (e.g., gnuplot, at least for
tick marks and data rectangles) don’t (by default).
Much of Cleveland’s research has been summarised in his excellent book
W.S. Cleveland. Elements of Graphing Data. Revised edition. 1994.
See also his other book
W.S. Cleveland. Visualizing data. 1993.
Other articles of his that may be of interest:
Title: Graphical Perception and Graphical Methods for Analyzing
Scientific Data
Author(s): William S. Cleveland; Robert McGill
Source: Science, New Series, Vol. 229, No. 4716. (Aug. 30, 1985),
pp. 828-833.
Stable URL:
http://links.jstor.org/sici?sici=0036-8075%2819850830%293%3A229%3A4716%3C828%3AGPAGMF%3E2.0.CO%3B2-D
Abstract: Graphical perception is the visual decoding of the
quantitative and qualitative information encoded on
graphs. Recent investigations have uncovered basic
principles of human graphical perception that have
important implications for the display of data. The
computer graphics revolution has stimulated the invention
of many graphical methods for analyzing and presenting
scientific data, such as box plots, two-tiered error bars,
scatterplot smoothing, dot charts, and graphing on a log
base 2 scale.
Title: Graphical Perception: The Visual Decoding of Quantitative
Information on Graphical Displays of Data
Author(s): William S. Cleveland; Robert McGill
Source: Journal of the Royal Statistical Society. Series A
(General), Vol. 150, No. 3. (1987), pp. 192-229.
Stable URL:
http://links.jstor.org/sici?sici=0035-9238%281987%29150%3A3%3C192%3AGPTVDO%3E2.0.CO%3B2-T
Abstract: Studies in graphical perception, both theoretical and
experimental, provide a scientific foundation for the
construction area of statistical graphics. From these
studies a paradigm that has important applications for
practice has begun to emerge. The paradigm is based on
elementary codes: Basic geometric and textural aspects of
a graph that encode the quantitative information. The
methodology that can be invoked to study graphical
perception is illustrated by an investigation of the shape
parameter of a two-variable graph, a topic that has had
much discussion, but little scientific study, for at least
70 years.
Title: The Many Faces of a Scatterplot
Author(s): William S. Cleveland; Robert McGill
Source: Journal of the American Statistical Association, Vol. 79,
No. 388. (Dec., 1984), pp. 807-822.
Stable URL:
http://links.jstor.org/sici?sici=0162-1459%28198412%2979%3A388%3C807%3ATMFOAS%3E2.0.CO%3B2-G
Abstract: The scatterplot is one of our most powerful tools for data
analysis. Still, we can add graphical information to
scatterplots to make them considerably more powerful.
These graphical additions, faces of sorts, can enhance
capabilities that scatterplots already have or can add
whole new capabilities that faceless scatterplots do not
have at all. The additions we discuss here-some new and
some old-are (a) sunflowers, (b) category codes, (c) point
cloud sizings, (d) smoothings for the dependence of $y$ on
$x$ (middle smoothings, spread smoothings, and upper and
lower smoothings), and (e) smoothings for the bivariate
distribution of $x$ and $y$ (pairs of middle smoothings,
sum-difference smoothings, scale-ratio smoothings, and
polar smoothings). The development of these additions is
based in part on a number of graphical principles that can
be applied to the development of statistical graphics in
general.
--
Karl Ove Hufthammer
More information about the ntg-context
mailing list