A data visualisation consists of data symbols, guides, and labels.
We produce data symbols by mapping data values to the visual features of a shape.
Which visual feature we choose depends on:
We can map data values to text data symbols and text on guides (axes and legends).
We can map metadata and data statistics to text labels (titles, captions, and annotations).
The importance of text
Text data symbols
Text Features
Guides
Labels
The Importance of Text
Without text, it is difficult to extract any useful information from a data visualisation.
Viewers pay a lot of attention to the text on a data visualisation.
Text is initially processed as visual information, but is then processed as verbal information.
Text consists of one or more familiar visual objects.
Text Data Symbols
The crimeDistrictTotal
data frame contains the
total
of offenders by district
.
## # A tibble: 6 × 3
## district total avgPop
## <chr> <int> <dbl>
## 1 Auckland City 5583 17438.
## 2 Bay of Plenty 9369 15192.
## 3 Canterbury 8976 23550.
## 4 Central 8700 15491.
## 5 Counties/Manukau 10229 26388.
## 6 Eastern 6221 9382.
We can map each total
value to the
shape of a text data symbol.
We can map back from text data symbols to the raw data.
Text is appropriate for representing quantitative data.
Differences and ratios can be calculated, but that involves much greater cognitive load.
Text is appropriate for representing qualitative data.
Text can express zero and negative values.
Text has spectacular congruence.
Text data symbols are terrible for visual summaries/data statistics.
Text data symbols are terrible for visual summaries/data statistics.
Text Features
We can also map data values to other visual features of text data symbols: position, size, angle, and colour
A text data symbol often involves a redundant mapping.
The shape of text is also affected by the font.
The font family
describes an overall style.
The fontface
describes whether the text is bold,
italic, or plain.
The font families "sans"
, "serif"
, and
"mono"
are always available.
Selecting a custom font family is easy as long as you are using the right graphics device:
Further reading:
Guides
Guides are visual representations of scales, such as axes and legends.
Data values are mapped to data symbols such as lines to create tick marks, grid lines, and legends keys
Data values are mapped to text data symbols to create tick labels and legends labels
Axes and legends are generated automatically.
The scale functions like scale_x_continuous()
and
scale_colour_manual()
allow control of the details.
breaks
specifies the tick mark locations.labels
specifies the tick labels, either explicitly or
with a function like scales::label_comma()
.guide
takes a function like guide_axis()
or guide_legend()
to control the placement and layout of
the axis or legend.Direct labelling draws text data symbols in proximity to other data symbols rather than on a separate guide.
Labels
Labels include titles, captions, and annotations
A plot title may describe the overall purpose of the plot and may describe the overall message of the plot.
Axis titles and guide titles describe the variables that are being mapped and can include metadata such as the units of measurement.
An annotation may be used to describe an important feature within a plot.
We can map back from labels to metadata.
Labels are excellent for visual summaries/data statistics.
Labels and data symbols often work to support each other.
We can map back from labels to data statistics.
The labs()
function can be used to specify the plot
title and caption.
labels <-
labs(title=paste("Youth Crime has Declined",
"in all Districts over the last Decade"),
subtitle="(with small uptick in the last two years)",
caption=paste('Number of youth offenders aged between',
'14 and 16 from 2010 to 2020 in',
'New Zealand.\nSource: New Zealand',
'Ministry of Justice "Youth Justice',
' Indicator Report", 2021.'))
xlab()
and ylab()
set the axis
titles.
Use NULL
to remove the title.
annotate()
adds a geom.
Titles and/or captions may provide top-down goals and tasks that direct attention.
Titles and/or captions may provide top-down goals and tasks that direct attention.
Summary
Text is an essential component of any data visualisation.
We can use text data symbols to represent data values with great accuracy, but only for a small number of data values.
We can use text labels to represent much more complex and abstract information, including metadata and data statistics.
Labels are a very effective way to direct attention.
Exercises
Identify the role of each text element within this data visualisation.
What {ggplot2} functions would you use to draw each text element?
Can you see anything wrong with this data visualisation?
Can you see anything wrong with this data visualisation?
New Zealand Listener April 27-May 3 2024.