Effective Data Visualisation with R
Review

Paul Murrell
The University of Auckland

Data Symbols, Guides, and Labels

Data Symbols, Guides, and Labels

Data Symbols, Guides, and Labels

A Very Simple Visual Model

A Very Simple Visual Model

Map Data Values to Visual Features

Map Data Values to Visual Features

{ggplot2}

{ggplot2}

ggplot(rates) + 
    geom_col(aes(x=Year, y=Rate, fill=Age), position="dodge")

Quantitative Visual Features

Qualitative Visual Features

Accuracy

Capacity

Combining Visual Features

Combining Visual Features

Multi-Value Data Symbols

Multi-Value Data Symbols

Multi-Value Data Symbols

Multivariate Data Symbols

Multivariate Data Symbols

Multivariate Data Symbols

Design

  • Simple design adjustments can improve effectiveness as well as aesthetic appeal.

Design

ggplot(crimeGenderTotal) + 
    geom_col(aes(y=gender, x=total, fill=gender)) +
    scale_fill_manual(values=c("pink", "skyblue")) +
    labs(title=paste('<span style="color: skyblue">**Males**</span>',
                     '<span style="color: white">Responsible for More Offending than</span>',
                     '<span style="color: pink">**Females**</span>')) +
    scale_x_continuous(breaks=0, expand=expansion(0)) +
    theme(plot.background=element_rect(fill="grey20"),
          plot.title=element_markdown(),
          panel.background=element_blank(),
          panel.grid.major=element_blank(),
          axis.title=element_blank(), 
          axis.text.y=element_blank(),
          axis.text.x=element_text(color="white", hjust=0),
          axis.ticks.y=element_blank(),
          legend.position="none",
          plot.margin=unit(rep(1, 4), "cm"))

Customisation

  • If we cannot get exactly what we want from {ggplot2} we can draw it ourselves with {grid}.

male <- function(x=.5, y=.5) {
    vp <- viewport(x, y, width=unit(6, "mm"), height=unit(6, "mm"),
                   gp=gpar(col="white"))
    grobTree(segmentsGrob(.5, .5, 1, 1),
             segmentsGrob(1, 1, 2/3, 1),
             segmentsGrob(1, 1, 1, 2/3),
             circleGrob(1/3, 1/3, r=1/3, gp=gpar(fill="skyblue")),
             vp=vp)
}
female <- function(x=.5, y=.5) {
    vp <- viewport(x, y, width=unit(6, "mm"), height=unit(6, "mm"),
                   gp=gpar(col="white"))
    grobTree(segmentsGrob(2/3, .5, 2/3, -1/3),
             segmentsGrob(1/3, 0, 1, 0),
             circleGrob(2/3, 2/3, r=1/3, gp=gpar(fill="pink")),
             vp=vp)
}
symbols <- function(data, coords) {
    grobTree(male(unit(coords$x[2], "npc") - unit(6, "mm"), coords$y[2]),
             female(unit(coords$x[1], "npc") - unit(6, "mm"), coords$y[1]))
}
ggplot(crimeGenderTotal) +
    geom_col(aes(x=total, y=gender)) +
    grid_panel(symbols, aes(x=total, y=gender))

Conclusion

Conclusion

  • Some of the information we have shared may seem obvious, but we do not have to look far to see simple mistakes being made.

  • The information we have shared is intended to help you make good decisions about how to visualise your data.

  • The information we have shared is intended to help you critically appraise data visualisations that others have made (and possibly offer suggestions for improvements).

Conclusion

  • The main value of the ideas is to provide a set of guidelines

    • Possible reasons for why things are bad.
    • Things to try to make things better.
  • Many guidelines suggest both what to do and what not to do.

  • You always have a test device with you: your own eyes and visual system

    • If something does not look right, check it out.
    • You should be able to tell if you have made things better (or worse).

Exercises

Exercise

Exercise

Exercise

  • What is good and bad about:
    • the mappings of any one data visualisation?
    • the design of any one data visualisation?
    • the design of the whole page?

Exercise

  • Can you produce a {ggplot2}/{grid} version:

    • of any one data visualisation?
    • of all four data visualisations together?
    • of the whole page?
  • Data is not provided for the purple line in the top plot.

  • You will have to replace the map with a different data visualisation.

Exercise

  • Can you produce a data visualisation that best supports the statement in the grey box?