--- title: "Introduction to ggpointless" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to ggpointless} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", dev = "ragg_png" # <3 ) library(ggpointless) ``` `ggpointless` is a small extension of the [`ggplot2`](https://ggplot2.tidyverse.org/) package that provides additional layers: 1. `geom_pointless()` & `stat_pointless()` 2. `geom_lexis()` & `stat_lexis()` 3. `geom_chaikin()` & `stat_chaikin()` 4. `geom_catenary()` & `stat_catenary()` ## geom_pointless `geom_pointless()` is a layer to easily add minimal emphasis to your plots. The function takes it's power from `stat_pointless()`, which does all the work, but is not usually in the spotlight. ```{r default_pointless} library(ggplot2) library(ggpointless) x <- seq(-pi, pi, length.out = 100) y <- outer(x, 1:5, function(x, y) sin(x * y)) df1 <- data.frame( var1 = x, var2 = rowSums(y) ) p <- ggplot(df1, aes(x = var1, y = var2)) p + geom_pointless(location = c("first", "last", "minimum", "maximum")) ``` As you see, just adding `geom_pointless()` to `ggplot(...)` is not terribly useful on its own but when it teams up with `geom_line()` and friends, hopefully. ```{r location} p <- p + geom_line() p + geom_pointless(location = "all", size = 3) ``` `geom_pointless()` behaves like `geom_point()` does with the addition of a `location` argument. You can set it to `"first"`, `"last"` (the default), `"minimum"`, `"maximum"`, and `"all"`; where `"all"` is just shorthand to select `"first"`, `"last"`, `"minimum"` and `"maximum"`. In addition, you can use the computed variable `location` and map it to an aesthetic, e.g. `color`. ```{r color} p + geom_pointless(aes(color = after_stat(location)), location = "all", size = 3 ) + theme(legend.position = "bottom") ``` ### Order and orientation The locations are determined in the order in which they appear in the data, like `geom_path()` does compared to `geom_line()`. This can be seen in the next example, with sample data kindly taken from the [`geomtextpath`](https://github.com/AllanCameron/geomtextpath/#using-geomtextpath) package: ```{r spiral, fig.show='hold'} x <- seq(5, -1, length.out = 1000) * pi spiral <- data.frame( var1 = sin(x) * 1:1000, var2 = cos(x) * 1:1000 ) p <- ggplot(spiral) + geom_path() + coord_equal(xlim = c(-1000, 1000), ylim = c(-1000, 1000)) + theme(legend.position = "none") p + aes(x = var1, y = var2) + geom_pointless(aes(color = after_stat(location)), location = "all", size = 3 ) + labs(subtitle = "orientation = 'x'") p + aes(y = var1, x = var2) + geom_pointless(aes(color = after_stat(location)), location = "all", size = 3 ) + labs(subtitle = "orientation = 'y'") ``` As you see from the first of the last two examples `"first"` and `"minimum"` overlap, and `"first"` wins over `"minimum"`. If `location` is set to `"all"`, then the order in which points are plotted from top to bottom is: `"first"` > `"last"` > `"minimum"` > `"maximum"`. Otherwise, the order is determined as specified in the `location` argument, which also applies to the order of the legend key labels. ```{r overplotting} cols <- c( "first" = "#f8766d", "last" = "#7cae00", "minimum" = "#00bfc4", "maximum" = "#c77cff" ) df2 <- data.frame( var1 = 1:2, var2 = 1:2 ) p <- ggplot(df2, aes(x = var1, y = var2)) + geom_path() + coord_equal() + scale_color_manual(values = cols) # same as location = 'all' p + geom_pointless(aes(color = after_stat(location)), location = c("first", "last", "minimum", "maximum"), size = 3 ) + labs(subtitle = "same as location = 'all'") ``` ```{r} # reversed order p + geom_pointless(aes(color = after_stat(location)), location = c("maximum", "minimum", "last", "first"), size = 3 ) + labs(subtitle = "custom order") ``` ```{r} # same as location = 'all' again p + geom_pointless(aes(color = after_stat(location)), location = c("maximum", "minimum", "last", "first", "all"), size = 3 ) + labs(subtitle = "same as location = 'all' again") ``` ### Pick a different geom Just like all `stat_*` functions, `stat_pointless()` has a default geom, which is `"point"`. This means in reverse that you can highlight e.g. minimum and maximum in another way, for example with a horizontal line. ```{r stat} set.seed(42) ggplot(data.frame(x = 1:10, y = sample(1:10)), aes(x, y)) + geom_line() + stat_pointless( aes(yintercept = y, color = after_stat(location)), location = c("minimum", "maximum"), geom = "hline" ) + guides(color = guide_legend(reverse = TRUE)) ``` ## Plotting lifelines with geom_lexis `geom_lexis()` draws a lifeline for an event from it's start to it's end. The required aesthetics are `x` and `xend`. Here is an example: ```{r default_lexis} df1 <- data.frame( key = c("A", "B", "B", "C", "D"), x = c(0, 1, 6, 5, 6), y = c(5, 4, 10, 8, 10) ) p <- ggplot(df1, aes(x = x, xend = y, color = key)) + coord_equal() p + geom_lexis() ``` Also, if there is a gap in an event a horizontal line is drawn, which can be hidden setting `gap_filler = FALSE`. ```{r gap_filler, fig.show='hold'} p + geom_lexis(gap_filler = FALSE) ``` You can further style the appearance of your plot using the additional arguments. If you e.g. want to make a visual distinction between the ascending lines and the connecting lines, use `after_stat()` to map the `type` variable to the linetype aesthetic (or any other aesthetic). The variable `type` is created by `geom_lexis()` and takes two values: "solid" and "dotted"; so you might also want to call `scale_linettype_identity`. ```{r after_stat} p + geom_lexis( aes(linetype = after_stat(type)), point_show = FALSE ) + scale_linetype_identity() ``` ### Date and POSIXct classes You see the coordinates on the vertical y-axis show the difference between `x` and `xend` aesthetics. The "magic" of `geom_lexis()` happens in `stat_lexis()` when the input data is transformed and the calculations are performed. ```{r numeric} df1 <- data.frame( start = c(2019, 2021), end = c(2022, 2022), key = c("A", "B") ) ggplot(df1, aes(x = start, xend = end, group = key)) + geom_lexis() + coord_fixed() ``` Keeping in mind that dates are internally represented as the number of days, and the POSIXct class in turn represents seconds since some origin, the y-scale values in the next plots should come as no surprise. ```{r dates-and-times, fig.show='hold'} # Date fun <- function(i, class) as.Date(paste0(i, "-01-01")) df1[, c("start", "end")] <- lapply(df1[, c("start", "end")], fun) p1 <- ggplot(df1, aes(x = start, xend = end, group = key)) + geom_lexis() + labs(y = "days") + coord_fixed() # POSIXct df2 <- df1 df2[, c("start", "end")] <- lapply(df2[, c("start", "end")], as.POSIXct) p2 <- ggplot(df2, aes(x = start, xend = end, group = key)) + geom_lexis() + labs(y = "seconds") + coord_fixed() p1 p2 ``` In order to change the breaks and labels of the vertical scale to, say, years, we make the assumption that 1 year has 365.25 days, or 365.25 * 86400 seconds. ```{r transformation, fig.show='hold'} # years, roughly p1 + scale_y_continuous( breaks = 0:3 * 365.25, # or for p2: 0:3*365.25*86400 labels = function(i) floor(i / 365.25) # floor(i / 365.25*86400) ) + labs(y = "years") ``` ## geom_chaikin The algorithm of the `geom_chaikin()` function iteratively cuts off the ragged corners, as you can see in the example below. ```{r geom-chaikin-intro, fig.show='hold'} set.seed(42) dat <- data.frame( x = seq.int(10), y = sample(15:30, 10) ) p1 <- ggplot(dat, aes(x, y)) + geom_line(linetype = "12") p1 + geom_chaikin() ``` ## geom_catenary A catenary is the curve of an hanging chain or cable under its own weight when supported only at its ends (see [Wikipedia](https://en.wikipedia.org/wiki/Catenary) for more information). ```{r geom-catenary-intro, fig.show='hold'} ggplot(data.frame(x = c(0, 1), y = c(1, 1)), aes(x, y)) + geom_catenary() + ylim(0, 1) ``` `geom_catenary()` calculates a default value for the length of the chain that is set to 2 times the Euclidean distance of all x/y coordinates. Use the `chainLength argument to overwrite the default. ```{r geom-catenary-chainLength, fig.show='hold'} ggplot(data.frame(x = c(0, 1), y = c(1, 1)), aes(x, y)) + geom_catenary(chainLength = 1.5) + ylim(0, 1) ``` Multiple x/y coordinates are supported too. ```{r geom-catenary-chainLength-two-obs, fig.show='hold'} ggplot(data.frame(x = c(0, 1, 4), y = c(1, 1, 1)), aes(x, y)) + geom_catenary() ``` If you set `chainLength` to a value that is too short to connect two points, a straight line is drawn and a message is shown. ```{r geom-catenary-chainLength-minimum, fig.show='hold'} ggplot(data.frame(x = c(0, 1, 4), y = c(1, 1, 1)), aes(x, y)) + geom_catenary(chainLength = 4) ``` If you want to draw a chain where each piece has a different `chainLength` value, remember you can add a list to a ggplot object: ```{r geom-catenary-chainLength-individual, fig.show='hold'} ggplot(data.frame(x = c(0, 1), y = 1), aes(x, y)) + lapply(2:10, function(chainLength) { geom_catenary(chainLength = chainLength) } ) ``` ## Data The following data sets are shipped with the `ggpointless` package: 1. `co2_ml` : [CO~2~ records taken at Mauna Loa](https://gml.noaa.gov/ccgg/trends/data.html) 2. `covid_vac` : [COVID-19 Cases and Deaths by Vaccination Status](https://covid.cdc.gov/covid-data-tracker/#rates-by-vaccine-status) 3. `female_leaders` : [Elected and appointed female heads of state and government](https://en.wikipedia.org/w/index.php?title=List_of_elected_and_appointed_female_heads_of_state_and_government&oldid=1078024588) See the [`vignette("examples")`](https://flrd.github.io/ggpointless/articles/examples.html) for possible use cases.