How to Create a Plot with Two Vertical Dual-Scaled Axes (Left & Right)

Sometimes we may feel the need to plot a chart with information for two totally different variables simultaneously. Perhaps those variables are even measured in two different units of measurement (i.e. kilograms and hours). First thing we have to keep in mind in such scenario is that this is, in general, a terrible idea. If you don’t understand why and want to read some good articles about this topic, please check both Dual-Scaled Axes in Graphs, Are They Ever the Best Solution? and A Study on Dual-Scale Data Charts. This dual-scaled axes topic is also one of the mistakes explained in the article Mistakes, we’ve drawn a few.

However, unrecommended as it may be, it may happen that we absolutely need to create a plot with two vertical dual-scaled axes (one on the left and another on the right). Maybe our client needs to condense information as much as possible, maybe we are under strict restrictions imposed by scientific journals, etc. So let’s see how we can do this using ggplot2.

First of all, let’s have a look at some data that could benefit from this kind of plot.

data <- readRDS(file = "data/two.y.axes.Rda")
summary(data)
##      group         year           var1            var2      
##  Group 1:49   Min.   :1970   Min.   :1.523   Min.   :53.40  
##  Group 2:49   1st Qu.:1982   1st Qu.:3.230   1st Qu.:59.42  
##  Group 3:49   Median :1994   Median :4.438   Median :62.35  
##  Group 4:49   Mean   :1994   Mean   :4.309   Mean   :63.03  
##  Group 5:49   3rd Qu.:2006   3rd Qu.:5.447   3rd Qu.:65.10  
##  Group 6:49   Max.   :2018   Max.   :6.645   Max.   :76.90  
##  (Other): 0                  NA's   :167     NA's   :100

We see there are six groups and for each of them we have historical data (from 1970 to 2018) for two different variables (here var1 and var2). But we immediately realize there is one problem: range for both variables is pretty different. While var1 goes from 1.5 to 6.6, var2 goes from 53.4 to 76.9. If we show both variables in the same plot using same scale, var1 will be at the very bottom and var2 will be at the top. And both will look almost flat with a lot of empty space in the middle. This is clearly not what we want.

Fortunately, there is something we can do. We can take the range for one of the variables as our main range, and then transform the range for the other variable so that both variables can be plotted on the same vertical space. We begin by getting the integer ranges for our two variables.

integer_range <- function(x) {
  c(floor(min(x, na.rm = TRUE)), ceiling(max(x, na.rm = TRUE)))
}
(v1 <- integer_range(data$var1))
## [1] 1 7
(v2 <- integer_range(data$var2))
## [1] 53 77

Good, we will take values 1 to 7 to represent var1 on the left axis. Simple enough. But for var2 we probably don’t want to have a vertical right axis with values from 53 to 77. It will look nicer if we just take values from 50 to 80, isn’t it? So, let’s manually change this range.

v2 <- c(50, 80)

And now we can use a really basic linear model to learn how we can transform var2 to use var1 range (or vice versa).

lmod <- lm(v2 ~ v1)
(coef <- lmod$coefficients)
## (Intercept)          v1 
##          45           5

With this information in place, we can now create our two vertical dual-scaled axes plot.

library(tidyverse)
source("twoyaxes_theme.R")

ggplot(data) + 
  aes(x = year) +
  geom_point(aes(y = var1), color = colors[1], size = 1) + 
  geom_line(aes(y = var1), color = colors[1]) + 
  geom_point(aes(y = (var2 - coef[1]) / coef[2]), color = colors[2], size = 1) + 
  geom_line(aes(y = (var2 - coef[1]) / coef[2]), color = colors[2]) +
  facet_wrap(. ~ group, scales = "free") +
  scale_x_continuous(breaks = seq(1970, 2018, 6), 
                     minor_breaks = seq(1970, 2018, 2),
                     labels = sprintf("'%02d", seq(1970, 2018, 6) %% 100)) +
  scale_y_continuous(breaks = v1[1]:v1[2], limits = v1, position = "left",
                     sec.axis = sec_axis(~ coef[1] + . * coef[2], 
                                         name = "Second  variable", 
                                         breaks = seq(v2[1], v2[2], 5))) +
  labs(title = "Plot with Two Vertical Axes (Left & Right)",
       y = "First  variable", x = "", 
       caption = "Plot by Albert Mata (@almata).") +
  custom_theme

Some interesting points:

  • First pair of geom_point() and geom_line() plots first variable.
  • Second pair of geom_point() and geom_line() plots second variable. This is where we transform our data using coefficients from the linear model we’ve calculated.
  • We use facet_wrap() as there are six different groups of data. With scales = "free" we force all axes to be drawn for all six plots.
  • With scale_x_continuous() we set breaks and labels for X axis. In this example this is hardcoded.
  • With scale_y_continuous() we set breaks and limits for both Y axes. In the sec_axis part we undo the transformation.

Please note colors is just a vector with all the colors used in the plot (being the first two, the ones used for data), and custom_theme returns a ggplot2 theme to customize this plot (basically font families, sizes, colors, and other similar stuff). As you can see, I keep them in a separated twoyaxes_theme.R file for easier code reuse and more clarity (but feel free to ask if you want to have a look at it).

Finally, remember we can save our plots (i.e. as SVG) by assigning the result of ggplot() call to a variable (i.e. p) and then using ggsave() function.

ggsave(file = "two-y-axes-plot.svg", plot = p, width = 10, height = 6)