Sometimes we may feel the need to plot a chart with information for two totally different variables simultaneously. Perhaps those variables are even measured in two different units of measurement (i.e. kilograms and hours). First thing we have to keep in mind in such scenario is that this is, in general, a **terrible idea**. If you don’t understand why and want to read some good articles about this topic, please check both Dual-Scaled Axes in Graphs, Are They Ever the Best Solution? and A Study on Dual-Scale Data Charts. This dual-scaled axes topic is also one of the mistakes explained in the article Mistakes, we’ve drawn a few.

However, unrecommended as it may be, it may happen that we absolutely need to create a plot with two vertical dual-scaled axes (one on the left and another on the right). Maybe our client needs to condense information as much as possible, maybe we are under strict restrictions imposed by scientific journals, etc. So let’s see how we can do this using `ggplot2`

.

First of all, let’s have a look at some data that could benefit from this kind of plot.

```
data <- readRDS(file = "data/two.y.axes.Rda")
summary(data)
```

```
## group year var1 var2
## Group 1:49 Min. :1970 Min. :1.523 Min. :53.40
## Group 2:49 1st Qu.:1982 1st Qu.:3.230 1st Qu.:59.42
## Group 3:49 Median :1994 Median :4.438 Median :62.35
## Group 4:49 Mean :1994 Mean :4.309 Mean :63.03
## Group 5:49 3rd Qu.:2006 3rd Qu.:5.447 3rd Qu.:65.10
## Group 6:49 Max. :2018 Max. :6.645 Max. :76.90
## (Other): 0 NA's :167 NA's :100
```

We see there are six groups and for each of them we have historical data (from 1970 to 2018) for two different variables (here `var1`

and `var2`

). But we immediately realize there is one problem: range for both variables is pretty different. While `var1`

goes from 1.5 to 6.6, `var2`

goes from 53.4 to 76.9. If we show both variables in the same plot using same scale, `var1`

will be at the very bottom and `var2`

will be at the top. And both will look almost flat with a lot of empty space in the middle. This is clearly not what we want.

Fortunately, there is something we can do. We can take the range for one of the variables as our main range, and then transform the range for the other variable so that both variables can be plotted on the same vertical space. We begin by getting the integer ranges for our two variables.

```
integer_range <- function(x) {
c(floor(min(x, na.rm = TRUE)), ceiling(max(x, na.rm = TRUE)))
}
```

`(v1 <- integer_range(data$var1))`

`## [1] 1 7`

`(v2 <- integer_range(data$var2))`

`## [1] 53 77`

Good, we will take values 1 to 7 to represent `var1`

on the left axis. Simple enough. But for `var2`

we probably don’t want to have a vertical right axis with values from 53 to 77. It will look nicer if we just take values from 50 to 80, isn’t it? So, let’s manually change this range.

`v2 <- c(50, 80)`

And now we can use a really basic linear model to learn how we can transform `var2`

to use `var1`

range (or vice versa).

```
lmod <- lm(v2 ~ v1)
(coef <- lmod$coefficients)
```

```
## (Intercept) v1
## 45 5
```

With this information in place, we can now create our two vertical dual-scaled axes plot.

```
library(tidyverse)
source("twoyaxes_theme.R")
ggplot(data) +
aes(x = year) +
geom_point(aes(y = var1), color = colors[1], size = 1) +
geom_line(aes(y = var1), color = colors[1]) +
geom_point(aes(y = (var2 - coef[1]) / coef[2]), color = colors[2], size = 1) +
geom_line(aes(y = (var2 - coef[1]) / coef[2]), color = colors[2]) +
facet_wrap(. ~ group, scales = "free") +
scale_x_continuous(breaks = seq(1970, 2018, 6),
minor_breaks = seq(1970, 2018, 2),
labels = sprintf("'%02d", seq(1970, 2018, 6) %% 100)) +
scale_y_continuous(breaks = v1[1]:v1[2], limits = v1, position = "left",
sec.axis = sec_axis(~ coef[1] + . * coef[2],
name = "Second variable",
breaks = seq(v2[1], v2[2], 5))) +
labs(title = "Plot with Two Vertical Axes (Left & Right)",
y = "First variable", x = "",
caption = "Plot by Albert Mata (@almata).") +
custom_theme
```

Some interesting points:

- First pair of
`geom_point()`

and`geom_line()`

plots first variable. - Second pair of
`geom_point()`

and`geom_line()`

plots second variable. This is where we transform our data using coefficients from the linear model we’ve calculated. - We use
`facet_wrap()`

as there are six different groups of data. With`scales = "free"`

we force all axes to be drawn for all six plots. - With
`scale_x_continuous()`

we set breaks and labels for X axis. In this example this is hardcoded. - With
`scale_y_continuous()`

we set breaks and limits for both Y axes. In the`sec_axis`

part we undo the transformation.

Please note `colors`

is just a vector with all the colors used in the plot (being the first two, the ones used for data), and `custom_theme`

returns a `ggplot2`

theme to customize this plot (basically font families, sizes, colors, and other similar stuff). As you can see, I keep them in a separated `twoyaxes_theme.R`

file for easier code reuse and more clarity (but feel free to ask if you want to have a look at it).

Finally, remember we can save our plots (i.e. as SVG) by assigning the result of `ggplot()`

call to a variable (i.e. `p`

) and then using `ggsave()`

function.

`ggsave(file = "two-y-axes-plot.svg", plot = p, width = 10, height = 6)`