# Getting started with emmeans

*This post was last updated on 2021-11-04.*

Package **emmeans** (formerly known as **lsmeans**) is enormously useful for folks wanting to do post hoc comparisons among groups after fitting a model. It has a very thorough set of vignettes (see the vignette topics here), is very flexible with a ton of options, and works out of the box with a lot of different model objects (and can be extended to others 👍).

I’ve been consistently recommending **emmeans** to students fitting models in R. However, often times students struggle a bit to get started using the package, possibly due to the sheer amount of flexibility and information in the vignettes.

I’ve put together some basic examples for using **emmeans**, meant to be a complement to the vignettes. Specifically this post will demonstrate a few of the built-in options for some standard post hoc comparisons; I will write a separate post about custom comparisons in **emmeans**.

*Disclaimer: This post is about using a package in R and so unfortunately does not focus on appropriate statistical practice for model fitting and post hoc comparisons.*

## Table of Contents

- R packages
- The dataset and model
- Built in comparisons with emmeans()
- All pairwise comparisons
- Back-transforming results
- Changing the multiple comparisons adjustment
- Confidence intervals for comparisons
- Putting results in a data.frame
- Within group comparisons
- Main effects comparisons
- Treatment vs control example
- Alternative code for comparisons
- Just the code, please

# R packages

I will load **magrittr** for the pipe in addition to **emmeans**.

```
library(emmeans) # v. 1.7.0
library(magrittr) # v. 2.0.1
```

# The dataset and model

I’ve made a small dataset to use in this example.

The response variable is `resp`

, which comes from the log-normal distribution, and the two crossed factors of interest are `f1`

and `f2`

. Each factor has two levels: a control called `c`

as well as a second non-control level.

```
dat = data.frame(resp = c(1.6,0.3,3,0.1,3.2,0.2,0.4,0.4,2.8,
0.7,3.8,3,0.3,14.3,1.2,0.5,1.1,4.4,0.4,8.4),
f1 = factor(c("a","a","a","a","a",
"a","a","a","a","a","c","c","c","c","c",
"c","c","c","c","c")),
f2 = factor(c("1","c","1","c","1",
"c","1","c","1","c","1","c","1","c","1",
"c","1","c","1","c")))
str(dat)
```

```
# 'data.frame': 20 obs. of 3 variables:
# $ resp: num 1.6 0.3 3 0.1 3.2 0.2 0.4 0.4 2.8 0.7 ...
# $ f1 : Factor w/ 2 levels "a","c": 1 1 1 1 1 1 1 1 1 1 ...
# $ f2 : Factor w/ 2 levels "1","c": 1 2 1 2 1 2 1 2 1 2 ...
```

The model I will use is a linear model with a log-transformed response variable and the two factors and their interaction as explanatory variables. This is the “true” model since I created these data so I’m skipping all model checks (which I would not do in a real analysis).

Note I use `log(resp)`

in the model rather than creating a new log-transformed variable. This will allow me to demonstrate one of the convenient options available in `emmeans()`

later.

`fit1 = lm(log(resp) ~ f1 + f2 + f1:f2, data = dat)`

# Built in comparisons with emmeans()

The **emmeans** package has helper functions for commonly used post hoc comparisons (aka *contrasts*). For example, we can do pairwise comparisons via `pairwise`

or `revpairwise`

, treatment vs control comparisons via `trt.vs.ctrl`

or `trt.vs.ctrlk`

, and even consecutive comparisons via `consec`

.

The available built-in functions for doing comparisons are listed in the documentation for `?"contrast-methods"`

.

# All pairwise comparisons

One way to use `emmeans()`

is via formula coding for the comparisons. The formula is defined in the `specs`

argument.

In my first example I do all pairwise comparisons for all combinations of `f1`

and `f2`

. The built-in function `pairwise`

is put on the left-hand side of the formula of the `specs`

argument. The factors with levels to compare among are on the right-hand side. Since I’m doing all pairwise comparisons, the combination of `f1`

and `f2`

are put in the formula.

The model object is passed to the first argument in `emmeans()`

, `object`

.

`emm1 = emmeans(fit1, specs = pairwise ~ f1:f2)`

Using the formula in this way returns an object with two parts. The first part, called `emmeans`

, is the estimated marginal means along with the standard errors and confidence intervals. We can pull these out with dollar sign notation, which I demonstrate below.

These results are all on the *model* scale, so in this case these are estimated mean log response for each `f1`

and `f2`

combination. Note the message that `emmeans()`

gives us about results being on the log scale in the output. It knows the model is on the log scale because I used `log(resp)`

as the response variable.

`emm1$emmeans`

```
# f1 f2 emmean SE df lower.CL upper.CL
# a 1 0.569 0.445 16 -0.374 1.512
# c 1 -0.102 0.445 16 -1.045 0.842
# a c -1.278 0.445 16 -2.221 -0.334
# c c 1.335 0.445 16 0.392 2.279
#
# Results are given on the log (not the response) scale.
# Confidence level used: 0.95
```

The second part of the output, called `contrasts`

, contains the comparisons of interest. It is this section that we are generally most interested in when answering a question about differences among groups. You can see which comparison is which via the `contrast`

column.

These results are also on the model scale (and we get the same message in this section), and we’ll want to put them on the original scale.

The comparisons are accompanied by statistical tests of the null hypothesis of “no difference”, but lack confidence interval (CI) limits by default. We’ll need to get these.

The `emmeans()`

package automatically adjusts for multiple comparisons. Since we did all pairwise comparisons the package used a Tukey adjustment. The type of adjustment can be changed.

`emm1$contrasts`

```
# contrast estimate SE df t.ratio p.value
# a 1 - c 1 0.671 0.629 16 1.065 0.7146
# a 1 - a c 1.847 0.629 16 2.934 0.0434
# a 1 - c c -0.766 0.629 16 -1.217 0.6253
# c 1 - a c 1.176 0.629 16 1.869 0.2795
# c 1 - c c -1.437 0.629 16 -2.283 0.1438
# a c - c c -2.613 0.629 16 -4.152 0.0038
#
# Results are given on the log (not the response) scale.
# P value adjustment: tukey method for comparing a family of 4 estimates
```

# Back-transforming results

Since I used a log transformation I can express the results as multiplicative differences in medians on the original (data) scale.

We can always back-transform estimates and CI limits by hand, but in `emmeans()`

we can use the `type`

argument for this. Using `type = "response"`

will return results on the original scale. This works when the transformation is explicit in the model (e.g., `log(resp)`

) and works similarly for link functions in generalized linear models.

You’ll see the message changes in the output once I do this, indicating things were back-transformed from the model scale. We also are reminded that the tests were done on the model scale.

In the `contrast`

column in the `contrasts`

section we can see the expression of the comparisons has changed from additive comparisons (via subtraction) shown above to multiplicative comparisons (via division).

`emmeans(fit1, specs = pairwise ~ f1:f2, type = "response")`

```
# $emmeans
# f1 f2 response SE df lower.CL upper.CL
# a 1 1.767 0.786 16 0.688 4.538
# c 1 0.903 0.402 16 0.352 2.321
# a c 0.279 0.124 16 0.108 0.716
# c c 3.800 1.691 16 1.479 9.763
#
# Confidence level used: 0.95
# Intervals are back-transformed from the log scale
#
# $contrasts
# contrast ratio SE df null t.ratio p.value
# a 1 / c 1 1.9553 1.2306 16 1 1.065 0.7146
# a 1 / a c 6.3396 3.9900 16 1 2.934 0.0434
# a 1 / c c 0.4648 0.2926 16 1 -1.217 0.6253
# c 1 / a c 3.2422 2.0406 16 1 1.869 0.2795
# c 1 / c c 0.2377 0.1496 16 1 -2.283 0.1438
# a c / c c 0.0733 0.0461 16 1 -4.152 0.0038
#
# P value adjustment: tukey method for comparing a family of 4 estimates
# Tests are performed on the log scale
```

# Changing the multiple comparisons adjustment

The `adjust`

argument can be used to change the type of multiple comparisons adjustment. All available options are listed and described in the documentation for `summary.emmGrid`

under the section *P-value adjustments*.

One option is to skip multiple comparisons adjustments all together, using `adjust = "none"`

. If we use this the message about multiple comparisons disappears (since we didn’t use one).

```
emm1.1 = emmeans(fit1, specs = pairwise ~ f1:f2, type = "response", adjust = "none")
emm1.1
```

```
# $emmeans
# f1 f2 response SE df lower.CL upper.CL
# a 1 1.767 0.786 16 0.688 4.538
# c 1 0.903 0.402 16 0.352 2.321
# a c 0.279 0.124 16 0.108 0.716
# c c 3.800 1.691 16 1.479 9.763
#
# Confidence level used: 0.95
# Intervals are back-transformed from the log scale
#
# $contrasts
# contrast ratio SE df null t.ratio p.value
# a 1 / c 1 1.9553 1.2306 16 1 1.065 0.3025
# a 1 / a c 6.3396 3.9900 16 1 2.934 0.0097
# a 1 / c c 0.4648 0.2926 16 1 -1.217 0.2412
# c 1 / a c 3.2422 2.0406 16 1 1.869 0.0801
# c 1 / c c 0.2377 0.1496 16 1 -2.283 0.0365
# a c / c c 0.0733 0.0461 16 1 -4.152 0.0008
#
# Tests are performed on the log scale
```

# Confidence intervals for comparisons

We will almost invariably want to report confidence intervals for any comparisons of interest. We need a separate function to get these. Here is an example using the `confint()`

function with the default 95% CI (the confidence level can be changed, see `?confint.emmGrid`

). I use the pipe to pass the `contrasts`

into the `confint()`

function.

```
emm1.1$contrasts %>%
confint()
```

```
# contrast ratio SE df lower.CL upper.CL
# a 1 / c 1 1.9553 1.2306 16 0.5150 7.424
# a 1 / a c 6.3396 3.9900 16 1.6696 24.072
# a 1 / c c 0.4648 0.2926 16 0.1224 1.765
# c 1 / a c 3.2422 2.0406 16 0.8539 12.311
# c 1 / c c 0.2377 0.1496 16 0.0626 0.903
# a c / c c 0.0733 0.0461 16 0.0193 0.278
#
# Confidence level used: 0.95
# Intervals are back-transformed from the log scale
```

The `confint()`

function returns confidence intervals but gets rid of the statistical tests. Some people will want to also report the test statistics and p-values. In this case, we can use `summary()`

instead of `confint()`

, with `infer = TRUE`

.

```
emm1.1$contrasts %>%
summary(infer = TRUE)
```

```
# contrast ratio SE df lower.CL upper.CL null t.ratio p.value
# a 1 / c 1 1.9553 1.2306 16 0.5150 7.424 1 1.065 0.3025
# a 1 / a c 6.3396 3.9900 16 1.6696 24.072 1 2.934 0.0097
# a 1 / c c 0.4648 0.2926 16 0.1224 1.765 1 -1.217 0.2412
# c 1 / a c 3.2422 2.0406 16 0.8539 12.311 1 1.869 0.0801
# c 1 / c c 0.2377 0.1496 16 0.0626 0.903 1 -2.283 0.0365
# a c / c c 0.0733 0.0461 16 0.0193 0.278 1 -4.152 0.0008
#
# Confidence level used: 0.95
# Intervals are back-transformed from the log scale
# Tests are performed on the log scale
```

# Putting results in a data.frame

One of the really nice things about `emmeans()`

is that it makes it easy to get the results into a nice format for making tables or graphics of results. This is because the results are converted to a data.frame with `confint()`

or `summary()`

.

If needed, the estimated marginal means can also be put into a data.frame. In this case we can use `as.data.frame()`

to convert the `emmeans`

to a data.frame for plotting or putting into a table of results. We can also use `as.data.frame()`

directly on the contrasts above if we don’t need `confint()`

or `summary()`

(not shown).

```
emm1.1$emmeans %>%
as.data.frame()
```

```
# f1 f2 response SE df lower.CL upper.CL
# 1 a 1 1.7665334 0.7861763 16 0.6876870 4.537879
# 2 c 1 0.9034576 0.4020739 16 0.3517035 2.320806
# 3 a c 0.2786518 0.1240109 16 0.1084753 0.715802
# 4 c c 3.8004222 1.6913362 16 1.4794517 9.762542
```

# Within group comparisons

While we *can* do all pairwise comparisons, there are certainly plenty of situations where the research question dictates that we only want a specific set of comparisons. A common example of this is when we want to compare the levels of one factor within the levels of another. Here I’ll show comparisons among levels of `f1`

for each level of `f2`

.

The only thing that changes is the right-hand side of the `specs`

formula. The code `f1|f2`

translates to “compare levels of `f1`

within each level of `f2`

”.

```
emm2 = emmeans(fit1, specs = pairwise ~ f1|f2, type = "response")
emm2
```

```
# $emmeans
# f2 = 1:
# f1 response SE df lower.CL upper.CL
# a 1.767 0.786 16 0.688 4.538
# c 0.903 0.402 16 0.352 2.321
#
# f2 = c:
# f1 response SE df lower.CL upper.CL
# a 0.279 0.124 16 0.108 0.716
# c 3.800 1.691 16 1.479 9.763
#
# Confidence level used: 0.95
# Intervals are back-transformed from the log scale
#
# $contrasts
# f2 = 1:
# contrast ratio SE df null t.ratio p.value
# a / c 1.9553 1.2306 16 1 1.065 0.3025
#
# f2 = c:
# contrast ratio SE df null t.ratio p.value
# a / c 0.0733 0.0461 16 1 -4.152 0.0008
#
# Tests are performed on the log scale
```

You can see there is no message about a multiple comparisons adjustment in the above set of comparisons. This is because the package default is to correct for the number of comparisons *within* each group instead of across groups. In this case there is only a single comparison in each group.

If we consider the family of comparisons to be all comparisons regardless of group and want to correct for multiple comparisons, we can do so via `rbind.emmGrid`

.

Here is an example of passing `contrasts`

to `rbind()`

to correct for multiple comparisons. The default adjustment is Bonferroni, which can be much too conservative when the number of comparisons is large. You can control the multiple comparisons procedure via `adjust`

.

The results of `rbind()`

can also conveniently be used with `summary()`

, `confint()`

, and/or `as.data.frame()`

.

```
emm2$contrasts %>%
rbind()
```

```
# f2 contrast ratio SE df null t.ratio p.value
# 1 a / c 1.9553 1.2306 16 1 1.065 0.6050
# c a / c 0.0733 0.0461 16 1 -4.152 0.0015
#
# P value adjustment: bonferroni method for 2 tests
# Tests are performed on the log scale
```

# Main effects comparisons

Even if we have multiple factors in the model, complete with an interaction term, we can still do “overall” comparisons among groups if our research question indicated that main effects were important to estimate.

Doing main effects in the presence of an interaction means we *average over* the levels of the other factor(s). The `emmeans()`

function gives both a warning about the interaction and a message indicating which factor was averaged over to remind us of this.

Here is the estimated main effect of `f1`

. Since we are only interested in overall comparisons of that factor it is the only factor given on the right-hand side of the `specs`

formula.

`emmeans(fit1, specs = pairwise ~ f1)`

`# NOTE: Results may be misleading due to involvement in interactions`

```
# $emmeans
# f1 emmean SE df lower.CL upper.CL
# a -0.354 0.315 16 -1.0215 0.313
# c 0.617 0.315 16 -0.0503 1.284
#
# Results are averaged over the levels of: f2
# Results are given on the log (not the response) scale.
# Confidence level used: 0.95
#
# $contrasts
# contrast estimate SE df t.ratio p.value
# a - c -0.971 0.445 16 -2.182 0.0443
#
# Results are averaged over the levels of: f2
# Results are given on the log (not the response) scale.
```

# Treatment vs control example

The **emmeans** package has built-in helper functions for comparing each group mean to the control mean. If the control group is the in the first row of the `emmeans`

section of the output, this set of comparisons can be requested via `trt.vs.ctrl`

.

Note the default multiple comparisons adjustment is a Dunnett adjustment.

`emmeans(fit1, specs = trt.vs.ctrl ~ f1:f2)`

```
# $emmeans
# f1 f2 emmean SE df lower.CL upper.CL
# a 1 0.569 0.445 16 -0.374 1.512
# c 1 -0.102 0.445 16 -1.045 0.842
# a c -1.278 0.445 16 -2.221 -0.334
# c c 1.335 0.445 16 0.392 2.279
#
# Results are given on the log (not the response) scale.
# Confidence level used: 0.95
#
# $contrasts
# contrast estimate SE df t.ratio p.value
# c 1 - a 1 -0.671 0.629 16 -1.065 0.5857
# a c - a 1 -1.847 0.629 16 -2.934 0.0262
# c c - a 1 0.766 0.629 16 1.217 0.4947
#
# Results are given on the log (not the response) scale.
# P value adjustment: dunnettx method for 3 tests
```

Using `trt.vs.ctrl`

means we ended up comparing each group mean to the “a 1” group since it is in the first row. In the example I’m using the control group, “c c”, is actually the *last* group listed in the `emmeans`

section. When the control group is the last group in `emmeans`

we can use `trt.vs.ctrlk`

to get the correct set of comparisons.

`emmeans(fit1, specs = trt.vs.ctrlk ~ f1:f2)`

```
# $emmeans
# f1 f2 emmean SE df lower.CL upper.CL
# a 1 0.569 0.445 16 -0.374 1.512
# c 1 -0.102 0.445 16 -1.045 0.842
# a c -1.278 0.445 16 -2.221 -0.334
# c c 1.335 0.445 16 0.392 2.279
#
# Results are given on the log (not the response) scale.
# Confidence level used: 0.95
#
# $contrasts
# contrast estimate SE df t.ratio p.value
# a 1 - c c -0.766 0.629 16 -1.217 0.4947
# c 1 - c c -1.437 0.629 16 -2.283 0.0935
# a c - c c -2.613 0.629 16 -4.152 0.0021
#
# Results are given on the log (not the response) scale.
# P value adjustment: dunnettx method for 3 tests
```

That gives us what we want in this case. However, if the control group was some other group, like “c 1”, we could use `trt.vs.ctrlk`

with the `ref`

argument to define which row in the `emmeans`

section represents the control group.

The “c 1” group is the second row in the `emmeans`

so we can use `ref = 2`

to define this group as the control group.

`emmeans(fit1, specs = trt.vs.ctrlk ~ f1:f2, ref = 2)`

```
# $emmeans
# f1 f2 emmean SE df lower.CL upper.CL
# a 1 0.569 0.445 16 -0.374 1.512
# c 1 -0.102 0.445 16 -1.045 0.842
# a c -1.278 0.445 16 -2.221 -0.334
# c c 1.335 0.445 16 0.392 2.279
#
# Results are given on the log (not the response) scale.
# Confidence level used: 0.95
#
# $contrasts
# contrast estimate SE df t.ratio p.value
# a 1 - c 1 0.671 0.629 16 1.065 0.5857
# a c - c 1 -1.176 0.629 16 -1.869 0.1937
# c c - c 1 1.437 0.629 16 2.283 0.0935
#
# Results are given on the log (not the response) scale.
# P value adjustment: dunnettx method for 3 tests
```

Finally, if we want to reverse the order of subtraction in the treatment vs control comparisons we can use the `reverse`

argument.

`emmeans(fit1, specs = trt.vs.ctrlk ~ f1:f2, ref = 2, reverse = TRUE)`

```
# $emmeans
# f1 f2 emmean SE df lower.CL upper.CL
# a 1 0.569 0.445 16 -0.374 1.512
# c 1 -0.102 0.445 16 -1.045 0.842
# a c -1.278 0.445 16 -2.221 -0.334
# c c 1.335 0.445 16 0.392 2.279
#
# Results are given on the log (not the response) scale.
# Confidence level used: 0.95
#
# $contrasts
# contrast estimate SE df t.ratio p.value
# c 1 - a 1 -0.671 0.629 16 -1.065 0.5857
# c 1 - a c 1.176 0.629 16 1.869 0.1937
# c 1 - c c -1.437 0.629 16 -2.283 0.0935
#
# Results are given on the log (not the response) scale.
# P value adjustment: dunnettx method for 3 tests
```

# Alternative code for comparisons

The `emmeans()`

package offers the option to do comparisons in two steps instead of in one step the way I have been using it so far. I personally find this alternative most useful when doing custom comparisons, and I think it’s useful to introduce it now so it looks familiar. This alternative keeps the estimated marginal means and the comparisons of interest in separate objects, which can be attractive in some situations.

The first step is to use `emmeans()`

to calculate the marginal means of interest. We still use the formula in `specs`

with the factor(s) of interest on the right-hand side but no longer put anything on the left-hand side of the tilde.

We can still use `type`

in `emmeans()`

but cannot use `adjust`

(since we don’t adjust for multiple comparisons until we’ve actually done comparisons 😉).

```
emm3 = emmeans(fit1, specs = ~ f1:f2, type = "response")
emm3
```

```
# f1 f2 response SE df lower.CL upper.CL
# a 1 1.767 0.786 16 0.688 4.538
# c 1 0.903 0.402 16 0.352 2.321
# a c 0.279 0.124 16 0.108 0.716
# c c 3.800 1.691 16 1.479 9.763
#
# Confidence level used: 0.95
# Intervals are back-transformed from the log scale
```

We then get the comparisons we want in a second step using the `contrast()`

function. We request the comparisons we want via `method`

. When using built-in comparisons like I am here, we give the comparison function name as a string (meaning in quotes). Also see the `pairs()`

function, which is for the special case of all pairwise comparisons.

We can use `adjust`

in `contrast()`

to change the multiple comparisons adjustment.

`contrast(emm3, method = "pairwise", adjust = "none")`

```
# contrast ratio SE df null t.ratio p.value
# a 1 / c 1 1.9553 1.2306 16 1 1.065 0.3025
# a 1 / a c 6.3396 3.9900 16 1 2.934 0.0097
# a 1 / c c 0.4648 0.2926 16 1 -1.217 0.2412
# c 1 / a c 3.2422 2.0406 16 1 1.869 0.0801
# c 1 / c c 0.2377 0.1496 16 1 -2.283 0.0365
# a c / c c 0.0733 0.0461 16 1 -4.152 0.0008
#
# Tests are performed on the log scale
```

We can follow the `contrast()`

argument with `summary()`

or `confint()`

to get the output we want and put them into a data.frame for plotting/saving. Again, I think the real strength of `contrast()`

comes when we want custom comparisons, and I’ll demonstrate these in my next post on custom contrasts.

# Just the code, please

Here’s the code without all the discussion. Copy and paste the code below or you can download an R script of uncommented code from here.

```
library(emmeans) # v. 1.7.0
library(magrittr) # v. 2.0.1
dat = data.frame(resp = c(1.6,0.3,3,0.1,3.2,0.2,0.4,0.4,2.8,
0.7,3.8,3,0.3,14.3,1.2,0.5,1.1,4.4,0.4,8.4),
f1 = factor(c("a","a","a","a","a",
"a","a","a","a","a","c","c","c","c","c",
"c","c","c","c","c")),
f2 = factor(c("1","c","1","c","1",
"c","1","c","1","c","1","c","1","c","1",
"c","1","c","1","c")))
str(dat)
fit1 = lm(log(resp) ~ f1 + f2 + f1:f2, data = dat)
emm1 = emmeans(fit1, specs = pairwise ~ f1:f2)
emm1$emmeans
emm1$contrasts
emmeans(fit1, specs = pairwise ~ f1:f2, type = "response")
emm1.1 = emmeans(fit1, specs = pairwise ~ f1:f2, type = "response", adjust = "none")
emm1.1
emm1.1$contrasts %>%
confint()
emm1.1$contrasts %>%
summary(infer = TRUE)
emm1.1$emmeans %>%
as.data.frame()
emm2 = emmeans(fit1, specs = pairwise ~ f1|f2, type = "response")
emm2
emm2$contrasts %>%
rbind()
emmeans(fit1, specs = pairwise ~ f1)
emmeans(fit1, specs = trt.vs.ctrl ~ f1:f2)
emmeans(fit1, specs = trt.vs.ctrlk ~ f1:f2)
emmeans(fit1, specs = trt.vs.ctrlk ~ f1:f2, ref = 2)
emmeans(fit1, specs = trt.vs.ctrlk ~ f1:f2, ref = 2, reverse = TRUE)
emm3 = emmeans(fit1, specs = ~ f1:f2, type = "response")
emm3
contrast(emm3, method = "pairwise", adjust = "none")
```