Confounders, mediators, moderators and covariates

I recently put together some slides to explain mediators and mediation analysis to some people who knew slightly less than I did on the topic.

I started looking for some nice examples that would describe what a mediator was. I found plenty. Of course, it was also important to pre-empt confusion between similar and related terms, and since mediators and confounders are regularly mixed up I also looked for nice examples of confounders. Again, i found lots of good examples. For completeness I looked for examples of moderators and covariates. What struck me was that there were nice examples for each term separately but i couldn’t find a really comprehensive example that included them all and clearly delineated the difference between them. So I put my mind to it. And failed.

But then I asked someone cleverer than me (one of my former PhD supervisors) and he provided the bones of a nice example to which I have added. I can’t claim that the result is as clean and clear as I would like, but it is the best I have and I would welcome corrections and clarifications.

So here is the basic formulation.

We have an exposure (or treatment if you are trials minded) that we think is associated with an outcome of some type. Mediators and confounders are similar except for the direction of effect between them and the exposure/treatment. Mediators are additionally characterised by lying on the causal pathway between exposure and outcome. Moderators are simply interaction terms that change the size or direction (or both) of the effect of the exposure on outcome. I have represented them here using a vertical line between exposure and moderator that feeds into an arrow leading to outcome (which isn’t conventional but represents the relationship better). Meanwhile, covariates are variables that might affect outcomes but are not associated with anything else. So far, so confusingly boring. But hopefully grounding this stuff in a concrete example will help. Here’s the one I have.

Okay, so maternal deprivation is associated with mothers giving birth to babies with lower birthweight. This is well established. So well established I can’t find the reference. Let’s move on.

Next the mediator. The causal pathway through which deprivation might act could be through diet, i.e. being poor might mean mothers can’t afford good food and their diet consequently suffers. This sub-optimal diet is the same diet nourishing the unborn child who is smaller as a result. Mediation analysis can formally test whether this hypothesis is true.

The confusion between mediators and confounders arises from the fact that both have associations between the exposure and outcome. Now the confounder I have chosen is age. According to figure 2 here, there is an association between maternal age and deprivation. Logically, we do not allow deprivation to influence someone’s age, so the arrow only goes from age to deprivation. I like that about this example. We are also assuming that maternal age is associated with low birthweight which it seems to be.

Smoking is what I have chosen for my moderator. Now this study investigated black smoke (so air pollution) and did indeed find a significant interaction (so moderation) between deprivation and black smoke exposure on birthweight. This means that deprived mothers who smoke produce babies even smaller than the separate expected effects of smoking and deprivation. I’m gonna go ahead and call that a win.

Finally, there may be additional covariates that we want to control for such as maternal height. This could just as easily be called genetic factors or something else, but the point being that maybe there are things we need to account for as we think they are related to outcome (but nothing else).

Of course while ~~cherry picking~~ systematically reviewing the literature I found this which observes that there is an interaction between maternal age and deprivation. This would indicate that age may be a moderator as well as a confounder. And smoking and deprivation are also linked. And this really is where it becomes clear that in most situations the pathway diagram is complicated and uncertain. Some of the relationships will be stronger than others, there will be associations between practically all of the variables and there will be feedback loops all over the place.

Moreover, I have come to the conclusion that mediators and confounders are kind of intuitive. People understand them when given an example. Clear examples of interactions/moderation are rarer and I suppose that is because interactions are counter-intuitive. They are effects that are different from the sum of their parts and real-life examples that everyone recognises are not commonplace (prove me wrong commenters – I’d love some better examples). [UPDATE: i have come up with an example that parents might appreciate. You have two kids. Each on their own is well behaved and adorable. Put them together and you might expect double the cuteness, but that is often not the case- they bicker and argue and vie for supremacy. Their cuteness is not the sum of their cutenesses, they are each less well-behaved than on their own. This is your classic qualitative interaction]

Nevertheless, and I’d like to finish on an ill-deserved upbeat tone, the discipline of clearly describing what we think the relationships and pathways are is incredibly powerful and important to hone. In particular pre-specifying the order we expect things will happen allows us to identify and test causal hypotheses. In most situations, the process of committing this to paper is the hardest part. It is only once that step has been made that statistics can really begin to be applied (of course statistical thinking can help you get there too). To paraphrase Indiana Jones’ dad* once the pathway is clear the appropriate statistical approach often presents itself. (I would like to thank Zoe Kelson and Daniel Farewell for constructive comments on earlier drafts)

* I couldn’t get a picture of Indiana Jones’ dad with this text. Here’s one of Han Solo instead.

18 thoughts on “Confounders, mediators, moderators and covariates”

EpiAnn says:

July 7, 2015 at 3:52 pm

Found this through google while looking for mediation moderation vs. confounding interaction–suspect I’ll be a regular visitor!

LikeLike

Reply
1. markjameskelson says:
  
  July 7, 2015 at 4:51 pm
  
  Thanks Ann. Must blog more regularly!
  
  LikeLike
  
  Reply
hopeLESS says:

October 5, 2015 at 9:02 am

wow, this is exactly what I am looking for. Thank you. I have no suspicion to becoma a regular…

LikeLike

Reply
Andrew says:

March 24, 2016 at 8:41 pm

OMG a diagram – thank you thank you thank you.

LikeLiked by 1 person

Reply
paul says:

May 12, 2016 at 10:42 am

Thanks so much for this posting! I have been searching for this information as this issue was recently raised in response to some analyses I have been working on.
I was wondering if you might be able to provide a citation that discusses this? Any information would be great! Thanks!

LikeLike

Reply
1. markjameskelson says:
  
  May 12, 2016 at 11:05 am
  
  Hi Paul,
  
  I’m afraid i don’t have a single citation that covers all of this stuff. If i come across one i will update
  
  Mark
  
  LikeLike
  
  Reply
Learner says:

May 25, 2016 at 2:59 am

beautifully explained. Thanks a ton !
Few more question, please:
1. Partial and complete mediation?
I think what you’ve mentioned is partial mediation and the example of complete mediation is:
x1 affects x2 which affects Y.
x1 has no direct impact on Y.

Please correct, if I am wrong. Please also provide suitable example.

2 Also, somewhere I found: Intervening variables are also called mediating variables.

3. And there is a phrase ‘control variable’. So control variable is moderator or mediating variable.

Waiting for your answer, as nowhere I can find such a simple explanation. Thanks again.

LikeLike

Reply
1. markjameskelson says:
  
  May 25, 2016 at 10:36 am
  
  Hi Learner!
  
  1. Yes, your interpretation of partial and complete mediation is correct. My example does indeed have partial mediation. I guess an example of full mediation would be something like x1 = “time spent outdoors”, x2 = “being bitten by mosquitoes” and Y being “contracting Zika virus”. This assumes that Zika is only delivered through mosquitoes. You could see that time spent outdoors might increase your chance of contracting Zika, but only through being bitten. I suppose i would say that full mediation is similar to confounding (i.e. the relationship is entirely driven by something else). The only difference is the direction of the relationship. Thanks for this comment.
  
  2. Yes, mediator, mediating variable, intervening variable and intermediary variable are all synonyms
  
  3. A control variable is a term from basic science which refers to a variable held constant throughout the trial in both arms. So, if you were checking the effect of fertilisers on plant growth, you would want to keep the exposure to light the same in both groups. Light exposure would be a control variable.
  
  Thanks
  
  Mark
  
  LikeLike
  
  Reply
AnalyticAscent says:

May 25, 2016 at 9:50 am

I guess I’m not the only one to find this site via web search of proper terminology!

Recently I started a text analysis project that aims to check if any study controlled for confounders that might distort what the true connection between two variables are.

I’m doing this by using python scripts to check for the absence or presence of keywords associated with those third variables.

This is the best overview of how those statistical terms relate to one another, I’ll cite this in the future if anyone asks what they mean. Thank you for posting this!

LikeLike

Reply
1. markjameskelson says:
  
  May 25, 2016 at 10:37 am
  
  Thanks AnalyticAscent,
  
  See the response to Learner for some more synonyms Sounds like a challenging project. Best of luck
  
  Mark
  
  LikeLike
  
  Reply
Stephen says:

June 19, 2016 at 12:49 am

My understanding is moderator is the same as effect modifier, and effect modifiers are also in the causal pathway as mediators. Am I right?

LikeLike

Reply
1. markjameskelson says:
  
  June 19, 2016 at 7:11 am
  
  Yes, i think most people use moderator in the same way as effect modifier.
  You could have an effect modifier that was not a mediator (using the Baron and Kenny approach).
  
  LikeLike
  
  Reply
Amy says:

September 24, 2016 at 8:52 pm

About to give a talk on this topic, thanks so much for doing a nice job on summarizing the differences!

LikeLike

Reply
S says:

February 27, 2017 at 12:01 pm

Hi, I was looking for an explanation of the difference between mediators and confounders and came across your blog. Thank you! The diagram is especially helpful to understand the difference.

LikeLike

Reply
Data on a Bagel says:

March 15, 2017 at 1:28 pm

Hi,

I thought of a real-life example of a moderator that I think is pretty clear, and helps me when thinking of interaction:

For asbestos exposure, the outcome of lung cancer is moderated by tobacco smoke. The risk of lung cancer for those who are exposed to asbestos AND who are smokers is 50-90 times higher than for those who are exposed to asbestos and who are not smokers.

As a historical footnote, the Lorillard company used to sell a brand called “Kent”, which had the most dangerous form of asbestos in the filter in the 1950s. They sold 11.4 billion of those cigarettes.

https://www.asbestos.com/asbestos/smoking/

LikeLike

Reply
1. markjameskelson says:
  
  March 15, 2017 at 2:20 pm
  
  Thanks for this, helpful example
  
  LikeLike
  
  Reply
Alexander says:

May 4, 2018 at 4:11 am

I just came across this… thank you Mark.

I do have an important comment. I confounder must not be on the causal pathway. In contrast, a mediator must be on the causal pathway.

Why does this matter? In analysis, such as regression, if you control for a variable as if it were a confounder, but it is actually on the causal pathway, your results could show false positives, false negatives, possibly even a false protective effect from a toxic exposure. If you control for a mediator you can have the same problem. However a mediator can be valuable for explaining how an exposure affects an outcome.

This business of getting causal paths right and not controlling for things on a causal path is really important for getting causality right. E.g. if you manipulate X you can expect a specific effect on Y.

You said something similar in your last paragraph, which I really like.

LikeLike

Reply
1. markjameskelson says:
  
  May 4, 2018 at 9:04 am
  
  Thanks Alexander. This is a very good point. Much appreciated.
  
  LikeLike
  
  Reply