share
Cross ValidatedDoes causation imply correlation?
[+162] [9] Matthew
[2012-04-11 20:00:40]
[ correlation causality ]
[ https://stats.stackexchange.com/questions/26300/does-causation-imply-correlation ]

Correlation does not imply causation, as there could be many explanations for the correlation. But does causation imply correlation? Intuitively, I would think that the presence of causation means there is necessarily some correlation. But my intuition has not always served me well in statistics. Does causation imply correlation?

(6) Problem is, if you look up "imply" in a dictionary you'll see both "suggest" and "necessitate." - rolando2
(8) Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'. xkcd.com/552 - jchristie
(1) The question itself doesn't appear to be looking for a specific, factual answer, as indicated by the use of the word imply. The reference above is like an ultimate maybe. Or more like a probably but I can't prove it. - jchristie
[+122] [2012-04-12 18:14:24] Artem Kaznatcheev [ACCEPTED]

As many of the answers above have stated, causation does not imply linear correlation. Since a lot of the correlation concepts come from fields that rely heavily on linear statistics, usually correlation is seen as equal to linear correlation. The wikipedia article [1] is an alright source for this, I really like this image:

Correlation examples

Look at some of the figures in the bottom row, for instance the parabola-ish shape in the 4th example. This is kind of what happens in @StasK answer (with a little bit of noise added). Y can be fully caused by X but if the numeric relationship is not linear and symmetric, you will still have a correlation of 0.

The word you are looking for is mutual information [2]: this is sort of the general non-linear version of correlation. In that case, your statement would be true: causation implies high mutual information.

[1] http://en.wikipedia.org/wiki/Correlation_and_dependence
[2] http://en.wikipedia.org/wiki/Mutual_information

(3) It's usually but not always true that high mutual information accompanies causation. See @gung's answer where "if the cause is perfectly correlated with another causal variable with exactly the opposite effect." - Neil G
(6) The argument of two causes with opposite effects that always cancels each other doesn't make much sense to me as a cause. I can always assume there are unicorns causing something, and gremlins cancelling their efforts perfectly; I avoid this since it's silly. But maybe I am misunderstanding your point. - Artem Kaznatcheev
(11) His example is more extreme than it needs to be. It's possible for you to have Boolean variables $A, B$, and $C$ such that $A$ and $B$ are causes of $C$, and $C = A + B$ (mod 2). Then, absent of knowledge of $B$, $A$ and $C$ have no mutual information. $B$ is an undiscovered confounder — what you are calling "gremlins" even though it is something very common. - Neil G
@NeilG that is a good example, much more clear to me. I think it still pushes the meaning of 'cause' though, since A only causes C with B fixed (and if we fix B than A and C still have high mutual info), but I see your point much more now. Thanks for pointing out this shortcoming! - Artem Kaznatcheev
When we say that A and B are the causes of C, we mean that: if some external force (or agent) intervenes and changes A or B, then C will change, but if an external force changes C, then A or B don't change. So, if we "suppose that A and B are the causes of C", then A doesn't stop being a cause just because we as observers don't observe B! I would say that causation implies high mutual information in some context. - Neil G
(4) @NielG I agree with your first sentence, but not the second. Just because A & B causes C, doesn't mean that A causes C and B causes C. I don't see why cause has to be distributive over &. - Artem Kaznatcheev
(4) The reason that A is nevertheless a cause of C is because changing A will still change C. So, C is dependent on A even when we don't observe B. - Neil G
It (mutual information of B and C, given B)can be quantified with conditional mutual information. - Piotr Migdal
Also, if you choose an axis on which to measure the spin of particle, there is no mutual information between that choice and the spin of an entangled particle, but there are interpretations of quantum mechanics in which the former causes the latter to change. - Acccumulation
Why does causation imply mutual information? Does there not exist pseudorandom processes that in distribution give statistical independence inspite of certain variables in the process being causally related? - Galen
1
[+52] [2012-04-11 23:59:18] StasK

The strict answer is "no, causation does not necessarily imply correlation".

Consider $X\sim N(0,1)$ and $Y=X^2\sim\chi^2_1$. Causation does not get any stronger: $X$ determines $Y$. Yet, correlation between $X$ and $Y$ is 0. Proof: The (joint) moments of these variables are: $E[X]=0$; $E[Y]=E[X^2]=1$; $${\rm Cov}[X,Y]=E[ (X-0)(Y-1) ] = E[XY]-E[X]1 = E[X^3]-E[X]=0$$ using the property of the standard normal distribution that its odd moments are all equal to zero (can be easily derived from its moment-generating-function, say). Hence, the correlation is equal to zero.

To address some of the comments: the only reason this argument works is because the distribution of $X$ is centered at zero, and is symmetric around 0. In fact, any other distribution with these properties that would have sufficient number of moments would have worked in place of $N(0,1)$, e.g., uniform on $(-10,10)$ or Laplace $\sim \exp(-|x|)$. An oversimplified argument is that for every positive value of $X$, there is an equally likely negative value of $X$ of the same magnitude, so when you square the $X$, you can't say that greater values of $X$ are associated with greater or smaller values of $Y$. However, if you take say $X\sim N(3,1)$, then $E[X]=3$, $E[Y]=E[X^2]=10$, $E[X^3]=36$, and ${\rm Cov}[X,Y]=E[XY]-E[X]E[Y]=36-30=6\neq0$. This makes perfect sense: for each value of $X$ below zero, there is a far more likely value of $-X$ which is above zero, so larger values of $X$ are associated with larger values of $Y$. (The latter has a non-central $\chi^2$ distribution [1]; you can pull the variance from the Wikipedia page and compute the correlation if you are interested.)

[1] http://en.wikipedia.org/wiki/Noncentral-chi-squared-distribution

Could you explain more fully why x and y are uncorrelated in this example (or provide a useful reference or keyword that I could use to find more info)? It isn't immediately apparent to me. thanks. - DQdlM
When I have more time, I will give the full argument with expected values, then. - StasK
(2) @DQdlM: The standard random variable has vanishing odd central moments, due to the evenness of the density. Matthew: The answer is no, as StasK has demonstrated, because correlation is not the only type of dependence. - Emre
(3) @DQdlM: see the bottom middle graph in the first image on the Wikipedia correlation page. That's StasK's case. It only works when x is equally distributed about the origin (ie, if $X\sim N(3,1)$, correlation will be fairly high) - naught101
(1) It should also be noted that this answer only applies to linear correlation/covariance. There are other measures of correlation that are non-linear, such as distance correlation, and you can see from that page, that StatK's answer won't apply (covariance is non-zero for the $Y=X^2$ case). - naught101
@naught101 your statement "correlation will be fairly high" seems to directly contradict this as an answer, or did I misunderstand? - user10547
@JoshuaDrake, see my edit. Correlation will be positive. - StasK
(1) Won't any distribution with first and third moments of zero work (no symmetry needed)? - cardinal
(3) P.S. I'm so glad you posted this answer. It was hard to believe the question went so long without this answer. This was the exact example that came to my mind when I saw this question, but didn't get the time to write it up. I'm glad you did take the time. Cheers. - cardinal
(3) @cardinal: yeah, I guess we all learned these kinds of simple counterexamples in grad school... and yes, from the derivation of the covariance, you only need the first and the third moments to be zero. If you have a non-trivial example of an asymmetric distribution that has a zero third moment (finely tuned probability masses over five or six points do not count), I'd be very curious to see it though. - StasK
(3) Here 'causality' is being assumed to be expressable as a function. This is, $X$ causes $Y$ if and only if there exists a measurable function, $f$, such that $Y=f(X)$. I guess we could spend the rest of our lives discussing about the validity of this argument. - user10525
(1) @Procrastinator: To make much sense of the question from a mathematical framework we have to codify the notion of causality in some way. This doesn't seem too far removed from the way we view randomness itself through the lens of measure theory. I agree that this is a somewhat strict interpretation, but, for example, if we introduce further "causes" than it is sensible to model the situation as a map from a product space to the range. What other ways did you have in mind? - cardinal
(1) @Procrastinator: That's not it: X causes Y if and only if X and Y are dependent and an intervention at Y renders them independent. See Pearl's Causality for details. - Neil G
@cardinal: Causality has been formalized by Pearl. - Neil G
(2) @NeilG: Yes, I am familiar with Pearl and his work, which is not without its controversies. :) - cardinal
@cardinal: :) I'm actually interested in reading about the controversies. I've heard of them, but haven't read much. If you ever find some time to pass along something for me to read, I'd be grateful… :) - Neil G
(1) @NeilG: You might consider asking that as a separate question! I could point you in some direction(s), but you're likely to get answers from others more qualified than myself on the matter. - cardinal
Lack of dependence implies lack of causality. For the normal distribution dependence means non-zero correlation. Therefore the term correlation is popularly -but wrongfully- used instead of dependence. Stask provides a nice counter example. So for normal variables causation implies correlation. - Hunaphu
2
[+34] [2012-04-11 20:17:37] Fomite

Essentially, yes.

Correlation does not imply causation because there could be other explanations for a correlation beyond cause. But in order for A to be a cause of B they must be associated in some way. Meaning there is a correlation between them - though that correlation does not necessarily need to be linear.

As some of the commenters have suggested, it's likely more appropriate to use a term like 'dependence' or 'association' rather than correlation. Though as I've mentioned in the comments, I've seen "correlation does not mean causation" in response to analysis far beyond simple linear correlation, and so for the purposes of the saying, I've essentially extended "correlation" to any association between A and B.


(20) I tend to reserve the word correlation for linear correlation, and use dependence for nonlinear relations that may or may not have linear correlation. - Memming
(4) @Memming I would too, save for the fact that people trot out "Correlation does not imply causation" re: fairly complex non-linear association. - Fomite
Memming is right. You need to define correlation if you don't mean Pearson correlation. - Neil G
@NeilG There is correlation beyond Pearson. Spearman comes to mind. - Fomite
(1) @NeilG Or for that matter, one may be able to get a linear Pearson correlation by transforming one variable or the other. The problem is the adage itself is over-simplified. - Fomite
(1) @EpiGrad: Both good points. In common parlance, correlation is just more of A coincides with more B. I think your answer would benefit from making your use of a broad definition of correlation clear. - Neil G
@NeilG See the edit :) - Fomite
Consider a button that causes a coin to flip and where this is the only means of flipping the coin. There is perfect causation of the state of the coin and the pushing of the button. But there is no correlation whatsoever. Two things can be causally linked with no correlation whatsoever between them. Another example would be the outputs of a perfect thermostat and the temperature of a room. The former is a causal factor of the latter, but they do not correlate at all (since the temperature is constant). - David Schwartz
3
[+33] [2017-09-07 01:59:21] Carlos Cinelli

Things are definitely nuanced here. Causation does not imply correlation nor even statistical dependence, at least not in the simple way we usually think about them, or in the way some answers are suggesting (just transforming $X$ or $Y$ etc).

Consider the following causal model:

$$ X \rightarrow Y \leftarrow U $$

That is, both $X$ and $U$ cause $Y$.

Now let:

$$ X \sim bernoulli(0.5)\\ U \sim bernoulli(0.5) \\ Y = 1- X - U + 2XU $$

Suppose you don't observe $U$. Notice that $P(Y|X) = P(Y)$. That is, even though $X$ causes $Y$ (in the non-parametric structural equation sense) you don't see any dependence! You can do any non-linear transformation you want and that won't reveal any dependence because there isn't any marginal dependency of $Y$ and $X$ here.

The trick is that even though $X$ and $U$ cause $Y$, marginally their average causal effect is zero. You only see the (exact) dependence when conditioning both on $X$ and $U$ together (that also shows that $X\perp Y$ and $U\perp Y$ does not imply $\{X, U\} \perp Y$). So, yes, one could argue that, even though $X$ causes $Y$, the marginal causal effect of $X$ on $Y$ is zero, so that's why we don't see dependence of $X$ and $Y$. But this just illustrates how nuanced the problem is, because $X$ does cause $Y$, not just in the way you naively would think (it interacts with $U$).

So in short I would say that: (i) causality suggests dependence; but, (ii) the dependence is functional/structural dependence and it may or may not translate in the specific statistical dependence you are thinking of.


(2) Carlos, it's correct to says that if we know the full set of variables involved in the causal model this problem (statistical invisibility) disappear? - markowitz
(2) @markowitz you would need to observe everything to the deterministic level, thus not a very realistic scenario. - Carlos Cinelli
I interpret your answer as “yes”. You are right, the situation that I supposed is unrealistic; I’m aware about it. However the question was related only about the logic that you described and the finality was to grasp it. My conviction was something like “causation imply statistical association” and others answers in this page sound like this. After all also your example is slight unrealistic but not for this reason in uninteresting. It seems me that, also in general, causation without statistical association is slight unrealistic but theoretically interesting. - markowitz
(1) @markowitz the "statistical invisibility" happens when the model is not faithful to the graph. For exact cancelation, this depends on a specific choice of parameterization, so some people argue it is indeed unlikely. However, near cancelation might be plausible since it depends on a neighborhood of parameters, so it all depends on context. The point here is just that you need to make your causal assumptions explicit because, logically, causation does not imply association by itself -- you need extra assumptions. - Carlos Cinelli
4
[+24] [2012-04-11 20:32:37] Peter Flom

Adding to @EpiGrad 's answer. I think, for a lot of people, "correlation" will imply "linear correlation". And the concept of nonlinear correlation might not be intuitive.

So, I would say "no they don't have to be correlated but they do have to be related". We are agreeing on the substance, but disagreeing on the best way to get the substance across.

One example of such a causation (at least people think it's causal) is that between the likelihood of answering your phone and income. It is known that people at both ends of the income spectrum are less likely to answer their phones than people in the middle. It is thought that the causal pattern is different for the poor (e.g. avoid bill collectors) and rich (e.g. avoid people asking for donations).


5
[+13] [2012-04-11 20:28:01] gung - Reinstate Monica

The cause and the effect will be correlated unless there is no variation at all in the incidence and magnitude of the cause and no variation at all in its causal force. The only other possibility would be if the cause is perfectly correlated with another causal variable with exactly the opposite effect. Basically, these are thought-experiment conditions. In the real world, causation will imply dependence in some form (although it might not be linear correlation).


+1 for reminding us of the equilibrium situation (which might be more common than you suggest) - conjugateprior
I meant to +1, but hit minus by accident. Please edit? - Neil G
@NeilG, I think you can just hit the up normal & it will negate the accidental -1; alternatively, perhaps a moderator can do it, unfortunately, I cannot override / edit your vote. However, if it sticks, it'll be OK; no harm, no foul. - gung - Reinstate Monica
@gung: If you edit your answer, I can change my vote. Unfortunately the system locks old votes in. I wanted to upvote, because I think you raise an important point. - Neil G
@ConjugatePrior, I would be curious if you knew of such a situation. I can't think of one in real life, just speculation for the sake of conceptual clarity. - gung - Reinstate Monica
(3) @NeilG, I indulged my addiction to italics. - gung - Reinstate Monica
(1) Some theories actually imply this, e.g. many game theory models. Some empirical situations where you can't discern a difference (although there would actually be one 'in gung-italics' as it were :-) include 'neutral' no gene change scenarios when evolutionary selection pressure at two levels point in different directions. - conjugateprior
(1) I like the first exception, but not the second exception. I like to think that flipping the switch causes the light to go on, but if I happen to only flip the switch during a blackout nothing happens. Perhaps there was not really a causal relation. - emory
gung, you're talking about general correlation here, right? Most people with a less than advanced stats education are likely to equate that with linear correlation. I think you should clarify that somewhere. - naught101
(1) @naught101, you raise a good point, which has been discussed elsewhere on this page. I have edited my answer. However, when I've worked with people, I don't think they have a strong conception of correlation as necessarily linear, even though I tell them that. Although they wouldn't put it in these terms, I think most people understand 'correlation' as closer to 'function of'. Nonetheless, I should be clearer in my use of terms, and should've been from the start. - gung - Reinstate Monica
(2) @emory: the cause of the light coming on is actually the closure of the electrical circuit (which is caused by the flicking of the switch, with the environmental conditions including a functioning grid). During a blackout, flicking the switch doesn't close the circuit, because it's broken elsewhere. So in a sense, the blackout is the "opposite" effect that gung was talking about (ie. light is on, blackout turns it off). It could also be thought of as a nullifying effect. - naught101
(1) @gung I agree with you about the common interpretation of 'correlated'. I've found people usually use it to mean 'monotonically associated with' or more loosely 'predictable from'. - conjugateprior
6
[+9] [2019-01-23 13:00:26] Lizzie Silver

There are great answers here. Artem Kaznatcheev [1], Fomite [2] and Peter Flom [3] point out that causation would usually imply dependence rather than linear correlation. Carlos Cinelli [4] gives an example where there's no dependence, because of how the generating function is set up.

I want to add a point about how this dependence can disappear in practice, in the kinds of datasets that you might well work with. Situations like Carlos's example are not limited to mere "thought-experiment conditions".

Dependences vanish in self-regulating processes. Homeostasis, for example, ensures that your internal body temperature remains independent of the room temperature. External heat influences your body temperature directly, but it also influences the body's cooling systems (e.g. sweating) which keep the body temperature stable. If we sample temperature in extremely fast intervals and using extremely precise measurements, we have a chance of observing the causal dependences, but at normal sampling rates, body temperature and external temperature appear independent.

Self-regulating processes are common in biological systems; they are produced by evolution. Mammals that fail to regulate their body temperature are removed by natural selection. Researchers who work with biological data should be aware that causal dependences may vanish in their datasets.

[1] https://stats.stackexchange.com/a/26370/57345
[2] https://stats.stackexchange.com/a/26302/57345
[3] https://stats.stackexchange.com/a/26306/57345
[4] https://stats.stackexchange.com/a/301823/57345

7
[0] [2021-12-08 14:58:55] SolingerStuebchen

The answer is: Causation does not imply (linear) correlation.

Assume we have the causal graph: $X \rightarrow Y$, where $X$ is a cause of $Y$, such that, if $X < 0$ we have $Y=X$ and else (if $X \geq 0$) we have $Y=-X$.

Clearly, $X$ is a cause of $Y$. However, when you compute the correlation between instances of $X$ and $Y$, e.g., for the points $X$=[-5,-4,-3,-2,-1,0,1,2,3,4,5] and $Y$=[-5,-4,-3,-2,-1,0,-1,-2,-3,-4,-5], then the correlation $corr(X,Y)$ will be 0, even though there exists a true causal mechanism relationship between $X$ and $Y$, for which the value of $Y$ is solely determined by the value of $X$.

You can try it out here: https://ncalculators.com/statistics/covariance-calculator.htm using the above data vectors.


8
[0] [2023-05-20 02:52:30] Alex Michael

I add a less statistically technical answer here for the less statistically inclined audience:

One variable (let's say, X) can positively influence another variable (let's say, $Y$), while not being associated with $Y$, or even being negatively associated with $Y$, if there are confounding factors that distort the association between $X$ and $Y$.

For example, suppose that the very best doctors are put in wards with the highest needs patients. While the quality of doctors itself has a positive influence on reducing death rates of patients, the quality of doctors could actually be negatively correlated with death rates, because of the confounding variable of the needs of the patients.


9