Can we nevertheless extract causality from correlation?
I would argue that, theoretically, we cannot. Practically speaking, however, we frequently settle for “very, very convincing correlations” as indicative of causation. A correlation may be persuasively described as causation if three conditions are met:
Completeness: The association itself (R²) is 100%. When we observe X, we always observe Y.
No bias: The association between X and Y is not affected by a third, omitted variable, Z.
I feel like you have this backwards. In the assignment Y:=2X, each unit of Y is caused by half a unit of X. In the game where we flip a coin at fair odds, if you have increased your wealth by 8× in 3 tosses, that was caused by you getting heads every toss. Theoretically establishing causality is trivial.
The problem comes when we try to do so practically, because reality is full of surprising detail.
> No bias: The association between X and Y is not affected by a third, omitted variable, Z.
This is, practically speaking, the difficult condition. I'm not so convinced the others are necessary (practically speaking, anyway) but you should read Pearl if you're into this!
You probably also need at least:
- Y does not appear when X does not
- We need an overwhelming sample size containing examples of both X and not X
- The experiment and data collection and trivially repeatable (so that we don't need to rely on trust)
- The experiment, data collection and analysis must be easy to understand and sensible in every way without leaving room for error
And as another commenter already pointed out: You can't really eradicate the existence of an unknown Z
In general you assume DAGs, i.e. non-cyclical causality. Cyclical relations must be resolved through distinct temporal steps, i.e. u_t0 causes v_t1 and v_t1 causes u_t2. When your measurement precision only captures simultaneous effects of both u on v and v on u you have a problem.
--
Can we nevertheless extract causality from correlation?
I would argue that, theoretically, we cannot. Practically speaking, however, we frequently settle for “very, very convincing correlations” as indicative of causation. A correlation may be persuasively described as causation if three conditions are met:
Completeness: The association itself (R²) is 100%. When we observe X, we always observe Y.
No bias: The association between X and Y is not affected by a third, omitted variable, Z.
Temporality: X temporally precedes Y.