Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have reflected on a good definition of causality and would be curious if anyone has thoughts or critiques of it. I am repasting part of my essay below. (https://alexpetralia.com/2023/02/25/statistics-only-gives-co...)

--

Can we nevertheless extract causality from correlation?

I would argue that, theoretically, we cannot. Practically speaking, however, we frequently settle for “very, very convincing correlations” as indicative of causation. A correlation may be persuasively described as causation if three conditions are met:

Completeness: The association itself (R²) is 100%. When we observe X, we always observe Y.

No bias: The association between X and Y is not affected by a third, omitted variable, Z.

Temporality: X temporally precedes Y.



I feel like you have this backwards. In the assignment Y:=2X, each unit of Y is caused by half a unit of X. In the game where we flip a coin at fair odds, if you have increased your wealth by 8× in 3 tosses, that was caused by you getting heads every toss. Theoretically establishing causality is trivial.

The problem comes when we try to do so practically, because reality is full of surprising detail.

> No bias: The association between X and Y is not affected by a third, omitted variable, Z.

This is, practically speaking, the difficult condition. I'm not so convinced the others are necessary (practically speaking, anyway) but you should read Pearl if you're into this!


You are missing one crucial additional condition:

- No colliders have been included in the analysis, which would introduce appearance of causality that does not exist


You probably also need at least: - Y does not appear when X does not - We need an overwhelming sample size containing examples of both X and not X - The experiment and data collection and trivially repeatable (so that we don't need to rely on trust) - The experiment, data collection and analysis must be easy to understand and sensible in every way without leaving room for error

And as another commenter already pointed out: You can't really eradicate the existence of an unknown Z


Lightning doesn't cause fire because I have observed fire created by matches under a blue sky.

(I've also observed lightning that was not followed by fire. We really need to stop wasting money on lightning rods.)


Ruling out all Z is the almost-impossible part. It's hard to prove a negative, especially with incomplete information.


What of the double slit experiment, where observation changes the outcome? Do we call observation the cause of the outcome?


In general you assume DAGs, i.e. non-cyclical causality. Cyclical relations must be resolved through distinct temporal steps, i.e. u_t0 causes v_t1 and v_t1 causes u_t2. When your measurement precision only captures simultaneous effects of both u on v and v on u you have a problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: