[HN Gopher] ML Beyond Curve Fitting: An Intro to Causal Inferenc...
       ___________________________________________________________________
        
       ML Beyond Curve Fitting: An Intro to Causal Inference and Do-
       Calculus (2018)
        
       Author : lnyan
       Score  : 65 points
       Date   : 2021-06-21 09:02 UTC (1 days ago)
        
 (HTM) web link (www.inference.vc)
 (TXT) w3m dump (www.inference.vc)
        
       | benibela wrote:
       | Recently I finished a PhD about causal inference. How can that
       | help my life or career?
        
       | clircle wrote:
       | What are the greatest successes of causal inference? What
       | problems have been solved by the do-calculus or potential
       | outcomes frameworks?
       | 
       | I'm pretty sure the link between smoking and cancer was
       | established before causal inference came about.
        
         | fny wrote:
         | Propensity score matching is probably the one area with the
         | greatest utility. There's a lot of good literature on the
         | subject and it's fairly popular in econometrics and health
         | analytics.
        
         | Fomite wrote:
         | Keep in mind that the timeline from medical research being
         | published to entering practice can sometimes be measured in
         | decades, and causal inference is swimming against the RCT-heavy
         | currents of medical research. Several journals I publish in,
         | for example, expressly forbid causal language for non-RCTs
         | (much to my consternation as a mathematical modeler).
         | 
         | Many of the tools of causal inference were also relatively
         | inaccessible until fairly recently.
         | 
         | I'd argue that the potential outcomes frameworks are really
         | useful from a philosophical standpoint in teaching students,
         | and things like the target trial framework has been doing
         | useful work in conceptualizing observational studies in the
         | context of a hypothetical trial (and recognizing that trials
         | are themselves essentially a special case of cohort studies).
         | 
         | The major source of potential is the places where you can't
         | ethically randomize interventions, yet still need to produce
         | evidence.
        
           | [deleted]
        
           | clircle wrote:
           | Thanks for this. I agree that potential outcomes is very
           | useful, at least on a pedagogical level. I have been reading
           | Imbens and Rubin's book and loving it.
        
         | yenwel wrote:
         | I think causal inference is succesfull where it is impossible
         | do an intervention to force a randomized trail on your
         | population. Classical statistics like in agriculture that you
         | set up a design and field trail to find out the interactions
         | and additive effects is sometimes not possible. Say you want to
         | check the effect of some economic policy change or medical
         | treatment that would be unethical to refuse to some part of the
         | population.
        
           | cracker_jacks wrote:
           | Can you give an example of how you can get away without an
           | intervention?
        
             | pjmorris wrote:
             | Have a look at the literature on how smoking was
             | established as a cause for cancer. You can't ethically
             | intervene to have non-smokers smoke long enough to develop
             | lung cancer. A lot of money and intellectual effort was
             | spent on correlation not equaling causation in this case.
             | 
             | I'm no expert on the literature here, but Peter Norvig
             | mentions the smoking-cancer example in his article on
             | experiment design [0]. He gets to the same place the
             | causality people do; observational studies.
             | 
             | [0] https://norvig.com/experiment-design.html
        
             | efm wrote:
             | I recommend the book (free online):
             | https://www.hsph.harvard.edu/miguel-hernan/causal-
             | inference-... and the associated Coursera course. It's both
             | simple and subtle to be able to get causality out of
             | observational data.
        
               | 0101010110 wrote:
               | This is true only for a small subset of Causal DAGs even
               | within this 'Causal Calculus'. It can't account for
               | circular causality or discontinuous relationships. That's
               | not to diminish your suggestion, only to contextualise
               | it.
        
               | benibela wrote:
               | The are new theories for cycles:
               | https://www.eur.nl/sites/corporate/files/2018-07/mooij-
               | pup-2...
        
             | dwohnitmok wrote:
             | You can't really, at least not in the sense that I think
             | most people think of it.
             | 
             | You basically need to make some assumptions that are
             | broadly equivalent to assuming you've already correctly
             | guessed certain parts of the underlying causal structure.
             | So in a certain sense you're kind of begging the question,
             | in a way that you wouldn't need to do if you had the
             | ability to do interventions/randomized trials.
             | 
             | That being said causal inference techniques are still very
             | valuable in making explicit exactly what assumptions you're
             | making and how those affect your final conclusion and
             | therefore how to minimize the impact of those assumptions.
        
               | 0101010110 wrote:
               | The rules also provide a framework within which you can
               | rule out some causal relationships. So they at least go
               | some way to confirming which hypotheses can't be correct
               | given the data.
        
             | labcomputer wrote:
             | At a high level:
             | 
             | The core idea behind a RCT is that the characteristics of a
             | "unit" (a patient) can't affect which treatment is
             | selected. On average, people who got treatment A are
             | statistically the same as those who got treatment B. So you
             | can assume any difference in outcome is a result of the
             | treatment.
             | 
             | One of the simpler ways to do causal inference is by
             | pairwise matching:
             | 
             | You try to identify what variables make patients different.
             | Then find pairs of units which are "the same" but received
             | different treatments. After the pairing process, your
             | treatment and control groups should ("should" is doing some
             | heavy lifting here) now be statistically "the same" _by
             | construction_. Recall, that this is what we were going for
             | in an RCT. If you did everything right, you can now apply
             | all the normal statistical machinery that you would apply
             | to an RCT.
             | 
             | The challenge is:
             | 
             | 1. Identifying all the variables that make units alike.
             | 
             | 2. You tend to throw away a lot of data, which reduces your
             | statistical power. Even when the treatment classes are
             | balanced, a given unit in class A may not pair up well with
             | any unit from class B.
             | 
             | 3. (Related to 2) Finding globally-optimal pairs of closest
             | matches can be hard.
             | 
             | 4. (Also related to 2) You need _at least some_ people in
             | each group. Sometimes the treatment and control are just so
             | different that _nobody_ pairs up very well.
             | 
             | In some sense, the pairing process is just a re-weighting
             | of your data. People who are similar to someone in the
             | other group have a large weight. People who are unlike the
             | other group have a low weight.
             | 
             | You can generalize that idea a bit and reinvent what's
             | called Inverse Propensity Score Weighting. In this case,
             | you try to model a unit's propensity to receive a
             | treatment, and then use 1/propensity as that unit's weight.
             | 
             | The intuition is: If the model says you were likely to
             | receive treatment B (you have a low propensity for A) and
             | you actually received treatment A, then you are likely to
             | pair up with someone who actually received B. So we should
             | up-weight you.
        
           | clircle wrote:
           | I'm aware that causal inference is a popular technique in
           | econometrics, and other places where we cannot conduct
           | experiments. What I'm not aware of, is if these techniques
           | have produced highly useful and reliable inferences. (Putting
           | on my counterfactual hat) Are there examples of observational
           | studies that would have failed to change public policy
           | without the techniques of causal inference?
        
       ___________________________________________________________________
       (page generated 2021-06-22 23:01 UTC)