[HN Gopher] Seven basic rules for causal inference
       ___________________________________________________________________
        
       Seven basic rules for causal inference
        
       Author : RafelMri
       Score  : 174 points
       Date   : 2024-08-16 07:14 UTC (3 days ago)
        
 (HTM) web link (pedermisager.org)
 (TXT) w3m dump (pedermisager.org)
        
       | lordnacho wrote:
       | This is brilliant. The whole causal inference thing is something
       | I only came across after university, either I missed it or it is
       | a hole in the curriculum, because it seems incredibly fundamental
       | to our understanding of the world.
       | 
       | The thing that made be read into it was a quite interesting
       | sentence from lesswrong, saying that actually the common idea
       | that correlation does not imply causation is wrong. Now it's not
       | wrong in the face-value sense, it's wrong in the sense that
       | actually you can use correlations to learn something about
       | causation, and there turns out to be a whole field of study here.
        
         | Vecr wrote:
         | When did you go to university? The terminology here came from
         | Pearl 2000, and it probably took years and years after that to
         | diffuse out.
        
           | lordnacho wrote:
           | I thought Pearl was writing from 1984 onwards?
           | 
           | I was at university around the millennium.
        
             | janto wrote:
             | Causality (2000) made the topic accessible (to students and
             | lecturers) as a single book.
        
         | jerf wrote:
         | "correlation does not imply causation is wrong"
         | 
         | That's a specific instance of a more general problem in the
         | "logical fallacies", which is that most of them are written to
         | be true in an absolutist, Aristotelian frame. It is true that
         | if two things are correlated you can not therefore infer a
         | rigidly 100% chance that there is a causative relationship
         | there. And that's how Aristotelian logic works; everything is
         | either True or False and if there is anything else it is as
         | most "Indeterminate" and there is absolutely, positively, no in
         | betweens or probabilities or anything else.
         | 
         | However, consider the canonical "logical fallacy":
         | 1. A -> B.         2. B         3. Therefore, A.
         | 
         | It is absolutely a logical fallacy in the Aristotelian sense.
         | Just because B is there does not mean A is. However,
         | _probabilistically_ , if you are uncertain about A, the
         | presence of B _can_ be used to _update_ your expected
         | probability of A. After all, this is exactly what Bayes ' rule
         | is for!
         | 
         | Many of the "fallacies" can be rewritten to be useful
         | probabilistically, and aren't quite _as_ fallacious as their
         | many internet devotees fancy.
         | 
         | It is certainly reasonable to be "suspicious" about
         | correlations. There often is a "there" there. Of course,
         | whether you can ever figure out what the "there" is is quite a
         | different question; https://gwern.net/everything really gets in
         | your way. (I also recommend https://gwern.net/causality ).
         | 
         | The upshot is basically 1. the glib dismissal that correlation
         | != causation is, well, too glib and throws away too many things
         | but 2. it is still true you still generally can't assume it
         | either. The reality of the situation is exceedingly
         | complicated.
        
           | boxfire wrote:
           | I liked the way Pearl phrased it originally. A calculus of
           | anti-correlations implies causation. That makes the nature of
           | the analysis clear and doesn't set of the classic minds alarm
           | bells.
        
             | cubefox wrote:
             | Unfortunately this calculus is exceedingly complicated and
             | I haven't even seen a definition of "a causes b" in terms
             | of this calculus. One problem is that Pearl and others make
             | use of the notion of "d-separation". This allows for
             | elegant proofs but is hard to understand. I once found a
             | paper which replaced d-separation with equivalent but more
             | intuitive assumptions about common causes, but I since
             | forgot the source.
             | 
             | By the way, there is also an alternative to causal graphs,
             | namely "finite factored sets" by Scott Garrabrant. Probably
             | more alternatives exist. Though I don't know more about
             | (dis)advantages.
        
           | geye1234 wrote:
           | I don't disagree with the substance of your comment, but want
           | to clarify something.
           | 
           | Lesswrong promulgated a seriously misleading view of Aristole
           | as some fussy logician who never observed reality and was
           | unaware of probability, chance, the unknown, and so on. It is
           | entirely false. Aristotle repeats, again and again and again,
           | that we can only seek the degree of certainty that is
           | appropriate for a given subject matter. In the _Ethics_ ,
           | perhaps his most-read work, he says this, or something like
           | it, at least five times.
           | 
           | I mention this because your association of the words
           | "absolutist" and "Aristotelian" suggests your comment may
           | have been influenced by this.
           | 
           | ISTM that there are two entirely different discussions taking
           | place here, not opposed to each other. "Aristotelian" logic
           | tends to be more concerned with ontology -- measles causes
           | spots, therefore if he has measles, then he will have spots.
           | Whereas the question of probability is entirely
           | epistemological -- we know he has spots, which may indicate
           | he has measles, but given everything else we know about his
           | history and situation this seems unlikely; let's investigate
           | further. Both describe reality, and both are useful.
           | 
           | So the fallacies are _entirely_ fallacious: I don 't think
           | your point gainsays this. But I agree that, _to us_ , B may
           | suggest A, and it is then that the question of probability
           | comes into play.
           | 
           | Aquinas, who was obviously greatly influenced by Aristotle,
           | makes a similar point somewhere IIRC (I think in SCG when
           | he's explaining why the ontological argument for God's
           | existence fails), so it's not as if this is a new discovery.
        
             | jerf wrote:
             | I consider Aristotelian logic to be a category. It is the
             | Newtonian physics of the logic world; if your fancier logic
             | doesn't have some sort of correspondence principle to
             | Aristotelian logic, something has probably gone wrong. (Or
             | you're so far out in whacky logic land you've left
             | correspondence to the real universe behind you. More power
             | to you, as long as you are aware you've done that.) And
             | like Newton, being the first to figure it out labels
             | Aristotle as a certifiable genius.
             | 
             | See also Euclid; the fact that his geometry turns out not
             | to be The Geometry does not diminish what it means to have
             | blazed that trail. And it took centuries for anyone to find
             | an alternative; that's quite an accomplishment.
             | 
             | If I have a backhanded criticism hiding in my comment, it
             | actually isn't pointed at Aristotle, but at the school
             | system that may teach some super basic logic at some point
             | and accidentally teaches people that's all logic is, in
             | much the same way stats class accidentally teaches people
             | that everything is uniformly randomly distributed (because
             | it makes the homework problems easier, which is
             | legitimately true, but does reduce the education's value in
             | the real world), leaving people fairly vulnerable to the
             | lists of fallacies they may find on the internet and
             | unequipped to realize that they only apply in certain ways,
             | in certain cases. I don't know that I've ever seen such a
             | list where they point out that they have some validity in a
             | probabilistic sense. There's also the fallacies that are
             | just plain fallacious even so, but I don't generally see
             | them segmented off or anything.
        
               | 082349872349872 wrote:
               | > _...it took centuries for anyone to find an
               | alternative..._
               | 
               | Pedantry: s/centuries/millennia/ (roughly 21 of the
               | former, 2 of the latter?)
               | 
               | EDIT: does anyone remember the quote about problems
               | patiently waiting for our understanding to improve?
        
         | currymj wrote:
         | Rigorous causal inference methods are just now starting to
         | diffuse into the undergraduate curriculum, after gradually
         | becoming part of the mainstream in a lot of social science
         | fields. But this is just happening.
         | 
         | Judea Pearl is in some respects a little grandiose, but I think
         | he is right to be express shock that it took almost a century
         | to develop to this point, given how long the basic tools of
         | probability and statistics have been fairly mature.
        
       | currymj wrote:
       | Rule 2 ("causation creates correlation") would be strongly
       | disputed by a lot of people. It relies on the assumption of
       | "faithfulness" which is not discussed until the bottom of the
       | article.
       | 
       | This is a very innocent sounding assumption but it's actually
       | quite strong. In particular it may be violated when there are
       | control systems or strategic agents as part of the system you
       | want to study -- which is often the case for causal inference. In
       | such scenarios (eg the famous thermostat example) you could have
       | strong causal links which are invisible in the data.
        
         | apwheele wrote:
         | This was my thought as well.
         | 
         | I don't like showing the scatterplots in these examples, as
         | "correlation" _I think_ is more associated with the correlation
         | coefficient than the more generic independence that the author
         | means in this scenario. E.g. a U shape in the scatterplot may
         | have a zero correlation coefficient but is not conditionally
         | independent.
        
           | bdjsiqoocwk wrote:
           | > E.g. a U shape in the scatterplot may have a zero
           | correlation coefficient but is not conditionally independent.
           | 
           | Ok this is correct, but has nothing to do with causality.
           | Whether or not two variables are correlated and whether or
           | not they are independent, and when one does or doesn't imply
           | the other, is a conversation that can be had without
           | resorting to the concept of causality at all. And in fact
           | that's how the subject is taught at an introductory level
           | basically 100% of the times.
        
             | cubefox wrote:
             | > Ok this is correct, but has nothing to do with causality.
             | 
             | It does. Dependence and independence have a lot to do with
             | causation, as the article explains.
             | 
             | > Whether or not two variables are correlated and whether
             | or not they are independent, and when one does or doesn't
             | imply the other, is a conversation that can be had without
             | resorting to the concept of causality at all.
             | 
             | Yes, but this is irrelevant. It's like saying "whether or
             | not someone is married is a conversation that can be had
             | without resorting to the concept of a bachelor at all".
             | 
             | You can talk about (in)dependence without talking about
             | causation, but you can't talk in detail about causation
             | without talking about (in)dependence.
        
           | ordu wrote:
           | From the article:
           | 
           |  _> NB: Correlated does not mean linearly correlated_
           | 
           |  _> For simplicity, I have used linear correlations in all
           | the example R code. In real life, however, the pattern of
           | correlation /association/mutual information we should expect
           | depends entirely on the functional form of the causal
           | relationships involved._
        
             | dash2 wrote:
             | The standard mathematical definition of correlation means
             | linear correlation. If you are talking about non-
             | independence, it would be better to use that language. This
             | early mistake made me think the author is not really an
             | expert.
        
               | cubefox wrote:
               | What is an appropriate measure of (in)dependence though,
               | if not Pearson correlation? Such that you feed a scatter
               | plot into the formula for this measure, and if the
               | measure returns 0 dependence, the variables are
               | independent.
        
               | currymj wrote:
               | it's a tough problem.
               | 
               | there are various schemes for estimating mutual
               | information from samples. if you do that and mutual
               | information is very close to zero, then I guess you can
               | claim the two rvs are independent. But these estimators
               | are pretty noisy and also often computationally
               | frustrating (the ones I'm familiar with require doing a
               | bunch of nearest-neighbor search between all the points).
               | 
               | I agree with the OP that it's better to say "non-
               | independence" and avoid confusion, at the same time, I
               | disagree that linear correlation is actually the standard
               | definition. In many fields, especially those where nobody
               | ever expects linear relationships, it is not and
               | everybody uses "correlated" to mean "not independent".
        
               | cubefox wrote:
               | Yeah. It would be simpler to talk about causal graphs if
               | the nodes represented only events instead of arbitrary
               | variables, because independence between events is much
               | simpler to determine: X and Y are independent iff P(X) *
               | P(Y) = P(X and Y). For events there also exists a measure
               | of dependence: The so-called odds ratio. It is not
               | influenced by the marginal probabilities, unlike Pearson
               | correlation (called "phi coefficient" for events) or
               | pointwise mutual information. Of course in practice
               | events are usually not a possible simplification.
        
               | rlpb wrote:
               | That seems a bit harsh. People can independently become
               | experts without being familiar with the terminology used
               | by existing experts. Further, if intended for a non-
               | expert audience, it may even be deliberate to loosen
               | definitions of terms used by experts, and being precise
               | by leaving a note about that instead, which apparently is
               | exactly what this author did.
        
               | dash2 wrote:
               | It's much better to use vocabulary consistently with what
               | everyone else does in the field. Then you don't need to
               | add footnotes correcting yourself. And if you are not
               | familiar with what everyone else means by correlation,
               | you're very unlikely to be an expert. This is not like
               | that Indian mathematician who reinvented huge chunks of
               | mathematics.
        
               | rlpb wrote:
               | > It's much better to use vocabulary consistently with
               | what everyone else does in the field.
               | 
               | Fine, but...
               | 
               | > And if you are not familiar with what everyone else
               | means by correlation, you're very unlikely to be an
               | expert.
               | 
               | Perhaps, but this is not relevant. If there's a problem
               | with this work, then that problem can be criticized
               | directly. There is no need, and it is not useful, to
               | infer "expertise" by indirect means.
        
           | currymj wrote:
           | This is a separate issue and also a good point. Correlation
           | sometimes means "Pearson's correlation coefficient" and
           | sometimes means "anything but completely independent" and
           | it's often unclear. In this context I mean the latter.
        
         | BenoitP wrote:
         | I'd argue you both could be right. Your comment could lead to a
         | definition of intelligence. Organisms capable of causally
         | influencing deterministic systems to their advantage can be
         | marked as intelligent. The complexity of which would determine
         | the degree of intelligence.
         | 
         | Your point is great in that it pinpoints also the notions of
         | agency scopes. In all the causal DAGs it feels like there are
         | implicit regions: ones where we can influence or not, intervene
         | or not, observe or not, where one is responsible for or not.
         | 
         | An intelligent agent is one capable of modelling a system,
         | influence it, and bias it. Such that it can reach and exploit
         | an existing corner case of it. I talk about a corner case
         | because of entropy and murphy's law. For a given energy, there
         | are way many more unadvantageous states than advantageous one.
         | And the intelligence of a system is the complexity required to
         | wield the entropy reduction of an energy source.
        
           | joe_the_user wrote:
           | Two problem with this. 1. There are many other ways that
           | correlation doesn't imply causation. 2. The phenomenon the gp
           | describes doesn't require broad intelligence but just
           | reactiveness - a thermostat or a guided missile could have
           | this.
        
         | bubblyworld wrote:
         | For anyone else who went down a rabbit hole - this paper
         | describes the problem control systems present for these
         | methodologies:
         | https://www.sciencedirect.com/science/article/abs/pii/B97801...
         | 
         | (paywalled link, but it's available on a well-known useful
         | website)
        
         | SpaceManNabs wrote:
         | My fav way to intuit this is this example
         | 
         | https://stats.stackexchange.com/questions/85363/simple-examp...
         | 
         | Blew my mind the first time I saw it.
         | 
         | Not the same definitions one to one (author specifically talks
         | about correlation vs linear correlation) but same idea.
        
         | kyllo wrote:
         | Indeed, causally linked variables need not be correlated in
         | observed data; bias in the opposite direction of the causal
         | effect may approximately equal or exceed it in magnitude and
         | "mask" the correlation. Chapter 1 of this popular causal
         | inference book demonstrates this with a few examples:
         | https://mixtape.scunning.com/01-introduction#do-not-confuse-...
        
       | Vecr wrote:
       | Are the assumptions "No spurious correlation", "Consistency", and
       | "Exchangeability" ever actually true? If a dataset's big enough
       | you should generally be able to find at least one weird
       | correlation, and the others are limits of doing statistics in the
       | real world.
        
         | levocardia wrote:
         | Some situations guarantee certain assumptions: Randomization,
         | for example, guarantees exchangeability.
        
       | shiandow wrote:
       | This is missing my favourite rule.
       | 
       | 0. The directions of all arrows not part of a collider are
       | statistically meaningless.
        
         | Vecr wrote:
         | What's not part of a collider? Good luck with your memory in
         | that case.
        
           | 082349872349872 wrote:
           | I'm guessing they mean that given a bunch of correlated nodes
           | but no collider (in which case the casual graph must be a
           | tree of some sort) you not only don't know if the tree be
           | bushy or linear, you don't even know which node may be the
           | root.
           | 
           | (bushy trees, of which there are very many compared with
           | linear ones, would be an instance of Gwern's model* of
           | confounds being [much] more common than causality?)
           | 
           | * https://news.ycombinator.com/item?id=41291636
        
             | Vecr wrote:
             | Right, but your memory functions as a collider, if there
             | are literally no colliders anywhere you by definition won't
             | be able to remember anything.
        
       | dkga wrote:
       | I highly suggest this paper here for a more complete view of
       | causality that nests do-calculus (at least in economics):
       | 
       | Heckman, JJ and Pinto, R. (2024): "Econometric causality: The
       | central role of thought experiments", Journal of Econometrics,
       | v.243, n.1-2.
        
         | fn-mote wrote:
         | Why should you look this paper up? It argues that certain
         | approaches from statistics and computer science are limited,
         | and (essentially) that economists have a better approach. YMMV,
         | but the criticisms are specific (whether or not you buy the
         | "fix").
         | 
         | From the paper:
         | 
         | > Each of the recent approaches holds value for limited classes
         | of problems. [...] The danger lies in the sole reliance on
         | these tools, which eliminates serious consideration of
         | important policy and interpretation questions. We highlight the
         | flexibility and adaptability of the econometric approach to
         | causality, contrasting it with the limitations of other causal
         | frameworks.
        
       | Rhapso wrote:
       | I'm keeping this link, taking a backup and handing it out
       | whenever i can. It is succinct and effective.
       | 
       | These are concepts i find myself constantly having to explain and
       | teach and they are critical to problem solving.
        
       | 082349872349872 wrote:
       | Can these seven be reduced to three basic rules?
       | 
       | - controlling for a node increases correlation among pairs where
       | both are ancestors
       | 
       | - controlling for a node does not affect (the lack of)
       | correlation among pairs where at least one is categorically
       | unrelated (shares no ancestry with that node)
       | 
       | - controlling for a node decreases correlation among pairs where
       | both are related but at least one is not an ancestor
        
       | raymondh wrote:
       | Is there a simple R example for Rule 4?
        
         | elsherbini wrote:
         | It is sort of tautological:                   # variable A has
         | three causes: C1,C2,C3         C1 <- rnorm(100)         C2 <-
         | rnorm(100)         C3 <- rnorm(100)              A <- ifelse(C1
         | + C2 + C3 > 1, 1, 0)              cor(A, C1)         cor(A, C2)
         | cor(A, C3)              # If we set the values of A
         | ourselves...         A <- sample(c(1,0), 100, replace=TRUE)
         | # then A no longer has correlation with its natural causes
         | cor(A, C1)         cor(A, C2)         cor(A, C3)
        
       | abeppu wrote:
       | At the bottom, the author mentions that by "correlation" they
       | don't mean "linear correlation", but all their diagrams show the
       | presence or absence of a clear linear correlation, and code
       | examples use linear functions of random variables.
       | 
       | They offhandedly say that "correlation" means "association" or
       | "mutual information", so why not just do the whole post in terms
       | of mutual information? I _think_ the main issue with that is just
       | that some of these points become tautologies -- e.g. the first
       | point,  "independent variables have zero mutual information" ends
       | up being just one implication of the definition of mutual
       | information.
        
         | jdhwosnhw wrote:
         | This isnt a correction to your post, but a clarification for
         | other readers: correlation implies dependence, but dependence
         | does not imply correlation. Conversely, two variables share
         | non-zero mutual information if and only if they are dependent.
        
           | islewis wrote:
           | Could you give some examples of dependence without
           | correlation?
        
             | xtacy wrote:
             | You can check the example described here:
             | https://stats.stackexchange.com/questions/644280/stable-
             | viol...
             | 
             | Judea Pearl's book also goes into the above in some detail,
             | as to why faithfulness might be a reasonable assumption.
        
             | abeppu wrote:
             | A clear graphical set of illustrations is the bottom row in
             | this famous set: https://en.wikipedia.org/wiki/Correlation#
             | /media/File:Correl...
             | 
             | They have clear dependence; if you imagine fixing
             | ("conditioning") x at a particular value and looking at the
             | distribution of y at that value, it's different from the
             | overall distribution of y (and vice versa). But the
             | familiar linear correlation coefficient wouldn't indicate
             | anything about this relationship.
        
             | kyllo wrote:
             | > A sailor is sailing her boat across the lake on a windy
             | day. As the wind blows, she counters by turning the rudder
             | in such a way so as to exactly offset the force of the
             | wind. Back and forth she moves the rudder, yet the boat
             | follows a straight line across the lake. A kindhearted yet
             | naive person with no knowledge of wind or boats might look
             | at this woman and say, "Someone get this sailor a new
             | rudder! Hers is broken!" He thinks this because he cannot
             | see any relationship between the movement of the rudder and
             | the direction of the boat.
             | 
             | https://mixtape.scunning.com/01-introduction#do-not-
             | confuse-...
        
             | gweinberg wrote:
             | Imagine your data points look like a U. There's no
             | (lineral) correlation between x and y, you are equally
             | likely to have a high value of y when x is high or low. But
             | low values of y are associated with medium values of x, and
             | a high value of y means x will be very high or very low.
        
             | crystal_revenge wrote:
             | I mentioned it in another comment, but the most trivial
             | example is:
             | 
             | X ~ Unif(-1,1)
             | 
             | Y = X^2
             | 
             | In this case X and Y have a correlation of 0.
        
           | westurner wrote:
           | By that measure, all of these Spurious Correlations indicate
           | _insignificant_ dependence, which isn 't of utility:
           | https://www.tylervigen.com/spurious-correlations
           | 
           | Isn't it possible to contrive an example where a test of
           | pairwise dependence causes the statistician to error by
           | excluding relevant variables from tests of more complex
           | relations?
           | 
           | Trying to remember which of these factor both P(A|B) and
           | P(B|A) into the test
        
             | abeppu wrote:
             | I think you're using the word "insignificant" in a possibly
             | misleading or confusing way.
             | 
             | I think in this context, the issue with the spurious
             | correlations from that site is that they're all time series
             | for overlapping periods. Of course, the people who
             | collected these understood that time was an important
             | causal factor in all these phenomena. In the graphical
             | language of this post:
             | 
             | T --> X_i
             | 
             | T --> X_j
             | 
             | Since T is a common cause to both, we should expect to see
             | a mutual information between X_i, X_j. In the paradigm
             | here, we could try to control for T and see if a
             | relationship persists (i.e. perhaps in the same month,
             | collect observations for X_i, X_j in each of a large number
             | of locales), and get a signal on whether some the shared
             | dependence on time is the _only_ link.
        
         | bjornsing wrote:
         | I'd be more interested in those tautologies nonetheless. Much
         | better than literally untrue statements that I have to somehow
         | decipher.
        
       | levocardia wrote:
       | >Controlling for a collider leads to correlation
       | 
       | This is a big one that most people are not aware of. Quite often,
       | in economics, medicine, and epidemiology, you'll see researchers
       | adjust for everything in their regression model: income, physical
       | activity, education, alcohol consumption, BMI, ... without
       | realizing that they could easily be inducing collider bias.
       | 
       | A much better, but rare, approach is to sit down with some
       | subject matter experts and draft up a DAG - directed acyclic
       | graph - that makes your assumptions about the causal structure of
       | the problem explicit. Then determine what needs to be adjusted
       | for in order to get a causal estimate of the effect. When you're
       | explicit about your causal assumptions, it makes it easier for
       | other researchers to propose different causal structures, and see
       | if your results still hold up under alternative causal
       | structures.
       | 
       | The DAGitty tool [1] has some cool examples.
       | 
       | [1] https://www.dagitty.net/dags.html
        
         | kyllo wrote:
         | Collider bias or "Berkson's Paradox" is a fun one, there lots
         | of examples of it in everyday life:
         | https://en.wikipedia.org/wiki/Berkson%27s_paradox
        
       | chrsig wrote:
       | > Rule 8: Controlling for a causal descendant (partially)
       | controls for the ancestor
       | 
       | perhaps this is a quaint or wildly off base question, but an
       | honest one, please forgive any ignorance:
       | 
       | Isn't this essentiallydefining the partial derivative? Should one
       | arrive at the calculus definition of a partial derivative by
       | following this?
        
         | bubblyworld wrote:
         | You probably could if you interpret that sentence very
         | creatively. But I think it's useful to remember that this is
         | mathematics, and words like "control", "descendant" and
         | "ancestor" have specific technical meanings (all defined in the
         | article, I believe).
         | 
         | The technical meaning of that sentence has to do with
         | probability theory (probability distributions, correlation,
         | conditionals), and not so much calculus (differentiable
         | functions, limits, continuity).
        
       | nomilk wrote:
       | Humble reminder of how easy R is to use. Download and install R
       | for your operating system: https://cran.r-project.org/bin/
       | 
       | Start it in the terminal by typing:                   R
       | 
       | Copy/paste the code from the article to see it run!
        
         | curiousgal wrote:
         | Can't use R without RStudio. It so much better than the
         | terminal.
        
           | nomilk wrote:
           | Agree RStudio makes R a dream, but isn't necessary for
           | someone to run the code in the article =)
        
           | throwway_278314 wrote:
           | really??? I've developed in R for over a decade using two
           | terminal windows. One runs vim, the other runs R. Keyboard
           | shortcuts to send R code from vim to R.
           | 
           | first google hit if you want to try this yourself:
           | https://www.freecodecamp.org/news/turning-vim-into-an-r-
           | ide-...
           | 
           | Sooooooo much better than "notebooks". Hating on "notebooks"
           | today.
        
         | carlmr wrote:
         | >Humble reminder of how easy R is to use.
         | 
         | I had to learn R for a statistics course. This was a long time
         | ago. But coming from a programming background I never found any
         | other mainstream language as hard to grok as R.
         | 
         | Has this become better? Is it just me that doesn't get it?
        
         | incognito124 wrote:
         | R is my least favorite language to use, thanks to the uni
         | courses that force it
         | 
         | https://github.com/ReeceGoding/Frustration-One-Year-With-R
        
       | crystal_revenge wrote:
       | > Independent variables are not correlated
       | 
       | But it's important to remember that _dependent_ variables can
       | also be _not correlated_. That is _no correlation_ does _not_
       | imply independence.
       | 
       | Consider this trivial case:
       | 
       | X ~ Uniform(-1,1)
       | 
       | Y = X^2
       | 
       | Cor(X,Y) = 0
       | 
       | Despite the fact that Y's value is absolutely determined by the
       | value of X.
        
         | TheRealPomax wrote:
         | This is also why it's important to _look at your plots_.
         | Because simply looking at your scatter plot makes it really
         | obvious what methods you _can 't_ use, even if it doesn't
         | really tell you anything about what you _should_ use.
        
         | antognini wrote:
         | The author is using "correlation" in a somewhat non-standard
         | way. He isn't referring to linear correlation as you are, but
         | any sort of nonzero mutual information between the two
         | variables. So in his usage those two variables are "correlated"
         | in your example.
        
       ___________________________________________________________________
       (page generated 2024-08-19 23:00 UTC)