[HN Gopher] Beautiful Probability
___________________________________________________________________
Beautiful Probability
Author : codeAligned
Score : 54 points
Date : 2024-02-23 18:43 UTC (4 hours ago)
(HTM) web link (www.readthesequences.com)
(TXT) w3m dump (www.readthesequences.com)
| jawarner wrote:
| Isn't that Edwin T. Jaynes example just p-hacking? If only 1 out
| of 100 experiments produces a statistically significant result,
| and you only report the one, I would intuitively consider that
| evidence to be worth less. Can someone more versed in Bayesian
| statistics better explain the example?
| skulk wrote:
| I find the original discussion to be far more interesting than
| whatever I just read in TFA:
| https://books.google.com.mx/books?id=sLz0CAAAQBAJ&pg=PA13&lp...
| usgroup wrote:
| Yeah generally Jaynes book is very nice and easy to read for
| this sort of material.
| abeppu wrote:
| > One who thinks that the important question is: "Which
| quantities are random?" is then in this situation. For the
| first researcher, n was a fixed constant, r was a random
| variable with a certain sampling distribution. For the second
| researcher, r/n was a fixed constant (approximately), and n
| was the random variable, with a very different sampling
| distribution. Orthodox practice will then analyze the two
| experiments in different ways, and will in general draw
| different conclusions about the efficacy of the treatment
| from them.
|
| But so then the data _are_ different between the two
| experiments, because they were observing different random
| variables -- so why is it concerning if they arrive at
| different conclusions? In fact, the _fact that the 2nd
| experiment finished_ is also an observation on its own (e.g.
| if the treatment was in fact a dangerous poison, perhaps it
| would have been infeasible for the 2nd researcher to reach
| their stopping criteria).
| usgroup wrote:
| Well no because it's talking about either a fixed sample size
| or stopping when a % total is reached. Neither imply a
| favourable p-value necessarily.
|
| I think the author means to say that it's two methods
| incidentally equivalent in the data they collect that may draw
| different conclusions based on their initial assumptions.
| Question is how do you make coherent sense of it.
|
| At level 1 depth it's insightful.
|
| At level 2 depth it's a straw man.
|
| At level 3 depth, just keep drinking until you're back at level
| 1 depth.
| tech_ken wrote:
| > The other ... decided he would not stop until he had data
| indicating a rate of cures definitely greater than 60%
|
| I believe that "definitely greater than 60%" is supposed to
| imply that the researcher is stopping when the p-value of
| their HA (theta>=60%) is below alpha, so an optional stopping
| (ie. "p-hacking") situation.
| Terr_ wrote:
| I think the point is that the different _planned_ stopping
| rules of each researcher--their subjective thoughts--should
| _not_ affect what we consider the objective or mathematical
| significance of their otherwise-identical process and results.
| (Not unless humans have psychic powers.)
|
| It's illogical to deride one of those two result-sets as
| telling us less about the objective universe just because the
| researcher had a different private intent (e.g. "p-hacking")
| for stopping at n=100.
|
| _________________
|
| > According to old-fashioned statistical procedure [...] It's
| quite possible that the first experiment will be "statistically
| significant," the second not. [...]
|
| > But the likelihood of a given state of Nature producing the
| data we have seen, has nothing to do with the researcher's
| private intentions. So whatever our hypotheses about Nature,
| the likelihood ratio is the same, and the evidential impact is
| the same, and the posterior belief should be the same, between
| the two experiments. At least one of the two Old Style methods
| must discard relevant information--or simply do the wrong
| calculation--for the two methods to arrive at different
| answers.
| lalaithion wrote:
| If you have two researchers, and one is "trying" to p-hack by
| repeating an experiment with different parameters, and one is
| trying to avoid p-hacking by preregistering their parameters,
| you might expect the paper published by the latter one to be
| more reliable.
|
| However, if you know that the first researcher just happened to
| get a positive result on their first try (and therefore didn't
| actually have to modify parameters), Bayesian math says that
| their intentions didn't matter, only their result. If, however,
| they did 100 experiments and chose the best one, then their
| intentions... still don't matter! but their behavior does
| matter, and so we can discount their paper.
|
| Now, if you _only_ know their intentions but not their final
| behavior (because they didn't say how many experiments they did
| before publishing), then their intentions matter because we can
| predict their behavior based on their intentions. But once you
| know their behavior (how many experiments they attempted), you
| no longer care about their intentions; the data speaks for
| itself.
| d0mine wrote:
| Bayesian approach sounds like a religion (one true way).
|
| There is nothing unusual about different mathematical
| methods/models producing different results e.g., the number of
| roots even for the same quadratic equation may depend on
| "private" thoughts such as whether complex roots are of interest
| (sometimes they do/sometimes they don't). All models are wrong
| some are useful.
| usgroup wrote:
| Yeah I'd agree at some depth. We don't talk enough about
| integers, rationals and real numbers and what they imply for
| our "normative rationality" or "epistemological commitment".
| But aside from the integers, everything else is totally
| suspicious.
| biomcgary wrote:
| One of my priors: "a group of people who look like a faith-
| based community, but claim not to be one, should not be
| trusted".
| lalaithion wrote:
| > the number of roots even for the same quadratic equation may
| depend on "private" thoughts such as whether complex roots are
| of interest
|
| You are confusing ambiguity in a problem statement due to human
| language being imprecise with two well-specified identical
| experimental results having different results due to the
| intentions of the human carrying them out.
|
| Is arithmetic a religion because there's "one true way" of
| adding integers?
| kevindamm wrote:
| I can think of at least two ways to add integers.. the
| categorical way that applies a mapping from the set into
| itself, and the set-theoretic way that deals with unwrapping
| and rewrapping successor relations. The latter is sometimes
| resorted to in heavily-relational contexts like Datalog.
| lalaithion wrote:
| Yes, this is addressed in the original article... there are
| multiple "lawful" ways of adding integers which all give
| the same results, and likewise in probability all "lawful"
| ways of analyzing data should give the same results. If you
| have two different ways of adding numbers which give
| different results, one is not lawful.
| usgroup wrote:
| So you know when you believe something and then you update your
| belief because you get some evidence?
|
| Yeah, and then you stack some beliefs on top of that.
|
| And then you discover the evidence wasn't actually true. Remind
| me again what the normative Bayesian update looks like in that
| instance.
|
| Unfortunately it's turtles all the way down.
| nerdponx wrote:
| > you discover the evidence wasn't actually true
|
| Not really going to vouch for the normative Bayesian approach,
| but you might just consider this new (strong) evidence for
| applying an update.
| crdrost wrote:
| The precise claim (I believe) is that the prior update which
| you had, made some assumptions about the correct way to
| phrase your perceptions.
|
| That is, you say, for the update, "the probability that this
| trial came out with X successes given everything else that I
| take for granted, and also that the hypothesis is true" vs.
| "the probability that this trial came out with X successes
| given everything else that I take for granted, and also that
| the hypothesis is false." So you actually say in both cases
| the fragment, "this trial came out with X successes."
|
| What happens if _it didn 't really_? Well, the proper
| Bayesian approach is to state that you phrased this fragment
| wrong. You _actually_ needed to qualify "the probability
| that _I saw_ this trial come out with X successes given ...
| ", and those probabilities might have been different than the
| trial actually coming out with X successes.
|
| OK but what happens if _that didn 't really, either_. Well,
| the proper Bayesian approach is to state that you phrased the
| fragment _doubly_ wrong. You _actually_ needed to qualify it
| as "the probability that _I thought I saw_ this trial come
| out with X successes given... ". So now you are properly
| guarded, like a good Bayesian, against the possibility that
| maybe you sneezed while you were reading the experiment
| results and even though you saw 51, it got scrambled in your
| head and you thought you saw 15.
|
| OK but what happens if _that didn 't really, either either_.
| You _thought_ that you thought that you saw something, but
| actually you didn 't think you saw anything, because you were
| in The Matrix or had dementia or any number of other things
| that mess with our perceptions of ourselves. So you, good
| Bayesian that you wish to be, needed to qualify this thing
| extra!
|
| The idea is that Bayesianism is one of those "if all you have
| is a hammer you see everything as a nail" type of things.
| It's not that you can't see a screw as a really inefficient
| nail, that is totally one valid perspective on screwness.
| It's also not that the hammer doesn't have any valid uses. It
| does, it's very useful, but when you start trying to chase
| all of human rationality with it, you start to run into some
| really weird issues.
|
| For instance, the proper Bayesian view of intuitions is that
| they are a form of evidence (because what else would they
| be), and that they are extremely reliable when they point to
| lawlike metaphysical statements (otherwise we have trouble
| with "1 + 1 = 2" and "reality is not self-contradictory" and
| other metaphysical laws that we take for granted) but
| correspondingly unreliable when, say, we intuit things other
| than metaphysical laws, such as the existence of a monster in
| the closet or a murderer hiding under the bed or that the
| only explanation for our missing (actually misplaced) laptop
| is that someone must have stolen it in the middle of the
| night." You need to do this to build up the "ground truth"
| that allows you to get to the vanilla epistemology stuff that
| you then take for granted like "okay we can run experiments
| to try to figure out stuff about the world, and those
| experiments say that the monster in the closet isn't actually
| there."
| jawarner wrote:
| Real world systems are complicated. In theory, you could do
| belief propagation to update your beliefs through the whole
| network, if your brain worked something like a Bayesian
| network.
| biomcgary wrote:
| Natural selection didn't wire our brains to work like a
| Bayesian network. If it had, wouldn't it be easier to make
| converts to the Church of Reverend Bayes? /s
|
| Alternatively, brains ARE Bayesian networks with hard coded
| priors that cannot be changed without CRISPR.
| cyanydeez wrote:
| TThis just sounds like logical tetris
| lalaithion wrote:
| P(B|I saw E, P) = P(I saw E|B,P) * P(B|P) / P(I saw E|P)
| P(B|E was false, I saw E, P) = P(E was false|B,I saw E,P) *
| P(B|P,I saw E) / P(E was false|P, I saw E)
|
| This is a pretty basic application of Bayes' theorem.
| usgroup wrote:
| Love it: p(I saw E) and p(I didn't really see E).
|
| Just move the argument one level down: "I saw E is false" and
| it turns out so is "E is false" . So then? Add "E was false
| was false"?
|
| Turtles all the way down.
|
| At some point something has to be "true" in order to
| conditionalise on it.
| AbrahamParangi wrote:
| I'm confused in that I don't see how this is troubling. Yes, the
| two experimenters rolled dice and got the same result, but it's
| as if one of them was rolling a 6 sided die and the other a 20
| sided one. Each experiment is not a result per se but a sample
| from a distribution.
|
| How you infer the shape of that distribution based on the
| experiment is a function of the distribution of all courses your
| experiment could have taken. This set of paths is different in
| each case, which means the inference we make must also be
| different.
|
| There is no inconsistency. The confusion seems to be in assuming
| that the experimental result was a true statement about the
| nature of the world rather than a true statement about simply
| what happened.
|
| edit: This seems to me to be a specific case of a general class
| of difficult thinking where you ask yourself: "what are all the
| worlds that I might be in that are consistent with what I'm
| presently observing".
| lalaithion wrote:
| If you see two people roll a d20 and get a 20, you get to say
| "wow, that was unlikely" to both of them, even if one of them
| privately admits they were going to quickly re-roll their die
| if they got below a 10. What matters is their actual behavior
| (identical in the example) not their intentions. The d6 vs d20
| version is different because their behavior is different.
| ninthcat wrote:
| Unlikely in what probability space? We only see one version
| of reality so the probabilities that we assign to any outcome
| are based on a prior choice of probability space. That is why
| the researchers' intent matters.
| AbrahamParangi wrote:
| Yes, indeed.
| lalaithion wrote:
| Both events have the same probability of happening; 1/20.
| The fact that the researcher intended to do something in a
| reality that didn't happen isn't relevabnt.
| ninthcat wrote:
| If you want to know whether a drug is more effective than
| placebo, the answer to that question depends on both the
| data collected in a study and the initial study design.
| There's a reason why it's meaningless to say "that was
| unlikely" after somebody says they were born on January
| 1, or after getting a two-factor code that is the same
| number six times. There's nothing special about those
| particular events except for the fact that we noticed
| them. Since we live in a single instance of the universe
| where they have already happened, they have probability
| 1. At the same time, on any given instance they have
| probability 1/365ish or 1/10000. The difference between
| these two interpretations of the probability is the same
| difference as having a good experimental design vs a
| flawed experimental design where you repeat the
| experiment until you get the results you want to see.
| AbrahamParangi wrote:
| Let's imagine that we ran it as a simulation and we ran it a
| million times. The two people would have a different
| distribution of results. If you ignore the intention, you
| ignore reality as if that intention were not a part of it.
|
| Do you not notice that your inference is less accurate using
| this line of reasoning? Does that not suggest that it's
| simply wrong?
| usgroup wrote:
| This is well put. Coincidentally in the example the results
| are the same , but they need not be. given repeated
| experiments with the same intentions one may expect
| different distributions.
|
| However, one could just move the argument up a level and
| manufacture a case of different intentions leading to the
| same distributions and then ask the same question.
| kgwgk wrote:
| > Coincidentally in the example the results are the same
| , but they need not be.
|
| The questions is whether we should draw different
| conclusions when the results are the same. I don't think
| that anyone has any issues with drawing different
| conclusions when the results are different!
| lalaithion wrote:
| Imagine you have a machine that rolls a d20 and lies if
| the die comes up 1-19, and tells the truth on a 20.
| Should you trust this machine usually? No. But if you can
| _see that the die comes up 20_ then you should trust it.
| The fact that it sometimes might lie doesn't mean that
| you should distrust the machine if you can see that in
| this case it's telling the truth.
| lalaithion wrote:
| What do you mean by 'results'?
|
| They would not have different distributions of results on
| their first die roll.
|
| They would have different distributions of results on their
| reported die roll.
|
| If I am looking at their first die roll, the fact that they
| would have different reported die rolls doesn't matter!
| kgwgk wrote:
| The question is whether we should draw different conclusions
| from one set of observations depending not just on what we are
| observing but also on different ways to define "what are all
| the worlds that I might be in that are consistent with what I'm
| presently observing".
| birdofhermes wrote:
| As other commenters have pointed out any given introductory
| chapter in a book on Bayesian statistics, including Jaynes', is
| better exposition than this. I found _Probability Theory: The
| Logic of Science_ very easy to follow and very well-written.
|
| I had a similar experience when I finally found a copy of
| Barbour's _The End of Time_ and discovered, much to my chagrin,
| that it wasn't nearly as mystical or complicated as EY makes it
| seem in the Timeless Physics "sequence". Barbour's account was
| much more readable and much easier to understand.
|
| Yudkowsky just isn't that great of a popular science writer. It's
| not his specialty, so this shouldn't be surprising.
| lalaithion wrote:
| Here's a link:
| http://www.med.mcgill.ca/epidemiology/hanley/bios601/Gaussia...
|
| And if you want to read what he has to say on the optional
| stopping problem, you can scroll down to page 196 (166 in page
| numbers) to the heading "6.9.1 Digression on optional stopping"
|
| I don't personally think Jaynes is much easier to read than
| Yudkowsky, but he's definitely more rigorous.
| bdjsiqoocwk wrote:
| Meaningless drivel.
| lalaithion wrote:
| From _Probability Theory: The Logic of Science_:
|
| > Then the possibility seems open that, for different priors,
| different functions r(x1,..., xn) of the data may take on the
| role of sufficient statistics. This means that use of a
| particular prior may make certain particular aspects of the data
| irrelevant. Then a different prior may make different aspects of
| the data irrelevant. One who is not prepared for this may think
| that a contradiction or paradox has been found.
|
| I think this explains one of the confusions many commenters have;
| for an experimenter who repeats observations until they reach
| their desired ratio r/(n-r), the ratio r/(n-r) is not a
| sufficient statistic! But when we have an experimenter who has a
| pre-registered n, then ratio r/(n-r) is a sufficient statistic.
| However, in either case,
|
| > We did not include n in the conditioning statements in p(D|th
| I) because, in the problem as defined, it is from the data D that
| we learn both n and r. But nothing prevents us from considering a
| different problem in which we decide in advance how many trials
| we shall make; then it is proper to add n to the prior
| information and write the sampling probability as p(D|nth I). Or,
| we might decide in advance to continue the Bernoulli trials until
| we have achieved a certain number r of successes, or a certain
| log-odds u = log[r/(n - r)]; then it would be proper to write the
| sampling probability as p(D|rth I) or p(D|uth I), and so on. Does
| this matter for our conclusions about th?
|
| > In deductive logic (Boolean algebra) it is a triviality that AA
| = A; if you say: 'A is true' twice, this is logically no
| different from saying it once. This property is retained in
| probability theory as logic, since it was one of our basic
| desiderata that, in the context of a given problem, propositions
| with the same truth value are always assigned the same
| probability. In practice this means that there is no need to
| ensure that the different pieces of information given to the
| robot are independent; our formalism has automatically the
| property that redundant information is not counted twice.
| roenxi wrote:
| That seems a bit long winded since this situation is a direct
| result of Bayes' theorem. It seems to me equivalent to say:
|
| Bayes' Theorem holds because it can be proven. Therefore,
| situations can be constructed where considering identical data
| without considering priors gives nonsense conclusions. For
| example if we happen to know as a prior that P(outcome of
| experiment is a certain ratio) = P(experiment is completed)
| then that must be considered when interpreting the results.
___________________________________________________________________
(page generated 2024-02-23 23:00 UTC)