[HN Gopher] Bayesian Statistics: The three cultures
       ___________________________________________________________________
        
       Bayesian Statistics: The three cultures
        
       Author : luu
       Score  : 174 points
       Date   : 2024-07-26 17:15 UTC (5 hours ago)
        
 (HTM) web link (statmodeling.stat.columbia.edu)
 (TXT) w3m dump (statmodeling.stat.columbia.edu)
        
       | thegginthesky wrote:
       | I miss the college days where professors would argue endlessly on
       | Bayesian vs Frequentist.
       | 
       | The article is very well succinct and even explains why even my
       | Bayesian professors had different approaches to research and
       | analysis. I never knew about the third camp, Pragmatic Bayes, but
       | definitely is in line with a professor's research that was very
       | through on probability fit and the many iteration to get the
       | prior and joint PDF just right.
       | 
       | Andrew Gelman has a very cool talk "Andrew Gelman - Bayes,
       | statistics, and reproducibility (Rutgers, Foundations of
       | Probability)", which I highly recommend for many Data Scientists
        
         | spootze wrote:
         | Regarding the frequentist vs bayesian debates, my slightly
         | provocative take on these three cultures is
         | 
         | - subjective Bayes is the strawman that frequentist academics
         | like to attack
         | 
         | - objective Bayes is a naive self-image that many Bayesian
         | academics tend to possess
         | 
         | - pragmatic Bayes is the approach taken by practitioners that
         | actually apply statistics to something (or in Gelman's terms,
         | do science)
        
           | refulgentis wrote:
           | I see, so academics are frequentists (attackers) or objective
           | Bayes (naive), and the people Doing Science are pragmatic
           | (correct).
           | 
           | The article gave me the same vibe, nice, short set of labels
           | for me to apply as a heuristic.
           | 
           | I never really understood this particular war, I'm a
           | simpleton, A in Stats 101, that's it. I guess I need to bone
           | up on Wikipedia to understand what's going on here more.
        
             | Yossarrian22 wrote:
             | Academics can be pragmatic, I've know ones who've sued both
             | Bayesian statistics and MLE
        
             | sgt101 wrote:
             | Bayes lets you use your priors, which can be very helpful.
             | 
             | I got all riled up when I saw you wrote "correct", I can't
             | really explain why... but I just feel that we need to keep
             | an open mind. These approaches to data are choices at the
             | end of the day... Was Einstein a Bayesian? (spoiler: no)
        
               | refulgentis wrote:
               | You're absolutely right, trying to walk a delicate
               | tightrope that doesn't end up with me giving my
               | unfiltered "you're wrong so lets end conversation"
               | response.
               | 
               | Me 6 months ago would have written: "this comment is
               | unhelpful and boring, but honestly, that's slightly
               | unfair to you, as it just made me realize how little help
               | the article is, and it set the tone. is this even a real
               | argument with sides?"
               | 
               | For people who want to improve on this aspect of
               | themselves, like I did for years:
               | 
               | - show, don't tell (ex. here, I made the oddities more
               | explicit, enough that people could reply to me spelling
               | out what I shouldn't.)
               | 
               | - Don't assert anything that wasn't said directly, ex.
               | don't remark on the commenter, or subjective qualities
               | you assess in the comment.
        
               | 0cf8612b2e1e wrote:
               | Using your priors is another way of saying you know
               | something about the problem. It is exceedingly difficult
               | to objectively analyze a dataset without interjecting any
               | bias. There are too many decision points where something
               | needs to be done to massage the data into shape. Priors
               | is just an explicit encoding of some of that knowledge.
        
             | thegginthesky wrote:
             | Frequentist and Bayesian are correct if both have
             | scientific rigor in their research and methodology. Both
             | can be wrong if the research is whack or sloppy.
        
               | slashdave wrote:
               | I've used both in some papers and report two results (why
               | not?). The golden rule in my mind is to fully describe
               | your process and assumptions, then let the reader decide.
        
             | runarberg wrote:
             | I understand the war between bayesians and frequentists.
             | Frequentist methods have been misused for over a century
             | now to justify all sorts of pseudoscience and hoaxes (as
             | well as created a fair share of honest mistakes), so it is
             | understandable that people would come forward and claim
             | there must be a better way.
             | 
             | What I don't understand is the war between naive bayes and
             | pragmatic bayes. If it is real, it seems like the extension
             | of philosophers vs. engineers. Scientists should see value
             | in both. Naive Bayes is important to the philosophy of
             | science, without which there would be a lot of junk science
             | which would go unscrutinized for far to long, and engineers
             | should be able to see the value of philosophers saving them
             | works by debunking wrong science before they start to
             | implement theories which simply will not work in practice.
        
           | DebtDeflation wrote:
           | A few things I wish I knew when took Statistics courses at
           | university some 25 or so years ago:
           | 
           | - Statistical significance testing and hypothesis testing are
           | two completely different approaches with different
           | philosophies behind them developed by different groups of
           | people that kinda do the same thing but not quite and
           | textbooks tend to completely blur this distinction out.
           | 
           | - The above approaches were developed in the early 1900s in
           | the context of farms and breweries where 3 things were true -
           | 1) data was extremely limited, often there were only 5 or 6
           | data points available, 2) there were no electronic computers,
           | so computation was limited to pen and paper and slide rules,
           | and 3) the cost in terms of time and money of running
           | experiments (e.g., planting a crop differently and waiting
           | for harvest) were enormous.
           | 
           | - The majority of classical statistics was focused on two
           | simple questions - 1) what can I reliably say about a
           | population based on a sample taken from it and 2) what can I
           | reliably about the differences between two populations based
           | on the samples taken from each? That's it. An enormous
           | mathematical apparatus was built around answering those two
           | questions in the context of the limitations in point #2.
        
             | ivan_ah wrote:
             | That was a nice summary.
             | 
             | The data-poor and computation-poor context of old school
             | statistics definitely biased the methods towards the
             | "recipe" approach scientists are supposed to follow
             | mechanically, where each recipe is some predefined sequence
             | of steps, justified based on an analytical approximations
             | to a sampling distribution (given lots of assumptions).
             | 
             | In modern computation-rich days, we can get away from the
             | recipes by using resampling methods (e.g. permutation tests
             | and bootstrap), so we don't need the analytical
             | approximation formulas anymore.
             | 
             | I think there is still room for small sample methods
             | though... it's not like biological and social sciences are
             | dealing with very large samples.
        
         | RandomThoughts3 wrote:
         | I'm always puzzled by this because while I come from a country
         | where the frequentist approach generally dominates, the fight
         | with Bayesian basically doesn't exist. That's just a bunch of
         | mathematical theories and tools. Just use what's useful.
         | 
         | I'm still convinced that Americans tend to dislike the
         | frequentist view because it requires a stronger background in
         | mathematics.
        
           | parpfish wrote:
           | I don't think mathematical ability has much to do with it.
           | 
           | I think it's useful to break down the anti-Bayesians into
           | statisticians and non-statistician scientists.
           | 
           | The former are mathematically savvy enough to understand
           | bayes but object on philosophical grounds; the later don't
           | care about the philosophy so much as they feel like an attack
           | on frequentism is an attack on their previous research and
           | they take it personally
        
             | mturmon wrote:
             | This is a reasonable heuristic. I studied in a program that
             | (for both philosophical and practical reasons) questioned
             | whether the Bayesian formalism should be applied as widely
             | as it is. (Which for many people is, basically everywhere.)
             | 
             | There are some cases, that do arise in practice, where you
             | can't impose a prior, and/or where the "Dutch book"
             | arguments to justify Bayesian decisions don't apply.
        
           | thegginthesky wrote:
           | It's because practicioners of one says that the other camp is
           | wrong and question each other's methodologies. And in
           | academia, questioning one's methodology is akin to saying one
           | is dumb.
           | 
           | To understand both camps I summarize like this.
           | 
           | Frequentist statistics has very sound theory but is
           | misapplied by using many heuristics, rule of thumbs and
           | prepared tables. It's very easy to use any method and hack
           | the p-value away to get statistically significant results.
           | 
           | Bayesian statistics has an interesting premise and inference
           | methods, but until recently with the advancements of
           | computing power, it was near impossible to do simulations to
           | validate the complex distributions used, the goodness of fit
           | and so on. And even in the current year, some bayesian
           | statisticians don't question the priors and iterate on their
           | research.
           | 
           | I recommend using methods both whenever it's convenient and
           | fits the problem at hand.
        
           | runarberg wrote:
           | I think the distaste Americans have to frequentists has much
           | more to do with history of science. The Eugenics movement had
           | a massive influence on science in America a and they used
           | frequentist methods to justify (or rather validate) their
           | scientific racism. Authors like Gould brought this up in the
           | 1980s, particularly in relation to factor analysis and
           | intelligence testing, and was kind of proven right when
           | Hernstein and Murray published _The Bell Curve_ in 1994.
           | 
           | The p-hacking exposures of the 1990s only fermented the
           | notion that it is very easy to get away with junk science
           | using frequentest methods to unjustly validate your claims.
           | 
           | That said, frequentists are still the default statistics in
           | social sciences, which ironically is where the damage was the
           | worst.
        
             | lupire wrote:
             | What is the protection against someone using a Bayesian
             | analysis but abusing it with hidden bias?
        
               | analog31 wrote:
               | My knee jerk reaction is replication, and studying a
               | problem from multiple angles such as experimentation and
               | theory.
        
               | runarberg wrote:
               | I'm sure there are creative ways to misuse bayesian
               | statistics, although I think it is harder to hide your
               | intentions as you do that. With frequentist approaches
               | your intentions become obscure in the whole mess of
               | computations and at the end of it you get to claim this
               | is a simple "objective" truth because the p value shows <
               | 0.05. In bayesan statistics the data you put into it is
               | front and center: _The chances of my theory being true
               | given this data is greater than 95%_ (or was it chances
               | of getting this data given my theory?). In reality most
               | hoaxes and junk science was because of bad data which
               | didn't get scrutinized until much too late (this is what
               | Gould did).
               | 
               | But I think the crux of the matter is that bad science
               | has been demonstrated with frequentists and is now a part
               | of our history. So people must either find a way to fix
               | the frequentist approaches or throw it out for something
               | different. Bayesian statistics is that something
               | different.
        
             | TeaBrain wrote:
             | I don't think the guy's basic assertion is true that
             | frequentist statistics is less favored in American
             | academia.
        
           | bb86754 wrote:
           | I can attest that the frequentist view is still very much the
           | mainstream here too and fills almost every college curriculum
           | across the United States. You may get one or two Bayesian
           | classes if you're a stats major, but generally it's
           | hypothesis testing, point estimates, etc.
           | 
           | Regardless, the idea that frequentist stats requires a
           | stronger background in mathematics is just flat out silly
           | though, not even sure what you mean by that.
        
             | blt wrote:
             | I also thought it was silly, but maybe they mean that
             | frequentist methods still have analytical solutions in some
             | settings where Bayesian methods must resort to Monte Carlo
             | methods?
        
           | ordu wrote:
           | I'd suggest you to read "The Book of Why"[1]. It is mostly
           | about Judea's Pearl next creation, about causality, but he
           | also covers bayesian approach, the history of statistics, his
           | motivation behind bayesian statistics, and some success
           | stories also.
           | 
           | To read this book will be much better, then to apply
           | "Hanlon's Razor"[2] because you see no other explanation.
           | 
           | [1] https://en.wikipedia.org/wiki/The_Book_of_Why
           | 
           | [2] https://en.wikipedia.org/wiki/Hanlon's_razor
        
           | gnulinux wrote:
           | This statement is correct only on a very basic, fundamental
           | sense, but it disregards the research practice. Let's say
           | you're a mathematician who studies analysis or algebra. Sure,
           | technically there is no fundamental reason for constructive
           | logic and classical logic to "compete", you can simply choose
           | whichever one is useful for the problem you're solving, in
           | fact {constructive + lem + choice axioms} will be equivalent
           | to classical math, so why not just study constructive math
           | since it's higher level of abstraction and you can always add
           | those axioms "later" when you have a particular application.
           | 
           | In reality, on a human level, it doesn't work like that
           | because, when you have disagreements on the very foundations
           | of your field, although both camps can agree that their
           | results do follow, the fact that their results (and thus
           | terminology) are incompatible makes it too difficult to
           | research both at the same time. This basically means,
           | practically speaking, you need to be familiar with both, but
           | definitely specialize in one. Which creates hubs of different
           | sorts of math/stats/cs departments etc.
           | 
           | If you're, for example, working on constructive analysis,
           | you'll have to spend tremendous amount of energy on
           | understanding contemporary techniques like localization etc
           | just to work around a basic logical axiom, which is likely
           | irrelevant to a lot of applications. Really, this is like
           | trying to understand the mathematical properties of binary
           | arithmetic (Z/2Z) but day-to-day studying group theory in
           | general. Well, sure Z/2Z is a group, but really you're simply
           | interested in a single, tiny, finite abelian group, but now
           | you need to do a whole bunch of work on non-abelian groups,
           | infinite groups, non-cyclic groups etc just to ignore all
           | those facts.
        
         | bunderbunder wrote:
         | Link to talk: https://youtu.be/xgUBdi2wcDI
        
           | mturmon wrote:
           | Thank you.
           | 
           | In fact, the whole talk series
           | (https://foundationsofprobabilityseminar.com/) and channel (h
           | ttps://www.youtube.com/@foundationsofprobabilitypa2408/vide..
           | .) seem interesting.
        
       | brcmthrowaway wrote:
       | Where does Deep Learning come in?
        
         | thegginthesky wrote:
         | Most models are derived of Machine Learning principles that are
         | a mix of classic probability theory, Frequentist and Bayesian
         | statistics and lots of Computer Science fundamentals. But there
         | have been advancements in Bayesian Inference and Bayesian Deep
         | Learning, you should check the work of frameworks like Pyro
         | (built on top of PyTorch)
         | 
         | Edit: corrected my sentence, but see 0xdde reply for better
         | info.
        
           | 0xdde wrote:
           | I could be wrong, but my sense is that ML has leaned Bayesian
           | for a very long time. For example, even Bishop's widely used
           | book from 2006 [1] is Bayesian. Not sure how Bayesian his new
           | deep learning book is.
           | 
           | [1] https://www.microsoft.com/en-
           | us/research/publication/pattern...
        
             | thegginthesky wrote:
             | I stand corrected! It was my impression that many methods
             | used in ML such as Support Vector Machines, Decision Trees,
             | Random Forests, Boosting, Bagging and so on have very deep
             | roots in Frequentist Methods, although current CS
             | implementations lean heavily on optimizations such as
             | Gradient Descent.
             | 
             | Giving a cursory look into Bishop's book I see that I am
             | wrong, as there's deep root in Bayesian Inference as well.
             | 
             | On another note, I find it very interesting that there's
             | not a bigger emphasis on using the correct distributions in
             | ML models, as the methods are much more concerned in
             | optimizing objective functions.
        
         | tfehring wrote:
         | An implicit shared belief of all of the practitioners the
         | author mentions is that they attempt to construct models that
         | correspond to some underlying "data generating process".
         | Machine learning practitioners may use similar models or even
         | the same models as Bayesian statisticians, but they tend to
         | evaluate their models primarily or entirely based on their
         | predictive performance, _not_ on intuitions about why the data
         | is taking on the values that it is.
         | 
         | See Breiman's classic "Two Cultures" paper that this post's
         | title is referencing:
         | https://projecteuclid.org/journals/statistical-science/volum...
        
         | vermarish wrote:
         | At a high level, Bayesian statistics and DL share the same
         | objective of fitting parameters to models.
         | 
         | In particular, _variational inference_ is a family of
         | techniques that makes these kinds of problems computationally
         | tractable. It shows up everywhere from variational
         | autoencoders, to time-series state-space modeling, to
         | reinforcement learning.
         | 
         | If you want to learn more, I recommend reading Murphy's
         | textbooks on ML: https://probml.github.io/pml-book/book2.html
        
         | klysm wrote:
         | Not sure why this is being downvoted, as it's mentioned
         | peripherally in the article. I think it's primary used as an
         | extreme example of a model where the inner mechanism is
         | entirely inscrutable.
        
         | samch93 wrote:
         | A (deep) NN is just a really complicated data model, the way
         | one treats the estimation of its parameters and prediction of
         | new data determines whether one is a Bayesian or a frequentist.
         | The Bayesian assigns a distribution to the parameters and then
         | conditions on the data to obtain a posterior distribution based
         | on which a posterior predictive distribution is obtained for
         | new data, while the frequentist treats parameters as fixed
         | quantities and estimates them from the likelihood alone, e.g.,
         | with maximum likelihood (potentially using some hacks such as
         | regularization, which themselves can be given a Bayesian
         | interpretation).
        
       | tonymet wrote:
       | A priori distributions are a form of stereotyping. How do people
       | reconcile that?
        
         | lupire wrote:
         | A Bayesian analysis lets you see how the posterior varies as a
         | function of the prior, instead of forcing you to pick a prior
         | before you start.
         | 
         | The tighter the range of this function, the more confidence you
         | have in the result.
         | 
         | You can never know anything if you absolutely refuse to have a
         | prior, because that gives division by 0 in the posterior.
        
         | klysm wrote:
         | What? Maybe in a very specific context where you are modeling
         | joint distributions of people and traits, but that's barely a
         | critique of the method itself.
        
           | tonymet wrote:
           | it's not a critique of the method
        
       | mjhay wrote:
       | The great thing about Bayesian statistics is that it's
       | subjective. You don't have to be in the subjectivist school. You
       | can choose your own interpretation based on your (subjective)
       | judgment.
       | 
       | I think this is a strength of Bayesianism. Any statistical work
       | is infused with the subjective judgement of individual humans. I
       | think it is more objective to not shy away from this immutable
       | fact.
        
         | klysm wrote:
         | The appropriateness of each approach is very much a function of
         | what is being modeled and the corresponding consequences for
         | error.
        
           | mjhay wrote:
           | Of course. The best approach for a particular problem depends
           | on your best judgment.
           | 
           | I guess that means I'm in the pragmatist school in this
           | article's nomenclature (I'm a big fan of Gelman and all the
           | other stats folks there), but what one thinks is pragmatic is
           | also subjective.
        
       | tfehring wrote:
       | The author is claiming that Bayesians vary along two axes: (1)
       | whether they generally try to inform their priors with their
       | knowledge or beliefs about the world, and (2) whether they
       | iterate on the functional form of the model based on its
       | goodness-of-fit and the reasonableness and utility of its
       | outputs. He then labels 3 of the 4 resulting combinations as
       | follows:
       | +---------------+-----------+--------------+         |
       | | iteration | no iteration |
       | +---------------+-----------+--------------+         |
       | informative   | pragmatic | subjective   |         |
       | uninformative |     -     | objective    |
       | +---------------+-----------+--------------+
       | 
       | My main disagreement with this model is the empty bottom-left box
       | - in fact, I think that's where most self-labeled Bayesians in
       | industry fall:
       | 
       | - Iterating on the functional form of the model (and therefore
       | the assumed underlying data generating process) is generally
       | considered obviously good and necessary, in my experience.
       | 
       | - Priors are _usually_ uninformative or weakly informative,
       | partly because data is often big enough to overwhelm the prior.
       | 
       | The need for iteration feels so obvious to me that the entire "no
       | iteration" column feels like a straw man. But the author, who
       | knows far more academic statisticians than I do, explicitly says
       | that he had the same belief and "was shocked to learn that
       | statisticians didn't think this way."
        
         | klysm wrote:
         | The no iteration thing is very real and I don't think it's even
         | for particularly bad reasons. We iterate on models to make them
         | better, by some definition of better. It's no secret that
         | scientific work is subject to rather perverse incentives around
         | thresholds of significance and positive results. Publish or
         | perish. Perverse incentives lead to perverse statistics.
         | 
         | The iteration itself is sometimes viewed directly as a problem.
         | The "garden of forking paths", where the analysis depends on
         | the data, is viewed as a direct cause for some of the
         | statistical and epistemological crises in science today.
         | 
         | Iteration itself isn't inherently bad. It's just that the
         | objective function usually isn't what we want from a scientific
         | perspective.
         | 
         | To those actually doing scientific work, I suspect iterating on
         | their models feels like they're doing something unfaithful.
         | 
         | Furthermore, I believe a lot of these issues are strongly
         | related to the flawed epistemological framework which many
         | scientific fields seem to have converged: p<0.05 means it's
         | true, otherwise it's false.
         | 
         | edit:
         | 
         | Perhaps another way to characterize this discomfort is by the
         | number of degrees of freedom that the analyst controls. In a
         | Bayesian context where we are picking priors either by belief
         | or previous data, the analyst has a _lot_ of control over how
         | the results come out the other end.
         | 
         | I think this is why fields have trended towards a set of
         | 'standard' tests instead of building good statistical models.
         | These take most of the knobs out of the hands of the analyst,
         | and generally are more conservative.
        
           | slashdave wrote:
           | In particle physics, it was quite fashionable (and may still
           | be) to iterate on blinded data (data deliberated altered by a
           | secret, random number, and/or relying entirely on Monte Carlo
           | simulation).
        
             | klysm wrote:
             | Interesting I wasn't aware of that. Another thing I've only
             | briefly read about is registering studies in advance, and
             | quite literally preventing iteration.
        
             | bordercases wrote:
             | Yeah it's essentially a way to reflect parsimonious
             | assumptions so that your output distribution can be
             | characterized as a law.
        
           | j7ake wrote:
           | Iteration is necessary for any analysis. To safeguard
           | yourself from overfitting, be sure to have a hold out dataset
           | that hasn't been touched until the end.
        
           | joeyo wrote:
           | > Iteration itself isn't inherently bad. It's just that the
           | objective       > function usually isn't what we want from a
           | scientific perspective.
           | 
           | I think this is exactly right and touches on a key difference
           | between science and engineering.
           | 
           |  _Science_ : Is treatment A better than treatment B?
           | 
           |  _Engineering_ : I would like to make a better treatment B.
           | 
           | Iteration is harmful for the first goal yet essential for the
           | second. I work in an applied science/engineering field where
           | both perspectives exist. (and are necessary!) Which specific
           | path is taken for any given experiment or analysis will
           | depends on which goal one is trying to achieve. Conflict will
           | sometimes arise when it's not clear which of these two
           | objectives is the important one.
        
             | jiggawatts wrote:
             | There is _no difference_ between comparing A versus B or B1
             | versus B2. The data collection process and and the
             | mathematical methods are (typically) identical or subject
             | to the same issues.
             | 
             | E.g.: profiling an existing application and tuning its
             | performance is comparing two products, it just so happens
             | that they're different versions of the same series. If you
             | compared it to a competing vendor's product you should use
             | the same mathematical analysis process.
        
         | Onavo wrote:
         | Interesting, in my experience modern ML runs almost entirely on
         | pragmatic Bayes. You find your ELBO, you choose the latest
         | latent variable du jour that best models your problem domain
         | (these days it's all transformers), and then you start running
         | experiments.
        
           | tfehring wrote:
           | I think each category of Bayesian described in the article
           | generally falls under Breiman's [0] "data modeling" culture,
           | while ML practitioners, even when using Bayesian methods,
           | almost invariably fall under the "algorithmic modeling"
           | culture. In particular, the article's definition of pragmatic
           | Bayes says that "the model should be consistent with
           | knowledge about the underlying scientific problem and the
           | data collection process," which I don't consider the norm in
           | ML at all.
           | 
           | I do think ML practitioners in general align with the
           | "iteration" category in my characterization, though you could
           | joke that that miscategorizes people who just use (boosted
           | trees|transformers) for everything.
           | 
           | [0] https://projecteuclid.org/journals/statistical-
           | science/volum...
        
       | derbOac wrote:
       | I never liked the clubs you were expected to put yourself in,
       | what "side" you were on, or the idea that problems in science
       | that we see today could somehow be reduced to the inferential
       | philosophy you adopt. In a lot of ways I see myself as
       | information-theoretic in orientation, so maybe objective
       | Bayesian, although it's really neither frequentist nor Bayesian.
       | 
       | This three cultures idea is a bit of slight of hand in my
       | opinion, as the "pragmatic" culture isn't really exclusive of
       | subjective or objective Bayesianism and in that sense says
       | nothing about how you should approach prior specification or
       | interpretation or anything. Maybe Gelman would say a better term
       | is "flexibility" or something but then that leaves the question
       | of when you go objective and when you go subjective and why.
       | Seems better to formalize that than leave it as a bit of smoke
       | and mirrors. I'm not saying some flexibility about prior
       | interpretation and specification isn't a good idea, just that I'm
       | not sure that approaching theoretical basics with the answer
       | "we'll just ignore the issues and pretend we're doing something
       | different" is quite the right answer.
       | 
       | Playing a bit of devil's advocate too, the "pragmatic" culture
       | reveals a bit about _why_ Bayesianism is looked at with a bit of
       | skepticism and doubt.  "Choosing a prior" followed by "seeing how
       | well everything fits" and then "repeating" looks a lot like model
       | tweaking or p-hacking. I know that's not the intent, and it's
       | impossible to do modeling without tweaking, but if you approach
       | things that way, the prior just looks like one more degree of
       | freedom to nudge things around and fish with.
       | 
       | I've published and edited papers on Bayesian inference, and my
       | feeling is that the problems with it have never been in the
       | theory, which is solid. It's in how people use and abuse it in
       | practice.
        
       | bayesian_trout wrote:
       | If you want to get an informed opinion on modern Frequentist
       | methods check out the book "In All Likelihood" by Yudi Pawitawn.
       | 
       | In an early chapter it outlines, rather eloquently, the
       | distinctions between the Frequentist and Bayesian paradigms and
       | in particular the power of well-designed Frequentist or
       | likelihood-based models. With few exceptions, an analyst should
       | get the same answer using a Bayesian vs. Frequentist model if the
       | Bayesian is actually using uninformative priors. In the worlds I
       | work in, 99% of the time I see researchers using Bayesian methods
       | they are also claiming to use uninformative priors, which makes
       | me wonder if they are just using Bayesian methods to sound cool
       | and skip through peer review.
       | 
       | One potential problem with Bayesian statistics lies in the fact
       | that for complicated models (100s or even 1000s of parameters) it
       | can be extremely difficult to know if the priors are truly
       | uninformative in the context of a particular dataset. One has to
       | wait for models to run, and when systematically changing priors
       | this can take an extraordinary amount of time, even when using
       | high powered computing resources. Additionally, in the Bayesian
       | setting it becomes easy to accidentally "glue" a model together
       | with a prior or set of priors that would simply bomb out and give
       | a non-positive definite hessian in the Frequentist world (read: a
       | diagnostic telling you that your model is likely bogus and/or too
       | complex for a given dataset). One might scoff at models of this
       | complexity, but that is the reality in many applied settings, for
       | example spatio-temporal models facing the "big n" problem or for
       | stuff like integrated fisheries assessment models used to assess
       | status and provide information on stock sustainability.
       | 
       | So my primary beef with Bayesian statistics (and I say this as
       | someone who teaches graduate level courses on the Bayesian
       | inference) is that it can very easily be misused by non-
       | statisticians and beginners, particularly given the extremely
       | flexible software programs that currently are available to non-
       | statisticians like biologists etc. In general though, both
       | paradigms are subjective and Gelman's argument that it is turtles
       | (i.e., subjectivity) all the way down is spot on and really
       | resonates with me.
        
         | usgroup wrote:
         | +1 for "in all likelihood" but it should be stated that the
         | book explains a third approach which doesn't lean on either
         | subjective or objective probability.
        
           | bayesian_trout wrote:
           | fair :)
        
         | kgwgk wrote:
         | > So my primary beef with Bayesian statistics (...) is that it
         | can very easily be misused by non-statisticians and beginners
         | 
         | Unlike frequentist statistics? :-)
        
       | davidgerard wrote:
       | > Subjective Bayes
       | 
       | > I'm not sure if anyone ever followed this philosophy strictly,
       | nor do I know if anyone would register their affiliation as
       | subjective Bayesian these days.
       | 
       | lol the lesswrong/rationalist "Bayesians" do this _all the time_.
       | 
       | * I have priors
       | 
       | * YOU have biases
       | 
       | * HE is a toxoplasmotic culture warrior
        
       | prmph wrote:
       | So my theory is that probability is an ill-defined, unfalsifiable
       | concept. And yet, it _seems_ to model aspects of the world pretty
       | well, empirically. However, might it be leading us astray?
       | 
       | Consider the statement p(X) = 0.5 (probability of event X is
       | 0.5). What does this actually mean? It it a proposition? If so,
       | is it falsifiable? And how?
       | 
       | If it is not a proposition, what does it actually mean? If
       | someone with more knowledge can chime in here, I'd be grateful.
       | I've got much more to say on this, but only after I hear from
       | those with a rigorous grounding the theory.
        
       ___________________________________________________________________
       (page generated 2024-07-26 23:00 UTC)