[HN Gopher] The Surprising Predictability of Long Runs (2012) [pdf]
       ___________________________________________________________________
        
       The Surprising Predictability of Long Runs (2012) [pdf]
        
       Author : alexmolas
       Score  : 37 points
       Date   : 2024-10-11 14:40 UTC (2 days ago)
        
 (HTM) web link (www.csun.edu)
 (TXT) w3m dump (www.csun.edu)
        
       | nuancebydefault wrote:
       | I once saw on some website a chart with distribution of flat tire
       | events. Often one does not encounter it in 10 years and suddenly
       | 2 or 3 times in a year. Mathematically, chances of such
       | distribution are quite high.
        
       | fastaguy88 wrote:
       | One of the major breakthroughs in Bioinformatics was the
       | recognition that local similarity scores (which can be thought of
       | as runs of positive sequence similarity) are extreme-value
       | distributed.[0] The logic of that discovery uses almost exactly
       | the same mathematical argument as this paper [1], indeed I
       | recognized some of the same equations.
       | 
       | It is difficult to overstate the importance of this discovery for
       | biology, as today, the vast vast majority of protein functional
       | inferences for newly sequenced genomes are based on the
       | statistics of long runs of sequence similarity.
       | 
       | [0] https://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html
       | [1] https://www.pnas.org/doi/epdf/10.1073/pnas.87.6.2264
        
       | wenc wrote:
       | This is an interesting finding. There are two takeaways from the
       | paper.
       | 
       | 1. The length of streaks L for an independent Bernoulli process
       | with success probability p (with q = 1-p) over n trials can
       | easily be calculated.
       | 
       | L = log_{1/p} (n*q)
       | 
       | 2. This estimate becomes more accurate as p decreases. Because
       | the distribution of L is an extreme value distribution which gets
       | more concentrated as p decreases.
       | 
       | This means for low values of p, L becomes more predictable and
       | accurate.
       | 
       | I don't know how this result will change my life, but at least
       | now I know that I can predict streaks if I know p.
        
         | jonathan_landy wrote:
         | First thing it makes we want to do is qualify success rates
         | among individuals. Eg investors. Some are quite successful, but
         | more so relative to what you'd expect give equal randomness?
        
       | treetalker wrote:
       | Long streaks, not runners' long runs (which are also surprisingly
       | predictable).
        
         | mcswell wrote:
         | Unless of course a streaker does a long run.
        
       | DiscourseFan wrote:
       | Interesting but the paper suffers in certain respects within its
       | methodology by conflating real probabilities vs theoretical
       | probabilities.
       | 
       | Roulette, for instance, is only _theoretically_ 38 /1, but in
       | actuality all roulette tables have imperfections such that
       | certain numbers almost always get hit more than others; even
       | certain colors, under extraordinary circumstances.
       | 
       | One could say: well, but isn't this the case for all
       | probabilities? Not so: in the case of the lottery, the spread of
       | numbers people tend to choose may not be so random, but the
       | drawing itself is as close to random as possible. A run on the
       | lottery is very different from a run on a roulette table and a
       | run in baseball, or even a run in elections: there are forces,
       | even if they aren't necessarily measurable, that determine these
       | things and strict probabilistic analysis has no hold on these
       | forces. It's almost certainly the case that a hurricane will hit
       | Florida in September of 2025, even though nobody can precisely
       | predict it, nobody would bet against it. It's just the same way
       | with almost all chance in society, except for that which is
       | already controlled from the outset.
        
         | notahacker wrote:
         | > actuality all roulette tables have imperfections such that
         | certain numbers almost always get hit more than others; even
         | certain colors, under extraordinary circumstances
         | 
         | Seems unlikely these imperfections are enough to shift it
         | _significantly_ from 1 /38, based on both the variation in the
         | geometry of roulette tables that's small enough to be non-
         | obvious being tiny in comparison with the variation in croupier
         | action, and the likelihood of casinos _noticing_ any very long
         | run deviation in the size of their edge (which is contingent
         | upon customers hitting the zero pocket(s) with a certain
         | frequency)
        
           | conformist wrote:
           | Yes, but it's potentially more subtle - there's a competition
           | between professional players and casinos, in particular
           | online. There's an interesting bloomberg story on this: https
           | ://www.bloomberg.com/news/newsletters/2023-04-06/meet-n...
        
       | SoftTalker wrote:
       | Randomness doesn't look random.
        
       ___________________________________________________________________
       (page generated 2024-10-13 22:01 UTC)