[HN Gopher] The unreasonable effectiveness of conditional probab...
___________________________________________________________________
The unreasonable effectiveness of conditional probabilities
Author : kqr
Score : 134 points
Date : 2023-02-22 16:04 UTC (6 hours ago)
(HTM) web link (two-wrongs.com)
(TXT) w3m dump (two-wrongs.com)
| Hitton wrote:
| The first problem is known as "Coupon collector's problem".
| [deleted]
| ketanmaheshwari wrote:
| The unreasonable effectiveness of articles having the phrase
| "Unreasonable Effectiveness" in their title on HN:
| https://hn.algolia.com/?q=unreasonable+effectiveness
| nickponline wrote:
| I wrote a blog about the general case
| https://nickp.svbtle.com/general-collectors-problem
| gnulinux wrote:
| Other commenters made other comments, but I also want to note
| that this article is terrible for another reason. For the first
| problem, they computed mathematically all the way up to
| calculating the recursive expression they got. After that they
| moved to computing the value with Python. The reason this is bad
| is because writing the Python code to calculate ~8.3564 was easy
| in the first place. They made basic mathematical analysis that
| wasn't even necessary, then moved to code to actually solve the
| problem. To be instructional, they could compute 8.333 solution
| mathematically (as e.g. dgacmu or vecter in this thread did) and
| then later move to Python for an alternative solution. This is
| not only confusing but also incomplete for both approaches.
| photochemsyn wrote:
| While I wouldn't dream of posting ChatGPT content, here are some
| _questions_ you could give to ChatGPT to a make this all a fair
| bit clearer:
|
| First, what is an expectation value, as defined in the field of
| statistics?
|
| Second, is there some reason an expectation value wouldn't behave
| like a strange attractor from dynamical systems theory and sort
| of wander about, rather than settle down to a precise value?
|
| Third, what's the connection (if any) between conditional
| probability and expectation values?
|
| For the first example, a more interesting real world-problem
| would be one where the total number of different outcomes (toys)
| is unknown - say, counting the number of species of beetle that
| exist on planet Earth by steadily sampling all the habitats one
| can find - how long would the tails be?
|
| Variations of the latter examples might be odds that vary over
| time in conjunction with costs-of-entry that vary over time -
| what's the acceptable range of odds & costs to justify playing
| with some expectation of at least coming out even? That's a bit
| more like real-world markets.
| throwaway81523 wrote:
| I think you were downvoted (not by me) because your questions,
| except for the one about strange attractors, are about basic
| topics in probability theory. So you should probably read a
| book or take a class. The one about strange attractors is more
| interesting and maybe not studied enough, but you could look at
| https://en.wikipedia.org/wiki/El_Farol_Bar_problem for a
| related example.
| dgacmu wrote:
| I find this a quite terrible explanation of a fairly simple
| concept, but perhaps I haven't had enough coffee. Here's an
| alternative try:
|
| First, for the computer scientists out there, this is the Coupon
| Collector's problem - so you probably know already it's Th(N log
| N). That doesn't give us a precise answer but it should guide our
| intuition - it should take _somewhere_ like 4 * log(4) ~= 8 draws
| to get what we want. [1]
|
| Being exact:
|
| When you have already have 3 of the 4 toys, what's the chance you
| get the 4th? It's 1 in 4. The expected number of draws to get an
| item with P=1/4 is 4.
|
| Generalizing, the expected number of tries to get an item with
| probability p is 1/p. When you have 2 toys, you have a 1/2
| probability of getting one of the two you haven't seen already,
| etc. When you have no toys, you always get something you don't
| have.
|
| So the only slightly tricky case is when you have 1 toy, where
| you have a 3 in 4 chance of getting something new. 1/(3/4) = 1
| 1/3.
|
| So the answer is very simply 4 + 2 + 1 1/3 + 1 = 8.3333.
|
| More on the Coupon Collector's Problem:
| https://en.wikipedia.org/wiki/Coupon_collector%27s_problem
|
| [1] The actual bounds use ln, not log2, but from a "getting
| intuition" perspective log2 gets us into the right range.
| amluto wrote:
| I agree - the OP's explanation is terrible. What does this even
| mean:
|
| > From a fundamentals perspective, the expected number of meals
| is a function of the probability of finding the fourth toy in
| any specific meal.
|
| I can barely parse it. The probability of finding the nth toy
| before the nth meal is 0, so is the answer a function of 0? Or
| maybe the probability of finding the toy with index 4 in a
| specific meal is 1/4, and the answer is a function of 1/4?
|
| Anyway, if you drink another coffee and stop reducing your
| fractions, you get a nicer way to say the answer :) . After
| getting n different toys, the probability of a duplicate on the
| next try is n/4, so the probability of a new toy is 1-n/4 =
| (4-n)/4, and it takes 4/(4-n) draws on expectation to go from n
| to n+1 toys. So to do from 0 toys to 4 toys takes 4 * (1/4 +
| 1/3 + 1/2 + 1/1) tries on expectation, because you can just add
| the expected number of tries for each new toy.
| kgwgk wrote:
| >> From a fundamentals perspective, the expected number of
| meals is a function of the probability of finding the fourth
| toy in any specific meal.
|
| > I can barely parse it. The probability of finding the nth
| toy before the nth meal is 0, so is the answer a function of
| 0?
|
| The probability of finding the fourth toy in the first meal
| is 0%.
|
| The probability of finding the fourth toy in the second meal
| is 0%.
|
| The probability of finding the fourth toy in the third meal
| is 0%.
|
| The probability of finding the fourth toy in the fourth meal
| is 9.4% or whatever.
|
| The probability of finding the fourth toy in the fifth meal
| is 37.5% or whatever.
|
| Etc.
|
| The expected number of meals is 1 x 0% + 2 x 0% + 3 x 0% + 4
| x 9.4% + 5 x 37.5% + 6 x ... + ...
| [deleted]
| kgwgk wrote:
| More precisely:
|
| The probability of finding the fourth toy in the first meal
| is 0%.
|
| The probability of finding the fourth toy in the second
| meal is 0%.
|
| The probability of finding the fourth toy in the third meal
| is 0%.
|
| The probability of finding the fourth toy in the fourth
| meal is 3/32 = 9.4%.
|
| The probability of finding the fourth toy in the fifth meal
| is 9/64 = 14.1%.
|
| The probability of finding the fourth toy in the sixth meal
| is 123/512 = 24.0%.
|
| The probability of finding the fourth toy in the seventh
| meal is 135/1024 = 12.0%.
|
| The probability of finding the fourth toy in the eight meal
| is 903/8192 = 11.0%.
|
| The probability of finding the fourth toy in the ninth meal
| is 1449/16384 = 8.8%.
|
| Etc.
|
| The expected number of meals is 1 x 0% + 2 x 0% + 3 x 0% +
| 4 x 9.4% + 5 x 14.1% + 6 x 24.0% + 7 x 12.0% + 8 x 11.0% +
| 9 x 8.8% + ...
| onos wrote:
| A way to see the scaling this commenter quoted:
|
| Suppose there are on average N items left to be seen at time t.
| Then using a continuous time approximation:
|
| dN/dt = -N / N_tot
|
| This gives
|
| -dN / N = dt / N_tot
|
| Integrate from N_tot to a small value at left and from 0 to t
| at right,
|
| Log(N_tot /a) = t / N_tot,
|
| Solving for t gives the scaling quoted.
| vecter wrote:
| To put it in a different way, construct a Markov chain with
| these states and transitions:
|
| (A_0) -> (A_1) -> (A_2) -> (A_3) -> (A_4)
|
| State A_i = you have i unique toys. Each state from 1-4 also
| has a transition to itself.
|
| The probability of going from A_i -> A_{i+1} = P(A_{i,i+1}) =
| (1 - i/4). This is pretty obviously: 1, 3/4, 2/4, and 1/4. The
| transition probability of staying in the same state is 1 - that
| = i/4, although we don't use those values in our calculations.
|
| The expected time to reach A_4 is E[A_0 -> A_1] + E[A_1 -> A_2]
| + E[A_2 -> A_3] + E[A_3 -> A_4] = 1/P(A_{0,1}) + 1/P(A_{1,2}) +
| 1/P(A_{2,3}) + 1/P(A_{3,4}) = 1 + 4/3 + 2 + 4 = 8 1/3. This is
| true due both to linearity of expectation and the fact that
| each state only either stays the same or advances. If any state
| could transition to another state besides itself or the
| subsequent one, you'd have to do a more complex calculation.
| tunesmith wrote:
| After 9 purchases, what's the percentage certainty you'd have
| all four toys? If you had to pre-purchase all n at once from
| the outset, what's the value of n to be 99.9% certain you'd get
| all four?
| travisjungroth wrote:
| Estimates from 67 million simulations
|
| After 9: 71.13%
|
| p99.9: 29
| tunesmith wrote:
| Thanks for doing that. I suspected those answers were very
| different than the 8.3, which I'm guessing must have been
| around the 50th percentile. The way the initial question
| was asked, it seemed to suggest you had a pretty high
| probability of all four after 8.3, which isn't true.
| wizofaus wrote:
| Doesn't it depend on the #/ distribution of coupons
| printed/toys manufactured though? To take an extreme example if
| there are 4 different toys and only 10 of each are
| manufactured, then the probability of getting 4 all the same is
| surely somewhat lower than it would be if you assume
| essentially an infinite toy supply where the chance of getting
| a particular toy is always 1/4.
| mlyle wrote:
| We're effectively assuming that the number of toys are
| infinite and equally likely.
|
| Or, to put it slightly more formally, we're choosing each
| time from 4 equally likely possibilities, with replacement.
| Imagine you're drawing from a bag with 4 toys inside. How
| long until you've drawn each of the 4 toys, if you replace
| the toy in the bag after each choice?
| whack wrote:
| My initial hunch was also to calculate the answer as 1 + 4/3 +
| 2 + 4. Ie, "average tries to get 1st toy, plus average tries to
| get 2nd toy, plus ...."
|
| However, my memory of probability is very fuzzy and I wasn't
| sure if my above intuition was mathematically rigorous. Ie, can
| the expected values of these 4 events be simply summed in order
| to get the combined expected value?
|
| For anyone else curious about this, here's the mathematical
| proof that you can indeed do so. Ie, proof that E(X+Y) = E(X) +
| E(Y). Where X can be defined as the number of tries to get the
| 1st toy, Y is the number of tries to get the 2nd toy, and so
| on: https://www.milefoot.com/math/stat/rv-sums.htm
|
| In this specific example, X and Y are independent random
| variables. But if my reading of the above proof is correct, you
| can still sum the expected values even if X and Y happen to be
| correlated.
| vecter wrote:
| Linearity of expectation is of course true in general, but in
| this case summing the expectations of the individual
| transition probabilities is only valid because the only valid
| state transitions are going from state i -> i+1 (owning i
| unique toys to i+1 unique toys) or just staying in state i.
| If you could magically go from state i to any other state
| (you lose unique toys or gain more than one unique toy), then
| this simple calculation would not be correct.
| allanrbo wrote:
| Published 2023-03-18. Futuristic :-)
| wirrbel wrote:
| Conditional probabilities are reasonably effective because they
| are the mathematical underpinning of probabilities.
|
| They are not unreasonable effective.
|
| It's just that sampling from probability distributions is
| computationally expensive, which has lead to the development of
| frequentist statistics (as opposed to Bayesian statistics).
| Frequentist statistics is unreasonably effective, since it makes
| so many assumptions that its hard to do a frequentist statistical
| analysis without violating some of the assumptions the methods
| have with regards to the data distributions, etc.
| [deleted]
| travisjungroth wrote:
| On the first problem, if you're already going to estimate the
| answer with finite iterations over an infinite series, I think
| you should just skip ahead to the simulation that was used to
| test it.
|
| The header for the test section says "Validation" but that's not
| what's happening. It's not taking some answer and seeing if it's
| correct. It's coming to the answer in a different way, in a way
| that's faster to write, faster to run and easier to understand
| than the other code.
|
| Easier to extend and modify, too. Uneven weights for the toys?
| Easy. Want the P99 case? The standard deviation? Easy. Simulate
| it ten million times and see what comes out.
|
| I've just been hyped on simulations over calculations for
| probabilities lately and this is wonderful confirmation bias.
| danuker wrote:
| "The Unreasonable Effectiveness of Monte Carlo Methods"
| LegionMammal978 wrote:
| Depending on how much precision you'd like for your answer,
| Monte Carlo methods can converge unbearably slowly, since the
| standard deviation is proportional to 1/sqrt(N) as you add more
| samples. I recall once having to run 100 billion simulations to
| get the mean of a certain quantity to within a few decimal
| places. I'd take an infinite series over that ant day of the
| week (well, as long as it's not something as slow as the
| Leibniz series for pi/4).
|
| Of course, such precision relative to the standard deviation is
| rarely necessary for most practical situations.
| dekhn wrote:
| As a friend in grad school used to say... "An infinitely long
| Monte Carlo simulation recapitulates the underlying
| probability distribution perfectly, but we don't want to wait
| that long.":
| travisjungroth wrote:
| Yeah, it's the practicality. Most business decisions don't
| need more than three sigfig [source?]. We only tend to think
| about precision on things that already have it.
| 0cf8612b2e1e wrote:
| Unless you are trying to prove the existence of a subatomic
| particle, I am not sure what business decisions have
| anything close to that level of certainty.
| 0cf8612b2e1e wrote:
| Just making a simulation and codifying assumptions is hugely
| helpful. Forcing SMEs to rationalize a number and its spread
| has so much more utility than a closed form calculation which
| people have to take on faith.
| jrumbut wrote:
| Yeah I think simulations are massively underused.
|
| They might not help you prepare for a black swan type event,
| but a lot of times we're not ready for very normal problems. I
| suspect a day or two per year spent doing basic modeling and
| simulation would benefit a lot of teams, even back of the
| envelop stuff. Then you can return to the exercise a year later
| and see if your assumptions were on point or what turned out to
| be much different than expected and refine the model and
| parameters of the simulation.
| college_physics wrote:
| > They might not help you prepare for a black swan type event
|
| at some point probabilistic simulations were all the rage in
| the financial industry and then the financial crisis happened
|
| it is a fine art to figure out how to elicit the benefits of
| Monte Carlo simulation to put some structure around the
| future without falling into a mental trap that blinds you to
| non-quantifiable scenarios
| travisjungroth wrote:
| They're actually great for preparing black swan events. Taleb
| talks about them in "The Black Swan".
|
| Usually the rules of the game aren't as clearly stated as in
| this problem. Sometimes there is a literal coupon contest, or
| a node randomly pairs with another node. But usually you're
| looking at history to try to to understand the odds and then
| forecast some futures. The problem with doing that with
| statistics is it's not very receptive to "fiddling", even if
| it's Bayesian. With a simulation it's generally easier to
| throw something weird in there, like an effect orders of
| magnitude larger, dynamic effects, or a totally new rule.
| __anon-2023__ wrote:
| For the collectible toy problem, how much money would you save if
| you paired up with someone else and purchased meals until you
| both had all 4 toys? My intuition says going from 1 collector to
| 2 collectors would help significantly (like queuing theory).
| jonnycomputer wrote:
| Pet peeve. Just exactly what about it is unreasonable?
| sedundnes wrote:
| For "Game of Stopped Dice" the article states "we would keep our
| winnings on the first throw only if they are 5 or 6". Would it
| not be better to cash out even at $4 and play again using those
| winnings?
___________________________________________________________________
(page generated 2023-02-22 23:00 UTC)