[HN Gopher] Adventures in Probability
       ___________________________________________________________________
        
       Adventures in Probability
        
       Author : kiyanwang
       Score  : 144 points
       Date   : 2024-11-04 07:25 UTC (7 days ago)
        
 (HTM) web link (buttondown.com)
 (TXT) w3m dump (buttondown.com)
        
       | vdvsvwvwvwvwv wrote:
       | This was a lot of fun. By skipping over the formulaic details and
       | proof and explaining the lay of the land, it makes a good
       | starting point to explore further.
       | 
       | Either mathematically or just some python dice rolls.
       | 
       | Really good. Same ethos as fast.ai courses.
       | 
       | Finally that you can add tbe /3 and /5 to get a /8 distribution
       | makes intuitive semsw to me. / means lambda.
       | 
       | This is because if you have people arriving to a train station
       | you could split them by eye colour and there is no reason a
       | particular eye colour cause there to be dependencies. (Assuming
       | spherical cows: families arriving excepted. Assume it is downtown
       | rush hour)
        
       | asah wrote:
       | I can't help but wonder if real systems have additional (perhaps
       | subtle) signals, which can be provided to a neural network, which
       | then outperforms these simple algorithms.
       | 
       | For example, customers arrive at the grocery store in clusters
       | due to traffic lights, schools getting out, etc. Even without
       | direct signals, a NN could potentially pickup on these "rules"
       | given other inputs, e.g. time of day, weather, etc.
       | 
       | ?
        
         | vdvsvwvwvwvwv wrote:
         | Lgtm; A NN is literally a probability distribution producer.
        
         | tech_ken wrote:
         | > For example, customers arrive at the grocery store in
         | clusters due to traffic lights, schools getting out, etc.
         | 
         | You're kind of just describing seasonality components and
         | exogenous regressors; RNNs do actually function quite well for
         | demand forecasting of this type but even simple models (Holt-
         | Winters or a Bayesian state space model or something) can be
         | really effective
        
       | shiandow wrote:
       | Poisson processes are neat, they always end up working nicely in
       | ways that many other distributions/processes very much don't.
       | 
       | Splitting a Poisson process into two lower rate processes is a
       | neat trick. Even better is that you can do the same to convert a
       | Poisson process into one with a _variable_ rate, provided that
       | rate is lower than the original (original may be variable as
       | well).
       | 
       | And the fact that the partial sums of a bunch of exponential
       | distributions results in the same distribution of values as
       | picking Poisson(lambda * time) values uniformly at random is pure
       | magic.
        
         | tmoertel wrote:
         | Another neat property of Poisson processes is that when raced
         | against one another, they win in proportion to their underlying
         | rates. This property is the basis of a clever random sampling
         | algorithm that works well in SQL:                   SELECT *
         | FROM Population         WHERE weight > 0         ORDER BY
         | -LN(1.0 - RANDOM()) / weight         LIMIT 100  -- Sample size.
         | 
         | For an explanation of how it works, see
         | https://blog.moertel.com/posts/2024-08-23-sampling-with-sql....
        
           | foldU wrote:
           | (author of OP) That post of yours was actually what got me
           | tooling around with this stuff again :) it's a really
           | excellent one
        
             | tmoertel wrote:
             | Thanks! That was very kind of you to say. Whenever I write
             | stuff like that, I wonder, "Does anyone find this useful?"
             | It helps to hear every once in a while that the answer is
             | sometimes yes.
        
           | gmfawcett wrote:
           | Nice article!
        
       | exmadscientist wrote:
       | > I think if I were in charge of presenting this material to
       | students I'd do it by introducing the concept of memorylessness
       | and by showing how good memorylessness is, how many wonderful
       | things you can do with it. And then one day I'd be like, "well,
       | it sure would be nice if we had any distributions like that!" and
       | then whirl around with my piece of chalk to deliver the exciting
       | news that we _do_. Exactly one, in fact.
       | 
       | Incidentally, this also goes for the determinant of a matrix.
       | It's got a lot of neat and desirable properties, and it turns out
       | to be the _only_ thing that does. When it was finally taught to
       | me this way, those weird algorithms we use to compute this
       | seemingly-arbitrary number finally made sense. (And, in fact,
       | this is the easiest way to prove that all those algorithms have
       | to be computing the _same_ seemingly-arbitrary number. Because
       | the algorithms preserve the properties that define The
       | Determinant, and The Determinant is the _unique_ thing that
       | preserves all of those properties, so must those algorithms all
       | be computing The Determinant, no matter how different they might
       | look.)
       | 
       | So I can vouch that this style of explanation really does work,
       | at least for people like me.
        
         | AgentMatt wrote:
         | Can you give some examples for algorithms which aren't
         | obviously logically connected but use the determinant for its
         | nice properties?
        
         | jackthetab wrote:
         | Any sources that discuss this viewpoint wrt the determinant?
         | Seems I'm still at the "seemingly-arbitrary number" stage.
        
           | traes wrote:
           | The entire 3blue1brown series[0] on linear algebra is well
           | worth watching, it has really intuitive graphical
           | explanations of a bunch of concepts. Here's the one on
           | determinants in particular[1].
           | 
           | TL;DW the determinant represents how much you scale the
           | area/volume/hypervolume (depending on dimension) of a shape
           | by applying a matrix transformation to each point.
           | 
           | [0] https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObO
           | WTQ...
           | 
           | [1] https://www.youtube.com/watch?v=Ip3X9LOh2dk&list=PLZHQObO
           | WTQ...
        
           | exmadscientist wrote:
           | If you like textbooks, try Section 1.3 of Artin's _Algebra_
           | (find it at https://media.githubusercontent.com/media/storage
           | lfs/books/m... among others).
           | 
           | Do be warned that _Algebra_ is a... high-octane... text
           | written for serious math students and can be... powerful.
        
         | nerdponx wrote:
         | Matrix multiplication and the Gaussian distribution are also
         | like that. A lot of things are like that. I really dislike that
         | this approach is not a core tool for teaching in math.
        
       | 2-3-7-43-1807 wrote:
       | I doubt that this model of a queue and the processing of its
       | items by overlaying two independent poisson processes is
       | statistically valid (one for items arriving, the other for
       | processing those items). The processing starts only after the
       | respective item arrived in the queue - so it's not independent -
       | and this needs to be modeled accordingly or it requires a proof
       | that this is equivalent to the suggested overlaying approach -
       | that wouldn't be obvious or trivial.
        
         | travisjungroth wrote:
         | It might be trivial if you consider it a window of an infinite
         | process.
        
           | 2-3-7-43-1807 wrote:
           | the infinite process only solves the problem of having a
           | processed event happen before anything to process has entered
           | the queue - as far as I can tell.
        
         | ocular-rockular wrote:
         | As an introduction to the topic it functions very well though.
         | It doesn't matter whether it's valid or not. In fact, I would
         | say that diving immediately into the validity of some bullshit
         | independence assumptions and other nonsense is where you lose
         | most students (it definitely lost me).
         | 
         | I think flawed examples lead to a great way of scaffolding
         | towards the "true" nontrivial answer in a teaching setting at
         | least... I am still exceptionally bitter at how I was taught
         | and forced to learn stochastics and it was very much through a
         | purely theoretical, proof driven, abstract lens with very
         | crappy examples that were more of an afterthought... because of
         | course the theory is all you need to make sense of it!
        
           | 2-3-7-43-1807 wrote:
           | This bullshit independence is one of the most fundamental and
           | important concepts of probability theory and that other
           | nonsense is also relevant cause especially with statistics
           | it's easy to concoct a model which only seems to be correct
           | ... but in fact isn't.
        
         | carlmr wrote:
         | >The processing starts only after the respective item arrived
         | in the queue
         | 
         | Further you can't process anything if the queue is empty. So it
         | breaks down in this most obvious of cases.
        
       | alexpotato wrote:
       | >my professor projected video of himself writing on a piece of
       | paper before a very large auditorium, and that guy was left-
       | handed, and so his hand would cover his notes for like the entire
       | time and it was impossible to see what he was writing. I only
       | figured out that this was why it was so unpleasant like halfway
       | through the class.
       | 
       | So many of my college math classes had some version of this
       | professor who took a fascinating subject like linear algebra,
       | statistics or algorithms and made it into a slog. The fact that
       | most stats is taught by getting students to just memorize random
       | ideas rather than building up a holistic and intuitive view
       | really is a travesty.
       | 
       | Also makes sense why so many people, even though they took stats
       | in college, hav e such a poor understanding of probability.
        
         | tiahura wrote:
         | Left-handed I could handle. Opaque accents seem to warrant some
         | sort of consumer protection action by authorities.
        
         | dfxm12 wrote:
         | _The fact that most stats is taught by getting students to just
         | memorize random ideas rather than building up a holistic and
         | intuitive view really is a travesty._
         | 
         | We don't need first principals thinking for every thing.
         | Granted, this comment is a bit vague, so I don't know how
         | exactly you were taught or which type of class we're talking
         | about, but generally, you can accept some axioms in applied
         | mathematics classes. If we're talking the bare minimum classes
         | like in the article (for apparently a business degree), this is
         | likely a general applied prob/stat. Things tend to get more in
         | depth with more advanced pure mathematics courses.
        
           | bunderbunder wrote:
           | There's a middle path between rote memorization of outcomes,
           | and building everything up from first principles. And I'm
           | guessing it's probably what the parent poster had in mind.
           | 
           | A great statistics textbook along these lines is Principles
           | of Statistics by MG Bulmer. It's one of those Dover classic
           | textbooks that you can get for cheap. This book assumes you
           | already know basic calculus and combinatorics. It then goes
           | through a series of practical problems, and shows how you can
           | use calculus or combinatorics to solve them. And, along the
           | way, an intuitive and holistic perspective on statistics
           | begins to form.
           | 
           | The overall effect is great. It's a lot like a 3blue1brown
           | video series, only from the 1960s, and with problem sets.
        
           | ceh123 wrote:
           | We don't need first principals thinking every time, but
           | having an understanding of why you can't just test 100
           | variations of your hypothesis and accept p=0.05 as
           | "statistically significant" is important.
           | 
           | Additionally it's quite useful to have the background to
           | understand the differences between Pearson correlation and
           | Spearman rank, or why you might want to use Welch's t-test vs
           | students, etc.
           | 
           | Not that you should know all of these things off the top of
           | your head necessarily, but you should have the foundation to
           | be able to quickly learn them, and you should know what
           | assumptions the tests you're using actually make.
        
           | MajimasEyepatch wrote:
           | I get where you're coming from, and obviously there are
           | practical limitations on how deep one can and should go in an
           | introductory class. But my recollection of AP Statistics 15
           | years ago is that, because the exam and therefore curriculum
           | was so focused on running various tests on a TI-84, I learned
           | way more about using this one specific graphing calculator
           | than about statistics. I got a high score on the exam, but I
           | never felt like I understood any of it until I got to college
           | and took a statistics course that actually used calculus to
           | show what was going on.
        
           | xanderlewis wrote:
           | Intuition isn't synonymous with working from first
           | principles. You can have a very intuitive understanding of
           | something you only understand at a higher level. Indeed, this
           | is true for many applied mathematicians.
        
         | throw18376 wrote:
         | the reality is the vast majority of students have no interest
         | in seeing the beauty of any mathematical or technical field.
         | 
         | they want the professor to tell them the passwords they need to
         | memorize. then on the exam they repeat the passwords and get an
         | A. this is understandable though because they are under a lot
         | of pressure and these days nobody can afford to fail.
         | 
         | if the teaching style deviates from this they become annoyed,
         | leave poor course reviews, and that professor has a hard time.
         | 
         | the professor could overcome this by being "good" -- when the
         | students say a professor is "good" they mean it is easy to get
         | an A.
        
           | bunderbunder wrote:
           | In health care there have been studies that find an _inverse_
           | correlation between patient satisfaction scores and patient
           | outcomes. I don 't know if the same is true in education, but
           | I'd believe it.
        
       | dariosalvi78 wrote:
       | my Msc thesis in 2004 was about this: a probabilistic model for
       | last-recently used queues, based on poisson processes, for
       | network packets flows. The model worked OK on a couple of
       | datasets I could get at that time. I tried to publish the work
       | but got rejected a couple of times, then I gave up. If anyone
       | wants to read it (even try to publish) I am happy to share it.
        
       | sriram_malhar wrote:
       | You can't combine rate of arriving with rate of leaving, can you?
       | Leaving is dependent on arriving, so the latter distribution is
       | dependent on the former.
        
         | signa11 wrote:
         | little's law is quite instructive here. assuming that a system
         | is stable, you can.
        
       | sidcool wrote:
       | How does one know the application of such a Math concept for a
       | particular software problem? I couldn't guess in a million years.
        
         | manvillej wrote:
         | you go to school for it. Stats, applied mathematics, operations
         | research, industrial engineering.
         | 
         | I went for industrial engineering. we learned the math as pure
         | math, then the math as free language problems, then how to
         | identify and collect data to identify their attributes, then
         | simulate and verify those processes, then test for variations
         | in the underlying assumptions of those processes.
         | 
         | They never really did teach me to code well in a language that
         | was useful, I had to pick that one up myself.
        
       ___________________________________________________________________
       (page generated 2024-11-11 23:01 UTC)