[HN Gopher] An alternative construction of Shannon entropy
       ___________________________________________________________________
        
       An alternative construction of Shannon entropy
        
       Author : rkp8000
       Score  : 121 points
       Date   : 2024-11-13 16:45 UTC (5 days ago)
        
 (HTM) web link (rkp.science)
 (TXT) w3m dump (rkp.science)
        
       | xigoi wrote:
       | The site is unreadable on mobile because it disables overflow on
       | the equations (which it shows as images, even though it's 2024
       | and all modern browsers support MathML).
        
         | masfuerte wrote:
         | Today I learned! I'd missed the news that MathML was back in
         | Chrome. I've been publishing web pages using MathML for years,
         | along with a note that the equations don't render in Chrome. I
         | can finally remove the note.
        
       | kgwgk wrote:
       | Seems similar to
       | https://en.wikipedia.org/wiki/Principle_of_maximum_entropy#T...
        
       | ivan_ah wrote:
       | Nice!
       | 
       | The key step of the derivation is counting the "number of ways"
       | to get the histogram with bar heights L1, L2, ... Ln for a total
       | of L observations.
       | 
       | I had to think a bit why the provided formula is true:
       | choose(L,L1) * choose(L-L1,L2) * ... * choose(Ln,Ln)
       | 
       | The story I came up with for the first term, is that in the
       | sequence of lenght L, you need to choose L1 locations that will
       | get the symbol x1, so there are choose(L,L1) ways to do that.
       | Next you have L-L1 remaining spots to fill, and L2 of those need
       | to have the symbol x2, hence the choose(L-L1,L2) term, etc.
        
       | canjobear wrote:
       | Interestingly this has a rather frequentist flavor. The
       | probabilities end up coming from frequency ratios in very large
       | samples.
        
         | derbOac wrote:
         | That line of inferential philosophy is very Jaynesian, for
         | those who are interested.
        
       | IdealeZahlen wrote:
       | Calling this 'alternative' construction seems like coming full
       | circle since this line of combinatorial argument is how Boltzmann
       | came up with his H-function in the first place, which inspired
       | Shannon's entropy.
        
         | rkp8000 wrote:
         | Yep! This relationship is well known in statistical mechanics.
         | I was just surprised that in many years of intersecting with
         | information theory in other fields (computational neuroscience
         | in particular) I'd never come across it before, even though IMO
         | it provides an insightful perspective.
        
       | ziofill wrote:
       | Eh ok, but the trick is then taking the limit for L->infty and
       | use Stirling's approx which is what Shannon did
        
       | mturmon wrote:
       | This is what I learned as the "theory of types" from Cover and
       | Thomas, chapter 11, from original work by Imre Csiszar. See just
       | under "theorem 6" in
       | 
       | https://web.stanford.edu/class/ee376a/files/2017-18/lecture_...
       | 
       | The key (which is not in OP) is not the construction of E log(p),
       | but in being able to prove that the "typical set" exists (with
       | arbitrarily high probability), and that the entropy is its size.
        
       ___________________________________________________________________
       (page generated 2024-11-18 23:02 UTC)