[HN Gopher] An alternative construction of Shannon entropy
___________________________________________________________________
An alternative construction of Shannon entropy
Author : rkp8000
Score : 121 points
Date : 2024-11-13 16:45 UTC (5 days ago)
(HTM) web link (rkp.science)
(TXT) w3m dump (rkp.science)
| xigoi wrote:
| The site is unreadable on mobile because it disables overflow on
| the equations (which it shows as images, even though it's 2024
| and all modern browsers support MathML).
| masfuerte wrote:
| Today I learned! I'd missed the news that MathML was back in
| Chrome. I've been publishing web pages using MathML for years,
| along with a note that the equations don't render in Chrome. I
| can finally remove the note.
| kgwgk wrote:
| Seems similar to
| https://en.wikipedia.org/wiki/Principle_of_maximum_entropy#T...
| ivan_ah wrote:
| Nice!
|
| The key step of the derivation is counting the "number of ways"
| to get the histogram with bar heights L1, L2, ... Ln for a total
| of L observations.
|
| I had to think a bit why the provided formula is true:
| choose(L,L1) * choose(L-L1,L2) * ... * choose(Ln,Ln)
|
| The story I came up with for the first term, is that in the
| sequence of lenght L, you need to choose L1 locations that will
| get the symbol x1, so there are choose(L,L1) ways to do that.
| Next you have L-L1 remaining spots to fill, and L2 of those need
| to have the symbol x2, hence the choose(L-L1,L2) term, etc.
| canjobear wrote:
| Interestingly this has a rather frequentist flavor. The
| probabilities end up coming from frequency ratios in very large
| samples.
| derbOac wrote:
| That line of inferential philosophy is very Jaynesian, for
| those who are interested.
| IdealeZahlen wrote:
| Calling this 'alternative' construction seems like coming full
| circle since this line of combinatorial argument is how Boltzmann
| came up with his H-function in the first place, which inspired
| Shannon's entropy.
| rkp8000 wrote:
| Yep! This relationship is well known in statistical mechanics.
| I was just surprised that in many years of intersecting with
| information theory in other fields (computational neuroscience
| in particular) I'd never come across it before, even though IMO
| it provides an insightful perspective.
| ziofill wrote:
| Eh ok, but the trick is then taking the limit for L->infty and
| use Stirling's approx which is what Shannon did
| mturmon wrote:
| This is what I learned as the "theory of types" from Cover and
| Thomas, chapter 11, from original work by Imre Csiszar. See just
| under "theorem 6" in
|
| https://web.stanford.edu/class/ee376a/files/2017-18/lecture_...
|
| The key (which is not in OP) is not the construction of E log(p),
| but in being able to prove that the "typical set" exists (with
| arbitrarily high probability), and that the entropy is its size.
___________________________________________________________________
(page generated 2024-11-18 23:02 UTC)