[HN Gopher] What Is Entropy?
___________________________________________________________________
What Is Entropy?
Author : ainoobler
Score : 302 points
Date : 2024-07-22 18:33 UTC (1 days ago)
(HTM) web link (johncarlosbaez.wordpress.com)
(TXT) w3m dump (johncarlosbaez.wordpress.com)
| illuminant wrote:
| Entropy is the distribution of potential over negative potential.
|
| This could be said "the distribution of what ever may be over the
| surface area of where it may be."
|
| This is erroneously taught in conventional information theory as
| "the number of configurations in a system" or the available
| information that has yet to be retrieved. Entropy includes the
| unforseen, and out of scope.
|
| Entropy is merely the predisposition to flow from high to low
| pressure (potential). That is it. Information is a form of
| potential.
|
| Philosophically what are entropy's guarantees?
|
| - That there will always be a super-scope, which may interfere in
| ways unanticipated;
|
| - everything decays the only mystery is when and how.
| mwbajor wrote:
| All definitions of entropy stem from one central, universal
| definition: Entropy is the amount of energy unable to be used
| for useful work. Or better put grammatically: entropy describes
| the effect that not all energy consumed can be used for work.
| ajkjk wrote:
| There's a good case to be made that the information-theoretic
| definition of entropy is the most fundamental one, and the
| version that shows up in physics is just that concept as
| applied to physics.
| rimunroe wrote:
| My favorite course I took as part of my physics degree was
| statistical mechanics. It leaned way closer to information
| theory than I would have expected going in, but in
| retrospect should have been obvious.
|
| Unrelated: my favorite bit from any physics book is
| probably still the introduction of the first chapter of
| "States of Matter" by David Goodstein: "Ludwig Boltzmann,
| who spent much of his life studying statistical mechanics,
| died in 1906, by his own hand. Paul Ehrenfest, carrying on
| the work, died similarly in 1933. Now it is our turn to
| study statistical mechanics."
| galaxyLogic wrote:
| That would mean that information-theory is not part of
| physics, right? So, Information Theory and Entropy, are
| part of metaphysics?
| ajkjk wrote:
| Well it's part of math, which physics is already based
| on.
|
| Whereas metaphysics is, imo, "stuff that's made up and
| doesn't matter". Probably not the most standard take.
| galaxyLogic wrote:
| I'm wondering, isn't Information Theory as much part of
| physics as Thermodynamics is?
| ajkjk wrote:
| Not really. Information theory applies to anything
| probability applies to, including many situations that
| aren't "physics" per se. For instance it has a lot to do
| with algorithms and data as well. I think of it as being
| at the level of geometry and calculus.
| kgwgk wrote:
| Would you say that Geometry is as much a part of physics
| as Optics is?
| imtringued wrote:
| Yeah, people seemingly misunderstand that the entropy
| applied to thermodynamics is simply an aggregate statistic
| that summarizes the complex state of the thermodynamic
| system as a single real number.
|
| The fact that entropy always rises etc, has nothing to do
| with the statistical concept of entropy itself. It simply
| is an easier way to express the physics concept that
| individual atoms spread out their kinetic energy across a
| large volume.
| ziofill wrote:
| I think what you describe is the application of entropy in
| the thermodynamic setting, which doesn't apply to "all
| definitions".
| mitthrowaway2 wrote:
| This definition is far from universal.
| ziofill wrote:
| > Entropy includes the unforseen, and out of scope.
|
| Mmh, no it doesn't. You need to define your state space,
| otherwise it's an undefined quantity.
| kevindamm wrote:
| But it is possible to account for the unforseen (or out-of-
| vocabulary) by, for example, a Good-Turing estimate. This
| satisfies your demand for a fully defined state space while
| also being consistent with GP's definition.
| illuminant wrote:
| You are referring to the conceptual device you believe bongs
| to you and your equations. Entropy creates attraction and
| repulsion, even causing working bias. We rely upon it for our
| system functions.
|
| Undefined is uncertainty is entropic.
| fermisea wrote:
| Entropy is a measure, it doesn't create anything. This is
| highly misleading.
| senderista wrote:
| > bongs
|
| indeed
| axblount wrote:
| Baez seems to use the definition you call erroneous: "It's easy
| to wax poetic about entropy, but what is it? I claim it's the
| amount of information we don't know about a situation, which in
| principle we could learn."
| eoverride wrote:
| This answer is as confident as it's wrong and full of
| gibberish.
|
| Entropy is not a "distribution", it's a functional that maps a
| probability distribution to a scalar value, i.e. a single
| number.
|
| It's the mean log-probability of a distribution.
|
| It's an elementary statistical concept, independent of physical
| concepts like "pressure", "potential", and so on.
| illuminant wrote:
| It sounds like log-probability is the manifold surface area.
|
| Distribution of potential over negative potential. Negative
| potential is the "surface area", and available potential
| distributes itself "geometrically". All this is iterative
| obviously, some periodicity set by universal speed limit.
|
| It really doesn't sound like you disagree with me.
| Jun8 wrote:
| A well known anecdote reported by Shannon:
|
| "My greatest concern was what to call it. I thought of calling it
| 'information,' but the word was overly used, so I decided to call
| it 'uncertainty.' When I discussed it with John von Neumann, he
| had a better idea. Von Neumann told me, 'You should call it
| entropy, for two reasons. In the first place your uncertainty
| function has been used in statistical mechanics under that name,
| so it already has a name. In the second place, and more
| important, no one really knows what entropy really is, so in a
| debate you will always have the advantage.'"
|
| See the answers to this MathOverflow SE question
| (https://mathoverflow.net/questions/403036/john-von-neumanns-...)
| for references on the discussion whether Shannon's entropy is the
| same as the one from thermodynamics.
| BigParm wrote:
| Von Neumann was the king of kings
| tonetegeatinst wrote:
| Its odd...as someone interested but not fully into the
| sciences I see his name pop up everywhere.
| farias0 wrote:
| I've seen many people arguing he's the most intelligent
| person that ever lived
| wrycoder wrote:
| Some say Hungarians are actually aliens.
| jack_pp wrote:
| https://slatestarcodex.com/2017/05/26/the-atomic-bomb-
| consid...
| bee_rider wrote:
| He was really brilliant, made contributions all over the
| place in the math/physics/tech field, and had a sort of
| wild and quirky personality that people love telling
| stories about.
|
| A funny quote about him from a Edward "a guy with multiple
| equations named after him" Teller:
|
| > Edward Teller observed "von Neumann would carry on a
| conversation with my 3-year-old son, and the two of them
| would talk as equals, and I sometimes wondered if he used
| the same principle when he talked to the rest of us."
| strogonoff wrote:
| Are there many von-Neumann-like multidisciplinaries
| nowadays? It feels like unless one is razor sharp fully
| into one field one is not to be treated seriously by
| those who made careers in it (and who have the last word
| on it).
| i_am_proteus wrote:
| There have been a very small number of thinkers as
| publicly accomplished as von Neumann _ever._ One other
| who comes to mind is Carl F. Gauss.
| strogonoff wrote:
| Is it fair to say that the number of publicly
| accomplished multidisciplinaries alive at a particular
| moment is not rising as it may be expected,
| proportionally to the total number of suitably educated
| people?
| djd3 wrote:
| Euler.
|
| JVM was one of the smartest ever, but Euler was there
| centuries before and shows up in so many places.
|
| If I had a Time Machine I'd love to get those two
| together for a stiff drink and a banter.
| passion__desire wrote:
| Genius Edward Teller Describes 1950s Genius John Von
| Neumann
|
| https://youtu.be/Oh31I1F2vds?t=189 Describes Von
| Neumann's final days struggle when he couldn't think.
| Thinking, an activity which he loved the most.
| bee_rider wrote:
| I think there are none. The world has gotten too
| complicated for that. It was early days in quantum
| physics, information theory, and computer science. I
| don't think it is early days in anything that
| consequential anymore.
| adrianN wrote:
| It's the early days in a lot of fields, but they tend to
| be fiendishly difficult like molecular biology or
| neuroscience.
| ricksunny wrote:
| More than that, as professionals' career paths in fields
| develop, the organisations they work for specialize,
| becoming less amenable to the generalist. ('Why should we
| hire this mathematician who is also an expert in legal
| research? Their attention is probably divided, and
| meanwhile we have a 100% mathematician in the candidate
| pool fresh from an expensive dedicated PhD program with a
| growing family to feed.')
|
| I'm obviously using the archetype of Leibniz here as an
| example but pick your favorite polymath.
| bee_rider wrote:
| Are they fiendishly difficult or do we just need a von
| Neumann to come along and do what he did for quantum
| mechanics to them?
| Salgat wrote:
| Centuries ago, the limitation of most knowledge was the
| difficulty in discovery; once known, it was accessible to
| most scholars. Take Calculus, which is taught in every
| high school in America. The problem is, we're getting to
| a point where new fields are built on such extreme
| requirements, that even the known knowledge is extremely
| hard for talented university students to learn, let alone
| what is required to discover and advance that field.
| Until we are able to augment human intelligence, the days
| of the polymath advancing multiple fields are mostly
| over. I would also argue that the standards for peer
| reviewed whitepapers and obtaining PhDs has significantly
| dropped (due to the incentive structure to spam as many
| papers as possible), which is only hurting the
| advancement of knowledge.
| lachlan_gray wrote:
| IMO they do exist, but the popular attitude that it's not
| possible anymore is the issue, not a lack of genius. If
| everyone has a built in assumption that it can't happen
| anymore, then we will naturally prune away social
| pathways that enable it.
| rramadass wrote:
| An Introduction here :
| https://www.youtube.com/watch?v=IPMjVcLiNKc
| complaintdept wrote:
| Even mortals such as ourselves can apply some of Von
| Neumann's ideas in our everyday lives:
|
| https://en.m.wikipedia.org/wiki/Fair_coin#Fair_results_from
| _...
| vinnyvichy wrote:
| So much so, he has his own entropy!
|
| https://en.wikipedia.org/wiki/Von_Neumann_entropy
| penguin_booze wrote:
| He's a certified Martian:
| https://en.wikipedia.org/wiki/The_Martians_(scientists).
| zeristor wrote:
| I was hoping the Wikipedia might explain why this might
| have been.
| cubefox wrote:
| https://emilkirkegaard.dk/en/2022/11/a-theory-of-
| ashkenazi-g...
| bglazer wrote:
| Emil Kirkegaard is a self-described white nationalist
| eugenicist who thinks the age of consent is too high. I
| wouldn't trust anything he has to say.
| YeGoblynQueenne wrote:
| No need for ad hominems. This suffices to place doubt on
| the article's premises (and therefore any conclusion):
|
| >> This hasn't been strictly shown mathematically, but I
| think it is true.
| cubefox wrote:
| > Emil Kirkegaard is a self-described white nationalist
|
| That's simply a lie.
|
| > who thinks the age of consent is too high
|
| Too high in which country? Such laws vary strongly, even
| by US state, and he is from Denmark. Anyway, this has
| nothing to do with the topic at hand.
| anthk wrote:
| In Spain used to be as low as 13 a few decades ago; but
| that law was obviously written before the rural exodus of
| inner Spain into the cities (from the 60's to almost the
| 80's), as children since early puberty got to work/help
| in the farm/fields or at home and by age 14 they had far
| more duties and accountabilities than today. And yes,
| that yielded more maturity.
|
| Thus, the law had to be fixed for more urban/civilized
| times up to 16. Altough depending on the age/mentality
| closeness (such as 15-19 as it happened with a recent
| case), the young adult had its charges totally dropped.
| dekhn wrote:
| I really liked the approach my stat mech teacher used. In nearly
| all situations, entropy just ends up being the log of the number
| of ways a system can be arranged
| (https://en.wikipedia.org/wiki/Boltzmann%27s_entropy_formula)
| although I found it easiest to think in terms of pairs of dice
| rolls.
| petsfed wrote:
| And this is what I prefer too, although with the clarification
| that its the number of ways that a system can be arranged
| _without changing its macroscopic properties_.
|
| Its, unfortunately, not very compatible with Shannon's usage in
| any but the shallowest sense, which is why it stays firmly in
| the land of physics.
| enugu wrote:
| Assuming each of the N microstates for a given macrostate are
| equally possible with probability p=1/N, the Shannon Entropy
| is -Sp.log(p) = -N.p.log(p)=-1.log(1/N)=log(N), which is the
| physics interpretation.
|
| In the continuous version, you would get log(V) where V is
| the volume in phase space occupied by the microstates for a
| given macrostate.
|
| Liouville's theorem that the volume is conserved in phase
| space implies that any macroscopic process can only move all
| the microstates from a macrostate A into a macrostate B only
| if the volume of B is bigger than the volume of A. This
| implies that the entropy of B should be bigger than the
| entropy of A which is the Second Law.
| cubefox wrote:
| The second law of thermodynamics is time-asymmetric, but
| the fundamental physical laws are time-symmetric, so from
| them you can only predict that the entropy of B should be
| bigger than the entropy of A _irrespective of whether B is
| in the future or the past of A._ You need the additional
| assumption (Past Hypothesis) that the universe started in a
| low entropy state in order to get the second law of
| thermodynamics.
|
| > If our goal is to predict the future, it suffices to
| choose a distribution that is uniform in the Liouville
| measure given to us by classical mechanics (or its quantum
| analogue). If we want to reconstruct the past, in contrast,
| we need to conditionalize over trajectories that also
| started in a low-entropy past state -- that the "Past
| Hypothesis" that is required to get stat mech off the
| ground in a world governed by time-symmetric fundamental
| laws.
|
| https://www.preposterousuniverse.com/blog/2013/07/09/cosmol
| o...
| kgwgk wrote:
| The second law of thermodynamics is about systems that
| are well described by a small set of macroscopic
| variables. The evolution of an initial macrostate
| prepared by an experimenter who can control only the
| macrovariables is reproducible. When a thermodynamical
| system is prepared in such a reproducible way the
| preparation is happening in the past, by definition.
|
| The second law is about how part of the information that
| we had about a system - constrained to be in a macrostate
| - is "lost" when we "forget" the previous state and
| describe it using just the current macrostate. We know
| more precisely the past than the future - the previous
| state is in the past by definition.
| kgwgk wrote:
| > not very compatible with Shannon's usage in any but the
| shallowest sense
|
| The connection is not so shallow, there are entire books
| based on it.
|
| "The concept of information, intimately connected with that
| of probability, gives indeed insight on questions of
| statistical mechanics such as the meaning of irreversibility.
| This concept was introduced in statistical physics by
| Brillouin (1956) and Jaynes (1957) soon after its discovery
| by Shannon in 1948 (Shannon and Weaver, 1949). An immense
| literature has since then been published, ranging from
| research articles to textbooks. The variety of topics that
| belong to this field of science makes it impossible to give
| here a bibliography, and special searches are necessary for
| deepening the understanding of one or another aspect. For
| tutorial introductions, somewhat more detailed than the
| present one, see R. Balian (1991-92; 2004)."
|
| https://arxiv.org/pdf/cond-mat/0501322
| petsfed wrote:
| I don't dispute that the math is compatible. The problem is
| the interpretation thereof. When I say "shallowest", I mean
| the implications of each are very different.
|
| Insofar as I'm aware, there is no information-theoretic
| equivalent to the 2nd or 3rd laws of thermodynamics, so the
| intuition a student works up from physics about how and why
| entropy matters just doesn't transfer. Likewise, even if an
| information science student is well versed in the concept
| of configuration entropy, that's 15 minutes of one lecture
| in statistical thermodynamics. There's still the rest of
| the course to consider.
| abetusk wrote:
| Also known as "the number of bits to describe a system". For
| example, 2^N equally probable states, N bits to describe each
| state.
| Lichtso wrote:
| The "can be arranged" is the tricky part. E.g. you might know
| from context that some states are impossible (where the
| probability distribution is zero), even though they
| combinatorially exist. That changes the entropy to you.
|
| That is why information and entropy are different things.
| Entropy is what you know you do not know. That knowledge of the
| magnitude of the unknown is what is being quantified.
|
| Also, the point where I think the article is wrong (or not
| concise enough) as it would include the unknown unknowns, which
| are not entropy IMO:
|
| > I claim it's the amount of information we don't know about a
| situation
| slashdave wrote:
| Exactly. If you want to reuse the term "entropy" in
| information theory, then fine. Just stop trying to make a
| physical analogy. It's not rigorous.
| akira2501 wrote:
| I spend time just staring at the graph on this page.
|
| https://en.wikipedia.org/wiki/Thermodynamic_beta
| Tomte wrote:
| PBS Spacetime's entropy playlist:
| https://youtube.com/playlist?list=PLsPUh22kYmNCzNFNDwxIug8q1...
| foobarian wrote:
| A bit off-color but classic:
| https://www.youtube.com/watch?v=wgltMtf1JhY
| drojas wrote:
| My definition: Entropy is a measure of the accumulation of non-
| reversible energy transfers.
|
| Side note: All reversible energy transfers involve an increase in
| potential energy. All non-reversible energy transfers involve a
| decrease in potential energy.
| snarkconjecture wrote:
| That definition doesn't work well because you can have changes
| in entropy even if no energy is transferred, e.g. by exchanging
| some other conserved quantity.
|
| The side note is wrong in letter and spirit; turning potential
| energy into heat is one way for something to be irreversible,
| but neither of those statements is true.
|
| For example, consider an iron ball being thrown sideways. It
| hits a pile of sand and stops. The iron ball is not affected
| structurally, but its kinetic energy is transferred (almost
| entirely) to heat energy. If the ball is thrown slightly
| upwards, potential energy increases but the process is still
| irreversible.
|
| Also, the changes of potential energy in corresponding parts of
| two Carnot cycles are directionally the same, even if one is
| ideal (reversible) and one is not (irreversible).
| space_oddity wrote:
| However, while your definition effectively captures a
| significant aspect of entropy, it might be somewhat limited in
| scope
| ooterness wrote:
| For information theory, I've always thought of entropy as
| follows:
|
| "If you had a really smart compression algorithm, how many bits
| would it take to accurately represent this file?"
|
| i.e., Highly repetitive inputs compress well because they don't
| have much entropy per bit. Modern compression algorithms are good
| enough on most data to be used as a reasonable approximation for
| the true entropy.
| space_oddity wrote:
| The essence of entropy as a measure of information content
| glial wrote:
| I felt like I finally understood Shannon entropy when I realized
| that it's a subjective quantity -- a property of the observer,
| not the observed.
|
| The entropy of a variable X is the amount of information required
| to drive the observer's uncertainty about the value of X to zero.
| As a correlate, your uncertainty and mine about the value of the
| same variable X could be different. This is trivially true, as we
| could each have received different information that about X. H(X)
| should be H_{observer}(X), or even better, H_{observer, time}(X).
|
| As clear as Shannon's work is in other respects, he glosses over
| this.
| JumpCrisscross wrote:
| > _it 's a subjective quantity -- a property of the observer,
| not the observed_
|
| Shannon's entropy is a property of the source-channel-receiver
| system.
| glial wrote:
| Can you explain this in more detail?
|
| Entropy is calculated as a function of a probability
| distribution over possible messages or symbols. The sender
| might have a distribution P over possible symbols, and the
| receiver might have another distribution Q over possible
| symbols. Then the "true" distribution over possible symbols
| might be another distribution yet, call it R. The mismatch
| between these is what leads to various inefficiencies in
| coding, decoding, etc [1]. But both P and Q are beliefs about
| R -- that is, they are properties of observers.
|
| [1] https://en.wikipedia.org/wiki/Kullback-
| Leibler_divergence#Co...
| rachofsunshine wrote:
| This doesn't really make entropy itself observer dependent.
| (Shannon) entropy is a property of a distribution. It's just
| that when you're measuring different observers' beliefs, you're
| looking at different distributions (which can have different
| entropies the same way they can have different means,
| variances, etc).
| mitthrowaway2 wrote:
| Entropy is a property of a distribution, but since math does
| sometimes get applied, we also attach distributions to
| _things_ (eg. the entropy of a random number generator, the
| entropy of a gas...). Then when we talk about the entropy of
| those things, those entropies are indeed subjective, because
| different subjects will attach different probability
| distributions to that system depending on their information
| about that system.
| stergios wrote:
| "Entropy is a property of matter that measures the degree
| of randomization or disorder at the microscopic level", at
| least when considering the second law.
| mitthrowaway2 wrote:
| Right, but the very interesting thing is it turns out
| that what's random to me might not be random to you! And
| the reason that "microscopic" is included is because
| that's a shorthand for "information you probably don't
| have about a system, because your eyes aren't that good,
| or even if they are, your brain ignored the fine details
| anyway."
| canjobear wrote:
| Some probability distributions are objective. The
| probability that my random number generator gives me a
| certain number is given by a certain formula. Describing it
| with another distribution would be wrong.
|
| Another example, if you have an electron in a superposition
| of half spin-up and half spin-down, then the probability to
| measure up is objectively 50%.
|
| Another example, GPT-2 is a probability distribution on
| sequences of integers. You can download this probability
| distribution. It doesn't represent anyone's beliefs. The
| distribution has a certain entropy. That entropy is an
| objective property of the distribution.
| mitthrowaway2 wrote:
| Of those, the quantum superposition is the only one that
| has a chance at being considered objective, and it's
| still only "objective" in the sense that (as far as we
| know) your description provided as much information as
| anyone can possibly have about it, so nobody can have a
| more-informed opinion and all subjects agree.
|
| The others are both partial-information problems which
| are very sensitive to knowing certain hidden-state
| information. Your random number generator gives you a
| number that _you_ didn 't expect, and for which a formula
| describes your best guess based on available incomplete
| information, but the computer program that generated knew
| which one to choose and it would not have picked any
| other. Anyone who knew the hidden state of the RNG would
| also have assigned a different probability to that number
| being chosen.
| cubefox wrote:
| A more plausible way to argue for objectiveness is to say
| that some probability distributions are objectively more
| rational than others given the same information. E.g.
| when seeing a symmetrical die it would be irrational to
| give 5 a higher probability than the others. Or it seems
| irrational to believe that the sun will explode tomorrow.
| canjobear wrote:
| You might have some probability distribution in your head
| for what will come out of GPT-2 on your machine at a
| certain time, based on your knowledge of the random seed.
| But that is not the GPT-2 probability distribution, which
| is objectively defined by model weights that you can
| download, and which does not correspond to anyone's
| beliefs.
| financltravsty wrote:
| The probability distribution is subjective for both parts
| -- because it, once again, depends on the observer
| observing the events _in order to build a probability
| distribution._
|
| E.g. your random number generator generates 1, 5, 7, 8, 3
| when you run it. It generates 4, 8, 8, 2, 5 when I run
| it. I.e. we have received different information about the
| random number generator to build our _subjective_
| probability distributions. The level of entropy of our
| probability distributions is high because we have so
| little information to be certain about the
| representativeness of our distribution sample.
|
| If we continue running our random number generator for a
| while, we will gather more information, thus reducing
| entropy, and our probability distributions will both
| start converging _towards_ an objective "truth." If we
| ran our random number generators for a theoretically
| infinite amount of time, we will have reduced entropy to
| 0 and have a perfect and objective probability
| distribution.
|
| But this is impossible.
| canjobear wrote:
| Would you say that all claims about the world are
| subjective, because they have to be based on someone's
| observations?
|
| For example my cat weighs 13 pounds. That seems
| objective, in the sense that if two people disagree, only
| one can be right. But the claim is based on my
| observations. I think your logic leads us to deny that
| anything is objective.
| davidmnoll wrote:
| Right but in chemistry class the way it's taught via Gibbs
| free energy etc. makes it seem as if it's an intrinsic
| property.
| waveBidder wrote:
| that's actually the normal view, with saying both info and
| stat mech entropy are the same is an outlier, most
| popularized by Jaynes.
| kmeisthax wrote:
| If information-theoretical and statistical mechanics
| entropies are NOT the same (or at least, deeply
| connected) then what stops us from having a little guy[0]
| sort all the particles in a gas to extract more energy
| from them?
|
| [0] https://en.wikipedia.org/wiki/Maxwell%27s_demon
| xdavidliu wrote:
| Sounds like a non-sequitur to me; what are you implying
| about the Maxwell's demon thought experiment vs the
| comparison between Shannon and stat-mech entropy?
| canjobear wrote:
| Entropy in physics is usually the Shannon entropy of the
| probability distribution over system microstates given
| known temperature and pressure. If the system is in
| equilibrium then this is objective.
| kergonath wrote:
| Entropy in Physics is usually either the Boltzmann or
| Gibbs entropy, both of whom were dead before Shannon was
| born.
| enugu wrote:
| That's not a problem, as the GP's post is trying to state
| a mathematical relation not a historical attribution.
| Often newer concepts shed light on older ones. As Baez's
| article says, Gibbs entropy is Shannon's entropy of an
| associated distribution(multiplied by the constant k).
| kergonath wrote:
| It is a problem because all three come with a bagage.
| Almost none of the things discussed in this thread are
| invalid when discussing actual physical entropy even
| though the equations are superficially similar. And then
| there are lots of people being confidently wrong because
| they assume that it's just one concept. It really is not.
| enugu wrote:
| Don't see how the connection is superficial. Even the
| classical macroscopic definition of entropy as DS=[?]TdQ
| can be derived from the information theory perspective as
| Baez shows in article(using entropy maximizing
| distributions and Lagrange multipliers). If you have a
| more specific critique, it would be good to discuss.
| im3w1l wrote:
| In classical physics there is no real objective
| randomness. Particles have a defined position and
| momentum and those evolve deterministically. If you
| somehow learned these then the shannon entropy is zero.
| If entropy is zero then all kinds of things break down.
|
| So now you are forced to consider e.g. temperature an
| impossibility without quantum-derived randomness, even
| though temperature does not really seem to be a quantum
| thing.
| kgwgk wrote:
| > Particles have a defined position and momentum
|
| Which we don't know precisely. Entropy is about not
| knowing.
|
| > If you somehow learned these then the shannon entropy
| is zero.
|
| Minus infinity. Entropy in classical statistical
| mechanics is proportional to the logarithm of the volume
| in phase space. (You need an appropriate extension of
| Shannon's entropy to continuous distributions.)
|
| > So now you are forced to consider e.g. temperature an
| impossibility without quantum-derived randomness
|
| Or you may study statistical mechanics :-)
| kergonath wrote:
| > Which we don't know precisely. Entropy is about not
| knowing.
|
| No, it is not about not knowing. This is an instance of
| the intuition from Shannon's entropy does not translate
| to statistical Physics.
|
| It is about the number of possible microstates, which is
| completely different. In Physics, entropy is a property
| of a bit of matter, it is not related to the observer or
| their knowledge. We can measure the enthalpy change of a
| material sample and work out its entropy without knowing
| a thing about its structure.
|
| > Minus infinity. Entropy in classical statistical
| mechanics is proportional to the logarithm of the volume
| in phase space.
|
| No, 0. In this case, there is a single state with p=1 and
| and S = - k S p ln(p) = 0.
|
| This is the same if you consider the phase space because
| then it is reduced to a single point (you need a bit of
| distribution theory to prove it rigorously but it is
| somewhat intuitive).
|
| The probability p of an microstate is always between 0
| and 1, therefore p ln(p) is always negative and S is
| always positive.
|
| You get the same using Boltzmann's approach, in which
| case O = 1 and S = k ln(O) is also 0.
|
| > (You need an appropriate extension of Shannon's entropy
| to continuous distributions.)
|
| Gibbs' entropy.
|
| > Or you may study statistical mechanics
|
| Indeed.
| kgwgk wrote:
| > possible microstates
|
| Conditional on the known macrostate. Because we don't
| know the precise microstate - only which microstates are
| possible.
|
| If your reasoning is that << experimental entropy can be
| measured so it's not about that >> then it's not about
| macrostates and microstates either!
| nyssos wrote:
| > In Physics, entropy is a property of a bit of matter,
| it is not related to the observer or their knowledge. We
| can measure the enthalpy change of a material sample and
| work out its entropy without knowing a thing about its
| structure.
|
| Enthalpy is also dependent on your choice of state
| variables, which is in turn dictated by which observables
| you want to make predictions about: whether two
| microstates are distinguishable, and thus whether the
| part of the same macrostate, depends on the tools you
| have for distinguishing them.
| enugu wrote:
| > If entropy is zero then all kinds of things break down.
|
| Entropy is a macroscopic variable and if you allow
| microscopic information, strange things can happen! One
| can move from a high entropy macrostate to a low entropy
| macrostate if you choose the initial microstate
| carefully. But this is not a reliable process which you
| can reproduce experimentally, ie. it is not a
| thermodynamic process.
|
| A thermodynamics process P is something which takes a
| macrostate A to a macrostate B, independent of which
| microstate a0, a1, a2.. in A you started off with it. If
| the process depends on microstate, then it wouldn't be
| something we would recognize as we are looking from the
| macro perspective.
| IIAOPSW wrote:
| Yeah but distributions are just the accounting tools to keep
| track of your entropy. If you are missing one bit of
| information about a system, your understanding of the system
| is some distribution with one bit of entropy. Like the
| original comment said, the entropy is the number of bits
| needed to fill in the unknowns and bring the uncertainty down
| to zero. Your coin flips may be unknown in advance to you,
| and thus you model it as a 50/50 distribution, but in a
| deterministic universe the bits were present all along.
| dist-epoch wrote:
| Trivial example: if you know the seed of a pseudo-random number
| generator, a sequence generated by it has very low entropy.
|
| But if you don't know the seed, the entropy is very high.
| rustcleaner wrote:
| Theoretically, it's still only the entropy of the sneed-space
| + time-space it could have been running in, right?
| sva_ wrote:
| https://archive.is/9vnVq
| canjobear wrote:
| What's often lost in the discussions about whether entropy is
| subjective or objective is that, if you dig a little deeper,
| information theory gives you powerful tools for relating the
| objective and the subjective.
|
| Consider cross entropy of two distributions H[p, q] = -S p_i
| log q_i. For example maybe p is the real frequency distribution
| over outcomes from rolling some dice, and q is your belief
| distribution. You can see the p_i as representing the objective
| probabilities (sampled by actually rolling the dice) and the
| q_i as your subjective probabilities. The cross entropy is
| measuring something like how surprised you are on average when
| you observe an outcome.
|
| The interesting thing is that H[p, p] <= H[p, q], which means
| that if your belief distribution is wrong, your cross entropy
| will be higher than it would be if you had the right beliefs,
| q=p. This is guaranteed by the concavity of the logarithm. This
| gives you a way to compare beliefs: whichever q gets the lowest
| H[p,q] is closer to the truth.
|
| You can even break cross entropy into two parts, corresponding
| to two kinds of uncertainty: H[p, q] = H[p] + D[q||p]. The
| first term is the entropy of p and it is the aleatoric
| uncertainty, the inherent randomness in the phenomenon you are
| trying to model. The second term is KL divergence and it tells
| you how much additional uncertainty you have as the result of
| having wrong beliefs, which you could call epistemic
| uncertainty.
| bubblyworld wrote:
| Thanks, that's an interesting perspective. It also highlights
| one of the weak points in the concept, I think, which is that
| this is only a tool for updating beliefs to the extent that
| the underlying probability space ("ontology" in this analogy)
| can actually "model" the phenomenon correctly!
|
| It doesn't seem to shed much light on when or how you could
| update the underlying probability space itself (or when to
| change your ontology in the belief setting).
| bsmith wrote:
| Couldn't you just add a control (PID/Kalman filter/etc) to
| coverage on a stability of some local "most" truth?
| bubblyworld wrote:
| Could you elaborate? To be honest I have no idea what
| that means.
| _hark wrote:
| I think what you're getting at is the construction of the
| sample space - the space of outcomes over which we define
| the probability measure (e.g. {H,T} for a coin, or
| {1,2,3,4,5,6} for a die).
|
| Let's consider two possibilities:
|
| 1. Our sample space is "incomplete"
|
| 2. Our sample space is too "coarse"
|
| Let's discuss 1 first. Imagine I have a special die that
| has a hidden binary state which I can control, which forces
| the die to come up either even or odd. If your sample space
| is only which side faces up, and I randomize the hidden
| state appropriately, it appears like a normal die. If your
| sample space is enlarged to include the hidden state, the
| entropy of each roll is reduced by one bit. You will not be
| able to distinguish between a truly random coin and a coin
| with a hidden state if your sample space is incomplete. Is
| this the point you were making?
|
| On 2: Now let's imagine I can only observe whether the die
| comes up even or odd. This is a coarse-graining of the
| sample space (we get strictly less information - or, we
| only get some "macro" information). Of course, a coarse-
| grained sample space is necessarily an incomplete one! We
| can imagine comparing the outcomes from a normal die, to
| one which with equal probability rolls an even or odd
| number, except it cycles through the microstates
| deterministically e.g. equal chance of {odd, even}, but
| given that outcome, always goes to next in sequence
| {(1->3->5), (2->4->6)}.
|
| Incomplete or coarse sample spaces can indeed prevent us
| from inferring the underlying dynamics. Many processes can
| have the same apparent entropy on our sample space from
| radically different underlying processes.
| bubblyworld wrote:
| Right, this is exactly what I'm getting at - learning a
| distribution over a fixed sample space can be done with
| Bayesian methods, or entropy-based methods like the OP
| suggested, but I'm wondering if there are methods that
| can automatically adjust the sample space as well.
|
| For well-defined mathematical problems like dice rolling
| and fixed classical mechanics scenarios and such, you
| don't need this I guess, but for any real-world problem I
| imagine half the problem is figuring out a good sample
| space to begin with. This kind of thing must have been
| studied already, I just don't know what to look for!
|
| There are some analogies to algorithms like NEAT, which
| automatically evolves a neural network architecture while
| training. But that's obviously a very different context.
| _hark wrote:
| We could discuss completeness of the sample space, and we
| can also discuss completeness of the _hypothesis space_.
|
| In Solomonoff Induction, which purports to be a theory of
| universal inductive inference, the "complete hypothesis
| space" consists of all computable programs (note that all
| current physical theories are computable, so this
| hypothesis space is very general). Then induction is
| performed by keeping all programs consistent with the
| observations, weighted by 2 terms: the programs prior
| likelihood, and the probability that program assigns to
| the observations (the programs can be deterministic and
| assign probability 1).
|
| The "prior likelihood" in Solomonoff Induction is the
| program's complexity (well, 2^(-Complexity), where the
| complexity is the length of the shortest representation
| of that program.
|
| Altogether, the procedure looks like: maintain a belief
| which is a mixture of all programs consistent with the
| observations, weighted by their complexity and the
| likelihood they assign to the data. Of course, this
| procedure is still limited by the sample/observation
| space!
|
| That's our best formal theory of induction in a nutshell.
| canjobear wrote:
| This kind of thinking will lead you to ideas like
| algorithmic probability, where distributions are defined
| using universal Turing machines that could model anything.
| bubblyworld wrote:
| Amazing! I had actually heard about solomonoff induction
| before but my brain didn't make the connection. Thanks
| for the shortcut =)
| tel wrote:
| You can sort of do this over a suitably large (or infinite)
| family of models all mixed, but from an epistemological POV
| that's pretty unsatisfying.
|
| From a practical POV it's pretty useful and common (if you
| allow it to describe non- and semi-parametric models too).
| Agentus wrote:
| Correct anything thats wrong here. Cross entropy is the
| comparison of two distributions right? Is the objectivity
| sussed out in relation to the overlap cross section. And is
| the subjectivity sussed out not on average but deviations on
| average? Just trying to understand it in my framework which
| might be wholly off the mark.
| vinnyvichy wrote:
| Baez has a video (accompanying, imho), with slides
|
| https://m.youtube.com/watch?v=5phJVSWdWg4&t=17m
|
| He illustrates the derivation of Shannon entropy with pictures
| of trees
| IIAOPSW wrote:
| To shorten this for you with my own (identical) understanding:
| "entropy is just the name for the bits you don't have".
|
| Entropy + Information = Total bits in a complete description.
| CamperBob2 wrote:
| It's an objective quantity, but you have to be very precise in
| stating what the quantity describes.
|
| Unbroken egg? Low entropy. There's only one way the egg can
| exist in an unbroken state, and that's it. You could represent
| the state of the egg with a single bit.
|
| Broken egg? High entropy. There are an arbitrarily-large number
| of ways that the pieces of a broken egg could land.
|
| A list of the locations and orientations of each piece of the
| broken egg, sorted by latitude, longitude, and compass bearing?
| Low entropy again; for any given instance of a broken egg,
| there's only one way that list can be written.
|
| Zip up the list you made? High entropy again; the data in the
| .zip file is effectively random, and cannot be compressed
| significantly further. Until you unzip it again...
|
| Likewise, if you had to transmit the (uncompressed) list over a
| bandwidth-limited channel. The person receiving the data can
| make no assumptions about its contents, so it might as well be
| random even though it has structure. Its entropy is effectively
| high again.
| kragen wrote:
| shannon entropy is subjective for bayesians and objective for
| frequentists
| marcosdumay wrote:
| The entropy is objective if you completely define the
| communication channel, and subjective if you weave the
| definition away.
| kragen wrote:
| the subjectivity doesn't stem from the definition of the
| channel but from the model of the information source.
| what's the prior probability that you _intended_ to say
| 'weave', for example? that depends on which model of your
| mind we are using. frequentists argue that there is an
| objectively correct model of your mind we should always
| use, and bayesians argue that it depends on our _prior
| knowledge_ about your mind
| marcosdumay wrote:
| > he glosses over this
|
| All of information theory is relative to the channel. This bit
| is well communicated.
|
| What he glosses over is the definition of "channel", since it's
| obvious for electromagnetic communications.
| niemandhier wrote:
| My goto source for understanding entropy: http://philsci-
| archive.pitt.edu/8592/1/EntropyPaperFinal.pdf
| prof-dr-ir wrote:
| If I would write a book with that title then I would get to the
| point a bit faster, probably as follows.
|
| Entropy is _just_ a number you can associate with a probability
| distribution. If the distribution is discrete, so you have a set
| p_i, i = 1..n, which are each positive and sum to 1, then the
| definition is:
|
| S = - sum_i p_i log( p_i )
|
| Mathematically we say that entropy is a real-valued function on
| the space of probability distributions. (Elementary exercises:
| show that S >= 0 and it is maximized on the uniform
| distribution.)
|
| That is it. I think there is little need for all the mystery.
| kgwgk wrote:
| That covers one and a half of the twelve points he discusses.
| prof-dr-ir wrote:
| Correct! And it took me just one paragraph, not the 18 pages
| of meandering (and I think confusing) text that it takes the
| author of the pdf to introduce the same idea.
| kgwgk wrote:
| You didn't introduce any idea. You said it's "just a
| number" and wrote down a formula without any explanation or
| justification.
|
| I concede that it was much shorter though. Well done!
| bdjsiqoocwk wrote:
| Haha you reminded me of that idea in software engineering
| that "it's easy to make an algorithm faster if you accept
| that at times it might output the wrong result; in fact
| you can make infinitely fast"
| rachofsunshine wrote:
| The problem is that this doesn't get at many of the intuitive
| properties of entropy.
|
| A different explanation (based on macro- and micro-states)
| makes it intuitively obvious why entropy is non-decreasing with
| time or, with a little more depth, what entropy has to do with
| temperature.
| prof-dr-ir wrote:
| The above evidently only suffices as a definition, not as an
| entire course. My point was just that I don't think any other
| introduction beats this one, especially for a book with the
| given title.
|
| In particular it has always been my starting point whenever I
| introduce (the entropy of) macro- and micro-states in my
| statistical physics course.
| mjw_byrne wrote:
| That doesn't strike me as a problem. Definitions are often
| highly abstract and counterintuitive, with much study
| required to understand at an intuitive level what motivates
| them. Rigour and intuition are often competing concerns, and
| I think definitions should favour the former. The definition
| of compactness in topology, or indeed just the definition of
| a topological space, are examples of this - at face value,
| they're bizarre. You have to muck around a fair bit to
| understand why they cut so brilliantly to the heart of the
| thing.
| nabla9 wrote:
| Everyone who sees that formula can immediately see that it
| leads to principle of maximum entropy.
|
| Just like everyone seeing Maxwell's equations can immediately
| see that you can derive the the speed of light classically.
|
| Oh dear. The joy of explaining the little you know.
| prof-dr-ir wrote:
| As of this moment there are six other top-level comments
| which each try to define entropy, and frankly they are all
| wrong, circular, or incomplete. Clearly the very _definition_
| of entropy is confusing, and the _definition_ is what my
| comment provides.
|
| I never said that all the other properties of entropy are now
| immediately visible. Instead I think it is the only universal
| starting point of any reasonable discussion or course on the
| subject.
|
| And lastly I am frankly getting discouraged by all the
| dismissive responses. So this will be my last comment for the
| day, and I will leave you in the careful hands of, say, the
| six other people who are obviously so extremely knowledgeable
| about this topic. /s
| mitthrowaway2 wrote:
| So the only thing you need to know about entropy is that it's
| _a real-valued number you can associate with a probability
| distribution_? And that 's it? I disagree. There are several
| numbers that can be associated with probability distribution,
| and entropy is an especially useful one, but to understand why
| entropy is useful, or why you'd use that function instead of a
| different one, you'd need to know a few more things than just
| what you've written here.
| Maxatar wrote:
| Exactly, saying that's all there is to know about entropy is
| like saying all you need to know about chess are the rules
| and all you need to know about programming is the
| syntax/semantics.
|
| Knowing the plain definition or the rules is nothing but a
| superficial understanding of the subject. Knowing how to use
| the rules to actually do something meaningful, having a
| strategy, that's where meaningful knowledge lies.
| FabHK wrote:
| In particular, the expectation (or variance) of a real-valued
| random variable can also be seen as "a real-valued number you
| can associate with a probability distribution".
|
| Thus, GP's statement is basically: "entropy is like
| expectation, but different".
| prof-dr-ir wrote:
| Of course that is not my statement. See all my other replies
| to identical misinterpretations of my comment.
| senderista wrote:
| Many students will want to know where the minus sign comes
| from. I like to write the formula instead as S = sum_i p_i log(
| 1 / p_i ), where (1 / p_i) is the "surprise" (i.e., expected
| number of trials before first success) associated with a given
| outcome (or symbol), and we average it over all outcomes (i.e.,
| weight it by the probability of the outcome). We take the log
| of the "surprise" because entropy is an extensive quantity, so
| we want it to be additive.
| mensetmanusman wrote:
| Don't forget it's the only measure of the arrow of time.
| kgwgk wrote:
| One could also say that it's just a consequence of the
| passage of time (as in getting away from a boundary
| condition). The decay of radioactive atoms is also a measure
| of the arrow of time - of course we can say that's the same
| thing.
|
| CP violation may (or may not) be more relevant regarding the
| arrow of time.
| kaashif wrote:
| That definition is on page 18, I agree it could've been reached
| a bit faster but a lot of the preceding material is motivation,
| puzzles, and examples.
|
| This definition isn't the end goal, the physics things are.
| klysm wrote:
| The definition by itself without intuition of application is of
| little use
| bubblyworld wrote:
| Thanks for defining it rigorously. I think people are getting
| offended on John Baez's behalf because his book obviously
| covers a lot more - like _why_ does this particular number seem
| to be so useful in so many different contexts? How could you
| have motivated it a priori? Etcetera, although I suspect you
| know all this already.
|
| But I think you're right that a clear focus on the maths is
| useful for dispelling misconceptions about entropy.
| kgwgk wrote:
| Misconceptions about entropy are misconceptions about
| physics. You can't dispell them focusing on the maths and
| ignoring the physics entirely - especially if you just write
| an equation without any conceptual discussion, not even
| mathematical.
| bubblyworld wrote:
| I didn't say to _only_ focus on the mathematics. Obviously
| wherever you apply the concept (and it 's applied to much
| more than physics) there will be other sources of
| confusion. But just knowing that entropy is a property of a
| distribution, not a state, already helps clarify your
| thinking.
|
| For instance, you know that the question "what is the
| entropy of a broken egg?" is actually meaningless, because
| you haven't specified a distribution (or a set of
| micro/macro states in the stat mech formulation).
| kgwgk wrote:
| Ok, I don't think we disagree. But knowing that entropy
| is a property of a distribution given by that equation is
| far from "being it" as a definition of the concept of
| entropy in physics.
|
| Anyway, it seems that - like many others - I just
| misunderstood the "little need for all the mystery"
| remark.
| bubblyworld wrote:
| Right, I see what you're saying. I agree that there is a
| lot of subtlety in the way entropy is actually used in
| practice.
| prof-dr-ir wrote:
| > is far from "being it" as a definition of the concept
| of entropy in physics.
|
| I simply do not understand why you say this. Entropy in
| physics is defined using _exactly_ the same equation. The
| only thing I need to add is the choice of probability
| distribution (i.e. the choice of ensemble).
|
| I really do not see a better "definition of the concept
| of entropy in physics".
|
| (For quantum systems one can nitpick a bit about density
| matrices, but in my view that is merely a technicality on
| how to extend probability distributions to Hilbert
| spaces.)
| kgwgk wrote:
| I'd say that the concept of entropy "in physics" is about
| (even better: starts with) the choice of a probability
| distribution. Without that you have just a number
| associated with each probability distribution -
| distributions without any physical meaning so those
| numbers won't have any physical meaning either.
|
| But that's fine, I accept that you may think that it's
| just a little detail.
|
| (Quantum mechanics has no mystery either.
|
| ih/2pi dA/dt = AH - HA
|
| That's it. The only thing one needs to add is a choice of
| operators.)
| prof-dr-ir wrote:
| Sarcasm aside, I really do not think you are making much
| sense.
|
| Obviously one first introduces the relevant probability
| distributions (at least the micro-canonical ensemble).
| But once you have those, your comment still does not
| offer a better way to introduce entropy other than what I
| wrote. What did you have in mind?
|
| In other words, how did you think I should change this
| part of my course?
| eointierney wrote:
| Ah JCB, how I love your writing, you are always so very generous.
|
| Your This Week's Finds were a hugely enjoyable part of my
| undergraduate education and beyond.
|
| Thank you again.
| dmn322 wrote:
| This seems like a great resource for referencing the various
| definitions. I've tried my hand at developing an intuitive
| understanding: https://spacechimplives.substack.com/p/observers-
| and-entropy. TLDR - it's an artifact of the model we're using. In
| the thermodynamic definition, the energy accounted for in the
| terms of our model is information. The energy that's not is
| entropic energy. Hence why it's not "useable" energy, and the
| process isn't reversible.
| zoenolan wrote:
| Hawking on the subject
|
| https://youtu.be/wgltMtf1JhY
| bdjsiqoocwk wrote:
| Hmmm that list of things that contribute to entropy I've noticed
| omits particles which under "normal circumstances" on earth exist
| in bound states, for example it doesn't mentions W bosons or
| gluons. But in some parts of the universe they're not bound but
| in different state of matter, e.g. quark gluon plasma. I wonder
| how or if this was taken I to account.
| yellowcake0 wrote:
| Information entropy is literally the strict lower bound on how
| efficiently information can be communicated (expected number of
| transmitted bits) if the probability distribution which generates
| this information is known, that's it. Even in contexts such as
| calculating the information entropy of a bit string, or the
| English language, you're just taking this data and constructing
| some empirical probability distribution from it using the
| relative frequencies of zeros and ones or letters or n-grams or
| whatever, and then calculating the entropy of that distribution.
|
| I can't say I'm overly fond of Baez's definition, but far be it
| from me to question someone of his stature.
| arjunlol wrote:
| DS = DQ/T
| utkarsh858 wrote:
| I sometimes ponder where new entropy/randomness is coming from,
| like if we take the earliest state of universe as an infinitely
| dense point particle which expanded. So there must be some
| randomness or say variety which led it to expand in a non uniform
| way which led to the dominance of matter over anti-matter, or
| creation of galaxies, clusters etc. If we take an isolated system
| in which certain static particles are present, will there be the
| case that a small subset of the particles will get motion and
| this introduce entropy? Can entropy be induced automatically,
| atleast on a quantum level? If anyone can help me explain that it
| will be very helpful and thus can help explain origin of universe
| in a better way.
| pseidemann wrote:
| I saw this video, which explained it for me (it's german, maybe
| the automatic subtitles will work for you):
| https://www.youtube.com/watch?v=hrJViSH6Klo
|
| He argues that the randomness you are looking for comes from
| quantum fluctuations, and if this randomness did not exist, the
| universe would probably never have "happened".
| empath75 wrote:
| Symmetry breaking is the general phenomenon that underlies most
| of that.
|
| The classic example is this:
|
| Imagine you have a perfectly symmetrical sombrero[1], and
| there's a ball balanced on top of the middle of the hat.
| There's no preferred direction it should fall in, but it's
| _unstable_. Any perturbation will make it roll down hill and
| come to rest in a stable configuration on the brim of the hat.
| The symmetry of the original configuration is now broken, but
| it's stable.
|
| 1: https://m.media-
| amazon.com/images/I/61M0LFKjI9L.__AC_SX300_S...
| tasteslikenoise wrote:
| I've always favored this down-to-earth characterization of the
| entropy of a discrete probability distribution. (I'm a big fan of
| John Baez's writing, but I was surprised glancing through the PDF
| to find that he doesn't seem to mention this viewpoint.)
|
| Think of the distribution as a histogram over some bins. Then,
| the entropy is a measurement of, if I throw many many balls at
| random into those bins, the probability that the distribution of
| balls over bins ends up looking like that histogram. What you
| usually expect to see is a uniform distribution of balls over
| bins, so the entropy measures the probability of other rare
| events (in the language of probability theory, "large deviations"
| from that typical behavior).
|
| More specifically, if P = (P1, ..., Pk) is some distribution,
| then the probability that throwing N balls (for N very large)
| gives a histogram looking like P is about 2^(-N * [log(k) -
| H(P)]), where H(P) is the entropy. When P is the uniform
| distribution, then H(P) = log(k), the exponent is zero, and the
| estimate is 1, which says that by far the most likely histogram
| is the uniform one. That is the largest possible entropy, so any
| other histogram has probability 2^(-c*N) of appearing for some c
| > 0, i.e., is very unlikely and exponentially moreso the more
| balls we throw, but the entropy measures just how much. "Less
| uniform" distributions are less likely, so the entropy also
| measures a certain notion of uniformity. In large deviations
| theory this specific claim is called "Sanov's theorem" and the
| role the entropy plays is that of a "rate function."
|
| The counting interpretation of entropy that some people are
| talking about is related, at least at a high level, because the
| probability in Sanov's theorem is the number of outcomes that
| "look like P" divided by the total number, so the numerator there
| is indeed counting the number of configurations (in this case of
| balls and bins) having a particular property (in this case
| looking like P).
|
| There are lots of equivalent definitions and they have different
| virtues, generalizations, etc, but I find this one especially
| helpful for dispelling the air of mystery around entropy.
| vinnyvichy wrote:
| Hey did you want to say _relative entropy_ ~ rate function ~ KL
| divergence. Might be more familiar to ML enthusiasts here, get
| them to be curious about Sanov or large deviations.
| tasteslikenoise wrote:
| That's right, here log(k) - H(p) is really the relative
| entropy (or KL divergence) between p and the uniform
| distribution, and all the same stuff is true for a different
| "reference distribution" of the probabilities of balls
| landing in each bin.
|
| For discrete distributions the "absolute entropy" (just sum
| of -p log(p) as it shows up in Shannon entropy or statistical
| mechanics) is in this way really a special case of relative
| entropy. For continuous distributions, say over real numbers,
| the analogous quantity (integral of -p log(p)) isn't a
| relative entropy since there's no "uniform distribution over
| all real numbers". This still plays an important role in
| various situations and calculations...but, at least to my
| mind, it's a formally similar but conceptually separate
| object.
| vinnyvichy wrote:
| The book might disappoint some..
|
| >I have largely avoided the second law of thermodynamics ...
| Thus, the aspects of entropy most beloved by physics popularizers
| will not be found here.
|
| But personally, this bit is the most exciting to me.
|
| >I have tried to say as little as possible about quantum
| mechanics, to keep the physics prerequisites low. However,
| Planck's constant shows up in the formulas for the entropy of the
| three classical systems mentioned above. The reason for this is
| fascinating: Planck's constant provides a unit of volume in
| position-momentum space, which is necessary to define the entropy
| of these systems. Thus, we need a tiny bit of quantum mechanics
| to get a good approximate formula for the entropy of hydrogen,
| even if we are trying our best to treat this gas classically.
| foobarbecue wrote:
| How do you get to the actual book / tweets? The link just takes
| me back to the forward...
| vishnugupta wrote:
| http://math.ucr.edu/home/baez/what_is_entropy.pdf
| GoblinSlayer wrote:
| There's fundamental nature of entropy, but as usual it's not very
| enlightening for poor monkey brain, so to explain you need to
| enumerate all its high level behavior, but its high level
| behavior is accidental and can't be summarized in a concise form.
| space_oddity wrote:
| This complexity underscores the richness of the concept
| ctafur wrote:
| The way I understand it is with an analogy to probability. To me,
| events are to microscopic states like random variable is to
| entropy.
| ctafur wrote:
| My first contact with entropy was in chemistry and
| thermodynamics and I didn't get it. Actually I didn't get
| anything from engineering thermodynamics books such as Cengel
| and so.
|
| You must go to statistical mechanics or information theory to
| understand entropy. Or trying these PRICELESS NOTES from Prof.
| Suo:
| https://docs.google.com/document/d/1UMwpoDRZLlawWlL2Dz6YEomy...
| jsomedon wrote:
| Am I only one that can't download the pdf, or is the file server
| down? I can see the blog page but when I try downloading the
| ebook it just doesn't work..
|
| If the file server is down.. anyone could upload the ebook for
| download?
| tromp wrote:
| Closely related recent discussion:
| https://news.ycombinator.com/item?id=40972589
| tromp wrote:
| Closely related recent discussion on The Second Law of
| Thermodynamics (2011) (franklambert.net):
|
| https://news.ycombinator.com/item?id=40972589
| ThrowawayTestr wrote:
| MC Hawking already explained this
|
| https://youtu.be/wgltMtf1JhY
| ccosm wrote:
| "I have largely avoided the second law of thermodynamics, which
| says that entropy always increases. While fascinating, this is so
| problematic that a good explanation would require another book!"
|
| For those interested I am currently reading "Entropy Demystified"
| by Arieh Ben-Naim which tackles this side of things from much the
| same direction.
| suoduandao3 wrote:
| I like the formulation of 'the amount of information we don't
| know about a system that we could in theory learn'. I'm surprised
| there's no mention of the Copenhagen interpretation's interaction
| with this definition, under a lot of QM theories 'unavailable
| information' is different from available information.
| tsoukase wrote:
| After years of thought I dare to say the 2nd TL is a tautology.
| Entropy is increasing means every system tends to higher
| probability means the most probable is the most probable.
| tel wrote:
| I think that's right, though it's non-obvious that more
| probable systems are disordered. At least as non-obvious as
| Pascal's triangle is.
|
| Which is to say, worth saying from a first principles POV, but
| not all that startling.
___________________________________________________________________
(page generated 2024-07-23 23:10 UTC)