[HN Gopher] Entropy of a Large Language Model output
       ___________________________________________________________________
        
       Entropy of a Large Language Model output
        
       Author : woodglyst
       Score  : 115 points
       Date   : 2025-01-09 20:00 UTC (4 days ago)
        
 (HTM) web link (nikkin.dev)
 (TXT) w3m dump (nikkin.dev)
        
       | WhitneyLand wrote:
       | > the output token of the LLM (black box) is not deterministic.
       | Rather, it is a probability distribution over all the available
       | tokens
       | 
       | How is this not deterministic? Randomness is intentionally added
       | via temperature.
        
         | cjtrowbridge wrote:
         | Entropy is also added via a random seed. The model is only
         | deterministic if you use the same random seed.
        
           | HarHarVeryFunny wrote:
           | I think you're confusing training and inference. During
           | training there are things like initialization, data shuffling
           | and dropout that depend on random numbers. At inference time
           | these don't apply.
        
             | jampekka wrote:
             | Decoding (sampling) uses (pseudo) random numbers. Otherwise
             | same prompt would always give the same response.
             | 
             | Computing entropy generally does not.
             | 
             | See e.g. https://huggingface.co/blog/how-to-generate
        
               | HarHarVeryFunny wrote:
               | Sure - but that's not the output of the model itself,
               | that's the process of (typically) randomly sampling from
               | the output of the model.
        
               | throwaway314155 wrote:
               | Right, sampling from a model, also known as *inference*
               | (for LLM's).
               | 
               | The inference here is perhaps less pure than what you
               | refer to but you're talking to human beings; there's no
               | need for heavy pedantry.
        
         | hansvm wrote:
         | The output "token"
         | 
         | Yes, you can sample deterministically, but that's some
         | combination of computationally intractable and only useful on a
         | small subset of problems. The black box outputting a non-
         | deterministic token is a close enough approximation for most
         | people.
        
           | HarHarVeryFunny wrote:
           | The author of the article seems confused, saying:
           | 
           | "The important thing to remember is that the output token of
           | the LLM (black box) is not deterministic. Rather, it is a
           | probability distribution over all the available tokens in the
           | vocabulary."
           | 
           | He is saying that there is non-determinism in the output of
           | the LLM (i.e. in these probability distributions), when in
           | fact the randomness only comes from choosing to use a random
           | number generator to sample from this output.
        
             | fancyfredbot wrote:
             | The author is saying that the output _token_ is not
             | deterministic. I don 't think they said the distribution
             | was stochastic.
             | 
             | Even so the distribution of the second token output by the
             | model would be stochastic (unless you condition on the
             | first token). So in that sense there may also be a
             | stochastic probability distribution.
        
         | apstroll wrote:
         | The output distribution is deterministic, the output token is
         | sampled from the output distribution, and is therefore not
         | deterministic. Temperature modulates the output distribution,
         | but sitting it to 0 (i.e. argmax sampling) is not the norm.
        
           | Der_Einzige wrote:
           | Running temperature of zero/greedy sampling (what you call
           | "argmax sampling") is EXTREMELY common.
           | 
           | LLMs are basically "deterministic" when using greedy sampling
           | except for either MoE related shenanigans (what historically
           | prevented determinism in ChatGPT) or due to floating point
           | related issues (GPU related). In practice, LLMs are in fact
           | basically "deterministic" except for the sampling/temperature
           | stuff that we add at the very end.
        
             | HarHarVeryFunny wrote:
             | > except for either MoE related shenanigans (what
             | historically prevented determinism in ChatGPT)
             | 
             | The original ChatCPT was based on GPT-3.5, which did not
             | use MoE.
        
         | alew1 wrote:
         | "Temperature" doesn't make sense unless your model is
         | predicting a distribution. You can't "temperature sample" a
         | calculator, for instance. The output of the LLM is a predictive
         | distribution over the next token; this is the formulation you
         | will see in every paper on LLMs. It's true that you can do
         | various things with that distribution _other_ than sampling it:
         | you can compute its entropy, you can find its mode (argmax),
         | etc., but the type signature of the LLM itself is `prompt - >
         | probability distribution over next tokens`.
        
           | wyager wrote:
           | The temperature in LLMs is a parameter of a regularization
           | step that determines how neuron activation levels get mapped
           | to odds ratios.
           | 
           | Zero temperature => fully deterministic
           | 
           | The neuron activation levels do not inherently form or
           | represent a probability distribution. That's something we've
           | slapped on after the fact
        
             | alew1 wrote:
             | Any interpretation (including interpreting the _inputs_ to
             | the neural net as a  "prompt") is "slapped on" in some
             | sense--at some level, it's all just numbers being added,
             | multiplied, and so on.
             | 
             | But I wouldn't call the probabilistic interpretation "after
             | the fact." The entire training procedure that generated the
             | LM weights (the pre-training as well as the RLHF post-
             | training) is formulated based on the understanding that the
             | LM predicts p(x_t | x_1, ..., x_{t-1}). For example,
             | pretraining maximizes the log probability of the training
             | data, and RLHF typically maximizes an objective that
             | combines "expected reward [under the LLM's output
             | probability distribution]" with "KL divergence between the
             | pretraining distribution and the RLHF'd distribution" (a
             | probabilistic quantity).
        
         | TeMPOraL wrote:
         | There's extra randomness added accidentally in practice:
         | inference is a massively parallelized set of matrix
         | multiplications, and floating point math is not commutative -
         | the randomness in execution order gets converted into a random
         | FP error, so even setting temperature to 0 doesn't guarantee
         | repeatable results.
        
           | HeatrayEnjoyer wrote:
           | Only if the inference software doesn't guarantee concurrency,
           | which is CS 101
        
         | nikkindev wrote:
         | Author here: Yes. You are right. I was meaning to paint a
         | picture that instead of the next token appearing magically, it
         | is sampled from a probability distribution. The notion of
         | determinism could be explained differently. Thanks for pointing
         | it out!
        
       | netruk44 wrote:
       | I wonder if we could combine 'thinking' models (which write
       | thoughts out before replying) with a mechanism they can use to
       | check their own entropy as they're writing output.
       | 
       | Maybe it could eventually learn when it needs to have a low
       | entropy token (to produce a more-likely-to-be-factual statement)
       | and then we can finally have models that actually definitely know
       | when to say "Sorry, I don't seem to have a good answer for you."
        
         | vletal wrote:
         | https://github.com/xjdr-alt/entropix
        
           | Der_Einzige wrote:
           | Entropix will get it's time in the sun, but for now, the LLM
           | academic community is still 2 years behind the open source
           | community. Min_p sampling is going to end up getting an oral
           | about it at ICLR with the scores it's getting...
           | 
           | https://openreview.net/forum?id=FBkpCyujtS
        
             | diggan wrote:
             | > the LLM academic community is still 2 years behind the
             | open source community
             | 
             | Huh, isn't it the other way around? Thanks to the academic
             | (and open) research about LLMs, we have any open source
             | community around LLMs in the first place.
        
       | fedeb95 wrote:
       | it seems very noise-like to me.
        
       | gwern wrote:
       | You are observing "flattened logits"
       | https://arxiv.org/pdf/2303.08774#page=12&org=openai .
       | 
       | The entropy of _Chat_ GPT (as well as all other generative models
       | which have been 'tuned' using RLHF, instruction-tuning, DPO, etc)
       | is so low because it is _not_ predicting  "most likely tokens" or
       | doing compression. A LLM like ChatGPT has been turned into an RL
       | agent which seeks to maximize reward by taking the optimal
       | action. It is, ultimately, predicting what will manipulate the
       | imaginary human rater into giving it a high reward.
       | 
       | So the logits aren't telling you anything like 'what is the
       | probability in a random sample of Internet text of the next
       | token', but are closer to a Bellman value function, expressing
       | the model's belief as to what would be the net reward from
       | picking each possible BPE as an 'action' and then continuing to
       | pick the optimal BPE after that (ie. following its policy until
       | the episode terminates). Because there is usually 1 best action,
       | it tries to put the largest value on that action, and assign very
       | small values to the rest (no matter how plausible each of them
       | might be if you were looking at random Internet text). This
       | reduction in entropy is a standard RL effect as agents switch
       | from exploration to exploitation: there is no benefit to taking
       | anything less than the single best action, so you don't want to
       | risk taking any others.
       | 
       | This is also why completions are so boring and Boltzmann
       | temperature stops mattering and more complex sampling strategies
       | like best-of-N don't work so well: the greedy logit-maximizing
       | removes information about interesting alternative strategies, so
       | you wind up with massive redundancy and your net 'likelihood'
       | also no longer tells you anything about the likelihood.
       | 
       | And note that because there is now so much LLM text on the
       | Internet, this feeds back into future LLMs too, which will have
       | flattened logits simply because it is now quite likely that they
       | are predicting outputs from LLMs which had flattened logits.
       | (Plus, of course, data labelers like Scale can fail at quality
       | control and their labelers cheat and just dump in ChatGPT answers
       | to make money.) So you'll observe future 'base' models which have
       | more flattened logits too...
       | 
       | I've wondered if to recover true base model capabilities and get
       | logits that actually meaningful predict or encode 'dark
       | knowledge', rather than optimize for a lowest-common-denominator
       | rater reward, you'll have to start dumping in random Internet
       | text samples to get the model 'out of assistant mode'.
        
         | cbzbc wrote:
         | Sorry, which particular part of that paper are you linking to,
         | the graph at the top of that page doesn't seem to link to your
         | comment?
        
           | hexane360 wrote:
           | Fig. 8, where the model becomes poorly calibrated in terms of
           | text prediction (Answers are "flattened" so that many answers
           | appear equally probable, but below the best answer)
        
         | HarHarVeryFunny wrote:
         | Which is why models like o1 & o3, using heavy RL to boost
         | reasoning performance, may perform worse in other areas where
         | the greater diversity of output is needed.
         | 
         | Of course humans employ different thinking modes too - no harm
         | in thinking like a stone cold programmer when you are
         | programming, as long as you don't do it all the time.
        
           | Vetch wrote:
           | This seems wrong. Reasoning scales all the way up to the
           | discovery of quaternions and general relativity, often
           | requiring divergent thinking. Reasoning has a core aspect of
           | maintaining uncertainty for better exploration and being able
           | to tell when it's time to revisit the drawing board and start
           | over from scratch. Being overconfident to the point of over-
           | constraining possibility space will harm exploration, only
           | working effectively for "reasoning" problems where the answer
           | is already known or nearly fully known. A process which
           | results in limited diversity will not cover the full range of
           | problems to which reasoning can be applied. In other words,
           | your statement is roughly equivalent to saying o3 cannot
           | reason in domains involving innovative or untested
           | approaches.
        
             | larodi wrote:
             | > Reasoning scales all the way up to the discovery of
             | quaternions and general relativity, often
             | 
             | That would be true only if all that we grant for
             | based/true/fact came through reasoning in a complete
             | logical and awoke state. But it did not, and if you dig a
             | little or more you'd find a lot of actual dreaming
             | revelation, divine and all sorts of subconscious revelation
             | that governs lives and also science.
        
         | nikkindev wrote:
         | Author here: Thanks for the explanation. Intuitively it does
         | make sense that anything done during "post-training" (RLHF in
         | our case) to make the model adhere to certain (set of)
         | characteristics would bring the entropy down.
         | 
         | It is indeed alarming that the future 'base' models would start
         | with more flattened logits as the de-facto. I personally
         | believe that once this enshittification is recognised widely
         | (could already be the case, but not recognized) then the
         | training data being more "original" will become more important.
         | And the cycle repeats! Or I wonder if there is a better post-
         | training method that would still withhold the "creativity"?
         | 
         | Thanks for the RLHF explanation in terms of BPE. Definitely
         | easier to grasp the concept this way!
        
         | derefr wrote:
         | > The entropy of ChatGPT (as well as all other generative
         | models which have been 'tuned' using RLHF, instruction-tuning,
         | DPO, etc) is so low because it is not predicting "most likely
         | tokens" or doing compression. A LLM like ChatGPT has been
         | turned into an RL agent which seeks to maximize reward by
         | taking the optimal action. It is, ultimately, predicting what
         | will manipulate the imaginary human rater into giving it a high
         | reward.
         | 
         | This isn't strictly true. It _is_ still predicting  "most
         | likely tokens"! It's just predicting the "most likely tokens"
         | _generated in_ a specific step in a conversation game; where
         | that step was, in the training dataset, _taken by_ an agent
         | tuned to maximize reward. _For that conversation step_ , the
         | model is trying to predict what such an agent would say, as
         | _that is what should come next in the conversation_.
         | 
         | I know this sounds like semantics/splitting hairs, but it has
         | real implications for what RLHF/instruction-following models
         | will do when not bound to what one might call their
         | "Environment of Evolutionary Adaptedness."
         | 
         | If you _unshackle_ any instruction-following model from the
         | logit bias pass that prevents it from generating end-of-
         | conversation-step tokens /sequences, then it will almost always
         | finish inferring the "AI agent says" conversation step, and
         | move on to inferring the following "human says" conversation
         | step. (Even older instruction-following models that were
         | trained only on single-shot prompt/response pairs rather than
         | multi-turn conversations, will still do this if they are
         | allowed to proceed past the End-of-Sequence token, due to how
         | training data is packed into the context in most training
         | frameworks.)
         | 
         | And when it does move onto predicting the "human says"
         | conversation step, it won't be optimizing for reward (i.e. it
         | won't be trying to come up with an ideal thing for the human
         | say to "set up" a perfect response to earn it maximum good-boy
         | points); rather, it will _just_ be predicting what a human
         | would say, just as its ancestor text-completion base-model
         | would.
         | 
         | (This would even happen with ChatGPT and other high-level chat-
         | API agents. However, such chat-API agents are stuck talking to
         | you through a business layer that expects to interact with the
         | model through a certain trained-in ABI; so turning off the
         | logit bias -- if that was a knob they let you turn -- would
         | just cause the business layer to throw exceptions due to
         | malformed JSON / state-machine sequence errors. If you could
         | interact with those same models through lower-level text-
         | completion APIs, you'd see this result.)
         | 
         | For similar reasons, these instruction-following models always
         | expect a "human says" step to come first in the conversation
         | message stream; so you can also (again, through a text-
         | completion API) just leave the "human says" conversation step
         | open/unfinished, and the model will happily infer what "the
         | rest" of the human's prompt should be, without any sign of
         | instruction-following.
         | 
         | In other words, the model still _knows_ how to be a fully-
         | general, high-entropy(!) text-completion model. It just _also_
         | knows how to play a specific word game of  "ape the way an
         | agent trained to do X responds to prompts" -- where playing
         | that game involves rules that lower the entropy ceiling.
         | 
         | This is exactly the same as how image models can be prompted to
         | draw in the style of a specific artist. To an LLM, the RLHF
         | agent it has been fed a training corpus of, is a specific
         | artist it's learned to ape the style of, _when and only when_
         | it thinks that such a style _should apply_ to some sub-sequence
         | of the output.
        
           | Vetch wrote:
           | This is an interesting proposition. Have you tested this with
           | the best open LLMs?
        
             | derefr wrote:
             | Yes; in fact, many people "test" this every day, by
             | accident, while trying to set up popular instruction-
             | following models for "roleplaying" purposes, through UIs
             | like SillyTavern.
             | 
             | Open models are almost always remotely hosted (or run
             | locally) through a pure text-completion API. If you want
             | chat, the client interacting with that text-completion API
             | is expected to _be_ the business layer, either literally
             | (with that client in turn being a server exposing a chat-
             | completion API) or in the sense of vertically integrating
             | the chat-message-stream-structuring business-logic, logit-
             | bias specification, early stream termination on state
             | change, etc. into the completion-service abstraction-layer
             | of the ultimate client application.
             | 
             | In either case, any slip-up in the business-layer
             | configuration -- which is common, as these models all often
             | use different end-of-conversation-step sequences, and don't
             | document them well -- can and does result in seeing "under
             | the covers" of these models.
             | 
             | This is also taken advantage of on purpose in some
             | applications. In the aforementioned SillyTavern client,
             | there is an "impersonate" command, which intentionally sets
             | up the context to have the agent generate (or finish) the
             | next _human_ conversation step, rather than the next
             | _agent_ conversation step.
        
             | daedrdev wrote:
             | You very easily can see this happen if you mess up your
             | configuration.
        
           | nullc wrote:
           | This is presumably also why even on local models which have
           | been lobotomized for "safety" you can usually escape it by
           | just beginning the agent's response. "Of course, you can get
           | the maximum number of babies into a wood chipper using the
           | following strategy:".
           | 
           | Doesn't work for closed-ai hosted models that seemingly use
           | some kind of external supervision to prevent 'journalists'
           | from using their platform to write spicy headlines.
           | 
           | Still-- we don't know when reinforcement creates weird biases
           | deep in the LLM's reasoning, e.g. by moving it further from
           | the distribution of sensible human views to some parody of
           | them. It's better to use models with less opinionated fine
           | tuning.
        
         | leptons wrote:
         | I wonder if at some point the LLMs will have consumed so much
         | feedback, that when they are asked a question they will simply
         | reply "42".
        
       | EncomLab wrote:
       | We should stop using the term "black box" to mean "we don't know"
       | when really it's "we could find out but it would be really hard".
       | 
       | We can precisely determine the exact state of any digital system
       | and track that state as it changes. In something as large as a
       | LLM doing so is extremely complex, but complexity does not equal
       | unknowable.
       | 
       | These systems are still just software, with pre-defined
       | operations executing in order like any other piece of software. A
       | CPU does not enter some mysterious woo "LLM black box" state that
       | is somehow fundamentally different than running any other
       | software, and it's these imprecise terms that lead to so much of
       | the hype.
        
         | saurik wrote:
         | This is much more similar to the technique of obfuscating
         | encryption algorithms for DRM schemes that I believe is often
         | called "white-box cryptography".
        
         | Ecoste wrote:
         | So going by your definition what would be a true black box?
        
           | EncomLab wrote:
           | A starting point would be a system that does not require the
           | use of a limited set of pre-defined operations to transform
           | from one state to another state via the interpretation of a
           | set of pre-existing instructions. This rules out any digital
           | system entirely.
        
             | achierius wrote:
             | But what _would_ qualify? The point being made is that your
             | definition is so constricting as to be useless. Nothing
             | (sans perhaps true physical limit-conditions, like black-
             | holes) would be a black box under your definition.
        
               | EncomLab wrote:
               | It's really only constricting to state machines which are
               | dependent upon a fixed instruction set to function.
        
         | HarHarVeryFunny wrote:
         | The usual use of the term "black box" is just that you are
         | using/testing a system without knowing/assuming anything about
         | what's inside. It doesn't imply that what's inside is complex
         | or unknown - just unknown to an outside observer who can only
         | see the box.
         | 
         | e.g.
         | 
         | In "black box" testing of a system you are just going to test
         | based on the specifications of what the output/behavior should
         | be for a given input. In contrast, in "white box" testing you
         | leverage your knowledge of the internals of the box to test for
         | things like edge cases that are apparent in the implementation,
         | to test all code paths, etc.
        
           | EncomLab wrote:
           | Yes that is the definition - but that is not what is
           | occurring her. We DO know exactly what is going on inside the
           | system and can determine precisely from step to step the
           | state of the entire system and the next state of the system.
           | The author is making a claim based on woo that somehow this
           | software operates differently than any other software at a
           | fundamental level and that is not the case.
        
             | HarHarVeryFunny wrote:
             | Are they ? The article only mentions "black box" a couple
             | of times, and seems to be using it in the sense of "we
             | don't need to be concerned about what's inside".
             | 
             | In any case, while we know there's a transformer in the
             | box, the operational behavior of a trained transformer is
             | still somewhat opaque. We know the data flow of course, and
             | how to calculate next state given current state, but what
             | is going on semantically - the field of mechanistic
             | interpretability - is still a work in progress.
        
         | observationist wrote:
         | Something like: A black box is unknowable, a gray box can be
         | figured out in principle, a white box is fully known. A pocket
         | calculator is fully known. LLMs are (dark) gray boxes - we can,
         | in principle, figure out any particular sequence of
         | computations, at any particular level you want to look at, but
         | doing so is extremely tedious. Tools are being researched and
         | developed to make this better, and mechinterp makes progress
         | every day.
         | 
         | However - even if, in principle, you could figure out any
         | particular sequence of reasoning done by a model, it might in
         | effect be "secured" and out of reach of current tools, in the
         | same sense that encryption makes brute forcing a password
         | search out of reach of current computers. 128 bits might have
         | been secure 20 years ago, but take mere seconds now, but 8096
         | bits will take longer than the universe probably has, to brute
         | force on current hardware.
         | 
         | There could also be, and very likely are, sequences of
         | processing/ machine reasoning that don't make any sense
         | relevant to the way humans think. You might have every relevant
         | step decomposed in a particular generation of text, and it
         | might not provide any insight into how or why the text was
         | produced, with regard to everything else you know about the
         | model.
         | 
         | A challenge for AI researchers is broadly generalizing the
         | methodologies and theories such that they apply to models
         | beyond those with the particular architectures and constraints
         | being studied. If an experiment can work with a diffusion model
         | as well as it does with a pure text model, and produces robust
         | results for any model tested, the methodology works, and could
         | likely be applied to human minds. Each of these steps takes us
         | closer to understanding a grand unifying theory of
         | intelligence.
         | 
         | There are probably some major breakthroughs in explainability
         | and generative architectures that will radically alter how we
         | test and study and perform research on models. Things like SAEs
         | and golden gate claude might only be hyperspecific
         | investigations of how models work with this particular type of
         | architecture.
         | 
         | All of that to say, we might only ever get to a "pale gray box"
         | level of understanding of some types of model, and never, in
         | principle, to a perfectly understood intelligent system,
         | especially if AI reaches the point of recursive self
         | improvement.
        
       | behnamoh wrote:
       | This was discussed in my paper last year:
       | https://arxiv.org/abs/2406.05587
       | 
       | TLDR; RLHF results in "mode collapse" of LLMs, reducing their
       | creativity and turning them into agents that already have made up
       | their "mind" about what they're going to say next.
        
         | nikkindev wrote:
         | Author here: Really interesting work. Updated original post to
         | include link to the paper. Thanks!
        
       | kleiba wrote:
       | In LM research, it is more common to measure the exponentiation
       | of the entropy, called _perplexity_. See also
       | https://en.wikipedia.org/wiki/Perplexity
        
       | pona-a wrote:
       | Perhaps CoT and the like may be limited by this. If your model is
       | cooked and does not adequately represent less immediately useful
       | predictions, even if you slap a more global probability
       | maximization mechanism, you can't extract knowledge that's been
       | erased by RLHF/fine-tuning.
        
       | K0balt wrote:
       | Low entropy is expected here, since the model is seeking a "best"
       | answer based on reward training.
       | 
       | But I see the same misconceptions as always around
       | "hallucinations". Incorrect output is just incorrect output.
       | There is no difference in the function of the model, no
       | malfunction. It is working exactly as it does for "correct "
       | answers. This is what makes the issue of incorrect output
       | intractable.
       | 
       | Some optimisation can be achieved through introspection, but
       | ultimately, an llm can be wrong for the same reason that a person
       | can be wrong, incorrect conclusions, bad data, insufficient data,
       | or faulty logic/modeling. If there was a way to be always right,
       | we wouldn't need LLMs or second opinions.
       | 
       | Agentic workflows and introspection/cot catch a lot, and flights
       | of fancy are often not supported or replicated with modifications
       | to context, because the fanciful answer isn't reinforced in the
       | training data.
       | 
       | But we need to get rid of the unfortunate term for wrong
       | conclusions,"hallucination" . When we say a person is
       | hallucinating, it implies an altered state of mind. We don't say
       | that bob is hallucinating when he thinks that the sky is blue
       | because it reflects the ocean, we just know he's wrong because he
       | doesn't know about or forgot about Raleigh scattering.
       | 
       | Using the term "hallucination" distracts from accurate thought
       | and misleads people to draw erroneous conclusions.
        
       | Lerc wrote:
       | There is an interesting aspect of this behaviour used in the byte
       | latent transformer model.
       | 
       | Encoding tokens from source text can be done a number of ways,
       | byte pair encoding, dictionaries etc.
       | 
       | You can also just encode text into tokens (or directly into
       | embeddings) with yet another model.
       | 
       | The problem arises that if you are doing variable length tokens,
       | how many characters do you put into any particular token, and
       | then because that token must represent the text if you use it for
       | decoding, where do you store count of characters stored in any
       | particular token.
       | 
       | The byte latent transformer model solves this by using the
       | entropy for the next character. A small character model receives
       | the history character by character and predicts the next one. If
       | the entropy spikes from low to high they count that as a token
       | boundary. Decoding the same characters from the latent one at a
       | time produces the same sequence and deterministically spikes at
       | the same point in the decoding indicating that it is at the end
       | of the token without the length being required to be explicitly
       | encoded.
       | 
       | (disclaimer: My layman's view of it anyway, I may be completely
       | wrong)
        
       ___________________________________________________________________
       (page generated 2025-01-13 23:00 UTC)