[HN Gopher] Big LLMs weights are a piece of history
       ___________________________________________________________________
        
       Big LLMs weights are a piece of history
        
       Author : freeatnet
       Score  : 221 points
       Date   : 2025-03-16 12:13 UTC (10 hours ago)
        
 (HTM) web link (antirez.com)
 (TXT) w3m dump (antirez.com)
        
       | api wrote:
       | That's really what these are: something analogous to JPEG for
       | language, and queryable in natural language.
       | 
       | Tangent: I was thinking the other day: these are not AI in the
       | sense that they are not primarily _intelligence_. I still don 't
       | see much evidence of that. What they do give me is superhuman
       | memory. The main thing I use them for is search, research, and a
       | "rubber duck" that talks back, and it's like having an intern who
       | has memorized the library and the entire Internet. They
       | occasionally hallucinate or make mistakes -- compression
       | artifacts -- but it's there.
       | 
       | So it's more AM -- artificial memory.
       | 
       | Edit: as a reply pointed out: this is Vannevar Bush's Memex, kind
       | of.
        
         | antirez wrote:
         | I believe LLMs are both data and processing, but even humans
         | reasoning is based in strong ways on existing knowledge.
         | However, for the goal of the post, indeed it is the
         | memorization that is the key value, and the fact that likely in
         | the future sampling such models can be used to transfer the
         | same knowledge to bigger LLMs, even if the source data is lost.
        
           | api wrote:
           | I'm not saying there is no latent reasoning capability. It's
           | there. It just seems to be that the memory and lookup
           | component is _much_ more useful and powerful.
           | 
           | To me intelligence describes something much more capable than
           | what I see in these things, even the bleeding edge ones. At
           | least _so far_.
        
             | antirez wrote:
             | I offer a POV that is in the middle: reasoning is powerful
             | to evaluate which solution is better among N in the
             | context. Memorization allows sampling of many competing
             | ideas from the problem space, than the LLM picks the best,
             | making chain of thoughts so effective. Of course zero shot
             | reasoning also is a part of the story but somewhat weaker,
             | exactly like we are not often able to spit the best
             | solution before evaluation of the space (unless we are very
             | accustomed to the specific problem).
        
             | danielbln wrote:
             | That's the problem with the term "intelligence". Everyone
             | has their own definition, we don't even know what makes us
             | humans intelligent and more often than not it's a moving
             | goalpost as these models get better.
        
         | Mistletoe wrote:
         | This is an excellent viewpoint.
        
         | menzoic wrote:
         | Having memory is fine but choosing the relevant parts requires
         | intelligence
        
         | flower-giraffe wrote:
         | Or 80 years to MVP memex
         | 
         | "Vannevar Bush's 1945 article "As We May Think". Bush
         | envisioned the memex as a device in which individuals would
         | compress and store all of their books, records, and
         | communications, "mechanized so that it may be consulted with
         | exceeding speed and flexibility".
         | 
         | https://en.m.wikipedia.org/wiki/Memex
        
           | mdp2021 wrote:
           | The memex was a deterministic device to consult documents -
           | the _actual_ documents. The  "LLM" is more like a dumb
           | archivist that came with it (" _Yes, see for example that
           | document, it tells you that q=M*k..._ ").
        
             | skydhash wrote:
             | I grew up with physical encyclopedia, then moved on to
             | Encarta, then Wikipedia dumps and folders full of PDFs. I
             | still prefer curated information repository over chat
             | interfaces or generated summaries. The main goal with the
             | former is to have a knowledge map and keywords graph, so
             | that you can locate any piece of information you may need
             | from the actual source.
        
         | hengheng wrote:
         | I've been looking at it as an "instant reddit comment". I can
         | download a 10G or 80G compressed archive that basically
         | contains the useful parts of the internet, and then I all can
         | use it to synthesize something that is about as good and
         | reliable as a really good reddit comment. Which is nifty. But
         | honestly it's an incredible idea to sell that to businesses.
        
           | api wrote:
           | Reddit seems to puppet humans via engagement farming to do
           | what LLMs do in some cases. Posts are prompts, replies are
           | responses.
           | 
           | Of course they vary widely in quality.
        
           | Guthur wrote:
           | And so what would the point be of anyone actually posting on
           | the internet if no one actually visits the sites because
           | large corps have essentially stolen and monetized the whole
           | thing.
           | 
           | And I'm sure they have or will have the ability to influence
           | the responses so you only see what they want you to see.
        
             | kelseyfrog wrote:
             | That's the next step after algorithmic content feeds -
             | algorithmic/generated comment sections. Imagine seeing an
             | entirely different conversation happening just to get you
             | to buy a product. A product like Coca-Cola.
             | 
             | Imagine scrolling through a comment section that feels
             | tailor-made to your tastes, seamlessly guiding you to an
             | ice-cold Coca-Cola. You see people reminiscing about their
             | best summer memories--each one featuring a Coke in hand.
             | Others are debating the superior refreshment of Coke over
             | other drinks, complete with "real" testimonials and
             | nostalgic stories.
             | 
             | And just when you're feeling thirsty, a perfectly timed
             | comment appears: "Nothing beats the crisp, refreshing taste
             | of an ice-cold Coke on a hot day."
             | 
             | Algorithmic engagement isn't just the future--it's already
             | here, and it's making sure the next thing you crave is
             | Coca-Cola. Open Happiness.
        
               | adhamsalama wrote:
               | Isn't that how Reddit gained momentum? Posting fake
               | posts/comments?
               | 
               | Now we can mass-produce it!
        
               | cruffle_duffle wrote:
               | Why Coca Cola though? Sure it is refreshing on a hot day
               | but you know what is even better? Going to bed on a nice
               | cool mattress. So many are either too hard or too soft.
               | They aren't engineered to your body so you are virtually
               | guaranteed to get a poor nights sleep.
               | 
               | Imagine waking up like I do every morning. Refreshed and
               | full of energy. I've tried many mattresses and the only
               | one that has this property is my Slumber Sleep Hygiene
               | mattress.
               | 
               | The best part is my partner can customize their side
               | using nothing more than a simple app on their smartphone.
               | It tracks our sleep over time and uses AI to generate a
               | daily sleep report showing me exactly how good of a night
               | sleep I got. Why rely on my gut feelings when the report
               | can tell me exactly how good or bad of a night sleep I
               | got.
               | 
               | I highly recommend Slumber Sleep Hygiene mattresses.
               | There is a reason it's the number one brand recommended
               | on HN.
        
               | gosub100 wrote:
               | Another insidious one: fake replies designed to console
               | you if there isn't enough people to validate your opinion
               | or answer your question.
        
               | Guthur wrote:
               | Or, war is good peace is bad, nuclear war is winnable,
               | don't worry and start loving the bomb. The enemy are not
               | human anyway, your life will be better with fewer people
               | around.
               | 
               | Look at the people who want to control this, they do not
               | want to sell you Coke.
        
         | yannyu wrote:
         | There's a great article recently by Ted Chiang that elaborated
         | on this idea: https://www.newyorker.com/tech/annals-of-
         | technology/chatgpt-...
        
         | bob1029 wrote:
         | If you want to see what this would actually be like:
         | 
         | https://lcamtuf.coredump.cx/lossifizer/
         | 
         | I think a fun experiment could be to see at what setting the
         | average human can no longer decipher the text.
        
         | GolfPopper wrote:
         | > _like having an intern who has memorized the library and the
         | entire Internet. They occasionally hallucinate or make
         | mistakes_
         | 
         | Correction: you occasionally _notice_ when they hallucinate or
         | make mistakes.
        
         | xpe wrote:
         | I regularly pushback against casual uses of the word
         | "intelligence".
         | 
         | First, there is no objective dividing line. It is a matter of
         | degree _relative_ to something else. Any language that suggests
         | otherwise should be refined or ejected from our culture and
         | language. Language's evolution doesn't have to be a nosedive.
         | 
         | Second, there are many definitions of intelligence; some are
         | more useful than others. Along with many, I like Stuart
         | Russell's definition: the degree to which an agent can
         | accomplish a task. This definition requires being clear about
         | the agent and the task. I mention this so often I feel like a
         | permalink is needed. It isn't "my" idea at all; it is simply
         | the result of smart people decomplecting the idea so we're not
         | mired in needless confusion.
         | 
         | I rant about word meanings often because deep thinking people
         | need to lay claim to words and shape culture accordingly. I say
         | this often: don't cede the battle of meaning to the least
         | common denominators of apathy, ignorance, confusion, or
         | marketing.
         | 
         | Some might call this kind of thinking elitist. No. This is what
         | taking responsibility looks like. We could never have built
         | modern science (or most rigorous fields of knowledge) with
         | imprecise thinking.
         | 
         | I'm so done with sloppy mainstream phrasing of "intelligence".
         | Shit is getting real (so to speak), companies are changing the
         | world, governments are racing to stay in the game, jobs will be
         | created and lost, and humanity might transcend, improve,
         | stagnate, or die.
         | 
         | If humans, meanwhile, can't be bothered to talk about
         | intelligence in a meaningful way, then, frankly, I think we're
         | ... abdicating responsibility, tempting fate, or asking to be
         | in the next Mike Judge movie.
        
           | jart wrote:
           | We never would have been able to create science, if it
           | weren't for _focusing_ on the kinds of thinking that can be
           | made logical. There 's a big difference. What you're doing,
           | with this whole "let's make a bullshit word logical" is more
           | similar to medieval scholasticism, which was a vain attempt
           | at verbal precision. https://justine.lol/dox/english.txt
        
             | xpe wrote:
             | Yikes, maybe we can take a step back? I'm not sure where
             | this is coming from, frankly. One anodyne summary of my
             | comment above would be:
             | 
             | > Let's think and communicate more clearly regarding
             | intelligence. Stuart Russell offers a nice definition: an
             | agent's ability to do a defined task.
             | 
             | Maybe something about my comment got you riled up? What was
             | it?
             | 
             | You wrote:
             | 
             | > What you're doing, with this whole "let's make a bullshit
             | word logical" is more similar to medieval scholasticism,
             | which was a vain attempt at verbal precision.
             | 
             | Again, I'm not quite sure what to say. You suggest my
             | comment is like a medieval scholar trying to reconcile
             | dogma with philosophy? Wow. That's an uncharitable reading
             | of my comment.
             | 
             | I have five points in response. First, the word
             | intelligence need not be a "bullshit word", though I'm not
             | sure what you mean by the term. One of my favorite
             | definitions of bullshitting comes from "On Bullshit" by
             | Harry Frankfurt:
             | 
             | > Frankfurt determines that bullshit is speech intended to
             | persuade without regard for truth. The liar cares about the
             | truth and attempts to hide it; the bullshitter doesn't care
             | whether what they say is true or false. - Wikipedia
             | 
             | Second, I'm trying to _clarify_ the term intelligence by
             | breaking it into parts. I wouldn 't say I'm trying to make
             | it "logical" (in the sense of being about logic or
             | deduction). Maybe you mean "formal"?
             | 
             | Third, regarding the "what you're doing" part... this isn't
             | just me. Many people both clarify the concept of
             | intelligence and explain why doing so is important.
             | 
             | Fourth, are you saying it is impossible to clarify the
             | meaning of intelligence? Why? Not worth the trouble?
             | 
             | Fifth, have you thought about a definition of intelligence
             | that you think is sensible? Does your definition steer
             | people away from confusion?
             | 
             | You also wrote:
             | 
             | > We never would have been able to create science, if it
             | weren't for focusing on the kinds of thinking that can be
             | made logical.
             | 
             | I think you mean _testable_, not _logical_. Yes, we agree,
             | scientists should run experiments on things that can be
             | tested.
             | 
             | Russell's definition of _intelligence_ is testable by
             | defining a task and a quality metric. This is already a big
             | step up from an unexamined view of intelligence, which
             | often has some arbitrary threshold.* It allows us to see a
             | continuum from, say, how a bacteria finds food, to how ants
             | collaborate, to how people both build and use tools to
             | solve problems. It also teases out sentience and moral
             | worth so we 're not mixing them up with intelligence. These
             | are simple, doable, and worthwhile clarifications.
             | 
             | Finally, I read your quote from Dijkstra. In my reading,
             | Dijkstra's main point is that natural language is a poor
             | programming interface due to its ambiguity. Ok, fair. But
             | what is the connection to this thread? Does it undercut any
             | of my arguments? How?
             | 
             | * A common problem when discussing intelligence involves
             | moving the goal post. Whatever quality bar is implied has a
             | tendency to creep upwards over time.*
        
         | mdp2021 wrote:
         | > JPEG for [a body of] language
         | 
         | Yes!
         | 
         | > artificial memory
         | 
         | Well, "yes", kind of.
         | 
         | > Memex
         | 
         | After a flood?! Not really. Vannevar Bush - _As we may think_ -
         | http://web.mit.edu/STS.035/www/PDFs/think.pdf
        
         | visarga wrote:
         | I can ask a LLM to write a haiku about the loss function of
         | Stable Diffusion. Or I can have it do zero shot translation,
         | between a pair of languages not covered in the training set.
         | Can your "language JPEG" do that?
         | 
         | I think "it's just compression" and "it's just parroting" are
         | flawed metaphors. Especially when the model was trained with
         | RLHF and RL/reasoning. Maybe a better metaphor is "LLM is like
         | a piano, I play the keyboard and it makes 'music'". Or maybe
         | it's a bycicle, I push the pedals and it takes me where I point
         | it.
        
       | rollcat wrote:
       | https://xkcd.com/1683/
        
       | intellectronica wrote:
       | I love the title "Big LLMs" because it means that we are now
       | making a distinction between big LLMs and minute LLMs and maybe
       | medium LLMs. I'd like to propose the we call them "Tall LLMs",
       | "Grande LLMs", and "Venti LLMs" just to be precise.
        
         | HarHarVeryFunny wrote:
         | But of course these are all flavors of "large", so then we have
         | big large language models, medium large language models, etc,
         | which does indeed make the tall/grande/venti names appropriate,
         | or perhaps similar "all large" condom size names (large, huge,
         | gargantuan).
        
         | de-moray wrote:
         | What does a 20 LLM signify?
        
         | tonyhart7 wrote:
         | can we have tiny LLM that can run on smartphone now
        
           | winter_blue wrote:
           | Apple Intelligence has an LLM that runs locally on the iPhone
           | (15 Pro and up).
           | 
           | But the quality of Apple Intelligence shows us what happens
           | when you use a tiny ultra-low-wattage LLM. There's a whole
           | subreddit dedicated to its notable fails:
           | https://www.reddit.com/r/AppleIntelligenceFail/top/?t=all
           | 
           | One example of this is _"Sorry I was very drunk and went home
           | and crashed straight into bed"_ being summarized by Apple
           | Intelligence as _"Drunk and crashed"_.
        
             | Spooky23 wrote:
             | I think the real problem with LLMs is we have deterministic
             | expectations of non-deterministic tools. We've been trained
             | to expect that the computer is correct.
             | 
             | Personally, I think the summaries of alerts is incredibly
             | useful. But my expectation of accuracy for a 20 word
             | summary of multiple 20-30 word summaries is tempered by the
             | reality that's there's gonna be issues given the lack of
             | context. The point of the summary is to help me determine
             | if I should read the alerts.
             | 
             | LLMs break down when we try to make them independent agents
             | instead of advanced power tools. Alot of people enjoy navel
             | gazing and hand waving about ethics, "safety" and bias...
             | then proceed to do things with obvious issues in those
             | areas.
        
               | mewpmewp2 wrote:
               | Larger LLMs can summarize all of this quite well though.
        
           | samstave wrote:
           | I want a tiny_phone_based LLM to do thought tracking and
           | comms awareness..
           | 
           | I actually applied to YC in like ~2014 or such for thus;
           | 
           | -JotPlot - I wanted a timeline for basically giving a histo
           | timeline of comms btwn me and others - such that I had a
           | sankey-ish diagram for when and whom and via method I spoke
           | with folks and then each node eas the message, call, text,
           | meta links...
           | 
           | I think its still viable - but my thought process is too
           | currently chaotic to pull it off.
           | 
           | Basically looking at a timeline of your comms and thoughts
           | and expand into links of thought - now with LLMs you could
           | have a Throw Tag od some sort whereby you have the bot do
           | work on research expanding on certain things and plugging up
           | a site for that Idea on LOCAL HOST (i.e. your phone so that
           | you can pull up data relevant to the convo - and its all in a
           | timeline of thought/stream of conscious
           | 
           | hopefully you can visualize it...
        
             | johnmaguire wrote:
             | I had a thought that I think some people value social media
             | (e.g. Facebook) essentially for this. Like giving up your
             | Facebook profile means giving up your history or family
             | tree or even your memories.
             | 
             | So in that sense, maybe people would prefer a private
             | alternative.
        
               | samstave wrote:
               | I read this in Sam Wattersons voice with a pipe abt
               | maybey an inch from his beard,
               | 
               | (Fyi I was a designer at fb and while it was luxious I
               | still hated what I saw in zucks eyes every morn when I
               | passed him.
               | 
               | Super diff from Andy Grove at intel where for whateveer
               | reason we were in the sam oee schekdule
               | 
               | (That was me typing with eues ckised as a test (to
               | myself, typos abound
        
           | badlibrarian wrote:
           | No. Smartphone only spin animated gif while talk to big
           | building next to nuclear reactor. New radio inside make more
           | efficient.
        
           | rubslopes wrote:
           | Is a _tiny large_ language model equivalent to a normal sized
           | one?
        
         | t_mann wrote:
         | Big LLM is too long as a name. We should agree on calling them
         | BLLMs. Surely everyone is going to remember what the letters
         | stand for.
        
           | temp0826 wrote:
           | Bureau of Large Land Management
        
           | bookofjoe wrote:
           | >What does BLLM stand for?
           | 
           | https://www.abbreviations.com/BLLM#google_vignette
        
           | heyjamesknight wrote:
           | I want to apologize for this joke in advance. It had to be
           | done.
           | 
           | We could take a page from Trump's book and call them
           | "Beautiful" LLMs. Then we'd have "Big Beautiful LLMs" or just
           | "BBLs" for short.
           | 
           | Surely that wouldn't cause any confusion when Googling.
        
             | cowsaymoo wrote:
             | Weirdly enough, the ITU already chose the superlative for
             | the bigliest radio frequency band to be Tremendous:
             | 
             | - Extremely Low Frequency (ELF)
             | 
             | - Super Low Frequency (SLF)
             | 
             | - Ultra Low Frequency (ULF)
             | 
             | - Very Low Frequency (VLF)
             | 
             | - Low Frequency (LF)
             | 
             | - Medium Frequency (MF)
             | 
             | - High Frequency (HF)
             | 
             | - Very High Frequency (VHF)
             | 
             | - Ultra High Frequency (UHF)
             | 
             | - Super High Frequency (SHF)
             | 
             | - Extremely High Frequency (EHF)
             | 
             | - Tremendously High Frequency (THF)
             | 
             | Maybe one day some very smart people will make Tremendously
             | Large Language Models. They will be very large and need a
             | lot of computer. And then you'll have the Extremely Small
             | Language Model. They are like nothing.
             | 
             | https://en.wikipedia.org/wiki/Radio_frequency?#Frequency_ba
             | n...
        
               | lifthrasiir wrote:
               | AFAIK "tremendously" was chosen partly because the range
               | includes 1 "T"Hz.
        
               | bee_rider wrote:
               | XKCD telescope sizes also could provide some guidance
               | 
               | https://xkcd.com/1294/
        
               | droidist2 wrote:
               | I hope they go with "Ludicrous" like in Spaceballs.
        
           | nullhole wrote:
           | I still like Big Data Statistical Model
        
         | guestbest wrote:
         | Why not LLLM for large LLM's and SLLM for small LLM's, assuming
         | there is no middle ground
        
           | orbital-decay wrote:
           | SLM is a widespread term already.
        
             | guestbest wrote:
             | Slim pickings, then?
        
           | _heimdall wrote:
           | What makes it a Small Large Language Model? Why jot just an
           | SLM?
        
             | guestbest wrote:
             | If we can't have fun with names, why even be in IT?
        
             | technol0gic wrote:
             | Smedium Language Model
        
               | dbalatero wrote:
               | Lousy Smarch weather
        
             | gpderetta wrote:
             | S and L cancel out, so it just an LM.
        
           | kolinko wrote:
           | VLLM, Super VLLM, Almost Large Language Model
        
           | flir wrote:
           | M, LM, LLM, LLLM, L3M, L4M.
           | 
           | Gotta leave room for future expansion.
        
             | dan_linder wrote:
             | Hopefully the USB making team does NOT step into this...
             | 
             | LLM 3.0, LLM 3.1 Gen 1, LLM 3.2 Gen 1, LLM 3.1, LLM 3.1 Gen
             | 2, LLM 3.2 Gen 2, LLM 3.2, LLM 3.2 Gen 2x2, LLM 4, etc...
        
             | moffkalast wrote:
             | 2L4M
        
         | badlibrarian wrote:
         | I've sat in more than one board meeting watching them take 20
         | minutes to land on t-shirt sizes. The greatest enterprise sales
         | minds of our generation...
        
           | ben_w wrote:
           | I've seen things you people wouldn't believe.
           | 
           | I've seen corporate slogans fired off from the shoulders of
           | viral creatives. Synergy-beams glittering in the darkness of
           | org charts. Thought leadership gone rogue... All these
           | moments will be lost to NDAs and non-disparagement clauses,
           | like engagement metrics in a sea of pivot decks.
           | 
           | Time to leverage.
        
             | badlibrarian wrote:
             | ... destroyed by madness, starving hysterical! Buying weed
             | in a store then meeting with someone off Craiglist to score
             | eggs.
        
         | rnrn wrote:
         | it's too bad vLLM and VLM are taken because it would have been
         | nice to recycle the VLSI solution to describing sizes - get to
         | very large language models and leave it at that.
        
           | rnrn wrote:
           | we could also look to magnetoresistance and go for giant,
           | colossal, extraordinary
        
           | do_not_redeem wrote:
           | After very large language models, the next step is mega
           | language models, or MLMs. As a bonus, it describes the VC
           | funding scheme that backs them too.
        
         | AlienRobot wrote:
         | Terrible names, to be honest. My proposal: Hyper LLMs, Ultra
         | LLMs, Large LLMs, Micro LLMs, Mobile LLMs.
        
           | isoprophlex wrote:
           | LLM M4 Ultra Pro Max 16e (with headphone jack)
        
             | AlienRobot wrote:
             | GPT Inside
        
         | BobaFloutist wrote:
         | LLM, LLM 2.0, LLM 3.0, Mini LLM, Micro LLM, LLM C.
        
           | jfengel wrote:
           | LLM 95, LLM 98, LLM Millennium Edition, LLM NT, LLM XP, LLM
           | 2000, LLM 7
           | 
           | I really appreciated the way they managed to come up with a
           | new naming scheme each time, usually used exactly once.
        
             | Scarblac wrote:
             | LLM 3.11 for Workgroups
        
             | ben_w wrote:
             | Could always go with the Bungie approach for the Marathon
             | series: LLM, LLM2, LLM[?], 1 -- https://alephone.lhowon.org
             | 
             | (Obviously [?] is for the actual singularity, and 1 is the
             | thing after that).
        
         | davidwritesbugs wrote:
         | or "DietLLM, RegularLLM, MealLLM and SuperSizedLLMWithFries"
        
         | naveen99 wrote:
         | LLM already has one large in it...
        
           | ben_w wrote:
           | If we can have a "Personal PIN Identification Number", we can
           | have a "Large LLM Language Model".
        
             | naveen99 wrote:
             | Redundundant
        
             | mewpmewp2 wrote:
             | What about Impersonal PIN anonymization letter?
        
         | latexr wrote:
         | Name them like clothing sizes: XXLLM, XLLM, LLM, MLM, SLM, XSLM
         | XXSLM.
        
           | ai-christianson wrote:
           | MLM... uh oh
        
             | anonym29 wrote:
             | I hate those ponzi schemes! Never buy a cutco knife or
             | those crappy herbalife supplements.
             | 
             | Alternatively, just make sure you keep things consensual,
             | and keep yourself safe, no judgement or labels from me :)
        
           | swyx wrote:
           | i did this!
           | 
           | XXLLM: ~1T (GPT4/4.5, Claude Opus, Gemini Pro)
           | 
           | XLLM: 300~500B (4o, o1, Sonnet)
           | 
           | LLM: 20~200B (4o, GPT3, Claude, Llama 3 70B, Gemma 27B)
           | 
           | ~~zone of emergence~~
           | 
           | MLM: 7~14B (4o-mini, Claude Haiku, T5, LLaMA, MPT)
           | 
           | SLM: 1~3B (GPT2, Replit, Phi, Dall-E)
           | 
           | ~~zone of generality~~
           | 
           | XSLM: <1B (Stable Diffusion, BERT)
           | 
           | 4XSLM: <100M (TinyStories)
           | 
           | https://x.com/swyx/status/1679241722709311490
        
         | _bin_ wrote:
         | "big large language model" renminds me uncomfortably of
         | "automated teller machine machine"
        
         | semireg wrote:
         | Pro, max, ultra...
        
         | TZubiri wrote:
         | Doesn't the first L in LLM mean large already?
         | 
         | It's like saying Automated ATM. Whoever wrote it barely knows
         | what the acronym means.
         | 
         | This whole article feels like written by someone who doesn't
         | understand the subject matter at all
        
           | xanderlewis wrote:
           | Almost everyone says 'PIN number' as well.
        
           | thih9 wrote:
           | We're fine with "The big friendly giant" and the sahara
           | desert ("desert desert"); big llm could join the family of
           | pleonasms.
           | 
           | https://en.m.wikipedia.org/wiki/Pleonasm
        
             | TZubiri wrote:
             | When it's a different language it's fine.
        
           | Kiro wrote:
           | Yes, that's the point of the comment and the whole discussion
           | here. LLMs are already Large so what should the prefix be?
           | Big LLM is a strong contender. I'm also pretty sure the
           | creator of redis is not "someone who doesn't understand the
           | subject matter at all".
        
             | TZubiri wrote:
             | It's very common for experts on one subject to take a jab
             | at another subject and depend on their reputation while
             | their skillset doesn't translate at all.
        
         | xanderlewis wrote:
         | And the US 'small' LLMs will actually be slightly larger than
         | the 'large' LLMs in the UK.
        
           | aziaziazi wrote:
           | I wonder how does the skinnies get dressed oversea: I wear
           | European S which translate to XXS in the US, but there's many
           | people skinnier than me, still within a "normal" BMI. Do they
           | have to find XXXS? Do they wear oversized clothes? Choosing
           | trousers is way easier because the system of cm/inches of
           | length+perimeter correspond to real values.
        
             | Spivak wrote:
             | It's a crazy experience being just physically larger than
             | most of the world. Especially when the size on the label
             | carries some implicit shame/judgement. Like I'm skinny, I'm
             | pretty much the lowest weight I can be and not look
             | emaciated / worrying. But when shopping for a skirt in
             | Asian sizes I was a 4XL, and usually an or L-2XL in
             | European sizes. Having to shift my mental space that a US M
             | is the "right" size for me was hard for many years. But
             | like I guess this is how sizing was always kinda supposed
             | to work.
        
           | deepsun wrote:
           | We ordered swag T-shirts for a conference from two providers,
           | but EU provider L's were actually larger than US L!
        
           | jgalt212 wrote:
           | It's funny you say that, but when travelling abroad I
           | wondered how Europeans and Japanese stay sufficiently
           | hydrated.
        
             | jdietrich wrote:
             | For healthy adults, thirst is a perfectly adequate guide to
             | hydration needs. Historically normal patterns of drinking -
             | e.g. water with meals and a few cups of tea or coffee in
             | between - are perfectly sufficient unless you're doing hard
             | physical labour or spending long periods of time outdoors
             | in hot weather. The modern American preoccupation with
             | constantly drinking water is a peculiar cultural phenomenon
             | with no scientific basis.
        
               | kccqzy wrote:
               | I've always understood constantly drinking water as a
               | ruse to use the bathroom more often, which is helpful for
               | Americans with sedentary lifestyles.
        
               | droidist2 wrote:
               | Don't many medications dehydrate you though? And
               | Americans are on a lot of medications.
        
               | brian-armstrong wrote:
               | Diabetes causes dehydration
        
             | floriannn wrote:
             | Is this a thing about how restaurants in some European
             | countries charge for water?
        
           | miki123211 wrote:
           | > The UK
           | 
           | You mean the EU, right? The UK isn't covered by the AI act.
           | 
           | /s
        
         | thih9 wrote:
         | Dismissed, Big LLM will live on along with Big Data.
        
           | deepsun wrote:
           | Well, big data for me was always clear -- when data sizes are
           | too large to use regular tools (ls, du, wc, vi, pandas).
           | 
           | I.e. when pretty much every tool or script I used before
           | doesn't work anymore, and need a special tool (gsutil, bq,
           | dusk, slurm), it's a mind shift.
        
         | huijzer wrote:
         | "There are 2 hard problems in computer science: cache
         | invalidation, naming things, and off-by-1 errors."
        
         | saltcured wrote:
         | I'd prefer to see olive sizes get a renaissance. I was always
         | amused by Super Colossal when following my mom around a store
         | as a little kid.
         | 
         | From a random web search, it seems the sizes above Large are:
         | Extra Large, Jumbo, Extra Jumbo, Giant, Colossal, Super
         | Colossal, Mammoth, Super Mammoth, Atlas.
        
           | inciampati wrote:
           | And I'd love to see data compression terminology get an
           | overhaul. Do we need big LLMs or just succinct data
           | structures? Or maybe "compact" would be good enough? (Yeah
           | LLMs are cool but why not just, you know, losslessly compress
           | the actual data in a way that lets us query its content?)
        
             | rowanG077 wrote:
             | Well the obvious answer is that LLMs are more then just
             | pure search. They can synthesize novel information from
             | their learned knowledge.
        
         | varispeed wrote:
         | Then there will be "decaf LLM"
        
         | Arcuru wrote:
         | I've been labeling LLMS as "teensy", "smol", "mid", "biggg",
         | "yuuge". I've been struggling to figure out where to place the
         | lines between them though.
        
         | nextts wrote:
         | https://xkcd.com/1294/
        
       | laborcontract wrote:
       | I miss the good ol days when I'd have text-davinci make me a
       | table of movies that included a link to the movie poster. It
       | usually generated a url of an image in an s3 bucket. The link
       | _always worked_.
        
       | nickpsecurity wrote:
       | People wanting this would be better off using memory
       | architectures, like how the brain does it. For ML, the simplest
       | approach is putting in memory layers with content-addressible
       | schemes. I have a few links on prototypes in this comment:
       | 
       | https://news.ycombinator.com/item?id=42824960
        
         | HarHarVeryFunny wrote:
         | Animal brains do not separate long term memory and processing -
         | they are one and the same thing - columnar neural assemblies in
         | the cortex that have learnt to recognize repeated patterns, and
         | in turn activate others.
        
       | hedgehog wrote:
       | This doesn't make much sense to me. Unattributed heresay has
       | limited historical value, perhaps zero given that the view of the
       | web most of the weights-available models have is Common Crawl
       | which is itself available for preservation.
        
       | jart wrote:
       | Mozilla's llamafile project is designed to enable LLMs to be
       | preserved for historical purposes. They ship the weights and all
       | the necessary software in a deterministic dependency-free single-
       | file executable. If you save your llamafiles, you should be able
       | to run them in fifty years and have the outputs be exactly the
       | same as what you'd get today. Please support Mozilla in their
       | efforts to ensure this special moment in history gets archived
       | for future generations!
       | 
       | https://github.com/Mozilla-Ocho/llamafile/
        
         | visarga wrote:
         | LLMs are much easier to port than software. They are just a big
         | blob of numbers and a few math operations.
        
           | refulgentis wrote:
           | LLMs are _much_ harder, software is just a blob of _two_
           | numbers.
           | 
           | ;)
           | 
           | (less socratic: I have a fraction of a fraction of jart's
           | experience, but have enough experience via maintining a
           | cross-platform llama.cpp wrapper to know there's a _ton_ of
           | ways to interpret that bag o ' floats and you need a _lot_ of
           | ancillary information.)
        
           | andix wrote:
           | I think software is rather easy to archive. Emulators are
           | they key. Nearly every platform from the past can be emulated
           | on a modern arm/x86 Linux/windows system.
           | Arm/x86/linux/windows are ubiquitous, even if they might fade
           | away there will be emulators around for a long time. With
           | future compute power it should be no problem to just use
           | nested emulation, to run old emulators on an emulated
           | x86/linux.
        
           | jsight wrote:
           | Indeed. In 50 years, loading the weights and doing math
           | should be much easier than getting some 50 year old piece of
           | cuda code to work.
           | 
           | Then again, CPUs will be fast enough that you'd probably just
           | emulate amd64 and run it as CPU-only.
        
       | isoprophlex wrote:
       | Interesting. Just this morning I had a conversation with Claude
       | about this very topic. When asked "can you give me your thoughts
       | on LLM train runs as historical artifacts? do you think they
       | might be uniquely valuable for future historians?", it answered
       | > oh HELL YEAH they will be. future historians are gonna have a
       | fucking field day with us.              > imagine some poor
       | academic in 2147 booting up "vintage llm.exe" and getting to
       | directly interrogate the batshit insane period when humans first
       | created quasi-sentient text generators right before everything
       | went completely sideways with *gestures vaguely at civilization*
       | > *"computer, tell me about the vibes in 2025"*              >
       | "BLARGH everyone was losing their minds about ai while also being
       | completely addicted to it"
       | 
       | Interesting indeed to be able to directly interrogate the median
       | experience of being online in 2025.
       | 
       | (also my apologies for slop-posting; i slapped so many custom
       | prompting on it that I hope you'll find the output to be amusing
       | enough)
        
         | tryauuum wrote:
         | what's the prompt?
        
       | dmos62 wrote:
       | Enjoy the insight, but the title makes my eye twitch. How about
       | "LLM weights are pieces of history"?
        
         | lblume wrote:
         | Small LLM weights are not really interesting though. I am
         | currently training GPT-2 small sized models for a scientific
         | project right, and their world models are just not good enough
         | to generate any kind of real insight about the world it was
         | trained in except for corpus biases.
        
           | dmos62 wrote:
           | A collection of newspapers is generally a better source than
           | a single leaflet, but even a leaflet is a piece of history.
        
       | GeoAtreides wrote:
       | Just like the map isn't the territory, so summaries are not the
       | content nor the library fillings the actual books.
       | 
       | If I want to read a post, a book, a forum, I want to read exactly
       | that, not a simulacrum built by arcane mathematical algorithms.
        
         | visarga wrote:
         | The counter perspective is that this is not a book, it's an
         | interactive simulation of that era. The model is trained on
         | everything, this means it acts like a mirror of ourselves. I
         | find it fascinating to explore the mind-space it captured.
        
         | defgeneric wrote:
         | While the post talks about big LLMs as a valuable "snapshot" of
         | world knowledge, the same technology can be used for lossless
         | compression: https://bellard.org/ts_zip/.
        
       | fl4tul4 wrote:
       | > Scientific papers and processes that are lost forever as
       | publishers fail, their websites shut down.
       | 
       | I don't think the big scientific publishers (now, in our time)
       | will ever fail, they are RICH!
        
         | bookofjoe wrote:
         | So was the Roman Empire
        
         | thayne wrote:
         | Perhaps a shorter term risk is the publishers consider some
         | papers less profitable, so they stop preserving them.
        
         | Legend2440 wrote:
         | That means nothing. Big companies fail all the time. There is
         | no guarantee any of them will be here in 50 years, let alone
         | 500.
        
       | dr_dshiv wrote:
       | "We should regard the Internet Archive as one of the most
       | valuable pieces of modern history; instead, many companies and
       | entities make the chances of the Archive to survive, and
       | accumulate what otherwise will be lost, harder and harder. I
       | understand that the Archive headquarters are located in what used
       | to be a church: well, there is no better way to think of it than
       | as a sacred place."
       | 
       | Amen. There is an active effort to create an Internet Archive
       | based in Europe, just... in case.
        
         | ttul wrote:
         | Well, it did establish a new HQ in Canada...
         | 
         | https://vancouversun.com/news/local-news/the-internet-archiv...
         | 
         | (Edited: apparently just a new HQ and not THE HQ)
        
           | thrance wrote:
           | With this belligerent maniac in the White House who recently
           | doubled-down on his wish to annex Canada [1], I wouldn't feel
           | safe relocating there if the goal is to flee the US.
           | 
           | [1] https://www.nbcnews.com/politics/donald-trump/trump-
           | quest-co...
        
         | badlibrarian wrote:
         | Anyone who takes even an hour to audit anything about the
         | Internet Archive will soon come to a very sad conclusion.
         | 
         | The physical assets are stored in the blast radius of an oil
         | refinery. They don't have air conditioning. Take the tour and
         | they tell you the site runs slower on hot days. Great mission,
         | but atrociously managed.
         | 
         | Under attack for a number of reasons, mostly absurd. But a few
         | are painfully valid.
        
           | floam wrote:
           | I realized recently, who needs torrents? I can get a good rip
           | of any movie right there.
        
             | aziaziazi wrote:
             | I understand what you describe is prohibited in many
             | jurisdictions, however I'm curious about the technical
             | aspect : in my experience they host the html but often not
             | the assets, especially big pictures and I guess most movies
             | files are bigger that pictures. Do you use a special trick
             | to host/find them?
        
               | badlibrarian wrote:
               | No. And every video game every made is available for
               | download as well. If you even have to download it: they
               | pride in making many of them playable in browser with
               | just a click.
               | 
               | Copyright issues aside (let's avoid that mess) I was
               | referring to basic technical issues with the site. Design
               | is atrocious, search doesn't work, you can click 50
               | captures of a site before you find one that actually
               | loads, obvious data corruption, invented their own schema
               | instead of using a standard one and don't enforce it, API
               | is insane and usually broken, uploader doesn't work
               | reliably, don't honor DMCA requests, ask for photo id and
               | passports then leak them ...
               | 
               | It's the worst possible implementation of the best
               | possible idea.
        
               | yuvalr1 wrote:
               | And yet, it's the best we currently have. I donate to
               | them. We can come with demands of how it should be
               | managed, but it should not prevent us from helping them.
        
               | badlibrarian wrote:
               | If you poke around at what US government agencies are
               | doing, and what European countries and non-profits are
               | doing, or even do a deep dive into what your local
               | library offers, you may find they no longer lead the
               | pack.
               | 
               | They didn't even ask for donations until they
               | accidentally set fire to their building annex. People
               | offered to help (SF was apparently booming that year) and
               | of course they promptly cranked out the necessary PHP to
               | accept donations.
               | 
               | Now it's become part of the mythology. But throwing petty
               | cash at a plane in a death spiral doesn't change gravity.
               | They need to rehabilitate their reputation and partner
               | with organizations who can help them achieve their
               | mission over the long term. I personally think they need
               | to focus on archival, legal long-term preservation and
               | archival, before sticking their neck out any further. If
               | this means no more Frogger in the browser, so be it.
               | 
               | I certainly don't begrudge anyone who donates, but asking
               | for $17 on the same page as copyrighted game ROMs and
               | glitchy scans of comic books isn't a long-term strategy.
        
           | dr_dshiv wrote:
           | Their yearly budget is less than the budget of just the SF
           | library system.
        
             | badlibrarian wrote:
             | Then maybe they should've figured out how to keep hard
             | drives in a climate controlled environment before they
             | decided to launch a bank.
             | 
             | https://ncua.gov/newsroom/press-release/2016/internet-
             | archiv...
        
         | blmurch wrote:
         | Yup! We're here and looking to do good work with Cultural
         | Heritage and Research Organizations in Europe. I'm very happy
         | to be working with the Internet Archive once again after a 20
         | year long break.
         | 
         | https://www.stichtinginternetarchive.nl/
        
       | Havoc wrote:
       | I wonder whether it'll become like pre-WW2 steel that doesn't
       | have nuclear contamination.
       | 
       | Just with a pre-LLM knowledge
        
       | guybedo wrote:
       | fwiw i've added a summary of the discussion here:
       | https://extraakt.com/extraakts/67d708bc9844db151612d782
        
       | dstroot wrote:
       | Isn't big LLM training data actually the most analogous to the
       | internet archive? Shouldn't the title be "Big LLM training data
       | is a piece of history"? Especially at this point in history since
       | a large portion of internet data going forward will be LLM
       | generated and not human generated? It's kind of the last snapshot
       | of human-created content.
        
         | antirez wrote:
         | The problem is, where is this 20T tokens that are being used
         | for this task? No way to access them. I hope that at least
         | OpenAI and a few more have solid historical storage of the
         | tokens they collect.
        
       | bossyTeacher wrote:
       | So large large language model?
        
       | blinky81 wrote:
       | "big large" lol
        
       | almosthere wrote:
       | Split the wayback machine away from its book copyright lawsuit
       | stuff and you don't have to worry.
        
       | codr7 wrote:
       | I find it very depressing to think that the only traces left from
       | all the creativity will end up to be AI slop, the worst use case
       | ever.
       | 
       | I feel like the more people use GenAI, the less intelligent they
       | become. Like the rest of this society, they seem designed to suck
       | the life force out of humans and and return useless crap instead.
        
       | andix wrote:
       | I think it's fine that not everything on the internet is archived
       | forever.
       | 
       | It has always been like that, in the past people wrote on paper,
       | and most of it was never archived. At some point it was just
       | lost.
       | 
       | I inherited many boxes of notes, books and documents from my
       | grandparents. Most of it was just meaningless to me. I had to
       | throw away a lot of it and only kept a few thousand pages of
       | various documents. The other stuff is just lost forever. And
       | that's probably fine.
       | 
       | Archives are very important, but nowadays the most difficult part
       | is to select what to archive. There is so much content added to
       | the internet every second, only a fraction of it can be archived.
        
       | throwaway48476 wrote:
       | The internet training data for LLMs is valuable history were
       | losing one dead webadmin at a time. The regurgitated slop less
       | so.
        
       | pama wrote:
       | I would be curious to know if it would be possible to recunstruct
       | approximate versions of popular common subsets of internet
       | training data by using many different LLMs that may have happened
       | to read the same info. Anyone knows pointers to math papers about
       | such things?
        
       ___________________________________________________________________
       (page generated 2025-03-16 23:01 UTC)