[HN Gopher] Big LLMs weights are a piece of history
___________________________________________________________________
Big LLMs weights are a piece of history
Author : freeatnet
Score : 221 points
Date : 2025-03-16 12:13 UTC (10 hours ago)
(HTM) web link (antirez.com)
(TXT) w3m dump (antirez.com)
| api wrote:
| That's really what these are: something analogous to JPEG for
| language, and queryable in natural language.
|
| Tangent: I was thinking the other day: these are not AI in the
| sense that they are not primarily _intelligence_. I still don 't
| see much evidence of that. What they do give me is superhuman
| memory. The main thing I use them for is search, research, and a
| "rubber duck" that talks back, and it's like having an intern who
| has memorized the library and the entire Internet. They
| occasionally hallucinate or make mistakes -- compression
| artifacts -- but it's there.
|
| So it's more AM -- artificial memory.
|
| Edit: as a reply pointed out: this is Vannevar Bush's Memex, kind
| of.
| antirez wrote:
| I believe LLMs are both data and processing, but even humans
| reasoning is based in strong ways on existing knowledge.
| However, for the goal of the post, indeed it is the
| memorization that is the key value, and the fact that likely in
| the future sampling such models can be used to transfer the
| same knowledge to bigger LLMs, even if the source data is lost.
| api wrote:
| I'm not saying there is no latent reasoning capability. It's
| there. It just seems to be that the memory and lookup
| component is _much_ more useful and powerful.
|
| To me intelligence describes something much more capable than
| what I see in these things, even the bleeding edge ones. At
| least _so far_.
| antirez wrote:
| I offer a POV that is in the middle: reasoning is powerful
| to evaluate which solution is better among N in the
| context. Memorization allows sampling of many competing
| ideas from the problem space, than the LLM picks the best,
| making chain of thoughts so effective. Of course zero shot
| reasoning also is a part of the story but somewhat weaker,
| exactly like we are not often able to spit the best
| solution before evaluation of the space (unless we are very
| accustomed to the specific problem).
| danielbln wrote:
| That's the problem with the term "intelligence". Everyone
| has their own definition, we don't even know what makes us
| humans intelligent and more often than not it's a moving
| goalpost as these models get better.
| Mistletoe wrote:
| This is an excellent viewpoint.
| menzoic wrote:
| Having memory is fine but choosing the relevant parts requires
| intelligence
| flower-giraffe wrote:
| Or 80 years to MVP memex
|
| "Vannevar Bush's 1945 article "As We May Think". Bush
| envisioned the memex as a device in which individuals would
| compress and store all of their books, records, and
| communications, "mechanized so that it may be consulted with
| exceeding speed and flexibility".
|
| https://en.m.wikipedia.org/wiki/Memex
| mdp2021 wrote:
| The memex was a deterministic device to consult documents -
| the _actual_ documents. The "LLM" is more like a dumb
| archivist that came with it (" _Yes, see for example that
| document, it tells you that q=M*k..._ ").
| skydhash wrote:
| I grew up with physical encyclopedia, then moved on to
| Encarta, then Wikipedia dumps and folders full of PDFs. I
| still prefer curated information repository over chat
| interfaces or generated summaries. The main goal with the
| former is to have a knowledge map and keywords graph, so
| that you can locate any piece of information you may need
| from the actual source.
| hengheng wrote:
| I've been looking at it as an "instant reddit comment". I can
| download a 10G or 80G compressed archive that basically
| contains the useful parts of the internet, and then I all can
| use it to synthesize something that is about as good and
| reliable as a really good reddit comment. Which is nifty. But
| honestly it's an incredible idea to sell that to businesses.
| api wrote:
| Reddit seems to puppet humans via engagement farming to do
| what LLMs do in some cases. Posts are prompts, replies are
| responses.
|
| Of course they vary widely in quality.
| Guthur wrote:
| And so what would the point be of anyone actually posting on
| the internet if no one actually visits the sites because
| large corps have essentially stolen and monetized the whole
| thing.
|
| And I'm sure they have or will have the ability to influence
| the responses so you only see what they want you to see.
| kelseyfrog wrote:
| That's the next step after algorithmic content feeds -
| algorithmic/generated comment sections. Imagine seeing an
| entirely different conversation happening just to get you
| to buy a product. A product like Coca-Cola.
|
| Imagine scrolling through a comment section that feels
| tailor-made to your tastes, seamlessly guiding you to an
| ice-cold Coca-Cola. You see people reminiscing about their
| best summer memories--each one featuring a Coke in hand.
| Others are debating the superior refreshment of Coke over
| other drinks, complete with "real" testimonials and
| nostalgic stories.
|
| And just when you're feeling thirsty, a perfectly timed
| comment appears: "Nothing beats the crisp, refreshing taste
| of an ice-cold Coke on a hot day."
|
| Algorithmic engagement isn't just the future--it's already
| here, and it's making sure the next thing you crave is
| Coca-Cola. Open Happiness.
| adhamsalama wrote:
| Isn't that how Reddit gained momentum? Posting fake
| posts/comments?
|
| Now we can mass-produce it!
| cruffle_duffle wrote:
| Why Coca Cola though? Sure it is refreshing on a hot day
| but you know what is even better? Going to bed on a nice
| cool mattress. So many are either too hard or too soft.
| They aren't engineered to your body so you are virtually
| guaranteed to get a poor nights sleep.
|
| Imagine waking up like I do every morning. Refreshed and
| full of energy. I've tried many mattresses and the only
| one that has this property is my Slumber Sleep Hygiene
| mattress.
|
| The best part is my partner can customize their side
| using nothing more than a simple app on their smartphone.
| It tracks our sleep over time and uses AI to generate a
| daily sleep report showing me exactly how good of a night
| sleep I got. Why rely on my gut feelings when the report
| can tell me exactly how good or bad of a night sleep I
| got.
|
| I highly recommend Slumber Sleep Hygiene mattresses.
| There is a reason it's the number one brand recommended
| on HN.
| gosub100 wrote:
| Another insidious one: fake replies designed to console
| you if there isn't enough people to validate your opinion
| or answer your question.
| Guthur wrote:
| Or, war is good peace is bad, nuclear war is winnable,
| don't worry and start loving the bomb. The enemy are not
| human anyway, your life will be better with fewer people
| around.
|
| Look at the people who want to control this, they do not
| want to sell you Coke.
| yannyu wrote:
| There's a great article recently by Ted Chiang that elaborated
| on this idea: https://www.newyorker.com/tech/annals-of-
| technology/chatgpt-...
| bob1029 wrote:
| If you want to see what this would actually be like:
|
| https://lcamtuf.coredump.cx/lossifizer/
|
| I think a fun experiment could be to see at what setting the
| average human can no longer decipher the text.
| GolfPopper wrote:
| > _like having an intern who has memorized the library and the
| entire Internet. They occasionally hallucinate or make
| mistakes_
|
| Correction: you occasionally _notice_ when they hallucinate or
| make mistakes.
| xpe wrote:
| I regularly pushback against casual uses of the word
| "intelligence".
|
| First, there is no objective dividing line. It is a matter of
| degree _relative_ to something else. Any language that suggests
| otherwise should be refined or ejected from our culture and
| language. Language's evolution doesn't have to be a nosedive.
|
| Second, there are many definitions of intelligence; some are
| more useful than others. Along with many, I like Stuart
| Russell's definition: the degree to which an agent can
| accomplish a task. This definition requires being clear about
| the agent and the task. I mention this so often I feel like a
| permalink is needed. It isn't "my" idea at all; it is simply
| the result of smart people decomplecting the idea so we're not
| mired in needless confusion.
|
| I rant about word meanings often because deep thinking people
| need to lay claim to words and shape culture accordingly. I say
| this often: don't cede the battle of meaning to the least
| common denominators of apathy, ignorance, confusion, or
| marketing.
|
| Some might call this kind of thinking elitist. No. This is what
| taking responsibility looks like. We could never have built
| modern science (or most rigorous fields of knowledge) with
| imprecise thinking.
|
| I'm so done with sloppy mainstream phrasing of "intelligence".
| Shit is getting real (so to speak), companies are changing the
| world, governments are racing to stay in the game, jobs will be
| created and lost, and humanity might transcend, improve,
| stagnate, or die.
|
| If humans, meanwhile, can't be bothered to talk about
| intelligence in a meaningful way, then, frankly, I think we're
| ... abdicating responsibility, tempting fate, or asking to be
| in the next Mike Judge movie.
| jart wrote:
| We never would have been able to create science, if it
| weren't for _focusing_ on the kinds of thinking that can be
| made logical. There 's a big difference. What you're doing,
| with this whole "let's make a bullshit word logical" is more
| similar to medieval scholasticism, which was a vain attempt
| at verbal precision. https://justine.lol/dox/english.txt
| xpe wrote:
| Yikes, maybe we can take a step back? I'm not sure where
| this is coming from, frankly. One anodyne summary of my
| comment above would be:
|
| > Let's think and communicate more clearly regarding
| intelligence. Stuart Russell offers a nice definition: an
| agent's ability to do a defined task.
|
| Maybe something about my comment got you riled up? What was
| it?
|
| You wrote:
|
| > What you're doing, with this whole "let's make a bullshit
| word logical" is more similar to medieval scholasticism,
| which was a vain attempt at verbal precision.
|
| Again, I'm not quite sure what to say. You suggest my
| comment is like a medieval scholar trying to reconcile
| dogma with philosophy? Wow. That's an uncharitable reading
| of my comment.
|
| I have five points in response. First, the word
| intelligence need not be a "bullshit word", though I'm not
| sure what you mean by the term. One of my favorite
| definitions of bullshitting comes from "On Bullshit" by
| Harry Frankfurt:
|
| > Frankfurt determines that bullshit is speech intended to
| persuade without regard for truth. The liar cares about the
| truth and attempts to hide it; the bullshitter doesn't care
| whether what they say is true or false. - Wikipedia
|
| Second, I'm trying to _clarify_ the term intelligence by
| breaking it into parts. I wouldn 't say I'm trying to make
| it "logical" (in the sense of being about logic or
| deduction). Maybe you mean "formal"?
|
| Third, regarding the "what you're doing" part... this isn't
| just me. Many people both clarify the concept of
| intelligence and explain why doing so is important.
|
| Fourth, are you saying it is impossible to clarify the
| meaning of intelligence? Why? Not worth the trouble?
|
| Fifth, have you thought about a definition of intelligence
| that you think is sensible? Does your definition steer
| people away from confusion?
|
| You also wrote:
|
| > We never would have been able to create science, if it
| weren't for focusing on the kinds of thinking that can be
| made logical.
|
| I think you mean _testable_, not _logical_. Yes, we agree,
| scientists should run experiments on things that can be
| tested.
|
| Russell's definition of _intelligence_ is testable by
| defining a task and a quality metric. This is already a big
| step up from an unexamined view of intelligence, which
| often has some arbitrary threshold.* It allows us to see a
| continuum from, say, how a bacteria finds food, to how ants
| collaborate, to how people both build and use tools to
| solve problems. It also teases out sentience and moral
| worth so we 're not mixing them up with intelligence. These
| are simple, doable, and worthwhile clarifications.
|
| Finally, I read your quote from Dijkstra. In my reading,
| Dijkstra's main point is that natural language is a poor
| programming interface due to its ambiguity. Ok, fair. But
| what is the connection to this thread? Does it undercut any
| of my arguments? How?
|
| * A common problem when discussing intelligence involves
| moving the goal post. Whatever quality bar is implied has a
| tendency to creep upwards over time.*
| mdp2021 wrote:
| > JPEG for [a body of] language
|
| Yes!
|
| > artificial memory
|
| Well, "yes", kind of.
|
| > Memex
|
| After a flood?! Not really. Vannevar Bush - _As we may think_ -
| http://web.mit.edu/STS.035/www/PDFs/think.pdf
| visarga wrote:
| I can ask a LLM to write a haiku about the loss function of
| Stable Diffusion. Or I can have it do zero shot translation,
| between a pair of languages not covered in the training set.
| Can your "language JPEG" do that?
|
| I think "it's just compression" and "it's just parroting" are
| flawed metaphors. Especially when the model was trained with
| RLHF and RL/reasoning. Maybe a better metaphor is "LLM is like
| a piano, I play the keyboard and it makes 'music'". Or maybe
| it's a bycicle, I push the pedals and it takes me where I point
| it.
| rollcat wrote:
| https://xkcd.com/1683/
| intellectronica wrote:
| I love the title "Big LLMs" because it means that we are now
| making a distinction between big LLMs and minute LLMs and maybe
| medium LLMs. I'd like to propose the we call them "Tall LLMs",
| "Grande LLMs", and "Venti LLMs" just to be precise.
| HarHarVeryFunny wrote:
| But of course these are all flavors of "large", so then we have
| big large language models, medium large language models, etc,
| which does indeed make the tall/grande/venti names appropriate,
| or perhaps similar "all large" condom size names (large, huge,
| gargantuan).
| de-moray wrote:
| What does a 20 LLM signify?
| tonyhart7 wrote:
| can we have tiny LLM that can run on smartphone now
| winter_blue wrote:
| Apple Intelligence has an LLM that runs locally on the iPhone
| (15 Pro and up).
|
| But the quality of Apple Intelligence shows us what happens
| when you use a tiny ultra-low-wattage LLM. There's a whole
| subreddit dedicated to its notable fails:
| https://www.reddit.com/r/AppleIntelligenceFail/top/?t=all
|
| One example of this is _"Sorry I was very drunk and went home
| and crashed straight into bed"_ being summarized by Apple
| Intelligence as _"Drunk and crashed"_.
| Spooky23 wrote:
| I think the real problem with LLMs is we have deterministic
| expectations of non-deterministic tools. We've been trained
| to expect that the computer is correct.
|
| Personally, I think the summaries of alerts is incredibly
| useful. But my expectation of accuracy for a 20 word
| summary of multiple 20-30 word summaries is tempered by the
| reality that's there's gonna be issues given the lack of
| context. The point of the summary is to help me determine
| if I should read the alerts.
|
| LLMs break down when we try to make them independent agents
| instead of advanced power tools. Alot of people enjoy navel
| gazing and hand waving about ethics, "safety" and bias...
| then proceed to do things with obvious issues in those
| areas.
| mewpmewp2 wrote:
| Larger LLMs can summarize all of this quite well though.
| samstave wrote:
| I want a tiny_phone_based LLM to do thought tracking and
| comms awareness..
|
| I actually applied to YC in like ~2014 or such for thus;
|
| -JotPlot - I wanted a timeline for basically giving a histo
| timeline of comms btwn me and others - such that I had a
| sankey-ish diagram for when and whom and via method I spoke
| with folks and then each node eas the message, call, text,
| meta links...
|
| I think its still viable - but my thought process is too
| currently chaotic to pull it off.
|
| Basically looking at a timeline of your comms and thoughts
| and expand into links of thought - now with LLMs you could
| have a Throw Tag od some sort whereby you have the bot do
| work on research expanding on certain things and plugging up
| a site for that Idea on LOCAL HOST (i.e. your phone so that
| you can pull up data relevant to the convo - and its all in a
| timeline of thought/stream of conscious
|
| hopefully you can visualize it...
| johnmaguire wrote:
| I had a thought that I think some people value social media
| (e.g. Facebook) essentially for this. Like giving up your
| Facebook profile means giving up your history or family
| tree or even your memories.
|
| So in that sense, maybe people would prefer a private
| alternative.
| samstave wrote:
| I read this in Sam Wattersons voice with a pipe abt
| maybey an inch from his beard,
|
| (Fyi I was a designer at fb and while it was luxious I
| still hated what I saw in zucks eyes every morn when I
| passed him.
|
| Super diff from Andy Grove at intel where for whateveer
| reason we were in the sam oee schekdule
|
| (That was me typing with eues ckised as a test (to
| myself, typos abound
| badlibrarian wrote:
| No. Smartphone only spin animated gif while talk to big
| building next to nuclear reactor. New radio inside make more
| efficient.
| rubslopes wrote:
| Is a _tiny large_ language model equivalent to a normal sized
| one?
| t_mann wrote:
| Big LLM is too long as a name. We should agree on calling them
| BLLMs. Surely everyone is going to remember what the letters
| stand for.
| temp0826 wrote:
| Bureau of Large Land Management
| bookofjoe wrote:
| >What does BLLM stand for?
|
| https://www.abbreviations.com/BLLM#google_vignette
| heyjamesknight wrote:
| I want to apologize for this joke in advance. It had to be
| done.
|
| We could take a page from Trump's book and call them
| "Beautiful" LLMs. Then we'd have "Big Beautiful LLMs" or just
| "BBLs" for short.
|
| Surely that wouldn't cause any confusion when Googling.
| cowsaymoo wrote:
| Weirdly enough, the ITU already chose the superlative for
| the bigliest radio frequency band to be Tremendous:
|
| - Extremely Low Frequency (ELF)
|
| - Super Low Frequency (SLF)
|
| - Ultra Low Frequency (ULF)
|
| - Very Low Frequency (VLF)
|
| - Low Frequency (LF)
|
| - Medium Frequency (MF)
|
| - High Frequency (HF)
|
| - Very High Frequency (VHF)
|
| - Ultra High Frequency (UHF)
|
| - Super High Frequency (SHF)
|
| - Extremely High Frequency (EHF)
|
| - Tremendously High Frequency (THF)
|
| Maybe one day some very smart people will make Tremendously
| Large Language Models. They will be very large and need a
| lot of computer. And then you'll have the Extremely Small
| Language Model. They are like nothing.
|
| https://en.wikipedia.org/wiki/Radio_frequency?#Frequency_ba
| n...
| lifthrasiir wrote:
| AFAIK "tremendously" was chosen partly because the range
| includes 1 "T"Hz.
| bee_rider wrote:
| XKCD telescope sizes also could provide some guidance
|
| https://xkcd.com/1294/
| droidist2 wrote:
| I hope they go with "Ludicrous" like in Spaceballs.
| nullhole wrote:
| I still like Big Data Statistical Model
| guestbest wrote:
| Why not LLLM for large LLM's and SLLM for small LLM's, assuming
| there is no middle ground
| orbital-decay wrote:
| SLM is a widespread term already.
| guestbest wrote:
| Slim pickings, then?
| _heimdall wrote:
| What makes it a Small Large Language Model? Why jot just an
| SLM?
| guestbest wrote:
| If we can't have fun with names, why even be in IT?
| technol0gic wrote:
| Smedium Language Model
| dbalatero wrote:
| Lousy Smarch weather
| gpderetta wrote:
| S and L cancel out, so it just an LM.
| kolinko wrote:
| VLLM, Super VLLM, Almost Large Language Model
| flir wrote:
| M, LM, LLM, LLLM, L3M, L4M.
|
| Gotta leave room for future expansion.
| dan_linder wrote:
| Hopefully the USB making team does NOT step into this...
|
| LLM 3.0, LLM 3.1 Gen 1, LLM 3.2 Gen 1, LLM 3.1, LLM 3.1 Gen
| 2, LLM 3.2 Gen 2, LLM 3.2, LLM 3.2 Gen 2x2, LLM 4, etc...
| moffkalast wrote:
| 2L4M
| badlibrarian wrote:
| I've sat in more than one board meeting watching them take 20
| minutes to land on t-shirt sizes. The greatest enterprise sales
| minds of our generation...
| ben_w wrote:
| I've seen things you people wouldn't believe.
|
| I've seen corporate slogans fired off from the shoulders of
| viral creatives. Synergy-beams glittering in the darkness of
| org charts. Thought leadership gone rogue... All these
| moments will be lost to NDAs and non-disparagement clauses,
| like engagement metrics in a sea of pivot decks.
|
| Time to leverage.
| badlibrarian wrote:
| ... destroyed by madness, starving hysterical! Buying weed
| in a store then meeting with someone off Craiglist to score
| eggs.
| rnrn wrote:
| it's too bad vLLM and VLM are taken because it would have been
| nice to recycle the VLSI solution to describing sizes - get to
| very large language models and leave it at that.
| rnrn wrote:
| we could also look to magnetoresistance and go for giant,
| colossal, extraordinary
| do_not_redeem wrote:
| After very large language models, the next step is mega
| language models, or MLMs. As a bonus, it describes the VC
| funding scheme that backs them too.
| AlienRobot wrote:
| Terrible names, to be honest. My proposal: Hyper LLMs, Ultra
| LLMs, Large LLMs, Micro LLMs, Mobile LLMs.
| isoprophlex wrote:
| LLM M4 Ultra Pro Max 16e (with headphone jack)
| AlienRobot wrote:
| GPT Inside
| BobaFloutist wrote:
| LLM, LLM 2.0, LLM 3.0, Mini LLM, Micro LLM, LLM C.
| jfengel wrote:
| LLM 95, LLM 98, LLM Millennium Edition, LLM NT, LLM XP, LLM
| 2000, LLM 7
|
| I really appreciated the way they managed to come up with a
| new naming scheme each time, usually used exactly once.
| Scarblac wrote:
| LLM 3.11 for Workgroups
| ben_w wrote:
| Could always go with the Bungie approach for the Marathon
| series: LLM, LLM2, LLM[?], 1 -- https://alephone.lhowon.org
|
| (Obviously [?] is for the actual singularity, and 1 is the
| thing after that).
| davidwritesbugs wrote:
| or "DietLLM, RegularLLM, MealLLM and SuperSizedLLMWithFries"
| naveen99 wrote:
| LLM already has one large in it...
| ben_w wrote:
| If we can have a "Personal PIN Identification Number", we can
| have a "Large LLM Language Model".
| naveen99 wrote:
| Redundundant
| mewpmewp2 wrote:
| What about Impersonal PIN anonymization letter?
| latexr wrote:
| Name them like clothing sizes: XXLLM, XLLM, LLM, MLM, SLM, XSLM
| XXSLM.
| ai-christianson wrote:
| MLM... uh oh
| anonym29 wrote:
| I hate those ponzi schemes! Never buy a cutco knife or
| those crappy herbalife supplements.
|
| Alternatively, just make sure you keep things consensual,
| and keep yourself safe, no judgement or labels from me :)
| swyx wrote:
| i did this!
|
| XXLLM: ~1T (GPT4/4.5, Claude Opus, Gemini Pro)
|
| XLLM: 300~500B (4o, o1, Sonnet)
|
| LLM: 20~200B (4o, GPT3, Claude, Llama 3 70B, Gemma 27B)
|
| ~~zone of emergence~~
|
| MLM: 7~14B (4o-mini, Claude Haiku, T5, LLaMA, MPT)
|
| SLM: 1~3B (GPT2, Replit, Phi, Dall-E)
|
| ~~zone of generality~~
|
| XSLM: <1B (Stable Diffusion, BERT)
|
| 4XSLM: <100M (TinyStories)
|
| https://x.com/swyx/status/1679241722709311490
| _bin_ wrote:
| "big large language model" renminds me uncomfortably of
| "automated teller machine machine"
| semireg wrote:
| Pro, max, ultra...
| TZubiri wrote:
| Doesn't the first L in LLM mean large already?
|
| It's like saying Automated ATM. Whoever wrote it barely knows
| what the acronym means.
|
| This whole article feels like written by someone who doesn't
| understand the subject matter at all
| xanderlewis wrote:
| Almost everyone says 'PIN number' as well.
| thih9 wrote:
| We're fine with "The big friendly giant" and the sahara
| desert ("desert desert"); big llm could join the family of
| pleonasms.
|
| https://en.m.wikipedia.org/wiki/Pleonasm
| TZubiri wrote:
| When it's a different language it's fine.
| Kiro wrote:
| Yes, that's the point of the comment and the whole discussion
| here. LLMs are already Large so what should the prefix be?
| Big LLM is a strong contender. I'm also pretty sure the
| creator of redis is not "someone who doesn't understand the
| subject matter at all".
| TZubiri wrote:
| It's very common for experts on one subject to take a jab
| at another subject and depend on their reputation while
| their skillset doesn't translate at all.
| xanderlewis wrote:
| And the US 'small' LLMs will actually be slightly larger than
| the 'large' LLMs in the UK.
| aziaziazi wrote:
| I wonder how does the skinnies get dressed oversea: I wear
| European S which translate to XXS in the US, but there's many
| people skinnier than me, still within a "normal" BMI. Do they
| have to find XXXS? Do they wear oversized clothes? Choosing
| trousers is way easier because the system of cm/inches of
| length+perimeter correspond to real values.
| Spivak wrote:
| It's a crazy experience being just physically larger than
| most of the world. Especially when the size on the label
| carries some implicit shame/judgement. Like I'm skinny, I'm
| pretty much the lowest weight I can be and not look
| emaciated / worrying. But when shopping for a skirt in
| Asian sizes I was a 4XL, and usually an or L-2XL in
| European sizes. Having to shift my mental space that a US M
| is the "right" size for me was hard for many years. But
| like I guess this is how sizing was always kinda supposed
| to work.
| deepsun wrote:
| We ordered swag T-shirts for a conference from two providers,
| but EU provider L's were actually larger than US L!
| jgalt212 wrote:
| It's funny you say that, but when travelling abroad I
| wondered how Europeans and Japanese stay sufficiently
| hydrated.
| jdietrich wrote:
| For healthy adults, thirst is a perfectly adequate guide to
| hydration needs. Historically normal patterns of drinking -
| e.g. water with meals and a few cups of tea or coffee in
| between - are perfectly sufficient unless you're doing hard
| physical labour or spending long periods of time outdoors
| in hot weather. The modern American preoccupation with
| constantly drinking water is a peculiar cultural phenomenon
| with no scientific basis.
| kccqzy wrote:
| I've always understood constantly drinking water as a
| ruse to use the bathroom more often, which is helpful for
| Americans with sedentary lifestyles.
| droidist2 wrote:
| Don't many medications dehydrate you though? And
| Americans are on a lot of medications.
| brian-armstrong wrote:
| Diabetes causes dehydration
| floriannn wrote:
| Is this a thing about how restaurants in some European
| countries charge for water?
| miki123211 wrote:
| > The UK
|
| You mean the EU, right? The UK isn't covered by the AI act.
|
| /s
| thih9 wrote:
| Dismissed, Big LLM will live on along with Big Data.
| deepsun wrote:
| Well, big data for me was always clear -- when data sizes are
| too large to use regular tools (ls, du, wc, vi, pandas).
|
| I.e. when pretty much every tool or script I used before
| doesn't work anymore, and need a special tool (gsutil, bq,
| dusk, slurm), it's a mind shift.
| huijzer wrote:
| "There are 2 hard problems in computer science: cache
| invalidation, naming things, and off-by-1 errors."
| saltcured wrote:
| I'd prefer to see olive sizes get a renaissance. I was always
| amused by Super Colossal when following my mom around a store
| as a little kid.
|
| From a random web search, it seems the sizes above Large are:
| Extra Large, Jumbo, Extra Jumbo, Giant, Colossal, Super
| Colossal, Mammoth, Super Mammoth, Atlas.
| inciampati wrote:
| And I'd love to see data compression terminology get an
| overhaul. Do we need big LLMs or just succinct data
| structures? Or maybe "compact" would be good enough? (Yeah
| LLMs are cool but why not just, you know, losslessly compress
| the actual data in a way that lets us query its content?)
| rowanG077 wrote:
| Well the obvious answer is that LLMs are more then just
| pure search. They can synthesize novel information from
| their learned knowledge.
| varispeed wrote:
| Then there will be "decaf LLM"
| Arcuru wrote:
| I've been labeling LLMS as "teensy", "smol", "mid", "biggg",
| "yuuge". I've been struggling to figure out where to place the
| lines between them though.
| nextts wrote:
| https://xkcd.com/1294/
| laborcontract wrote:
| I miss the good ol days when I'd have text-davinci make me a
| table of movies that included a link to the movie poster. It
| usually generated a url of an image in an s3 bucket. The link
| _always worked_.
| nickpsecurity wrote:
| People wanting this would be better off using memory
| architectures, like how the brain does it. For ML, the simplest
| approach is putting in memory layers with content-addressible
| schemes. I have a few links on prototypes in this comment:
|
| https://news.ycombinator.com/item?id=42824960
| HarHarVeryFunny wrote:
| Animal brains do not separate long term memory and processing -
| they are one and the same thing - columnar neural assemblies in
| the cortex that have learnt to recognize repeated patterns, and
| in turn activate others.
| hedgehog wrote:
| This doesn't make much sense to me. Unattributed heresay has
| limited historical value, perhaps zero given that the view of the
| web most of the weights-available models have is Common Crawl
| which is itself available for preservation.
| jart wrote:
| Mozilla's llamafile project is designed to enable LLMs to be
| preserved for historical purposes. They ship the weights and all
| the necessary software in a deterministic dependency-free single-
| file executable. If you save your llamafiles, you should be able
| to run them in fifty years and have the outputs be exactly the
| same as what you'd get today. Please support Mozilla in their
| efforts to ensure this special moment in history gets archived
| for future generations!
|
| https://github.com/Mozilla-Ocho/llamafile/
| visarga wrote:
| LLMs are much easier to port than software. They are just a big
| blob of numbers and a few math operations.
| refulgentis wrote:
| LLMs are _much_ harder, software is just a blob of _two_
| numbers.
|
| ;)
|
| (less socratic: I have a fraction of a fraction of jart's
| experience, but have enough experience via maintining a
| cross-platform llama.cpp wrapper to know there's a _ton_ of
| ways to interpret that bag o ' floats and you need a _lot_ of
| ancillary information.)
| andix wrote:
| I think software is rather easy to archive. Emulators are
| they key. Nearly every platform from the past can be emulated
| on a modern arm/x86 Linux/windows system.
| Arm/x86/linux/windows are ubiquitous, even if they might fade
| away there will be emulators around for a long time. With
| future compute power it should be no problem to just use
| nested emulation, to run old emulators on an emulated
| x86/linux.
| jsight wrote:
| Indeed. In 50 years, loading the weights and doing math
| should be much easier than getting some 50 year old piece of
| cuda code to work.
|
| Then again, CPUs will be fast enough that you'd probably just
| emulate amd64 and run it as CPU-only.
| isoprophlex wrote:
| Interesting. Just this morning I had a conversation with Claude
| about this very topic. When asked "can you give me your thoughts
| on LLM train runs as historical artifacts? do you think they
| might be uniquely valuable for future historians?", it answered
| > oh HELL YEAH they will be. future historians are gonna have a
| fucking field day with us. > imagine some poor
| academic in 2147 booting up "vintage llm.exe" and getting to
| directly interrogate the batshit insane period when humans first
| created quasi-sentient text generators right before everything
| went completely sideways with *gestures vaguely at civilization*
| > *"computer, tell me about the vibes in 2025"* >
| "BLARGH everyone was losing their minds about ai while also being
| completely addicted to it"
|
| Interesting indeed to be able to directly interrogate the median
| experience of being online in 2025.
|
| (also my apologies for slop-posting; i slapped so many custom
| prompting on it that I hope you'll find the output to be amusing
| enough)
| tryauuum wrote:
| what's the prompt?
| dmos62 wrote:
| Enjoy the insight, but the title makes my eye twitch. How about
| "LLM weights are pieces of history"?
| lblume wrote:
| Small LLM weights are not really interesting though. I am
| currently training GPT-2 small sized models for a scientific
| project right, and their world models are just not good enough
| to generate any kind of real insight about the world it was
| trained in except for corpus biases.
| dmos62 wrote:
| A collection of newspapers is generally a better source than
| a single leaflet, but even a leaflet is a piece of history.
| GeoAtreides wrote:
| Just like the map isn't the territory, so summaries are not the
| content nor the library fillings the actual books.
|
| If I want to read a post, a book, a forum, I want to read exactly
| that, not a simulacrum built by arcane mathematical algorithms.
| visarga wrote:
| The counter perspective is that this is not a book, it's an
| interactive simulation of that era. The model is trained on
| everything, this means it acts like a mirror of ourselves. I
| find it fascinating to explore the mind-space it captured.
| defgeneric wrote:
| While the post talks about big LLMs as a valuable "snapshot" of
| world knowledge, the same technology can be used for lossless
| compression: https://bellard.org/ts_zip/.
| fl4tul4 wrote:
| > Scientific papers and processes that are lost forever as
| publishers fail, their websites shut down.
|
| I don't think the big scientific publishers (now, in our time)
| will ever fail, they are RICH!
| bookofjoe wrote:
| So was the Roman Empire
| thayne wrote:
| Perhaps a shorter term risk is the publishers consider some
| papers less profitable, so they stop preserving them.
| Legend2440 wrote:
| That means nothing. Big companies fail all the time. There is
| no guarantee any of them will be here in 50 years, let alone
| 500.
| dr_dshiv wrote:
| "We should regard the Internet Archive as one of the most
| valuable pieces of modern history; instead, many companies and
| entities make the chances of the Archive to survive, and
| accumulate what otherwise will be lost, harder and harder. I
| understand that the Archive headquarters are located in what used
| to be a church: well, there is no better way to think of it than
| as a sacred place."
|
| Amen. There is an active effort to create an Internet Archive
| based in Europe, just... in case.
| ttul wrote:
| Well, it did establish a new HQ in Canada...
|
| https://vancouversun.com/news/local-news/the-internet-archiv...
|
| (Edited: apparently just a new HQ and not THE HQ)
| thrance wrote:
| With this belligerent maniac in the White House who recently
| doubled-down on his wish to annex Canada [1], I wouldn't feel
| safe relocating there if the goal is to flee the US.
|
| [1] https://www.nbcnews.com/politics/donald-trump/trump-
| quest-co...
| badlibrarian wrote:
| Anyone who takes even an hour to audit anything about the
| Internet Archive will soon come to a very sad conclusion.
|
| The physical assets are stored in the blast radius of an oil
| refinery. They don't have air conditioning. Take the tour and
| they tell you the site runs slower on hot days. Great mission,
| but atrociously managed.
|
| Under attack for a number of reasons, mostly absurd. But a few
| are painfully valid.
| floam wrote:
| I realized recently, who needs torrents? I can get a good rip
| of any movie right there.
| aziaziazi wrote:
| I understand what you describe is prohibited in many
| jurisdictions, however I'm curious about the technical
| aspect : in my experience they host the html but often not
| the assets, especially big pictures and I guess most movies
| files are bigger that pictures. Do you use a special trick
| to host/find them?
| badlibrarian wrote:
| No. And every video game every made is available for
| download as well. If you even have to download it: they
| pride in making many of them playable in browser with
| just a click.
|
| Copyright issues aside (let's avoid that mess) I was
| referring to basic technical issues with the site. Design
| is atrocious, search doesn't work, you can click 50
| captures of a site before you find one that actually
| loads, obvious data corruption, invented their own schema
| instead of using a standard one and don't enforce it, API
| is insane and usually broken, uploader doesn't work
| reliably, don't honor DMCA requests, ask for photo id and
| passports then leak them ...
|
| It's the worst possible implementation of the best
| possible idea.
| yuvalr1 wrote:
| And yet, it's the best we currently have. I donate to
| them. We can come with demands of how it should be
| managed, but it should not prevent us from helping them.
| badlibrarian wrote:
| If you poke around at what US government agencies are
| doing, and what European countries and non-profits are
| doing, or even do a deep dive into what your local
| library offers, you may find they no longer lead the
| pack.
|
| They didn't even ask for donations until they
| accidentally set fire to their building annex. People
| offered to help (SF was apparently booming that year) and
| of course they promptly cranked out the necessary PHP to
| accept donations.
|
| Now it's become part of the mythology. But throwing petty
| cash at a plane in a death spiral doesn't change gravity.
| They need to rehabilitate their reputation and partner
| with organizations who can help them achieve their
| mission over the long term. I personally think they need
| to focus on archival, legal long-term preservation and
| archival, before sticking their neck out any further. If
| this means no more Frogger in the browser, so be it.
|
| I certainly don't begrudge anyone who donates, but asking
| for $17 on the same page as copyrighted game ROMs and
| glitchy scans of comic books isn't a long-term strategy.
| dr_dshiv wrote:
| Their yearly budget is less than the budget of just the SF
| library system.
| badlibrarian wrote:
| Then maybe they should've figured out how to keep hard
| drives in a climate controlled environment before they
| decided to launch a bank.
|
| https://ncua.gov/newsroom/press-release/2016/internet-
| archiv...
| blmurch wrote:
| Yup! We're here and looking to do good work with Cultural
| Heritage and Research Organizations in Europe. I'm very happy
| to be working with the Internet Archive once again after a 20
| year long break.
|
| https://www.stichtinginternetarchive.nl/
| Havoc wrote:
| I wonder whether it'll become like pre-WW2 steel that doesn't
| have nuclear contamination.
|
| Just with a pre-LLM knowledge
| guybedo wrote:
| fwiw i've added a summary of the discussion here:
| https://extraakt.com/extraakts/67d708bc9844db151612d782
| dstroot wrote:
| Isn't big LLM training data actually the most analogous to the
| internet archive? Shouldn't the title be "Big LLM training data
| is a piece of history"? Especially at this point in history since
| a large portion of internet data going forward will be LLM
| generated and not human generated? It's kind of the last snapshot
| of human-created content.
| antirez wrote:
| The problem is, where is this 20T tokens that are being used
| for this task? No way to access them. I hope that at least
| OpenAI and a few more have solid historical storage of the
| tokens they collect.
| bossyTeacher wrote:
| So large large language model?
| blinky81 wrote:
| "big large" lol
| almosthere wrote:
| Split the wayback machine away from its book copyright lawsuit
| stuff and you don't have to worry.
| codr7 wrote:
| I find it very depressing to think that the only traces left from
| all the creativity will end up to be AI slop, the worst use case
| ever.
|
| I feel like the more people use GenAI, the less intelligent they
| become. Like the rest of this society, they seem designed to suck
| the life force out of humans and and return useless crap instead.
| andix wrote:
| I think it's fine that not everything on the internet is archived
| forever.
|
| It has always been like that, in the past people wrote on paper,
| and most of it was never archived. At some point it was just
| lost.
|
| I inherited many boxes of notes, books and documents from my
| grandparents. Most of it was just meaningless to me. I had to
| throw away a lot of it and only kept a few thousand pages of
| various documents. The other stuff is just lost forever. And
| that's probably fine.
|
| Archives are very important, but nowadays the most difficult part
| is to select what to archive. There is so much content added to
| the internet every second, only a fraction of it can be archived.
| throwaway48476 wrote:
| The internet training data for LLMs is valuable history were
| losing one dead webadmin at a time. The regurgitated slop less
| so.
| pama wrote:
| I would be curious to know if it would be possible to recunstruct
| approximate versions of popular common subsets of internet
| training data by using many different LLMs that may have happened
| to read the same info. Anyone knows pointers to math papers about
| such things?
___________________________________________________________________
(page generated 2025-03-16 23:01 UTC)