[HN Gopher] Vintage Large Language Models
___________________________________________________________________
Vintage Large Language Models
Author : pr337h4m
Score : 57 points
Date : 2025-11-16 13:15 UTC (9 hours ago)
(HTM) web link (owainevans.github.io)
(TXT) w3m dump (owainevans.github.io)
| mountainriver wrote:
| Very cool! I've been wanting to do this do a long time!
| nxobject wrote:
| I love the ideas about how we might use historical LLMs to
| inquire into the past!
|
| I imagine that (the author hints at this), to do this rigorously,
| spelling out assumptions etc, you'd have to build off theoretical
| frameworks used to inductively synthesize/qualify interviews and
| texts, currently around in history and the social sciences.
| abeppu wrote:
| The talk focuses for a bit on having pure data from before the
| given date. But it doesn't consider that the data available from
| before that time may be subject to strong selection bias, based
| on what's interesting to people doing scholarship or archival
| work after that date. E.g. have we disproportionately digitized
| the notes/letters/journals of figures whose ideas have gained
| traction after their death?
|
| The article makes a comparison to financial backtesting. If you
| form a dataset of historical prices of stocks which are
| _currently_ in the S&P500, even if you only use price data before
| time t, models trained against your data will expect that prices
| go up and companies never die, because they've only seen the
| price history of successful firms.
| alalv wrote:
| It mentions that problem in the first section
| malkia wrote:
| Not a financial person by any means, but doesn't the Black Swan
| Theory basically disproves such methods due to rarity of an
| event that might have huge impact without something to predict
| (in the past) that it might happen, or even if it can be
| predicted - the impact cannot?
|
| For example: Chernobyl, COVID, 2008 financial crisis and even
| 9/11
| ACCount37 wrote:
| All models are wrong, but some are useful.
|
| If you had a financial model that somehow predicted
| everything but black swan events, that would still be enough
| to make yourself rich beyond belief.
| dboon wrote:
| The talk explicitly addresses this exact issue.
| ideashower wrote:
| I like the idea of using vintage LLMs to study explicit and
| implicit bias. e.g. text before mid-19th century believing in
| racial superiority, gender discrimination, imperial authority or
| slavery. Comparing that to text since then. I'm sure there are
| more ideas when you use temporal constraints on training data.
| digdugdirk wrote:
| I've been wanting to do this on historical court records -
| building upon the existing cases, one by one, using llms as the
| "Judge". It'd be interesting to see which cases branch off from
| the established precedent, and how that cascades into the
| present.
|
| Any thoughts how one could get started with this?
| UltraSane wrote:
| Over the long term LLMs are going to become very interesting
| snapshots of history. Imagine prompting an LLM from 2025 in 2125.
| lukan wrote:
| I would probably prefer wikipedia snapshots (including debate)
| as a future historian.
| i80and wrote:
| Maybe in the sense that a CueCat is interesting to us today.
| nxobject wrote:
| You're right: I wish OpenAI could find a way to "donate" GPT-2
| or GPT-3 to the CHM, or some open archive.
|
| I feel like that generation of models was around the point
| where we were getting pleasantly surprised by the behaviors of
| models. (I think people were having fun translating things into
| sonnets back then?)
| unleaded wrote:
| Someone has sort of done this:
|
| https://www.reddit.com/r/LocalLLaMA/comments/1mvnmjo/my_llm_...
|
| I doubt a better one would cost $200,000,000.
| ijk wrote:
| I was hoping that this would be about Llama 1 and comparison with
| GPT-contaminated models.
| kingkongjaffa wrote:
| This would be a good way to verify emergent model capability to
| synthesize new knowledge.
|
| You give an LLM all the information from right before a topic was
| discovered or invented, and then you see if it can independently
| generate the new knowledge or not.
|
| It would be hard to know for sure if a discovery was genuine or
| accidentally included in the training data though.
| carsoon wrote:
| Using old models is a good way to received less biased
| information about an active event. Once a major event occurs
| information wars happen that try and change narratives and erase
| old information. But because models were trained before this the
| bias that the event causes is not yet present.
| lukev wrote:
| I'm sorry I don't quite follow... how can a model provide
| information at all about events it was trained before?
| carsoon wrote:
| We need a library of Alexandria for primary sources. If we had
| source transparency then referencing back to original sources
| would be more clear. We could do cool things like these vintage
| models to reduce bias from current events. Also books in every
| language and books for teaching each language would help with
| multimodality. Copyright makes it difficult to achieve the best
| results for LLM creation and usage though.
___________________________________________________________________
(page generated 2025-11-16 23:00 UTC)