[HN Gopher] Obituary for Cyc
___________________________________________________________________
Obituary for Cyc
Author : todsacerdoti
Score : 158 points
Date : 2025-04-08 19:13 UTC (3 hours ago)
(HTM) web link (yuxi-liu-wired.github.io)
(TXT) w3m dump (yuxi-liu-wired.github.io)
| pvg wrote:
| A big Cyc thread about a year ago
| https://news.ycombinator.com/item?id=40069298
| zitterbewegung wrote:
| You can run a version of CYC that was released online as opencyc
| https://github.com/asanchez75/opencyc . This is when a version of
| the system was posted on source forge and the GitHub has the
| dataset and the KB and inference engine. Note it has been written
| in an old version of Java.
| pfdietz wrote:
| It would be cool to try to generate the "knowledge" as in Cyc
| automatically, from LLMs.
| eob wrote:
| Or vice versa - perhaps some subset of the "thought chains" of
| Cyc's inference system could be useful training data for LLMs.
| euroderf wrote:
| When I first learned about LLMs, what came to mind is some
| sort of "meeting of the minds" with Cyc. 'Twas not to be,
| apparently.
| imglorp wrote:
| I view Cyc's role there as a RAG for common sense
| reasoning. It might prevent models from advising glue on
| pizza. (is-a 'pizza 'food) (not
| (is-a 'glue 'food)) (for-all i ingredients
| (assert-is-a i 'food))
| jes5199 wrote:
| sure but the bigger models don't make these trivial
| mistakes, and I'm not sure if translating the LLM english
| sentences into LISP and trying to check them is going to
| be more accurate than just training the models better
| vannevar wrote:
| I would argue that Lenat was at least directionally correct in
| understanding that sheer volume of data (in Cyc's case, rules and
| facts) was the key in eventually achieving useful intelligence. I
| have to confess that I once criticized the Cyc project for
| creating an ever-larger pile of sh*t and expecting a pony to
| emerge, but that's sort of what has happened with LLMs.
| cmrdporcupine wrote:
| I suspect at some point the pendulum will again swing back the
| other way and symbolic approaches will have some kind of
| breakthrough and become trendy again. And, I bet it will likely
| have something to do with accelerating these systems with
| hardware, much like GPUs have done for neural networks, in
| order to crunch really large quantities of facts
| whiplash451 wrote:
| Or maybe program synthesis combined by LLMs might be the way?
| luma wrote:
| The Bitter Lesson has a few things to say about this.
|
| http://www.incompleteideas.net/IncIdeas/BitterLesson.html
| kevin_thibedeau wrote:
| Real AGI will need a way to reason about factual knowledge.
| An ontology is a useful framework for establishing facts
| without inferring them from messy human language.
| IshKebab wrote:
| These guys are trying to combine symbolic reasoning with LLMs
| somehow: https://www.symbolica.ai/
| specialgoodness wrote:
| check out Imandra's platform for neurosymbolic AI -
| https://www.imandra.ai/
| chubot wrote:
| That's hilarious, but at least Llama was trained on libgen, an
| archive of most books and publications by humanity, no? Except
| for the ones which were not digitized I guess
|
| So there is probably a big pile of Reddit comments, twitter
| messages, and libgen and arxiv PDFs I imagine
|
| So there is some shit, but also painstakingly encoded knowledge
| (ie writing), and yeah it is miraculous that LLMs are right as
| often as they are
| ChadNauseam wrote:
| It's a miracle, but it's all thanks to the post-training.
| When you think of it, for so-called "next token predictors",
| LLMs talk in a way that almost no one actually talks, with
| perfect spelling and use of punctuation. The post-training
| somehow is able to get them to predict something along the
| lines of what a reasonably intelligent assistant with perfect
| grammar would say. LLMs are probably smarter than is exposed
| through their chat interface, since it's unlikely the post-
| training process is able to get them to impersonate the
| smartest character they'd be capable of impersonating.
| chubot wrote:
| I dunno I actually think say Claude AI SOUNDS smarter than
| it is, right now
|
| It has a phenomenal recall. I just asked it about
| "SmartOS", something I knew about, vaguely, in ~2012, and
| it gave me a pretty darn good answer. On that particular
| subject, I think it probably gave a better answer than
| anyone I could e-mail, call, or text right now
|
| It was significantly more informative than wikipedia -
| https://en.wikipedia.org/wiki/SmartOS
|
| But I still find it easy to stump it and get it to
| hallucinate, which makes it seem dumb
|
| It is like a person with good manners, and a lot of memory,
| and which is extremely good at comparisons
|
| But I would not say it is "smart" at coming up with new
| ideas or anything
|
| I do think a key point is that a "text calculator" is doing
| a lot of work ... i.e. summarization and comparison are
| extremely useful things. They can accelerate thinking
| baq wrote:
| https://ai-2027.com/ postulates that a good enough LLM will
| rewrite itself using rules and facts... sci-fi, but so is
| chatting with a matrix multiplication.
| josephg wrote:
| I doubt it. The human mind is a probabilistic computer, at
| every level. There's no set definition for what a chair is.
| It's fuzzy. Some things are obviously in the category, and
| some are at the periphery of it. (Eg is a stool a chair? Is a
| log next to a campfire a chair? How about a tree stump in the
| woods? Etc). This kind of fuzzy reasoning is the rule, not
| the exception when it comes to human intuition.
|
| There's no way to use "rules and facts" to express concepts
| like "chair" or "grass", or "face" or "justice" or really
| anything. Any project trying to use deterministic symbolic
| logic to represent the world fundamentally misunderstands
| cognition.
| veqq wrote:
| So you're just ignoring all the probabilistic, fuzzy etc.
| Prologs etc. which do precisely that?
| https://github.com/lab-v2/pyreason
| jgalt212 wrote:
| > The human mind is a probabilistic computer, at every
| level.
|
| Fair enough, but an airplane's wing is not very similar to
| a bird's wing.
| josephg wrote:
| That argument would hold a lot more weight if Cyc could
| fly. But as this article points out, decades of work and
| millions of dollars have utterly failed to get it off the
| ground.
| photonthug wrote:
| > There's no way to use "rules and facts" to express
| concepts like "chair" or "grass", or "face" or "justice" or
| really anything. Any project trying to use deterministic
| symbolic logic to represent the world fundamentally
| misunderstands cognition.
|
| Are you sure? In terms of theoretical foundations for AGI,
| AIXI is probabilistic but godel-machines are proof based
| and I think they'd meet criteria for deterministic /
| symbolic. Non-monotonic and temporal logics also exist,
| where chairness exists as a concept that might be revoked
| if 2 or more legs are missing. If you really want to get
| technical then by allowing logics with continuous time and
| changing discrete truth values, then you can probably
| manufacture a fuzzy logic where time isn't considered but
| truth/certainty values are continuous. Your ideas about
| logic might be too simple, it's more than just Aristotle
| dartharva wrote:
| > Lenat personally did not release the source code of his PhD
| project or EURISKO, remained unimpressed with open source, and
| disliked academia as much as academia disliked him. Most open
| information concerning Cyc had been deliberately removed circa
| 2015, at the moment when Cycorp pivoted to commercial
| applications.
|
| Makes one wonder how much a research being open makes a
| difference in its real-world success in the current age. Cyc's
| competitors (LLMs etc.) arguably have a lot to attribute to open
| public participation for its successes. Perhaps things would have
| been difference had Lenat been more open with the project?
| Legend2440 wrote:
| Probably not, tbh. The issue with Cyc is that it required huge
| amounts of manual effort to create the rules, while LLMs can
| learn their own rules from raw data.
|
| There was no machine intelligence in Cyc, just human
| intelligence.
| lenerdenator wrote:
| So more of an expert system?
| dragonwriter wrote:
| Cyc was exactly an expert system (and those were exactly as
| central an "AI" technology as LLMs are today, a few rounds
| of AI hype ago.)
| ted_dunning wrote:
| Very much so.
| moralestapia wrote:
| >while LLMs can learn their own rules from raw data
|
| Supervised vs. unsupervised, but LLMs haven't made any new
| discoveries on their own ... yet.
| giardini wrote:
| _> Lenat personally did not release the source code of his PhD
| project or EURISKO... <_
|
| Now that Lenat is dead, can his PhD project code and EURISKO
| code be released?
| ChuckMcM wrote:
| I had the funny thought that this is exactly what a sentient AI
| would write "stop looking here, there is nothing to see, move
| along." :-)
|
| I (like vannevar apparently) didn't feel Cyc was going anywhere
| useful, there were ideas there, but not coherent enough to form a
| credible basis for even a hypothesis of how a system could be
| constructed that would embody them.
|
| I was pretty impressed by McCarthy's blocks world demo, later he
| and a student formalized some of the rules for creating
| 'context'[1] for AI to operate within, I continue to think that
| will be crucial to solving some of the mess that LLMs create.
|
| For example, the early failures of LLMs suggesting that you could
| make salad crunchy by adding rocks was a classic context failure,
| data from the context of 'humor' and data from the context of
| 'recipes' intertwined. Because existing models have no context
| during training, there is nothing in the model that 'tunes' the
| output based on context. And you get rocks in your salad.
|
| [1]
| https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
| toisanji wrote:
| I think there could be a next generation cyc. Current llms have
| too many mistakes and grounding it with an AI ontology could be
| really interesting. I wrote more about it here:
| https://blog.jtoy.net/understanding-cyc-the-ai-database/
| mcphage wrote:
| > Cyc grew to contain approximately 30 million assertions at a
| cost of $200 million and 2,000 person-years. Yet despite Lenat's
| repeated predictions of imminent breakthrough, it never came.
|
| That seems like pretty small potatoes compared to how much has
| been spent on LLMs these days.
|
| Or to put it another way: if global funding for LLM development
| had been capped at $200m, how many of them would even exist?
| masfuerte wrote:
| It's funny, because AI companies are currently spending
| fortunes on mathematicians, physicists, chemists, software
| engineers, etc. to create good training data.
|
| Maybe this money would be better spent on creating a Lenat-
| style ontology, but I guess we'll never know.
| throwanem wrote:
| We may. LLMs are capable, even arguably at times inventive,
| but lack the ability to test against ground truth;
| ontological reasoners can never exceed the implications of
| the ground truth they're given, but within that scope reason
| perfectly. These seem like complementary strengths.
| gwern wrote:
| Language models repeatedly delivered practical, real-world
| economic value at every step of the way from at least n-grams
| on. (Remember the original 'unreasonable effectiveness of
| data'?) The applications were humble and weren't like "write
| all my code for me and then immanentize the eschaton", but they
| were real things like spelling error detection & correction,
| text compressors, voice transcription boosters, embeddings for
| information retrieval, recommenders, knowledge graph creation
| (ironically enough), machine translation services, etc. In
| contrast, Yuxi goes through the handful of described Cyc use-
| cases from their entire history, and it's not impressive.
| zozbot234 wrote:
| > That seems like pretty small potatoes compared to how much
| has been spent on LLMs these days.
|
| It seems to be a pretty high cost, at more than $6 per
| assertion. Wikidata - the closest thing we have to a "backbone
| for the Semantic Web" contains around 1.6G bare assertions
| describing 115M real-world entities, and that's a purely
| volunteer project.
| Animats wrote:
| Cyc is going great, according to the web site. "The Next
| Generation of Enterprise AI"[1]
|
| Lenat himself died in 2023. Despite this, he is listed as the
| only member of the "leadership team".[2]
|
| [1] https://cyc.com/
|
| [2] https://cyc.com/leadership-team/
| timClicks wrote:
| > The secretive nature of Cyc has multiple causes. Lenat
| personally did not release the source code of his PhD project or
| EURISKO, remained unimpressed with open source, and disliked
| academia as much as academia disliked him.
|
| One thing that's not mentioned here, but something that I took
| away from Wolfram's obituary of Lenat
| (https://writings.stephenwolfram.com/2023/09/remembering-doug...)
| was that Lenat was very easily distracted ("Could we somehow
| usefully connect [Wolfram|Alpha and the Wolfram Language] to CYC?
| ... But when I was at SXSW the next year Doug had something else
| he wanted to show me. It was a math education game.").
|
| My armchair diagnosis is untreated ADHD. He might have had had
| discussing the internals of CYC on his todo list since its first
| prototype, but the draft was never ready.
| kazinator wrote:
| It is worth plodding ahead with symbolic AI research because:
|
| - much less resource hungry / planet warming
|
| - auditable chains of inference
| ted_dunning wrote:
| Of course there is still the downside that it doesn't work very
| well.
| bsder wrote:
| I think Cyc should worry that I can't distinguish whether real or
| April Fools joke ...
| krick wrote:
| I have nothing to say about Cyc (apart to that the bitter lesson
| is bitter indeed, but was never debunked so far; and, also, I
| hate LLMs). But this line (from accompanying github) deserves
| some attention, IMO:
|
| > Due to the lack of long-term stability of IA, I have taken the
| liberty to scrape those from IA and kept them here for safe
| keeping.
|
| I mean, he is not wrong, and when I want to preserve something
| code-related, Github comes to mind as a safe place... which is
| crazy, if you think about it for a moment. And the fact that we
| are starting to use Github as a backup for IA is almost comically
| grotesque. I don't have a solution, but it's worth reminding that
| we sure have a problem.
| thot_experiment wrote:
| What does it mean to you to hate LLMs?
| ted_dunning wrote:
| Have you ever not hated LLMs?
| gibsonf1 wrote:
| Yep, they completely missed the boat. They tried to use concepts
| without actually modeling concepts, making a huge mess of
| contradicting statements which actually didn't model the world.
| Using a word in a statement does not a concept make!
| photonthug wrote:
| I enjoyed this read and agree Lenat was a grifter, which is easy
| to see based on contracts and closed source. But I dislike how
| the article seems tilted towards a hit piece against search,
| heuristics, reasoning, symbolic approaches in AI, and even
| _striving_ for explainable /understandable systems. It's a
| subtext throughout, so perhaps I'm misinterpreting it.. but the
| neats vs the scruffies thing is just not really productive, and
| there seems to be no real reason for the "either/or" mentality.
|
| To put some of this into starker contrast.. 40 years, 200 million
| dollars, and broken promises is the cost burned on something
| besides ML? Wait isn't the current approach burning that kind of
| cash in a weekend, and aren't we proudly backdating deep-learning
| to ~1960 every time someone calls it "new"? Is a huge volume of
| inscrutable weights, with unknown sources, generated at huge
| costs, really "better" than closed-source in terms of
| transparency? Are we not very busy building agents and critics
| very much like Minky's society of mind while we shake our heads
| and say he was wrong?
|
| This write-up also appears to me as if it were kind of punching
| down. A "hostile assessment" in an "obituary" is certainly easy
| in hindsight, especially if business is booming in your
| (currently very popular) neighborhood. If you didn't want to
| punch down, if you really want to go on record as saying
| logic/search are completely dead-ended and AGI won't ever touch
| the stuff.. it would probably look more like critiquing
| symbolica.ai, saying that nothing like scallop-lang / pyreason
| will ever find any use-cases, etc.
| anon291 wrote:
| > even striving for explainable/understandable systems
|
| It's been almost 6-8000 years since the advent of writing and
| we still cannot explain or understand human intelligence and
| yet we expect to be able to understand a machine that is close
| to or surpasses human intelligence? Isn't the premise
| fundamentally flawed?
| photonthug wrote:
| I think I'd remain interested in more conclusive proof one
| way or the other, since by your logic everything that's
| currently unknown is unknowable.
|
| Regardless of whether the project of explainable /
| understandable succeeds though, everyone should agree it's a
| worthy goal. Unless you like the idea of stock-markets,
| resource planning for cities and whole societies under the
| control of technology that's literally indistinguishable from
| oracles speaking to a whispering wind. I'd prefer someone
| else is able to hear/understand/check their math or their
| arguments. Speaking of 6-8000 years since something happened,
| oracles and mystical crap like that should be forgotten
| relics of a bygone era rather than an explicit goal for the
| future
| lambdaone wrote:
| In spite of all of this, Cycorp is still in business, and have
| pivoted to healthcare automation, including apparently insurance
| denials. I wonder if the full Cyc knowledg base will ever end up
| being released to the public domain, or whether it will simply
| fade away into nonexistence as proprietary data?
| ted_dunning wrote:
| Lenat beat Musk at his own game. Musk has only been promising
| "full self driving next year" for 10 years.
|
| Doug Lenat managed to make similar hopeless promises for nearly a
| half century.
| ein0p wrote:
| Cyc would have to model conditional distributions on a massive
| scale in order to be practical and useful. Coincidentally
| P(t_n|t_(n-1)...t_0) is exactly what LLMs model.
___________________________________________________________________
(page generated 2025-04-08 23:00 UTC)