[HN Gopher] Obituary for Cyc
       ___________________________________________________________________
        
       Obituary for Cyc
        
       Author : todsacerdoti
       Score  : 158 points
       Date   : 2025-04-08 19:13 UTC (3 hours ago)
        
 (HTM) web link (yuxi-liu-wired.github.io)
 (TXT) w3m dump (yuxi-liu-wired.github.io)
        
       | pvg wrote:
       | A big Cyc thread about a year ago
       | https://news.ycombinator.com/item?id=40069298
        
       | zitterbewegung wrote:
       | You can run a version of CYC that was released online as opencyc
       | https://github.com/asanchez75/opencyc . This is when a version of
       | the system was posted on source forge and the GitHub has the
       | dataset and the KB and inference engine. Note it has been written
       | in an old version of Java.
        
       | pfdietz wrote:
       | It would be cool to try to generate the "knowledge" as in Cyc
       | automatically, from LLMs.
        
         | eob wrote:
         | Or vice versa - perhaps some subset of the "thought chains" of
         | Cyc's inference system could be useful training data for LLMs.
        
           | euroderf wrote:
           | When I first learned about LLMs, what came to mind is some
           | sort of "meeting of the minds" with Cyc. 'Twas not to be,
           | apparently.
        
             | imglorp wrote:
             | I view Cyc's role there as a RAG for common sense
             | reasoning. It might prevent models from advising glue on
             | pizza.                   (is-a 'pizza 'food)         (not
             | (is-a 'glue 'food))         (for-all i ingredients
             | (assert-is-a i 'food))
        
               | jes5199 wrote:
               | sure but the bigger models don't make these trivial
               | mistakes, and I'm not sure if translating the LLM english
               | sentences into LISP and trying to check them is going to
               | be more accurate than just training the models better
        
       | vannevar wrote:
       | I would argue that Lenat was at least directionally correct in
       | understanding that sheer volume of data (in Cyc's case, rules and
       | facts) was the key in eventually achieving useful intelligence. I
       | have to confess that I once criticized the Cyc project for
       | creating an ever-larger pile of sh*t and expecting a pony to
       | emerge, but that's sort of what has happened with LLMs.
        
         | cmrdporcupine wrote:
         | I suspect at some point the pendulum will again swing back the
         | other way and symbolic approaches will have some kind of
         | breakthrough and become trendy again. And, I bet it will likely
         | have something to do with accelerating these systems with
         | hardware, much like GPUs have done for neural networks, in
         | order to crunch really large quantities of facts
        
           | whiplash451 wrote:
           | Or maybe program synthesis combined by LLMs might be the way?
        
           | luma wrote:
           | The Bitter Lesson has a few things to say about this.
           | 
           | http://www.incompleteideas.net/IncIdeas/BitterLesson.html
        
           | kevin_thibedeau wrote:
           | Real AGI will need a way to reason about factual knowledge.
           | An ontology is a useful framework for establishing facts
           | without inferring them from messy human language.
        
           | IshKebab wrote:
           | These guys are trying to combine symbolic reasoning with LLMs
           | somehow: https://www.symbolica.ai/
        
             | specialgoodness wrote:
             | check out Imandra's platform for neurosymbolic AI -
             | https://www.imandra.ai/
        
         | chubot wrote:
         | That's hilarious, but at least Llama was trained on libgen, an
         | archive of most books and publications by humanity, no? Except
         | for the ones which were not digitized I guess
         | 
         | So there is probably a big pile of Reddit comments, twitter
         | messages, and libgen and arxiv PDFs I imagine
         | 
         | So there is some shit, but also painstakingly encoded knowledge
         | (ie writing), and yeah it is miraculous that LLMs are right as
         | often as they are
        
           | ChadNauseam wrote:
           | It's a miracle, but it's all thanks to the post-training.
           | When you think of it, for so-called "next token predictors",
           | LLMs talk in a way that almost no one actually talks, with
           | perfect spelling and use of punctuation. The post-training
           | somehow is able to get them to predict something along the
           | lines of what a reasonably intelligent assistant with perfect
           | grammar would say. LLMs are probably smarter than is exposed
           | through their chat interface, since it's unlikely the post-
           | training process is able to get them to impersonate the
           | smartest character they'd be capable of impersonating.
        
             | chubot wrote:
             | I dunno I actually think say Claude AI SOUNDS smarter than
             | it is, right now
             | 
             | It has a phenomenal recall. I just asked it about
             | "SmartOS", something I knew about, vaguely, in ~2012, and
             | it gave me a pretty darn good answer. On that particular
             | subject, I think it probably gave a better answer than
             | anyone I could e-mail, call, or text right now
             | 
             | It was significantly more informative than wikipedia -
             | https://en.wikipedia.org/wiki/SmartOS
             | 
             | But I still find it easy to stump it and get it to
             | hallucinate, which makes it seem dumb
             | 
             | It is like a person with good manners, and a lot of memory,
             | and which is extremely good at comparisons
             | 
             | But I would not say it is "smart" at coming up with new
             | ideas or anything
             | 
             | I do think a key point is that a "text calculator" is doing
             | a lot of work ... i.e. summarization and comparison are
             | extremely useful things. They can accelerate thinking
        
         | baq wrote:
         | https://ai-2027.com/ postulates that a good enough LLM will
         | rewrite itself using rules and facts... sci-fi, but so is
         | chatting with a matrix multiplication.
        
           | josephg wrote:
           | I doubt it. The human mind is a probabilistic computer, at
           | every level. There's no set definition for what a chair is.
           | It's fuzzy. Some things are obviously in the category, and
           | some are at the periphery of it. (Eg is a stool a chair? Is a
           | log next to a campfire a chair? How about a tree stump in the
           | woods? Etc). This kind of fuzzy reasoning is the rule, not
           | the exception when it comes to human intuition.
           | 
           | There's no way to use "rules and facts" to express concepts
           | like "chair" or "grass", or "face" or "justice" or really
           | anything. Any project trying to use deterministic symbolic
           | logic to represent the world fundamentally misunderstands
           | cognition.
        
             | veqq wrote:
             | So you're just ignoring all the probabilistic, fuzzy etc.
             | Prologs etc. which do precisely that?
             | https://github.com/lab-v2/pyreason
        
             | jgalt212 wrote:
             | > The human mind is a probabilistic computer, at every
             | level.
             | 
             | Fair enough, but an airplane's wing is not very similar to
             | a bird's wing.
        
               | josephg wrote:
               | That argument would hold a lot more weight if Cyc could
               | fly. But as this article points out, decades of work and
               | millions of dollars have utterly failed to get it off the
               | ground.
        
             | photonthug wrote:
             | > There's no way to use "rules and facts" to express
             | concepts like "chair" or "grass", or "face" or "justice" or
             | really anything. Any project trying to use deterministic
             | symbolic logic to represent the world fundamentally
             | misunderstands cognition.
             | 
             | Are you sure? In terms of theoretical foundations for AGI,
             | AIXI is probabilistic but godel-machines are proof based
             | and I think they'd meet criteria for deterministic /
             | symbolic. Non-monotonic and temporal logics also exist,
             | where chairness exists as a concept that might be revoked
             | if 2 or more legs are missing. If you really want to get
             | technical then by allowing logics with continuous time and
             | changing discrete truth values, then you can probably
             | manufacture a fuzzy logic where time isn't considered but
             | truth/certainty values are continuous. Your ideas about
             | logic might be too simple, it's more than just Aristotle
        
       | dartharva wrote:
       | > Lenat personally did not release the source code of his PhD
       | project or EURISKO, remained unimpressed with open source, and
       | disliked academia as much as academia disliked him. Most open
       | information concerning Cyc had been deliberately removed circa
       | 2015, at the moment when Cycorp pivoted to commercial
       | applications.
       | 
       | Makes one wonder how much a research being open makes a
       | difference in its real-world success in the current age. Cyc's
       | competitors (LLMs etc.) arguably have a lot to attribute to open
       | public participation for its successes. Perhaps things would have
       | been difference had Lenat been more open with the project?
        
         | Legend2440 wrote:
         | Probably not, tbh. The issue with Cyc is that it required huge
         | amounts of manual effort to create the rules, while LLMs can
         | learn their own rules from raw data.
         | 
         | There was no machine intelligence in Cyc, just human
         | intelligence.
        
           | lenerdenator wrote:
           | So more of an expert system?
        
             | dragonwriter wrote:
             | Cyc was exactly an expert system (and those were exactly as
             | central an "AI" technology as LLMs are today, a few rounds
             | of AI hype ago.)
        
             | ted_dunning wrote:
             | Very much so.
        
           | moralestapia wrote:
           | >while LLMs can learn their own rules from raw data
           | 
           | Supervised vs. unsupervised, but LLMs haven't made any new
           | discoveries on their own ... yet.
        
         | giardini wrote:
         | _> Lenat personally did not release the source code of his PhD
         | project or EURISKO... <_
         | 
         | Now that Lenat is dead, can his PhD project code and EURISKO
         | code be released?
        
       | ChuckMcM wrote:
       | I had the funny thought that this is exactly what a sentient AI
       | would write "stop looking here, there is nothing to see, move
       | along." :-)
       | 
       | I (like vannevar apparently) didn't feel Cyc was going anywhere
       | useful, there were ideas there, but not coherent enough to form a
       | credible basis for even a hypothesis of how a system could be
       | constructed that would embody them.
       | 
       | I was pretty impressed by McCarthy's blocks world demo, later he
       | and a student formalized some of the rules for creating
       | 'context'[1] for AI to operate within, I continue to think that
       | will be crucial to solving some of the mess that LLMs create.
       | 
       | For example, the early failures of LLMs suggesting that you could
       | make salad crunchy by adding rocks was a classic context failure,
       | data from the context of 'humor' and data from the context of
       | 'recipes' intertwined. Because existing models have no context
       | during training, there is nothing in the model that 'tunes' the
       | output based on context. And you get rocks in your salad.
       | 
       | [1]
       | https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
        
       | toisanji wrote:
       | I think there could be a next generation cyc. Current llms have
       | too many mistakes and grounding it with an AI ontology could be
       | really interesting. I wrote more about it here:
       | https://blog.jtoy.net/understanding-cyc-the-ai-database/
        
       | mcphage wrote:
       | > Cyc grew to contain approximately 30 million assertions at a
       | cost of $200 million and 2,000 person-years. Yet despite Lenat's
       | repeated predictions of imminent breakthrough, it never came.
       | 
       | That seems like pretty small potatoes compared to how much has
       | been spent on LLMs these days.
       | 
       | Or to put it another way: if global funding for LLM development
       | had been capped at $200m, how many of them would even exist?
        
         | masfuerte wrote:
         | It's funny, because AI companies are currently spending
         | fortunes on mathematicians, physicists, chemists, software
         | engineers, etc. to create good training data.
         | 
         | Maybe this money would be better spent on creating a Lenat-
         | style ontology, but I guess we'll never know.
        
           | throwanem wrote:
           | We may. LLMs are capable, even arguably at times inventive,
           | but lack the ability to test against ground truth;
           | ontological reasoners can never exceed the implications of
           | the ground truth they're given, but within that scope reason
           | perfectly. These seem like complementary strengths.
        
         | gwern wrote:
         | Language models repeatedly delivered practical, real-world
         | economic value at every step of the way from at least n-grams
         | on. (Remember the original 'unreasonable effectiveness of
         | data'?) The applications were humble and weren't like "write
         | all my code for me and then immanentize the eschaton", but they
         | were real things like spelling error detection & correction,
         | text compressors, voice transcription boosters, embeddings for
         | information retrieval, recommenders, knowledge graph creation
         | (ironically enough), machine translation services, etc. In
         | contrast, Yuxi goes through the handful of described Cyc use-
         | cases from their entire history, and it's not impressive.
        
         | zozbot234 wrote:
         | > That seems like pretty small potatoes compared to how much
         | has been spent on LLMs these days.
         | 
         | It seems to be a pretty high cost, at more than $6 per
         | assertion. Wikidata - the closest thing we have to a "backbone
         | for the Semantic Web" contains around 1.6G bare assertions
         | describing 115M real-world entities, and that's a purely
         | volunteer project.
        
       | Animats wrote:
       | Cyc is going great, according to the web site. "The Next
       | Generation of Enterprise AI"[1]
       | 
       | Lenat himself died in 2023. Despite this, he is listed as the
       | only member of the "leadership team".[2]
       | 
       | [1] https://cyc.com/
       | 
       | [2] https://cyc.com/leadership-team/
        
       | timClicks wrote:
       | > The secretive nature of Cyc has multiple causes. Lenat
       | personally did not release the source code of his PhD project or
       | EURISKO, remained unimpressed with open source, and disliked
       | academia as much as academia disliked him.
       | 
       | One thing that's not mentioned here, but something that I took
       | away from Wolfram's obituary of Lenat
       | (https://writings.stephenwolfram.com/2023/09/remembering-doug...)
       | was that Lenat was very easily distracted ("Could we somehow
       | usefully connect [Wolfram|Alpha and the Wolfram Language] to CYC?
       | ... But when I was at SXSW the next year Doug had something else
       | he wanted to show me. It was a math education game.").
       | 
       | My armchair diagnosis is untreated ADHD. He might have had had
       | discussing the internals of CYC on his todo list since its first
       | prototype, but the draft was never ready.
        
       | kazinator wrote:
       | It is worth plodding ahead with symbolic AI research because:
       | 
       | - much less resource hungry / planet warming
       | 
       | - auditable chains of inference
        
         | ted_dunning wrote:
         | Of course there is still the downside that it doesn't work very
         | well.
        
       | bsder wrote:
       | I think Cyc should worry that I can't distinguish whether real or
       | April Fools joke ...
        
       | krick wrote:
       | I have nothing to say about Cyc (apart to that the bitter lesson
       | is bitter indeed, but was never debunked so far; and, also, I
       | hate LLMs). But this line (from accompanying github) deserves
       | some attention, IMO:
       | 
       | > Due to the lack of long-term stability of IA, I have taken the
       | liberty to scrape those from IA and kept them here for safe
       | keeping.
       | 
       | I mean, he is not wrong, and when I want to preserve something
       | code-related, Github comes to mind as a safe place... which is
       | crazy, if you think about it for a moment. And the fact that we
       | are starting to use Github as a backup for IA is almost comically
       | grotesque. I don't have a solution, but it's worth reminding that
       | we sure have a problem.
        
         | thot_experiment wrote:
         | What does it mean to you to hate LLMs?
        
           | ted_dunning wrote:
           | Have you ever not hated LLMs?
        
       | gibsonf1 wrote:
       | Yep, they completely missed the boat. They tried to use concepts
       | without actually modeling concepts, making a huge mess of
       | contradicting statements which actually didn't model the world.
       | Using a word in a statement does not a concept make!
        
       | photonthug wrote:
       | I enjoyed this read and agree Lenat was a grifter, which is easy
       | to see based on contracts and closed source. But I dislike how
       | the article seems tilted towards a hit piece against search,
       | heuristics, reasoning, symbolic approaches in AI, and even
       | _striving_ for explainable /understandable systems. It's a
       | subtext throughout, so perhaps I'm misinterpreting it.. but the
       | neats vs the scruffies thing is just not really productive, and
       | there seems to be no real reason for the "either/or" mentality.
       | 
       | To put some of this into starker contrast.. 40 years, 200 million
       | dollars, and broken promises is the cost burned on something
       | besides ML? Wait isn't the current approach burning that kind of
       | cash in a weekend, and aren't we proudly backdating deep-learning
       | to ~1960 every time someone calls it "new"? Is a huge volume of
       | inscrutable weights, with unknown sources, generated at huge
       | costs, really "better" than closed-source in terms of
       | transparency? Are we not very busy building agents and critics
       | very much like Minky's society of mind while we shake our heads
       | and say he was wrong?
       | 
       | This write-up also appears to me as if it were kind of punching
       | down. A "hostile assessment" in an "obituary" is certainly easy
       | in hindsight, especially if business is booming in your
       | (currently very popular) neighborhood. If you didn't want to
       | punch down, if you really want to go on record as saying
       | logic/search are completely dead-ended and AGI won't ever touch
       | the stuff.. it would probably look more like critiquing
       | symbolica.ai, saying that nothing like scallop-lang / pyreason
       | will ever find any use-cases, etc.
        
         | anon291 wrote:
         | > even striving for explainable/understandable systems
         | 
         | It's been almost 6-8000 years since the advent of writing and
         | we still cannot explain or understand human intelligence and
         | yet we expect to be able to understand a machine that is close
         | to or surpasses human intelligence? Isn't the premise
         | fundamentally flawed?
        
           | photonthug wrote:
           | I think I'd remain interested in more conclusive proof one
           | way or the other, since by your logic everything that's
           | currently unknown is unknowable.
           | 
           | Regardless of whether the project of explainable /
           | understandable succeeds though, everyone should agree it's a
           | worthy goal. Unless you like the idea of stock-markets,
           | resource planning for cities and whole societies under the
           | control of technology that's literally indistinguishable from
           | oracles speaking to a whispering wind. I'd prefer someone
           | else is able to hear/understand/check their math or their
           | arguments. Speaking of 6-8000 years since something happened,
           | oracles and mystical crap like that should be forgotten
           | relics of a bygone era rather than an explicit goal for the
           | future
        
       | lambdaone wrote:
       | In spite of all of this, Cycorp is still in business, and have
       | pivoted to healthcare automation, including apparently insurance
       | denials. I wonder if the full Cyc knowledg base will ever end up
       | being released to the public domain, or whether it will simply
       | fade away into nonexistence as proprietary data?
        
       | ted_dunning wrote:
       | Lenat beat Musk at his own game. Musk has only been promising
       | "full self driving next year" for 10 years.
       | 
       | Doug Lenat managed to make similar hopeless promises for nearly a
       | half century.
        
       | ein0p wrote:
       | Cyc would have to model conditional distributions on a massive
       | scale in order to be practical and useful. Coincidentally
       | P(t_n|t_(n-1)...t_0) is exactly what LLMs model.
        
       ___________________________________________________________________
       (page generated 2025-04-08 23:00 UTC)