[HN Gopher] Will scaling work?
       ___________________________________________________________________
        
       Will scaling work?
        
       Author : saliagato
       Score  : 188 points
       Date   : 2023-12-27 12:54 UTC (10 hours ago)
        
 (HTM) web link (www.dwarkeshpatel.com)
 (TXT) w3m dump (www.dwarkeshpatel.com)
        
       | hokeone wrote:
       | >Furthermore, the fact that LLMs seem to need such a stupendous
       | amount of data to get such mediocre reasoning indicates that they
       | simply are not generalizing. If these models can't get anywhere
       | close to human level performance with the data a human would see
       | in 20,000 years, we should entertain the possibility that
       | 2,000,000,000 years worth of data will be also be insufficient.
       | There's no amount of jet fuel you can add to an airplane to make
       | it reach the moon.
       | 
       | Never thought about it in this sense. Is he wrong?
        
         | gchamonlive wrote:
         | I don't think he is wrong. I also don't think the goal of LLMs
         | is to reproduce human intelligence. That is, we don't need
         | human-like inteligence in a box for a tool to be useful. So
         | this assertion could be right and still miss the point of this
         | tech in my opinion.
         | 
         | Edit: to expand, if the goal is AGI then yes we need all the
         | help we can get. But even so, AGI is in a totally different
         | league compared to human intelligence, they might as well be a
         | different species.
        
           | jpk wrote:
           | The context of the fine article is scaling LLMs into AGI.
           | It's not about whether the tool is useful or not, as
           | usefulness is a threshold well before AGI. Some folks are
           | spooked that LLMs are a few optimizations away from the
           | singularity, and the article just discusses some reasons why
           | that probably isn't the case.
        
             | gchamonlive wrote:
             | The article is really good! I was responding to "is he
             | wrong" part of the comment, not the article itself.
        
           | tremarley wrote:
           | We don't need human-like intelligence in a box for a tool to
           | be useful, But human-like intelligence is what many companies
           | are spending billions to try and achieve
        
           | dartos wrote:
           | This. I don't think LLMs are anywhere near sci-fi AGI (think
           | I, robot) It's such a vague term anyway, AGI.
           | 
           | LLMs provide some really nice text generation, summarization,
           | and outstanding semantic search. It's drop dead easy to make
           | a natural language interface to anything now.
           | 
           | That's a big deal. That's what's going to give this tech it's
           | longevity, imo.
        
         | gitfan86 wrote:
         | Over the past year there have been advances in making models
         | smaller while keeping performance high.
         | 
         | So if that continues then he is wrong unless he is defining
         | LLMs in a strict way that does not include new improvement in
         | the future
        
         | barrenko wrote:
         | he's not wrong, and yet he's not right.
        
         | az226 wrote:
         | LLMs are closer to discoveries on the spectrum than inventions.
         | Nobody predicted or planned the many emergent capabilities
         | we've seen. Almost like magic. Now is a period of moving along
         | the axis to invention with many intentional design,
         | architecture, and feature development alongside testing and
         | evaluation. We are far from done with LLMs, plenty of room for
         | many more discoveries, lots to explore. It's definitely a
         | precursor to AGI. They offer a platform to build and scale data
         | sets and test beds.
         | 
         | We haven't had ML models this large before. There's innovation
         | in architecture but we often come back to the bitter lesson:
         | more data.
         | 
         | We're likely going to see experimentation with language models
         | to learn from few examples. Fine tuning pretrained LLMs shows
         | they have quite a remarkable ability to learn from few
         | examples.
         | 
         | Liquid AI has a new learning architecture for dynamic learning
         | and much smaller models.
         | 
         | Some people seem mad about the bitter lesson, they want their
         | model based on human features to work better when so far
         | usually more data wins.
         | 
         | I think the next evolution here is in increasing the quality of
         | the training data and giving it more structure. I suspect the
         | right setup can seed emergent capabilities.
        
           | gitfan86 wrote:
           | The trick is to make many LLMs work together in feedback
           | loops. Some small some big.
           | 
           | That will get us to what was previously known as AGI. The
           | definition of AGI will change, but we will have systems that
           | put perform humans in most ways.
        
             | nnoremap wrote:
             | Isaiah 7:14 (NIV): "Therefore the Lord himself will give
             | you a sign: The virgin will conceive and give birth to a
             | son, and will call him Immanuel."
        
           | beardedwizard wrote:
           | > It's definitely a precursor to AGI.
           | 
           | What are you basing this claim on? There is no intelligence
           | in an LLM, only humans fooled by randomness.
        
             | pmontra wrote:
             | Maybe we've been fooling each other since forever too.
             | 
             | However whatever we're doing seems to be different from
             | what LLMs do, at least because of the huge difference in
             | how we train.
             | 
             | It's possible that it will end up like airplanes and birds.
             | Airplanes can bring us to the other side of the world in a
             | day by burning a lot of fuel. Birds can get there too in a
             | much longer time and more cheaply. They can also land on a
             | branch of a tree. Airplanes can't and it's too risky for
             | drones.
        
             | blackoil wrote:
             | > only humans fooled by randomness
             | 
             | Is there another kind?
        
             | MeImCounting wrote:
             | This is such an interesting take. What do you classify as
             | intelligence?
             | 
             | From my perspective theres intelligence in a how to manual.
             | 
             | It seems like maybe you mean consciousness? Or creativity?
        
           | nsagent wrote:
           | You might want to reconsider your stance on emergent
           | abilities in LLMs considering the NeurIPS 2023 best paper
           | winner is titled:
           | 
           | "Are Emergent Abilities of Large Language Models a Mirage?"
           | 
           | https://arxiv.org/abs/2304.15004
           | https://blog.neurips.cc/2023/12/11/announcing-the-
           | neurips-20...
        
         | auggierose wrote:
         | And yet, we reached the moon, and I would say airplanes were a
         | necessary step on the way, even if only for psychological
         | reasons. For airplanes we had at least an example in nature,
         | birds. But I am not aware of any animal that travelled from
         | earth to the moon on its own, except us.
        
           | manojlds wrote:
           | We are talking of LLMs, not whether we will be able to reach
           | AGI or not.
        
             | ImHereToVote wrote:
             | Airplanes in this analogy are essentially the collection of
             | matrix multiplications that emulate reasoning in a very
             | rough but useful manner in an LLM.
             | 
             | It's unclear whether a rocket ship is a multimodal neural
             | net. Or some sort of swarm of LLM's in an adversarial
             | relationship, or something completely novel. Regardless, we
             | might be as far between LLM's to ASI's, as airplanes are to
             | rocket ships. Or not.
        
           | Eddy_Viscosity2 wrote:
           | But we didn't use airplanes to get there. It needed a new
           | approach, different propulsion, different fuel, different
           | attitude control, etc. etc.
           | 
           | LLM may be a necessary step to get to AGI, but it (probably)
           | won't be the one that achieves that goal.
        
             | auggierose wrote:
             | I doubt that LLMs will give us AGI. But they have already
             | given us more intelligence from a computer than I would
             | have imagined to see during my lifetime.
        
           | syndacks wrote:
           | Sorry, but what's with HN's obsession with analogies? You see
           | this in almost every comment section where someone tries to
           | dis/prove a point using an analogy. I get the allure but it's
           | intellectually brittle; by the time someone starts to argue
           | off the second or third incantation of the original analogy,
           | the forest has been lost for the tree.
        
           | Jensson wrote:
           | What if LLMs are hot air balloons of flight, or kites of
           | flight? Kites and hot air balloons didn't really lead to
           | getting to the moon, they are a very different tangent.
        
           | majkinetor wrote:
           | > But I am not aware of any animal that travelled from earth
           | to the moon on its own, except us.
           | 
           | Tardigrades might :)
        
         | MPSimmons wrote:
         | I don't think the data is the weakness.
         | 
         | We're using Transformer architecture right now. There's no
         | reason there won't be further discoveries in AI that are as
         | impactful as "Attention is All You Need".
         | 
         | We may be due for another "AI Winter" where we don't see
         | dramatic improvement across the board. We may not. Regardless,
         | LLMs using the Transformer architecture may not have human
         | level intelligence, but they _are_ useful, and they'll continue
         | to be useful. In the 90s, even during the AI winter, we were
         | able to use Bayesian classification for such common tasks as
         | email filtering. There's no reason we can't continue to use
         | Transformer architecture LLMs for common purposes too. Content
         | production alone makes it worth while.
         | 
         | We don't _need_ AGI, it just seems like the direction we are
         | heading as a species. If we don't get there, it's fine. No need
         | to throw the baby out with the bath water.
        
         | Der_Einzige wrote:
         | Even the largest LLM has had less "total information" than most
         | humans take in through all of their senses over their lifetime.
         | A single day for a baby is taking in a continuous stream of
         | among other things high quality video and audio and does a
         | large amount of processing on that. Much of that for very young
         | babies is unsupervised learning (clustering), where baby learns
         | that object A and object B are different despite knowing
         | nothing else about their properties.
         | 
         | Humans can learn using every ML learning paradigm in ever
         | modality: unsupervised, self-supervised, semi-supervised,
         | supervised, active, reinforcement based, and anything else I
         | might be missing. Current LLMs are stuck with "self-supervised"
         | with the occasional reinforced (RLHF) or supervised (DPO)
         | cherry on top at the end. non multi-modal LLMs operate with one
         | modality. We are hardly scratching the surface on what's
         | possible with multi-modal LLMs today. We are hardly scratching
         | the surface for training data for these models.
         | 
         | The overwhelming majority of todays LLMs are vastly
         | undertrained and exhibit behavior of undertrained systems.
         | 
         | The claim from the OP about scale not giving us further
         | emergent properties flies in the face of all of what we know
         | about this field. Expect further significant gains despite nay-
         | sayers claiming it's impossible.
        
           | haltist wrote:
           | You are obviously a believer so you should know I know how to
           | build AGI with a patented and trademarked architecture called
           | "panoptic computronium cathedral"(tm). Tell all your friends
           | about it. I only need $80B to achieve AGI.
        
         | lumost wrote:
         | The Phi paper and various approaches to distilling from GPT-4
         | demonstrate that the training data and plausibly order of
         | presentation matter.
         | 
         | The challenge is that we both do not understand which set of
         | data is most beneficial for training, or how it could be
         | efficiently ordered without triggering computationally
         | infeasible problems. However we do know how to massively scale
         | up training.
        
         | espadrine wrote:
         | Demis Hassabis of Deepmind echoes a similar sentiment[0]:
         | 
         | > _I still think there are missing things with the current
         | systems. [...] I regard it a bit like the Industrial Revolution
         | where there was all these amazing new ideas about energy and
         | power and so on, but it was fueled by the fact that there were
         | dead dinosaurs, and coal and oil just lying in the ground.
         | Imagine how much harder the Industrial Revolution would have
         | been without that. We would have had to jump to nuclear or
         | solar somehow in one go. [In AI research,] the equivalent of
         | that oil is just the Internet, this massive human-curated
         | artefact. [...] And of course, we can draw on that. And there
         | 's just a lot more information there, I think, it turns out
         | than any of us can comprehend, really. [...] [T]here's still
         | things I think that are missing. I think we're not good at
         | planning. We need to fix factuality. I also think there's room
         | for memory and episodic memory._
         | 
         | [0]: https://cbmm.mit.edu/video/cbmm10-panel-research-
         | intelligenc...
        
           | skippyboxedhero wrote:
           | His view of the Industrial Revolution is completely wrong.
           | 
           | Societies pre-IR had multiple periods where energy usage
           | increased significantly, some of them based specifically
           | around coal. No IR.
           | 
           | Early IR was largely based around the usage of water power,
           | not coal. IR was pure innovation, people being able to
           | imagine and create the impossible, it was going straight to
           | nuclear already.
           | 
           | Ironically, someone who is an innovator believes the very
           | anti-innovation narrative of the IR (very roughly, this is
           | the anti-Eurocentric stuff that began appearing in the
           | 2000s...the world has moved on since then as these theories
           | are obviously wrong). Nothing tells you more about how busted
           | modern universities are than this fact.
        
             | archon1410 wrote:
             | Has the narrative moved on? The historian and blogger Bret
             | Devereaux presents a view on a 2022 blog post that seems to
             | back up what the Deepmind CEO is saying.
             | 
             | > The specificity matters here because each innovation in
             | the chain required not merely the discovery of the
             | principle, but also the design and an economically viable
             | use-case to all line up in order to have impact.
             | 
             | https://acoup.blog/2022/08/26/collections-why-no-roman-
             | indus...
        
             | pighive wrote:
             | I am very curious on what you mentioned, but not able to
             | comprehend. Can you ELI5? Are you saying fossil fuel based
             | industrial revolution is not as significant as it was or we
             | could have directly jumped to a higher level fuel?
        
             | joe_the_user wrote:
             | _Societies pre-IR had multiple periods where energy usage
             | increased significantly, some of them based specifically
             | around coal. No IR._
             | 
             | That's a straight up misstatement of the parent argument -
             | the parent argued that coal was necessary, not that coal
             | sufficient. True or not, the argument isn't refuted by the
             | IR starting with water power either.
             | 
             | And pairing this with "anti-woke" jabs is discourse-
             | diminishing stuff. The theory that petroleum was a key
             | ingredient of the IR is much older than that (I don't even
             | agree with it but it's better than "pure innovation"
             | fluff).
        
       | YetAnotherNick wrote:
       | > '5 OOMs off'
       | 
       | I think Google, Microsoft and facebook could easily have 5 OOM
       | data than the entire public web combined if we just count text.
       | Majority of people don't have any content on public web except
       | for personal photos. A minority has few public social media posts
       | and it is rare for people to write blog or research paper etc.
       | And almost everyone has some content written in mail or docs or
       | messaging.
        
         | nmca wrote:
         | From the article, and relevant here:
         | 
         | I'm worried that when people hear '5 OOMs off', how they
         | register it is, "Oh we have 5x less data than we need - we just
         | need a couple of 2x improvements in data efficiency, and we're
         | golden". After all, what's a couple OOMs between friends?
         | 
         | No, 5 OOMs off means we have 100,000x less data than we need.
        
           | YetAnotherNick wrote:
           | I meant 100,000x. At least for everyone I know, they have
           | 100,000x data in mail/messaging/docs/notes/meeting etc. than
           | their blog or any public site they own. Hell I would even say
           | that if you just have all the meetings of zoom, it will be
           | few order of magnitude higher than the entire public web.
        
             | saulpw wrote:
             | If I have 1MB on my blog, 100,000x would be 100GB. Just,
             | no. OOMs are not to be trifled with.
        
               | YetAnotherNick wrote:
               | How many people have blogs? How many people sent any
               | message or created a google docs? The answer could easily
               | be 10,000x times of people having blog. Also I was just
               | counting text content as I mentioned.
               | 
               | For reference, there are 175,000 authors in medium
               | compared to billions using whatsapp or gmail or
               | difference of around 50,000.
        
         | HarHarVeryFunny wrote:
         | Maybe, and certainly with the current trend of synthetic data
         | they can also create it, but I don't think quantity of data
         | beyond what something like GPT-4 has been trained on will in of
         | itself change much other than reducing brittleness by providing
         | coverage of remaining knowledge gaps.
         | 
         | Quality of data (which I believe is at least part of why
         | synthetic data is being used) can perhaps make more of a
         | difference and perhaps at least partly compensate in a crude
         | way for these models lack of outlier rejection and any
         | generalization prediction-feedback loop. Just feed them
         | consistent correct data in the first place.
        
       | ralusek wrote:
       | Almost everything interesting about AI so far has been unexpected
       | emergent behavior, and huge gains through minor insights. While I
       | don't doubt that the current architecture is likely to have a
       | current ceiling below that of peak human intelligence in certain
       | dimensions, it's already surpassed it in some, and there are
       | still gains to be made in others through things like synthetic
       | data.
       | 
       | I also don't understand the claims that it doesn't generalize. I
       | currently use it to solve problems that I can absolutely
       | guarantee were not in its training set, and it generalizes well
       | enough. I also think that one of the easiest ways to get it to
       | generalize better would simply be through giving it synthetic
       | data which demonstrates the process of generalizing.
       | 
       | It also seems foolish to extrapolate on what we have under the
       | assumption that there won't be key insights/changes in
       | architecture as we get to the limitations of synthetic data
       | wins/multi-modal wins.
        
         | jahnu wrote:
         | > problems that I can absolutely guarantee were not in its
         | training set
         | 
         | Can you share the strongest example?
        
           | Jabrov wrote:
           | Pretty much any coding problem in a unique or private
           | codebase
        
             | Der_Einzige wrote:
             | There is a difference between interpolation, which the
             | majority of humans are performing daily with coding in
             | private codebases, and genuine extrapolation, which is
             | difficult to prove and difficult to find in high
             | dimensional spaces. LLMs may not be able to easily
             | extrapolate (and when it does it's due to high
             | temperature), but they can interpolate extremely well, and
             | most human growth and innovation today comes from novel
             | interpolations, which are what LLMs are excellent at.
        
             | jahnu wrote:
             | I asked for the strongest example the OP can share in order
             | to evaluate their claim. If it's so obvious to the OP that
             | generalisation is happening then it should be easy to
             | provide a strong example, right?
        
         | nsagent wrote:
         | I mentioned this to another commenter as well:
         | 
         | You might want to reconsider your stance on emergent abilities
         | in LLMs considering the NeurIPS 2023 best paper winner is
         | titled:
         | 
         | "Are Emergent Abilities of Large Language Models a Mirage?"
         | 
         | https://arxiv.org/abs/2304.15004
         | https://blog.neurips.cc/2023/12/11/announcing-the-neurips-20...
        
           | Der_Einzige wrote:
           | Papers which get accepted with honors are not necessarily
           | more truthful than papers which have been rejected. Yann
           | LeCunn goes on twitter like any other grad student around
           | NeurIPS or ICML/ICMR and bitterly complains when one of his
           | (many) papers is rejected. Whose more likely to be correct
           | here? Yann LeCunn (the TOP nlp scholar in our field by
           | citations, who does claim that most emergent capabilities are
           | real in other papers), or a NeurIPS best paper winner? My bet
           | is on Yann.
           | 
           | Also, consider that some work gets a lot of positivity not
           | for the work itself, but for the people who wrote it. Timnit
           | Gebaru's work was effectively ignored until she got famous
           | for her spat with jeff dean at google. Her citations have
           | exploded as a result, and I don't think that most in the
           | field think that the "stochastic parrot" paper was especially
           | good, and certainly not her other papers which include
           | significant amounts of work dedicated to claiming that LLM
           | training is really bad for the environment (despite a single
           | jet taking AI researchers to conferences being worse for the
           | environment than LLM training circa that paper being written
           | was taking). Doesn't matter that the paper was wrong, it's
           | now highly cited because you get brownie points for citing
           | her work in grievance studies influenced subfields of AI.
        
             | peteradio wrote:
             | Yann LeCun through Meta is incentivized towards maximizing
             | capital return based on local maxima. That is how all
             | business works, there is not really a direct incentive to
             | pushing boundaries beyond what can be immediately
             | monetized.
        
             | nsagent wrote:
             | Please at least read the paper before appealing to
             | authority. It is a well designed set of experiments that
             | clearly demonstrates that the notion of a "phase change"
             | (rapid shift in capabilities) as a popularized by many
             | people claiming emergence is actually a gradual improvement
             | with more data.
             | 
             | But if you do want to appeal to Lecun as an authority, then
             | maybe you'll accept that these (re)tweets that clearly
             | indicate he finds the insights from the paper to be valid:
             | 
             | https://nitter.1d4.us/ylecun/status/1736479356917063847
             | https://nitter.1d4.us/rao2z/status/1736464000836309259
             | (retweeted)
             | 
             | As for Timnit, I think you have your timeline confused.
             | Model cards are what put her on the map for most general
             | NLP researchers, which predates her difficulties at Google.
             | 
             | 2018: Model cards paper was put on arXiv
             | https://arxiv.org/abs/1810.03993
             | 
             | 2019: Major ML organizations start using model cards
             | https://github.com/openai/gpt-2/blob/master/model_card.md
             | 
             | 2020: Model cards become fairly standard
             | https://blog.research.google/2020/07/introducing-model-
             | card-...
             | 
             | Dec 2020: Timnit is let go from the ethics team at Google
             | https://www.bbc.com/news/technology-55187611
             | 
             | EDIT:formatting
        
             | visarga wrote:
             | Gebru and her "Stochastic Parrots" did a big disservice to
             | AI safety turning the debate into a shit-show of identity
             | politics. Now she has her own institute, it was a move up
             | for her career. Her twitter spats with Yann LeCun were
             | legendary. Literally sent him to educate himself and
             | refused to debate him.
        
         | crowbahr wrote:
         | Latest research shows emergent behavior is illusory. It doesn't
         | preclude future emergence but currently models show 0 emergent
         | behavior.
         | 
         | To me the most interesting aspect of LLMs is the way that they
         | reveal cognitive 0-days in humans.
         | 
         | The human race needs patches to cognitive firmware to deal with
         | predictive text... Which is a fascinating revelation to me.
         | Sure it's backed up by psych analysis for decades but it's
         | interesting to watch it play out on such a large scale.
        
           | gitfan86 wrote:
           | When a human makes a mistake it is a "cognitive 0-day" but
           | when an LLM does something correctly it is "illusory"?
        
             | crowbahr wrote:
             | The cognitive 0-day is not the way that humans act like
             | LLMs, it's the way humans anthropomorphize LLMs. It's the
             | blind faith that LLMs do more than they do.
             | 
             | The illusion of emergence is fact not fiction. The
             | cognitive biases exposed by stochastic parrots are fact not
             | fiction.
        
               | gitfan86 wrote:
               | That is no different than saying beauty is only real if
               | it is 100% natural. A woman who wears makeup and colors
               | her hair is just an illusion of beauty.
               | 
               | It is a philosophical argument to say that a machine
               | isn't truly intelligent because it isn't using the same
               | type of neural network as a human
        
               | discreteevent wrote:
               | Parent is saying that with something as sophisticated as
               | intelligence it's not enough to say that if it behaves
               | like a duck it's a duck (which is what your seem to be
               | saying and which the parent calls a 0-day).
               | 
               | There are some really good bulshitters who have led smart
               | people into deep trouble. These bulshitters behaved
               | really like ducks but they weren't ducks. The duck test
               | just isn't good enough.
               | 
               | The -1 day is where people say that because LLMs behave
               | like humans then humans must be based on the same tech. I
               | just wonder if these people have ever debugged a complex
               | system only to discover that their initial model of how
               | it worked was way off.
        
               | gitfan86 wrote:
               | That is a new definition of intelligence that you are
               | using. You are saying that even when something can
               | outperform humans in the SAT or other tests of
               | intelligence, it isn't actually intelligent due to it not
               | being a carbon based lifeform
        
               | discreteevent wrote:
               | No. I gave the example of a bullshitter - who is usually
               | a carbon based life form.
        
           | afpx wrote:
           | What about papers like these that suggest creation of task-
           | oriented manifolds?
           | 
           | https://www.biorxiv.org/content/10.1101/764258v3
        
         | HarHarVeryFunny wrote:
         | > I also don't understand the claims that it doesn't
         | generalize. I currently use it to solve problems that I can
         | absolutely guarantee were not in its training set, and it
         | generalizes well enough. I also think that one of the easiest
         | ways to get it to generalize better would simply be through
         | giving it synthetic data which demonstrates the process of
         | generalizing.
         | 
         | I don't think what LLMs are currently doing is really
         | generalizing, but rather:
         | 
         | 1) Multiple occurrences of something in the dataset are
         | mutually statistically reinforcing. This isn't generalization
         | (abstraction) but rather reinforcement through repetition.
         | 
         | 2) Multiple different statistical patterns are being
         | recalled/combined in novel ways such that it seems able to
         | "correctly" respond to things out of dataset, but really this
         | only due to these novel combinations, not due to it having
         | abstracted it's knowledge and applying a more general (or
         | analogical) rule than present in it's individual training
         | points.
        
       | nemo44x wrote:
       | Where in the hype cycle are we for LLMs? Are we in the late
       | stages of the rise or over the peak and beginning the slide?
        
         | crowbahr wrote:
         | Still on the climb imo
        
         | kevindamm wrote:
         | If you have the answer to that question you could make some
         | very lucrative investments.
        
         | collaborative wrote:
         | LLMs are still too expensive to run and therefore can't be
         | supported by ads. If costs get lower we'll see them being
         | pushed _a lot_ more
        
       | jsnell wrote:
       | The original title ("will scaling work?") seems like a much more
       | accurate description of the article than the editorialized "why
       | scaling will not work" that this got submitted with. The
       | conclusion of the article is not that scaling won't work! It's
       | the opposite, the author thinks that AGI before 2040 is more
       | likely than not.
        
         | bee_rider wrote:
         | It might be nice to modify the title a bit though, to indicate
         | that it is about AGI.
         | 
         | Obviously scaling works in general, just ask anyone in HPC,
         | haha.
        
       | PlasmonOwl wrote:
       | Author is leveraging mental inflexibility to generate an
       | emotional response of denial. Sure, his points are correct but
       | are constrained. Let's remove 2 constraints and reevaluate:
       | 
       | 1 - Babies learn much more with much less 2 - Video training data
       | can be made in theory at incredible rates
       | 
       | The questions becomes: why is the author focusing on approaches
       | in AI investigated in like 2012? Does the author think SOTA is
       | text only? Are OpenAI or other market leaders only focusing on
       | text? Probably not.
        
       | berniedurfee wrote:
       | I think there's a huge assumption here that more LLM will lead to
       | AGI.
       | 
       | Nothing I've seen or learned about LLMs leads me to believe that
       | LLMs are in fact a pathway to AGI.
       | 
       | LLMs trained on more data with more efficient algorithms will
       | make for more interesting tools built with LLMs, but I don't see
       | this technology as a foundation for AGI.
       | 
       | LLMs don't "reason" in any sense of the word that I understand
       | and I think the ability to reason is table stakes for AGI.
        
         | cortic wrote:
         | If humans are basically evolved LLMs, which i think is likely;
         | Reasoning will be an emergent property of LLMs within context
         | with appropriate weights.
        
           | enieslobby wrote:
           | Why do you think humans are basically evolved LLMs? Honest
           | question, would love to read more about this viewpoint.
        
             | cortic wrote:
             | Look at a year old baby, there is no logic, no reasoning,
             | no real consciousness, just basic algorithms and data input
             | ports. It takes ten years of data sets before these
             | emergent properties start to develop, and another ten years
             | before anything of value can be output.
        
               | berniedurfee wrote:
               | I strongly disagree. Kids, even infants, show a
               | remarkable degree of sophistication in relation to an
               | LLM.
               | 
               | I admit that humans don't progress much behaviorally,
               | outside of intellect, past our teen years; we're very
               | instinct driven.
               | 
               | But still, I think even very young children have a spark
               | that's something far beyond rote token generation.
               | 
               | I think it's typical human hubris (and clever marketing)
               | to believe that we can invent AGI in less than 100 years
               | when it took nature millions of years to develop.
               | 
               | Until we understand consciousness, we won't be able to
               | replicate it and we're a very long way from that leap.
        
               | visarga wrote:
               | Humans are not very smart, individually, and over a
               | single lifetime. We become smart as a species in tens of
               | millennia of gathering experience and sharing it through
               | language.
               | 
               | What LLMs learn is exactly the diff between primitive
               | humans and us. It's such a huge jump a human alone can't
               | make it. If we were smarter we should have figured out
               | the germ theory of disease sooner, as we were dying from
               | infections.
               | 
               | So don't praise the learning abilities of little
               | children, without language and social support they would
               | not develop very much. We develop not just by our DNA and
               | direct experiences but also by assimilating past
               | experiences through language. It's a huge cache of
               | crystallized intelligence from the past, without which we
               | would not rule this planet.
               | 
               | That's also why I agree LLMs are stalling because we
               | can't quickly scale a few more orders of magnitude the
               | organic text inputs. So there must the a different way to
               | learn, and that is by putting AI in contact with
               | environments and letting it do its own actions and learn
               | from its mistakes just like us.
               | 
               | I believe humans are "just" contextual language and
               | action models. We apply language to understand, reason
               | and direct our actions. We are GPTs with better feedback
               | from outside, and optimized for surviving in this
               | environment. That explains why we need so few samples to
               | learn, the hard work has been done by many previous
               | generations, brains are fit for their own culture.
               | 
               | So the path forward will imply creating synthetic data,
               | and then somehow evaluating the good from the bad. This
               | will be task specific. For coding, we can execute tests.
               | For math, we can use theorem provers to validate. But for
               | chemistry we need simulations or labs. For physics, we
               | need the particle accelerator to get feedback. But for
               | games - we can just use the score - that's super easy,
               | and already led to super-human level players like
               | AlphaZero.
               | 
               | Each topic has its own slowness and cost. It will be a
               | slow grind ahead. And it can't be any other way, AI and
               | AGI are not magic. They must use the scientific method to
               | make progress just like us.
        
               | RandomLensman wrote:
               | Humans do more than just enhance predictive capabilities.
               | It is also a very strong assumption that we are optimised
               | for survival in many or all aspects (even unclear what
               | that means). Some things could be totally incidental and
               | not optimised. I find appeals to evolutionary
               | optimisation very tricky and often fraught.
        
               | spzb wrote:
               | Have you ever met a baby? They're nothing like an LLM.
               | For starters, they learn without using language. By one
               | year old they've taught themselves to move around the
               | physical world. They've started to learn cause and
               | effect. They've learned where "they" end and "the rest of
               | the world" begins. All an LLM has "learnt" is that some
               | words are more likely to follow others.
        
               | visarga wrote:
               | Why not? We have multi-modal models as well. Not pure
               | text.
        
               | timacles wrote:
               | This comment is just sad. What are you even talking
               | about? Have you ever seen a 1 year old
        
             | lumost wrote:
             | An LLM is simply a model which given a sequence, predicts
             | the rest of the sequence.
             | 
             | You can accurately describe any AGI or reasoning problem as
             | an open domain sequence modeling problem. It is not an
             | unreasonable hypothesis that brains evolved to solve a
             | similar sequence modeling problem.
        
               | RandomLensman wrote:
               | In the broader sense that is tricky as accurate
               | prediction is not always the right metric (otherwise we'd
               | still be using epicycles for the planets).
        
               | lumost wrote:
               | It depends on the goal, epicycles don't tell you about
               | the nature of heavenly bodies - but they do let you keep
               | an accurate calendar for a reasonable definition of
               | accurate. I'm not sure whether I need deep understanding
               | of intelligence to gain economic benefit from AI.
        
               | timacles wrote:
               | > It is not an unreasonable hypothesis that brains
               | evolved to solve a similar sequence modeling problem.
               | 
               | The real world is random, requires making decisions on
               | incomplete information in situations that have never
               | happened before. The real world is not a sequence of
               | tokens.
               | 
               | Consciousness requires instincts in order to prioritize
               | the endless streams of information. One thing people dont
               | want to accept about any AI is that humans always have to
               | tell it WHAT to think about. Our base reptilian brains
               | are the core driver behind all behavior. AI cannot learn
               | that
        
               | hutzlibu wrote:
               | "Consciousness requires instincts in order to prioritize
               | the endless streams of information. "
               | 
               | What if "instinct" is also just (pretrained) model
               | weight?
               | 
               | The human brain is very complex and far from understood
               | and definitely does NOT work like a LLM. But it likely
               | shares some core concepts. Neuronal networks were
               | inspired by brain synapses after all.
        
               | timacles wrote:
               | > What if "instinct" is also just (pretrained) model
               | weight?
               | 
               | Sure - then it will take the same amount of energy to
               | train as our reptilian and higher brains took. That means
               | trillions of real life experiences over millions of
               | years.
        
               | jodrellblank wrote:
               | Not at all, it took life hundreds of millions of years to
               | develop brains that could work with language, and took us
               | tens of thousands of years to develop languages and
               | writing and universal literacy. Now computers can print
               | it, visually read it, speech-to-text transcribe it,
               | write/create/generate it coherently, text-to-speech
               | output it, translate between languages, rewrite in
               | different styles, explain other writings, and that only
               | took - well, roughly one human lifetime since computers
               | became a thing.
        
               | measured_step wrote:
               | How do our base reptilian brains reason? We don't know
               | the specifics, but unless it's magic, then it's
               | determined by some kind of logic. I doubt that logic is
               | so unique that it can't eventually be reproduced in
               | computers.
        
             | cortic wrote:
             | My first answer was a bit hasty, let me try again;
             | 
             | We are clearly a product of our past experience (in LLMs
             | this is called our datasets). If you go back to the
             | beginning of our experiences, there is little identity,
             | consciousness, or ability to reason. These things are
             | learned indirectly, (in LLMs this is called an emergent
             | property). We don't learn indiscriminately, evolved
             | instinct, social pressure and culture guide and bias our
             | data consumption (in LLMs this is called our weights).
             | 
             | I can't think of any other way our minds could work, on
             | some level they _must_ function like a LLM, Language
             | perhaps supplemented with general Data, but the principle
             | being the same. Every new idea has been an abstraction or
             | supposition of someones current dataset, which is why
             | technological and general societal advancement has not been
             | linear but closer to exponential.
        
               | Jensson wrote:
               | Genes encode a ton of behaviors, you can't just ignore
               | that. Tabula rasa doesn't exist among humans.
               | 
               | > If you go back to the beginning of our experiences,
               | there is little identity, consciousness, or ability to
               | reason.
               | 
               | That is because babies brains aren't properly developed.
               | There is nothing preventing a fully conscious being from
               | being born, you see that among animals etc. A newborn
               | foal is a fully functional animal for example. Genes
               | encode the ability to move around, identify objects,
               | follow other beings, collision avoidance etc.
        
               | cortic wrote:
               | >Genes encode a ton of behaviors, you can't just ignore
               | that.
               | 
               | I'm not ignoring that, I'm just saying that in LLMs we
               | call these things weights. And i don't want to downplay
               | the importance of weights, its probably a significant
               | difference between us and other hominids.
               | 
               | But even if you considered some behaviors to be more akin
               | to the server or interface or preprocess in LLMs it still
               | wouldn't detract from the fact that the vast majority of
               | the things that make us autonomous logical sentient
               | beings come about through a process that is very similar
               | to the core workings of LLMs. I'm also not saying that
               | all animal brains function like LLMs, though that's an
               | interesting thought to consider.
        
           | tgv wrote:
           | So you think we were originally trained on 300B tokens, those
           | were then ingrained in our synapses, and then we evolved?
        
           | lossolo wrote:
           | Reasoning and intelligence exists without language.
        
             | cortic wrote:
             | You know i assumed that was true until right now. But I
             | can't think of a single example of reason and intelligence
             | existing without any form of language. Even insects have
             | rudimentary language, and in fact reasoning and
             | intelligence seem to scale with the complexity of language,
             | both by species and within species.
        
               | Jensson wrote:
               | Do slime mold have a language? Slime mold can learn and
               | adapt to environments, so it is intelligent and can do
               | rudimentary reasoning, but I doubt it communicates that
               | information to other slime molds.
               | 
               | It is a very different kind of life form though so many
               | things that applies to other complex being doesn't apply
               | to them. Being a large single cell means that they learn
               | by changing its proteins and other internals, very hard
               | for us humans to reason about and understand since it is
               | so alien compared to just having nerve cells with
               | physical connections.
        
               | cortic wrote:
               | Not sure i would say a slime mold has reason and
               | intelligence .. Or if i would then so does a river. Also
               | i think that how it changes its proteins could be
               | considered a language, without stretching the definition
               | of language any more than we have already stretched the
               | definition of reason and intelligence.
        
               | Jensson wrote:
               | Why is a slime mold a river but a human isn't? Slime mold
               | can predict temperature changes in its environment and
               | react before it happens, that isn't something a river
               | could do.
               | 
               | So your statement just seems to be your bias thinking
               | that a slime mold couldn't possible do any reasoning.
               | Cells are much smarter than most thinks.
               | 
               | Edit: Anyway, apparently slime molds can communicate what
               | they learn by sharing those proteins. So they do have a
               | language, it is like a primitive version of how human
               | bodies cells communicate. So your point still stands,
               | reasoning seems to go hand in hand with communication. If
               | you can reason then it is worth it to share those
               | conclusions with your friends and family.
               | 
               | They also taught slime molds to cross a bridge for food,
               | and it learned to do it. Then they got the slime mold to
               | tell other slime molds and now those also knew how to
               | cross the bridge. It is pretty cool that slime molds can
               | be that smart.
               | 
               | https://asknature.org/strategy/brainless-slime-molds-
               | both-le...
        
         | whoami_nr wrote:
         | Why next-token prediction is enough for AGI - Ilya Sutskever -
         | https://www.youtube.com/watch?v=YEUclZdj_Sc
        
           | bamboozled wrote:
           | Ilya can feel the AGI
        
           | passion__desire wrote:
           | We need planning. Imagine doing planning like this "drone in
           | a forest" in a different domain like "migrate this project
           | from python to rust".
           | 
           | https://youtu.be/m89bNn6RFoQ?t=71
        
           | lewhoo wrote:
           | I really don't think there's an explanation there. All
           | Sutskever says is the idea is to ask a LLM to be the smartest
           | being on the planet and it magically happens.
        
         | lkbm wrote:
         | I guess it's an "assumption", but it's an assumption that's
         | directly challenged in the article:
         | 
         | > But of course we don't actually care directly about
         | performance on next-token prediction. The models already have
         | humans beat on this loss function. We want to find out whether
         | these scaling curves on next-token prediction actually
         | correspond to true progress towards generality.
         | 
         | And:
         | 
         | > Why is it impressive that a model trained on internet text
         | full of random facts happens to have a lot of random facts
         | memorized? And why does that in any way indicate intelligence
         | or creativity?
         | 
         | And:
         | 
         | > So it's not even worth asking yet whether scaling will
         | continue to work - we don't even seem to have evidence that
         | scaling has worked so far.
        
           | berniedurfee wrote:
           | The conclusion that AGI will happen in 2040 is what I'm
           | arguing against. I think 4020 is maybe a better estimate.
           | 
           | I don't feel like we're anywhere close given that we can't
           | even yet meaningfully define reasoning or consciousness... or
           | as another commenter put it, what is it that differentiates
           | us so significantly from other animals.
        
         | __MatrixMan__ wrote:
         | We do have systems that reason. Prolog comes to mind. It's a
         | niche tool, used in isolated cases by relatively few people. I
         | think that the other candidates are similar: proof assistants,
         | physics simulators, computational chemistry and biology
         | workflows, CAD, etc.
         | 
         | When we get to the point where LLMs are able to invoke these
         | tools for a user, even if that user has no knowledge of them,
         | and are able to translate the results of that reasoning back
         | into the user's context... That'll start to smell like AGI.
         | 
         | The other piece, I think, is going to be improved cataloging of
         | human reasoning. If you can ask a question and get the answer
         | that a specialist who died fifty years ago would've given you
         | because that specialist was a heavy AI user and so their
         | specialty was available for query... That'll also start to
         | smell like AGI.
         | 
         | The foundations have been there for 30 years, LLMs are the
         | paint job, the door handles, and the windows.
        
           | cchance wrote:
           | Ya i feel like this issue is people think an LLM will someday
           | "wake up" no, LLM's will just be multimodal and developed to
           | use tools, and a software ecosystem around it will end up
           | using the LLM to reason how to execute, basically the LLM
           | will be the internal monologue of whatever the AGI looks
           | like.
        
             | __MatrixMan__ wrote:
             | Agreed. I think it's more likely that we'll reach a point
             | where their complexity is so great that no single person
             | can usefully reason about their outputs in relation to
             | their structure.
             | 
             | Not so much a them waking up as an us falling asleep.
        
           | lossolo wrote:
           | > We do have systems that reason. Prolog comes to mind. It's
           | a niche tool, used in isolated cases by relatively few
           | people. I think that the other candidates are similar: proof
           | assistants, physics simulators, computational chemistry and
           | biology workflows, CAD, etc.
           | 
           | I think OP meant other definition of reason, because by your
           | definition calculator can also reason. These are tools
           | created by humans, that help them to reason about stuff by
           | offloading calculations for some of the tasks. They do not
           | reason on their own and they can't extrapolate. They are
           | expert systems.
           | 
           | http://www.incompleteideas.net/IncIdeas/BitterLesson.html
        
             | __MatrixMan__ wrote:
             | If an expert system is not reasoning, and a statistical
             | apparatus like an LLM is not reasoning, then I think the
             | only definition that remains is the rather antiquated one
             | which defines reason as that capability which makes humans
             | unique and separates us from animals.
             | 
             | I don't think it's likely to be a helpful one in this case.
        
               | Jensson wrote:
               | I think he wants "reasoning" to include coming up with
               | rules and not just following rules. Humans can reason by
               | trying to figure out rules for systems and then see if
               | those rules work well, on large scale that is called the
               | scientific method but all humans do that on a small
               | scale, especially as kids.
               | 
               | For a system to be able to solve the same classes of
               | problems human can solve it would need to be able to
               | invent their own rules just like humans can.
        
               | berniedurfee wrote:
               | I think that is what I mean by reason. I set the bar for
               | reasoning and AGI pretty high.
               | 
               | Though, I will admit, a system that acts in a way that's
               | indistinguishable from a human will be awful hard to
               | classify as anything but AGI.
               | 
               | Maybe I'm conflating AGI and consciousness, though given
               | that we don't understand consciousness and there's no
               | clear definition of AGI, maybe they ought to be inclusive
               | of each other until we can figure out how to
               | differentiate them.
               | 
               | Still, one interesting outcome, I think, should
               | consciousness be included in the definition of AGI, is
               | that LLMs are deterministic, which, if conscious, would
               | (maybe) eliminate the notion of free will.
               | 
               | I feel like this whole exercise may end up representing a
               | tiny, microscopic scratch on the surface of what it will
               | actually take to build AGI. It feels like we're
               | extrapolating the capabilities of LLMs far too easily
               | from capable chat bots to full on artificial beings.
               | 
               | We humans are great at imagining the future, but not so
               | good at estimating how long it will take to get there.
        
               | lossolo wrote:
               | Reasoning, in the context of artificial intelligence and
               | cognitive sciences, can be seen as the process of drawing
               | inferences or making decisions based on available
               | information. This doesn't make machines like calculators
               | or LLMs equivalent to human reasoning, but it does
               | suggest they engage in some form of reasoning.
               | 
               | Expert systems, for instance, use a set of if-then rules
               | derived from human expertise to make decisions in
               | specific domains. This is a form of deductive reasoning,
               | albeit limited and highly structured. They don't
               | 'understand' in a human sense but operate within a
               | framework of logic provided by humans.
               | 
               | LLMs, on the other hand, use statistical methods to
               | generate responses based on patterns learned from vast
               | amounts of data. This isn't reasoning in the traditional
               | philosophical sense, but it's a kind of probabilistic
               | reasoning. They can infer, locally generalize, and even
               | 'extrapolate' to some extent within the bounds of their
               | training data. However, this is not the same as human
               | extrapolation, which often involves creativity and a deep
               | understanding of context.
        
         | ctoth wrote:
         | > I think there's a huge assumption here that more LLM will
         | lead to AGI.
         | 
         | I'm not sure you realize this, but that is literally what this
         | article was written to explore!
         | 
         | I feel like you just autocompleted what you believe about large
         | language models in this thread, rather than engaging with the
         | article. Engagement might look like "I hold the skeptic
         | position because of X, Y, and Z, but I see that the other
         | position has some really good, hard-to-answer points."
         | 
         | Instead, we just got the first thing that came to your mind
         | talking about AI.
         | 
         | In fact, am I talking to a person?
        
           | jeremyjh wrote:
           | I feel like an LLM would do a much better job than GP.
        
             | berniedurfee wrote:
             | Lol, at least then your comment wouldn't have bothered me
             | so much!
        
               | jeremyjh wrote:
               | I'm sorry I hurt your feelings, it wasn't my intention.
               | For what its worth, I actually think there is a good
               | chance that you are right - that there is something
               | missing in LLMs that still won't be present in bigger
               | LLMs. I mostly meant that an LLM would be more organized
               | around the source material and address specific points.
               | 
               | I actually asked ChatGPT 4 to do so, and it produced the
               | sort of reasonable but unremarkable stuff I've come to
               | expect from it.
        
           | berniedurfee wrote:
           | Lol, yes, in fact, I was reacting to the article.
           | 
           | The point I was trying to make is that I think better LLMs
           | won't lead to AGI. The article focused on the mechanics and
           | technology, but I feel that's missing the point.
           | 
           | The point being, AGI is not going to be a direct outcome of
           | LLM development, regardless of the efficiency or volume of
           | data.
        
             | ctoth wrote:
             | I can interpret this in a couple different ways, and I want
             | to make sure I am engaging with what you said, and not with
             | what I thought you said.
             | 
             | > I think better LLMs won't lead to AGI.
             | 
             | Does this mean you believe that the Transformer
             | architecture won't be an eventual part of AGI? (possibly
             | true, though I wouldn't bet on it)
             | 
             | Does this mean that you see no path for GPT-4 to become an
             | AGI if we just leave it alone sitting on its server? I
             | could certainly agree with that.
             | 
             | Does this mean that something like large language models
             | will not be used for their ability to model the world, or
             | plan, or even just complete patterns as does our own System
             | one in an eventual AGI architecture? I would have a lot
             | more trouble agreeing with that.
             | 
             | In general, it seems like these sequence modelers that
             | actually work right is a big primitive we didn't have in
             | 2016 and they certainly seem to me as an important step.
             | Something that will carry us far past human-level, whatever
             | that means for textual tasks.
             | 
             | To bring it back to the article, probably pure scale isn't
             | quite the secret sauce, but it's a good 80-90% and the rest
             | will come from the increased interest, the shear number of
             | human-level intelligences now working on this problem.
             | 
             | Too bad we haven't scaled safety nearly as fast though!
        
               | berniedurfee wrote:
               | Yes, I suppose my assertion is that LLMs may be a step
               | toward our understanding of what is required to create
               | AGI. But, the technology (the algorithms) will not be
               | part of the eventual solution.
               | 
               | Having said that, I do agree that LLMs will be
               | transformative technology. As important perhaps as the
               | transistor or the wheel.
               | 
               | I think LLMs will accelerate our ability as a species to
               | solve problems even more than the calculator, computer or
               | internet has.
               | 
               | I think the boost in human capability provided by LLMs
               | will help us more rapidly discover the true nature of
               | reasoning, intelligence and consciousness.
               | 
               | But, like the wheel, transistor, calculator, computer and
               | internet; I feel strongly that LLMs will prove to be just
               | another tool and not a foundational technology for AGI.
        
           | dullcrisp wrote:
           | Why does it matter?
        
           | joe_the_user wrote:
           | _I 'm not sure you realize this, but that is literally what
           | this article was written to explore!_
           | 
           | Yeah but it's "exploration" answers all the reasonable
           | objections by just extrapolating vague "smartness" (EDITED
           | [1]). "LLMs seem smart, more data will make 'em smarter..."
           | 
           | If _apparent_ intelligence were the only measure of where
           | things are going, we could be certain GPT-5 or whatever would
           | reach AGI. But I don 't many people think that's the case.
           | 
           | The various critics of LLMs like Gary Marcus make the point
           | that while LLMs increase in ability each iteration, they
           | continue to be weak in particular areas.
           | 
           | My favorite measure is "query intelligence" versus "task
           | accomplishment intelligence". Current "AI" (deep
           | learning/transformers/etc) systems are great at query
           | intelligence but don't seem to scale in their "task
           | accomplishment intelligence" at the same rate. (Notice "baby
           | AGI", ChatGPT+self-talk, fail to produce actual task
           | intelligence).
           | 
           | [1] Edited, original "seemed remarkably unenlightening. Lots
           | of generalities, on-the-one-hand-on-the-other descriptions".
           | Actually, reading more closely the article does raise good
           | objections - but still doesn't answer them well imo.
        
             | berniedurfee wrote:
             | I've also heard it said that "apparent" intelligence is
             | good enough to be called "real" intelligence if it's
             | indistinguishable from the real thing. That's where I have
             | a strong feeling that we're missing the true meaning of
             | intelligence, reasoning and consciousness.
             | 
             | As you said, we may very well be a couple iterations away
             | from a chatbot that is truly indistinguishable from a
             | human, but I still strongly assert that even a perfectly
             | coherent chatbot is nothing more than an automaton and we
             | humans are not automatons.
             | 
             | The fact that a couple replies in this thread made me feel
             | defensive and a bit discouraged with their condescending
             | tone is to me an internal reaction that an LLM or similar
             | system will never have. Maybe an appropriate emotional
             | reaction can be calculated and simulated, but I think the
             | nature of the experience itself is truly beyond our current
             | comprehension.
             | 
             | Maybe I'm grasping at the metaphysical to rationalize my
             | fear that we're on the cusp of understanding
             | consciousness... and it turns out to be pretty boring and
             | will be included with Microsoft O365 in a couple years.
        
               | dasil003 wrote:
               | I agree with you, but I think it's more of a
               | philosophical topic (ie. Chinese Room argument) than
               | something that technicians working on raw LLM
               | capabilities usually care to engage in. For them, the
               | Turing Test and utility in applications are the most
               | important thing.
               | 
               | Personally, I don't think we can construct an equivalent
               | intelligence to a human out of silicon. That's not say
               | AGI is unachievable or that it can't surpass human
               | intelligence and be superficially undistinguishable from
               | a human, but it will always be different and alien in
               | some way. I believe our intelligence is fundamentally
               | closer to other earth animals descended from common
               | genetic ancestors than it can be to an artificial
               | intelligence. As the creators of AI, we can and will
               | paper over these differences enough to Get The Job
               | Done(tm), but the uncanny valley will always be there if
               | you know where to look.
        
             | jeremyjh wrote:
             | > My favorite measure is "query intelligence" versus "task
             | accomplishment intelligence".
             | 
             | The article does address this regarding abysmal performance
             | on the GitHub PR benchmark. It's one of the big "ifs" for
             | sure.
        
       | mgaunard wrote:
       | I think the more interesting question is how long will people
       | cling to the illusion that LLMs will lead us to AGI?
       | 
       | Maintaining the illusion is important to keep the money flowing
       | in.
        
         | beepbooptheory wrote:
         | While this is certainly true, I think we can't ignore the
         | intense enthusiasm and faith of a large cohort of our peers
         | (or, you know, HN commenters) who believe this to be The Way,
         | and are not necessarily stakeholders in any meaningful sense.
         | Just look at some of the responses even in this thread. It
         | feels like some people just _need_ this, and respond to
         | balanced skepticism as Alyosha does to his brother Ivan.
         | 
         | In part, whether conscious or not, people see the bright future
         | of LLMs as a kind of redemption for the world so far wrought
         | from a Silicon Valley ideology; its almost too on-the-nose the
         | way chatgpt "fixes" internet search.
         | 
         | But on a deeper level, consider how many hn posts we saw before
         | chatgpt that were some variation of "I have reached a pinnacle
         | of career accomplishment in the tech world, but I can't find
         | meaning or value in my life." We don't seem to see those posts
         | quite as much with all this AI stuff in the air. People seem to
         | find some kind of existential value in the LLMs, one with an
         | urgency that does not permit skepticism or critique.
         | 
         | And, of course, in this thread alone, there is the constant
         | refrain: "well, perhaps _we_ are large language models
         | ourselves after all... " This reflex to crude Skinnerism says a
         | lot too: there are some that, I think, seek to be able to
         | conquer even themselves; to reduce their inner life to python
         | code and data, because it is something they can know and
         | understand and thus have some kind of (sense) of control or
         | insight about it.
         | 
         | I don't want to be harsh saying this, people need something to
         | believe in. I just think we can't discount how personal all
         | this appears to be for a lot of regular, non-AI-CEO people. It
         | is just extremely interesting, this culture and ideology being
         | built around this. To me it rivals the LLMs themselves as a
         | kind fascinating subject of inquiry.
        
           | visarga wrote:
           | There is no magic in the brain. There is no magic in LLMs.
           | There is just new experience we gain by interacting with the
           | environment and society. And there is the trove of past
           | experience encoded in our books. We got smart by collecting
           | experience, in other words, from outside. The magic in the
           | brain was not in the brain, but everywhere else.
           | 
           | What is experience? We are in state S, and take action A, and
           | observe feedback R. The environment is the teacher, giving us
           | reward signals. We can only increase our knowledge
           | incrementally, by trying our many bad ideas, and sometimes
           | paying with our lives. But we still leave morsels of newly
           | acquired experience for future generations.
           | 
           | We are experience machines, both individually and socially.
           | And intelligence is the distilled experience of the past,
           | encoded in concepts, methods and knowledge. Intelligence is a
           | collective process. None of us could reach our current level
           | without language and society.
           | 
           | Human language is in a way smarter than humans.
        
             | crowbahr wrote:
             | To say there's no magic in the brain drastically *minimizes
             | the complexity of the brain.
             | 
             | Your brain is several orders of magnitude more complex than
             | even the largest LLM.
             | 
             | GPT4 has 1 trillion parameters? Big deal. Your brain has 1
             | quadrillion synapses, constantly shifting. Beyond that the
             | synapses are analog messages, not binary. Each synapse is
             | approximately like 1000 transistors based on the
             | granularity of messaging it can send and receive.
             | 
             | It is temporally complex as well as structurally complex,
             | well beyond anything we've ever made.
             | 
             | I'm strongly in favor of AGI, for what it's worth, but LLMs
             | aren't even scratching the surface. They're nowhere close
             | to a human. They're a mediocre pastiche and it's equally
             | possible that they're a dead end as it is that they'll ever
             | be AGI.
        
               | visarga wrote:
               | That kind of explains why humans need to absorb less
               | language to train. It still takes 25 years of focused
               | study to become capable of pushing the frontier of
               | knowledge a tiny bit.
        
             | elktown wrote:
             | > There is no magic in the brain.
             | 
             | The amount of hubris we have in our field is deeply
             | embarrassing. Imagine a neuroscientists reading that. The
             | thought makes me blush.
        
               | lern_too_spel wrote:
               | Neuro _scientists_ would agree with GP, otherwise they
               | would be neuro _mystics_ instead of neuro _scientists_.
               | There is no magic. It 's all physical processes that we
               | can eventually understand.
        
               | elktown wrote:
               | The entire point was that we do not understand it? That
               | much of how the brain work is "magic" atm.
               | 
               | It's _our field_ that are the alchemist mystics, rambling
               | about AGI /Philosopher's Stone in ever increasingly
               | unhinged ways, while stirring our ML-pots that we have
               | never even tried to prove have a chance to be anymore
               | successful than the alchemists.
        
               | lern_too_spel wrote:
               | Just because we don't understand it doesn't mean it's
               | magic. That's the whole point of science.
               | 
               | > It's _our field_ that are the alchemist mystics,
               | rambling about AGI/Philosopher's Stone in ever
               | increasingly unhinged ways,
               | 
               | These "ramblings" are what scientists call hypotheses.
               | The people making these hypotheses have even proposed how
               | to test them.
        
               | elktown wrote:
               | Even with added quotes you can't stop reading it
               | literally? The total lack of critical thinking due to
               | confirmation bias is just as embarrassing.
        
             | RaftPeople wrote:
             | > _There is no magic in the brain._
             | 
             | Consciousness?
        
           | mgaunard wrote:
           | As much as I personally believe that neural networks do bear
           | a lot of resemblance to the human psyche, and that people are
           | just sophisticated biological machines, I don't see how LLMs
           | are capturing all of our thought processes.
           | 
           | What I say is not just regurgitation of my past experiences;
           | there is a logic to it.
        
         | ryanklee wrote:
         | You say this is as if it's settled and obvious that it it's an
         | illusion and it's only the delusional that believe the
         | opposite.
         | 
         | But if it were so settled and obvious there would be a clear
         | line of reasoning to make that plain. And there is not.
         | Instead, there is a very vibrant debate on the topic with tons
         | of nuance and good faith (and bad) on each side, if we want to
         | talk about sides.
         | 
         | And, of course, one of the implications of this very real and
         | significant inquiry that needs to be made and that requires
         | real contributions from informed individuals, is that whenever
         | anyone is dismissive or reductive regarding the unresolved
         | difficulties, you can be sure they have absolutely no clue what
         | they are talking about.
        
       | slibhb wrote:
       | The best analogy for LLMs (up to and including AGI) is the
       | internet + google search. Imagine explaining the internet/google
       | to someone in 1950. That person might say "Oh my god, everything
       | will change! Instantaneous, cheap communication! The world's
       | information available at light speed! Science will accelerate,
       | productivity will explode!" And yet, 70 years later, things have
       | certainly changed, but we're living in the same world with the
       | same general patterns and limitations. With LLMs I expect
       | something similar. Not a singularity, just a new, better tool
       | that, yes, changes things, increases productivity, but leaves
       | human societies more or less the same.
       | 
       | I'd like to be wrong but I can't help but feel that people
       | predicting a revolution are making the same, understandable
       | mistake as my hypothetical 1950s person.
        
         | jerpint wrote:
         | The internet has allowed us to interact in ways that were
         | inconceivable at the time; think communication and speed of
         | information for one.
         | 
         | When agents start being more reliable I think we will start
         | seeing applications we couldn't possibly anticipate today
        
         | arketyp wrote:
         | My take on this is that much of work and problem solving is
         | about understanding the problem. So I think human abilities
         | will remain the bottleneck. I pose this thought experiment: Is
         | it possible to design an AI system for a monkey which gives it
         | super-monkey abilities?
        
           | red75prime wrote:
           | For a monkey it's impossible to design... pretty much
           | anything beside a few simple tools. So, no. A monkey cannot
           | design a bow, a loom, a tractor, a computer, or an AI of any
           | kind.
           | 
           | We had designed many tools that beat us in various aspects.
           | This is an invalid analogy.
        
         | bee_rider wrote:
         | The internet did change things pretty dramatically.
         | 
         | Productivity at information communication tasks just isn't the
         | entire economy.
         | 
         | I think we are massively more productive. Some of the biggest
         | new companies are ad companies (Google, Facebook), or spend a
         | ton of their time designing devices that can't be modified by
         | their users (Apple, Microsoft). Even old fashioned companies
         | like tractor and train companies have time to waste on
         | _preventing users from performing maintenance._ And then the
         | economy has leftover effort to jailbreak all this stuff.
         | 
         | We're very productive, we've just found room for unlimited zero
         | or negative sum behavior.
        
           | imachine1980_ wrote:
           | I feel you are mixing value capture with value generation. If
           | GM produces cars with the same level of margins as Facebook
           | or Google, things will be different. LVMH (Louis Vuitton
           | Group) holds a value equivalent to that of Toyota,
           | Volkswagen, and two-thirds of Ford combined. Louis Vuitton
           | alone was valued more than Red Hat a few months ago. This
           | doesn't mean that Louis Vuitton is more valuable than Red
           | Hat, but rather that it captures Value more effectively than
           | Red Hat.
        
             | eru wrote:
             | > This doesn't mean that Louis Vuitton is more valuable
             | than Red Hat, but rather that it captures Value more
             | effectively than Red Hat.
             | 
             | What definition of 'valuable' are you using here?
        
               | bee_rider wrote:
               | Probably something like market cap (although I guess it
               | would have to be based on the past now that Red Hat has
               | been bought), or there are nebulous measures of brand
               | value out there.
               | 
               | I think it is a fair point TBH, my original comment could
               | have been more clear about this aspect.
        
             | bee_rider wrote:
             | I think I may have just skipped a step or not expressed
             | myself very well.
             | 
             | What I'm saying is, I suspect information technology has
             | made classic production companies vastly more efficient and
             | productive. To the point where we can afford to have
             | massive companies like Facebook that are almost entirely
             | based on value capture.
             | 
             | That's my speculation at least. Your example puts me in a
             | tough spot, in the sense that Louis Vuitton is pretty old
             | and pretty big. I'd have to know more about the company to
             | quibble, and I don't feel like researching it. I wonder if
             | the proportion of their value that comes from pointless
             | fashion branding was originally smaller. Or if the whole
             | pointless fashion branding segment was originally just
             | smaller itself. But I'm just spitballing.
             | 
             | In the past we also had mercenary companies and the like to
             | capture value without producing much, so I could just be
             | wrong.
        
           | HarHarVeryFunny wrote:
           | > The internet did change things pretty dramatically.
           | 
           | For sure - I grew up in the mid-late 70s having to walk to
           | the library to research stuff for homework, parents having to
           | use the yellow-pages to find things, etc.
           | 
           | Maybe smartphones are more of a game changer than desk-bound
           | internet though - a global communication device in your
           | pocket that'll give you driving directions, etc, etc.
           | 
           | BUT ... does the world really FEEL that different now, than
           | pre-internet? Only sort-of - more convenient, more connected,
           | but not massively different in the ways that I imagine other
           | inventions such as industrialization, electricity, cars may
           | have done. The invention of the telephone and radio maybe
           | would have felt a bit like the internet - a convenience that
           | made you feel more connected, and maybe more startling being
           | the first such capability?
        
             | toast0 wrote:
             | I would say that it feels different because the internet /
             | smartphones are more about giving everyone access to
             | inexpensive, high bandwidth, communication (nearly)
             | everywhere. But high bandwidth communications have been
             | available everywhere for a long time, if you had a need and
             | were willing to pay for it --- tv news would bounce signals
             | off a satelite for on scene reports, etc.
        
               | johngossman wrote:
               | It does feel different, but I don't think it's the
               | bandwidth, or even the availability. A newspaper is high
               | bandwidth and fairly inexpensive and ubiquitous but also
               | fairly high latency. The evening TV news was only once a
               | day until the 80s. One big change I noticed was 24-hour
               | news. Suddenly, it felt important to know about things
               | immediately. The web was different because it was
               | interactive--both in the sense that you could swiftly
               | switch between information sources and then in the social
               | media sense that everybody could participate, even if
               | participation meant flame wars.
               | 
               | And historically, TV news isn't that old, especially the
               | 24-hour variety. The Apollo landings and Vietnam War are
               | often cited as landmarks in TV news, where for the first
               | time large numbers of people watched things as they
               | occurred. But it's only about 25 years from those events
               | to Netscape Navigator, where the web became widely
               | available (at least in the developed world). That's a
               | long time in most people's lives, but I wouldn't be
               | surprised if future historians will see TV as something
               | like an early, one-way Internet.
        
             | bee_rider wrote:
             | I don't know really, I was a kid in the 90's.
             | 
             | This is a bit far from the economic aspect, but the world
             | currently seemed to be utterly suffused with a looming
             | sense of dread, I think because we have, or know other
             | people have, news notifications in their pockets telling us
             | all about how bad things are.
             | 
             | I don't remember that feeling from the 90's, but then, I
             | was a kid. And of course before that there was the constant
             | fear of nuclear annihilation, which we've only recently
             | brought back really. Maybe growing up in the end of history
             | warped my perspective, haha.
        
               | HarHarVeryFunny wrote:
               | Yes - internet "news" is hardly a positive.
               | 
               | I grew up in the UK, so news was mainly from the BBC
               | which was pretty decent although bad news (e.g. IRA
               | bombings) was still front and center. US TV news doesn't
               | even pretend/try to be unbiased and is all about shock
               | value, reinforcing their viewers political beliefs and of
               | course advertizing (which the BBC didn't have, being
               | state funded).
               | 
               | Internet takes bad news and misinformation to a whole new
               | and massively distorted level.
               | 
               | I gave up watching TV many years ago (nowadays primarily
               | YouTube & Netflix for entertainment), and mostly just
               | skim headlines (e.g. Google news) to get an idea of
               | what's going on.
        
               | stupidcar wrote:
               | People who experienced a stable childhood seem to have a
               | natural tendency to view the period they grew up in as,
               | if not a golden age, then a safer, simpler time. Which
               | makes sense: You're too young to be aware of much of the
               | complexity of the world, and your parents provide most of
               | your essential needs and shield you from a lot of bad
               | stuff.
               | 
               | That's not to say all eras are the same. Clearly there's
               | better and worse times to be alive, but it's hard to be
               | objective about our childhoods.
        
               | HarHarVeryFunny wrote:
               | That's certainly all true, and not just parents shielding
               | you from bad stuff, but the bad stuff just not appearing
               | on the TV or in the newspaper the way it will today on TV
               | or internet. If it was going on then nobody was aware of
               | it, and maybe not a bad thing. Is my life really better
               | for reading about some teenage cartel hitman making human
               | "stew" etc ?
               | 
               | But I do think that perhaps the 70's was a somewhat more
               | decent time than today. Lines have been crossed and
               | levels of violence normalized that it seems really didn't
               | exist back then, or certainly were not as widespread.
               | e.g. I grew up with the IRA constantly in the news -
               | often bombings in the UK as well as violence in Northern
               | Ireland. But, by today's standard the IRA's terrorism was
               | almost quaint and gentlemanly ... they'd plant a bomb,
               | but then call it into the police and/or media so that
               | people could be evacuated - they still created
               | terror/disruption which realistically probably did help
               | them achieve their goals, but without the level of ultra
               | violence and complete disregard for human life that we
               | see today, such as ISIS beheadings posted on FaceBook or
               | Twitter that some people happily watch and forward to
               | their friends, or the 9/11 attack which was really
               | inconceivable beforehand.
        
               | incangold wrote:
               | I was a teenager in the 90s in a house that read the
               | Daily Mail every day, and that could deliver a similar
               | sense of dread.
               | 
               | But at least the dread was about things that seemed
               | vaguely tractable and somewhat local, rather than the
               | dizzyingly complex, global and existential threats the
               | news delivers these days.
               | 
               | And of course not everyone read newspapers as
               | intentionally-alarming as the Mail. Whereas now many more
               | people's information supply is mediated by channels with
               | that brief.
               | 
               | Feels to me like a double-whammy of the alarm-maximising
               | sections of the internet developing at the same time as
               | the climate crisis becomes more imminent, maybe?
        
             | johngossman wrote:
             | I once asked my mom, who grew up in the 1930s (aside: feels
             | increasingly necessary to specific 19--), what was the
             | biggest technological change she had seen in her lifetime.
             | Her immediate answer was 'indoor plumbing.' But her next
             | answer was the cellphone. She said cars and trains weren't
             | vastly different from when she was a kid, she almost never
             | went on a plane, and that people spent a lot of time
             | watching the TV and listening to the radio, but they used
             | their cellphones more and for far more things.
        
             | scrozart wrote:
             | > does the world really FEEL that different now, than pre-
             | internet?
             | 
             | Yes. You said it yourself: you used to have to WALK
             | somewhere to look things up. Added convenience isn't the
             | only side affect; that walk wasn't instantaneous. During
             | the intervening time, you were stimulated in other ways on
             | your trek. You saw, smelled, and heard things and people
             | you wouldn't have otherwise. You may have tried different
             | routes and learned more about your surroundings.
             | 
             | I imagine you, like I, grew up outside, sometimes with
             | friends from a street or two over, that small distance
             | itself requiring some exploration and learning. Running in
             | fresh air, falling down and getting hurt, brushing it off
             | because there was still more woods/quarry/whatever to see,
             | sneaking, imagining what might lie behind the next
             | hill/building; all of that mattered. The minutae people are
             | immersed in today is vastly different in societies where
             | constant internet access is available than it was before,
             | and the people themselves are very different for it. My
             | experience with current teens and very young adults
             | indicates they're plenty bright and capable (30-somethings
             | seem mostly like us older folks, IMO), but many lack the
             | ability or desire to focus long enough to obtain real
             | understanding of context and the details supporting it to
             | really EXPERIENCE things meaningfully.
             | 
             | Admittedly anecdotal example: Explaining to someone why the
             | blue-ish dot that forms in the center of the screen in the
             | final scene of Breaking Bad is meaningful, after watching
             | the series together, is very disheartening. Extrapolation
             | and understanding through collation of subtle details seems
             | to be losing ground to black and white binaries easily
             | digested in minutes without further inquiry as to
             | historical context for those options.
             | 
             | I abhor broad generalizations, and parenting plays a large
             | part in this, but I see a concerning detachment among
             | whatever we're calling post-millenials, and that's a major,
             | real world difference coming after consecutive generations
             | of increasing engagement and activism confronting the real
             | problems we face.
        
             | TaylorAlexander wrote:
             | Considering that I work in open source robotics I literally
             | couldn't do my job without the internet. So that feels
             | pretty different!
        
             | hibikir wrote:
             | For me, it's incredibly different. I moved to the US from
             | Spain back when the best internet we could get at home was
             | 3kb/sec, and we liked it (yes kids, close to a million
             | times slower than today). I recall the massive cultural and
             | economic detachment of that move: Minimal shared culture.
             | Major differences in food availability: Often I couldn't
             | even cook what I wanted if I didn't smuggle the
             | ingredients. Connecting with people with shared interests
             | was really difficult, as discovering communities was a lot
             | of work: Even more so in America, where I needed a car for
             | everything, and communities lacked the local gossiping
             | infrastructure that I relied on at home.
             | 
             | Today, I got to do some miniature painting while hanging
             | out on video with someone in England. I get to buy books
             | digitally the same day they are published, and I don't have
             | to travel a suitcase full of them, plus a cd collection for
             | a 1 month vacation. My son can talk to his grandma, on
             | video, whenever he likes: Too cheap to meter. Food? I can
             | find an importer that already has what I want most of the
             | time, and if not, i can get anything shipped, from
             | anywhere. A boardgame from germany, along with some
             | cookies? Trivial. Spanish TV, including soccer games, which
             | before were impossible. My hometown's newspaper, along with
             | one from Madrid, and a few international ones.
             | 
             | An immigrant in the 90s basically left their culture behind
             | with no recourse. Today I can be American, and a Spaniard,
             | at the same time with minimal loss of context by being
             | away. All while working on a product used by hundreds of
             | millions of people, every day, with a team that spans 16
             | timezones, yet manages to have standups.
             | 
             | A lot of people's lives haven't changed that much, because
             | their day to day is still very local. If you work at the
             | oil field, and then go to the local high school to watch
             | your kid's game on friday night, and all your family is
             | local, a big part of your life wouldn't have been so
             | different in the 90s, or even in the 60s. But I look at the
             | things my family did week that I couldn't have possibly
             | done in 98, and it's most of my life. My dad's brain would
             | have melted if he could hear a description of the things I
             | get to do today that were just sci-fi when he died. It's
             | just that the future involved fewer people wielding katanas
             | in the metaverse than our teenage selves might have liked.
        
             | corethree wrote:
             | It's because the change happened slowly. So it feels like
             | nothing has changed.
             | 
             | Another thing that's changed is engineering. The US has
             | moved up the stack. Engineering is now mostly software
             | development and within that it's mostly web development.
             | Engineering and manufacturing has largely moved overseas to
             | Asia and that's where most of the expertise lies. The only
             | thing off the top of my head that the US still dominates in
             | engineering is software/aerospace/defense. In general
             | though everything else is dominated by Asia, if you want
             | the top hardware technology the US is no longer the place
             | to get it. In Silicon Valley there used to be a good mix of
             | different types of engineers, now everyone is SWE, and most
             | likely doing web stuff. But here's the thing, you most
             | likely wouldn't have noticed this unless you thought hard
             | about it because either you're too young or because the
             | change happened so slowly.
             | 
             | The same will be for AGI if it comes into fruition. A lot
             | of jobs will be replaced, slowly. Then when AGI replacement
             | reaches saturation most people will be used to the status
             | quo whether it's better or worse. It will seem like nothing
             | has changed.
        
           | Negitivefrags wrote:
           | I remember long ago reading an argument that information
           | technology has not actually increased productivity. I really
           | wish I could find a source for this now, but I just can't
           | seem to find it anywhere on the internet. Here it is anyway:
           | 
           | The administration of the Tax Service uses 4% of the total
           | tax revenue it generates. This percentage has stayed
           | relatively fixed over time.
           | 
           | If IT really improved productivity, wouldn't you expect that
           | that number would decrease, since Tax Administration is
           | presumably an area that we should expect to see great gains
           | from computerisation?
           | 
           | We should be able to do the same amount of work more
           | efficiently with IT, thus decreasing the percentage. If
           | instead the efficiency frees up time allowing more work to be
           | done (because there are people dodging taxes and we need to
           | discover that), then you should expect the amount of tax to
           | increase relatively which should also cause the percentage to
           | decrease.
           | 
           | Therefore IT has not increased productivity.
           | 
           | Either it doesn't do so directly, or it does do so directly,
           | but all the efficiency gains are immediately consumed by more
           | useless beurocracy.
        
             | FirmwareBurner wrote:
             | _> Either it doesn't do so directly, or it does do so
             | directly, but all the efficiency gains are immediately
             | consumed by more useless beurocracy._
             | 
             | That's how government digitalization has functioned in my
             | country. It hasn't improved things, it just moved all the
             | paper hassle to a digital hassle now where I need to go to
             | Reddit to find out how to use it right and then do a back
             | and forth to get it right. Same with the new digitalization
             | of medical activities, a lot of doctors I know say it
             | actually slows them down instead of making them more
             | productive as they say they're now drowning in even more
             | bureaucracy.
             | 
             | So depending on how you design and use your IT systems,
             | they can improve things for you if done well, but they cal
             | also slow you down if done poorly. And they're more often
             | done poorly than great because the people in charge of
             | ordering and buying them (governments, managers, execs,
             | bean counters, etc) are not the same people who have to use
             | them every day (doctors, taxpayers, clerks, employees in
             | the trenches, etc).
             | 
             | I kind of feel the same way about the Slack "revolution".
             | It hasn't made me more productive compared to the days when
             | I was using IBM Lotus Sametime. Come to think of it, Slack
             | and Teams, and all these IM apps designed around constant
             | group chatting instead of 1-1, is actually making me less
             | productive since it's full of SO .... MUCH ... NOISE, that
             | I need to go out of my way to turn off or tune out in order
             | to get any work done.
             | 
             | The famous F1 aerodinamic engineer, Arain Newey, doesn't
             | even use computers, he has his secretary print out his
             | emails every day which he reads at home and replies through
             | his secretary the next day, and draws everything by hand on
             | the drafting board and has the people below him draw them
             | in CAD and send him the printed simulation results through
             | his secretary, and guess what, his cars have been world
             | class winning designs. So more IT and more sync
             | communication, doesn't necessarily mean more results.
        
             | bee_rider wrote:
             | Hmm, I'm not sure I buy it, because I'm not sure what
             | additional effort applied to tax administration looks like.
             | 
             | Perhaps we could be optimistic about people and assume the
             | amount of real, legitimate tax fraud and evasion is pretty
             | low. If we took the latter scenario you present--increasing
             | efficiency means the same amount of people will do more
             | work--and assumed this effort is instead applied to
             | decreasing the number of random errors (which might result
             | in someone overpaying or underpaying), we wouldn't
             | necessarily expect to see a change in the expected value of
             | the taxes. But, it could be "better" in the sense that it
             | is more fair.
        
             | SgtBastard wrote:
             | >Either.
             | 
             | A third option: technology investments improved the
             | efficiency of the previous tax base, which allowed the
             | expansion of the tax base - through additional enforcement
             | activity, increasing the tax base in absolute terms but
             | also returning the overhead to its historical norms.
             | 
             | Without tracking the size of the tax base in inflation-
             | adjusted terms, hard to account for.
             | 
             | (The cynically, you're probably right re: useless
             | bureaucratic expansion)
        
         | _a_a_a_ wrote:
         | > but we're living in the same world with the same general
         | patterns and limitations
         | 
         | seems odd. What 'patterns' and 'limitations' do you still see?
         | Because I see so much has changed.
        
         | lysecret wrote:
         | Good point to me the internet was just "other people", what
         | differentiated is not the 4 people you know but literally
         | (almost) and potentially all other people.
         | 
         | With AI, the way I see it, it is just virtual other people. Of
         | course, a bit stranger but more simillar than you think.
        
           | david_allison wrote:
           | There's currently little to no learning or feedback loop due
           | to the relatively small context window sizes.
           | 
           | I've done many language exchanges with people using Google
           | Translate and the lack of improvement/memory of past
           | conversations is a real motivation killer; I'm concerned this
           | will move on to general discourse on the internet with the
           | proliferation of LLMs.
           | 
           | I'm sure many people have already gone around in circles with
           | rules-based customer support. AI can make this worse.
        
         | herval wrote:
         | > And yet, 70 years later, things have certainly changed, but
         | we're living in the same world with the same general patterns
         | and limitations. With LLMs I expect something similar. Not a
         | singularity, just a new, better tool that, yes, changes things,
         | increases productivity, but leaves human societies more or less
         | the same.
         | 
         | by what criteria do you see the world as the same today vs 70
         | years ago?
        
           | ketzo wrote:
           | I mean, very broad strokes, but I can see GP's point.
           | 
           | - people eat plants and animals
           | 
           | - people pay money for goods and services
           | 
           | - there are countries, sometimes they fight, sometimes they
           | work together
           | 
           | - men and women come together to create children, and often
           | raise those children together
           | 
           | etc, etc, etc
           | 
           | The "bones" of what make up a capital-S Society are pretty
           | much the same. None of these things _had_ to stay the same,
           | but they have so far.
        
             | majkinetor wrote:
             | VERY broad strokes. We also still have a Sun, and the
             | stars.
             | 
             | Internet and the last 30 years tech did change things
             | dramatically. I bet that most people would feel handicapped
             | if they were teleported just 50 years back. We got into
             | this type of life progressively, so people didn't notice
             | the change, even though it was dramatic. The same phenomena
             | with gradient changes happen on physiological level too,
             | this is not different.
        
             | herval wrote:
             | I mean, has _any_ change in _human history_ impacted those
             | considerably? This argument is like saying we live the same
             | way the cavemen did...
        
               | jodrellblank wrote:
               | I'm not the original commenter, but moving from nomadic
               | tribes to stable settlements, moving from hunter
               | gathering to agriculture, moving from almost everyone
               | subsistence farming to the introduction of money at all,
               | to most people working unrelated for money and trading
               | money for food[2], moving from multigenerational homes to
               | nuclear families to sending kids to schools and daycares,
               | moving from tribal lands to countries with a national
               | identity of their own which you are supposed to have some
               | kind of loyalty to - over and above the king/warlord you
               | trade protection with.
               | 
               | As well as those, the change from food and goods being
               | scarce to abundant roughly corresponding with the
               | industrial revolution (abundant textiles and clothes) and
               | the early to mid 1900s (factories), labour receding from
               | sunrise to sundown changing to a working week with days
               | off (various, but early 1900s official 5 day week[1] and
               | 8 hour day), changing to the more recent thing where both
               | parents have to work to get enough income while the child
               | is away all day, massively increased free time
               | (particularly household chore automation - electricity,
               | light, central heating, food mixers, washing machines,
               | mostly early to mid 1900s).
               | 
               | Compared to those things, the internet gets you something
               | else to read or watch (instead of TV, newspaper, book,
               | radio) and some other way to talk (instead of letter,
               | telegram, postcard, telephone). Yes the organisation of
               | things happens quicker and information comes from farther
               | away, and can be more up to date, but you spend your time
               | sitting in a chair watching or reading (office, home,
               | school) like you did before, you buy things and have them
               | delivered or go collect them (like you did before), you
               | consult maps and directories and consumer advice and
               | government documents (like you did before), you take and
               | share holiday photos (like before). It's different, but
               | it's not _all that different_.
               | 
               | [1] https://www.bbc.co.uk/bitesize/articles/zf22kmn (1932
               | in America)
               | 
               | [2] https://researchbriefings.files.parliament.uk/documen
               | ts/SN03... - the UK had 1.7M people working in farming in
               | 1851, down to 182k today while the population has roughly
               | 4x'd in the same time.
        
               | ketzo wrote:
               | Some people claim AGI will. If you believe in the heights
               | of "singularity" talk, we should expect some pretty
               | fundamental changes to the basics of our lives.
               | 
               | Not sure how much stock I put in that, though.
        
         | HarHarVeryFunny wrote:
         | I think AGI can change the world once it gets way beyond human
         | level both in terms of types of beyond-human "senses" and
         | pattern matching/prediction (i.e. intelligence), but we are
         | nowhere near that yet.
         | 
         | On their current trajectory LLMs are just expert systems that
         | will let certain types of simple job be automated. A potential
         | productivity amplifier similar to having a personal assistant
         | that you can assign tasks too. Handy (more so for people doing
         | desk-bound jobs than others), but not a game changer.
         | 
         | An AGI far beyond human capability could certainly accelerate
         | scientific advance and let us understand the world (e.g. how to
         | combat climate change, how to address international conflicts,
         | how to handle pandemics) so be very beneficial, but what that
         | would feel like to us is hard to guess. We get used to slowly
         | introduced (or even not so slowly) changes very quickly and
         | just accept them, even though today's tech would look like
         | science fiction 100 years ago.
         | 
         | What would certainly be a game changer, and presumably will
         | eventually come (maybe only in hundreds of years?) would be if
         | humans eventually relinquish control of government, industry,
         | etc to AGIs. Maybe our egos will cause us to keep pretending
         | we're in control - we're the ones asking the oracle, we could
         | pull the plug anytime (we'll tell ourselves) etc, but it'll be
         | a different world if all the decisions are nonetheless coming
         | from something WAY more intelligent than ourselves.
        
           | HarHarVeryFunny wrote:
           | Odd to see this down-voted... I guess my prediction of the
           | future has rubbed someone the wrong way, but if you disagree
           | then why not just reply ?!
        
         | cultureswitch wrote:
         | The internet did change things dramatically, but the change
         | wasn't as dramatic as industrialization. And that one matured
         | over two centuries.
        
         | Aerbil313 wrote:
         | Technology is _the_ one force that drives modern human
         | societies, Western ones even more. The world has changed
         | dramatically, especially with smartphones. I suggest reading
         | Ted Kaczynski.
        
         | hardwaregeek wrote:
         | It's important to remember that the internet is still very very
         | new. Like the generation of digital natives are barely in
         | adulthood. Sure, it's existed in some form for about 40 years,
         | but most of the world didn't have access for the longest time.
         | I wouldn't be surprised if we see massive changes in the next
         | 20 years from the people who grew up on the web (specifically
         | people outside the United States and Europe, where access was
         | harder for a long time)
        
           | Jensson wrote:
           | "Digital native" are the people who grew up with computers.
           | Many kids born in 1980's and later grew up with computers in
           | their earliest memories.
           | 
           | I'd call the current generation "Social media natives",
           | because that is the biggest difference from the previous
           | generation. 90s kids grew up with games and communication,
           | but they were free from facebook, youtube and instagram.
        
         | throwup238 wrote:
         | By that standard, nothing has meaningfully changed since
         | agriculture and domesticated animals. We're still killing each
         | other, forming hierarchical societies, passing down stories,
         | eating, drinking, sleeping, and making families - except now
         | we're killing each other from afar with gunpowder, forming
         | those hierarchies using the guise of democracy or whatever,
         | passing down stories in print rather than speech, can use
         | condoms to control when we make families, and so on.
         | 
         | Human civilization has accumulated many layers of systems since
         | then and the internet changed _all of them_ to the point that
         | many are barely recognizable. Just ask someone who 's been in
         | prison since before the internet was a thing - there are plenty
         | of them! They have extreme difficulty adapting to the outside
         | world after they've been gone for forty or fifty years.
        
         | amelius wrote:
         | Imagine explaining to someone from 1950 that we now all have a
         | TV-set on our office desks, with 1000+ channels ...
         | 
         | I bet their reaction would be a facepalm.
        
         | beebmam wrote:
         | > leaves human societies more or less the same
         | 
         | My mom, who is 70 years old, regularly tells me how profoundly
         | transformative the internet has been for society.
        
         | lnxg33k1 wrote:
         | Imagine telling those same people in the 50s that all those
         | changes in productivity would come for the benefit of no one
         | since the work week would be the same and purchasing power
         | would decline
        
         | yashap wrote:
         | Depends on the quality of the AGI. If it's legitimately as good
         | or better than humans at almost everything, while being cost
         | effective, it will utterly and completely change society.
         | Humans will be obsolete at almost every job - why pay a human
         | if an AGI can do it as good or better, for free(-ish)? Best
         | case scenario, the AGI is benevolent, traditional work is gone,
         | but we find some post-capitalism system, and new ways to keep
         | life interesting/meaningful. Worst case scenario, pure sci-fi
         | dystopia.
         | 
         | If it's closer to a midpoint between GPT-4 and true human
         | intelligence, then sure, I agree with you, it's a significant
         | change to society but not an overhaul. But if it's actually a
         | human level (or better) general intelligence, it'll be the
         | biggest change to human society maybe ever.
        
       | HarHarVeryFunny wrote:
       | I'm not sure how one can percentage-wise compare scaling and
       | algorithmic advances - per Dwarkesh's prediction that "70%
       | scaling + 30% algorithmic advance" will get us to AGI ?!
       | 
       | I think a clearer answer is that scaling alone will certainly NOT
       | get us to AGI. There are some things that are just
       | architecturally missing from current LLMs, and no amount of
       | scaling or data cleaning or emergence will make them magically
       | appear.
       | 
       | Some obvious architectural features from top of my list would
       | include:
       | 
       | 1) Some sort of planning ahead (cf tree of thought rollouts)
       | which could be implemented in a variety of ways. A simple single-
       | pass feed forward architecture, even a sophisticated one like a
       | transformer, isn't enough. In humans this might be accomplished
       | by some combination of short term memory and the thalamo-cortical
       | feedback loop - iterating on one's perception/reaction to
       | something before "drawing conclusions" (i.e. making predictions)
       | based on it.
       | 
       | 2) Online/continual learning so that the model/AGI can learn from
       | it's prediction mistakes via feedback from their consequences,
       | even if that is initially limited to conversational feedback in a
       | ChatGPT setting. To get closer to human-level AGI the model would
       | really need some type of embodiment (either robotic or in a
       | physical simulation virtual word) so that it's actions and
       | feedback go beyond a world of words and let it learn via
       | experimentation how the real world works and responds. You really
       | don't understand the world unless you can touch/poke/feel it, see
       | it, hear it, smell it etc. Reading about it in a book/training
       | set isn't the same.
       | 
       | I think any AGI would also benefit from a real short term memory
       | that can be updated and referred to continuously, although
       | "recalculating" it on each token in a long context window does
       | kind of work. In an LLM-based AGI this could just be an internal
       | context, separate from the input context, but otherwise updated
       | and addressed in the same way via attention.
       | 
       | It depends too on what one means by AGI - is this implicitly
       | human-like (not just human-level) AGI ? If so then it seems there
       | are a host of other missing features too. Can we really call
       | something AGI if it's missing animal capabilities such as emotion
       | and empathy (roughly = predicting other's emotions, based on
       | having learnt how we would feel in similar circumstances)? You
       | can have some type of intelligence without emotion, but that
       | intelligence won't extend to fully understanding humans and
       | animals, and therefore being able to interact with them in a way
       | we'd consider intelligent and natural.
       | 
       | Really we're still a long way from this type of human-like
       | intelligence. What we've got via pre-trained LLMs is more like
       | IBM Watson on steroids - an expert system that would do well on
       | Jeopardy and increasingly well on IQ or SAT tests, and can fool
       | people into thinking it's smarter and more human-like than it
       | really is, just as much simpler systems like Eliza could. The
       | Turing test of "can it fool a human" (in a limited Q&A setting)
       | really doesn't indicate any deeper capability than exactly that
       | ability. It's no indication of intelligence.
        
       | revskill wrote:
       | No, human is not that intelligent to generate super intelligent
       | bot in a short time.
       | 
       | My estimation is about 200 years in future to have a "human-brain
       | AI" that works.
       | 
       | All idea should be treated equally, not based on revenue metrics.
       | If everyone could make a Youtube clone, the revenue should be
       | divided equally to all of creator, that's the way the world
       | should move forward, instead of monopoly.
       | 
       | Everything will be suck, forever.
        
       | tw1984 wrote:
       | LLM is going to bring tons of cool applications, but AGI is not
       | an application!
       | 
       | You can feed your dog 100,000 times a day, but that won't make it
       | a 1,000kg dog. The whole idea that AGI can be achieved by
       | predicting the next word is just pure marketing nonsense at best.
        
       | xbar wrote:
       | Yes, for some things.
        
       | machiaweliczny wrote:
       | I think there's a need to separate knowledge from learning
       | algorithm. There's need to be a latent representation of
       | knowledge that models attend to but the way it's done right now
       | (with my limited understanding) doesn't seem to be it.
       | Transformers seems to only attend to previous text in the context
       | but not to the whole knowledge they posses which is obvious
       | limitation IMO. Human brain probably also doesn't attend to whole
       | knowledge but loads something into context so maybe it's fixable
       | without changing architecture.
       | 
       | LLMs can work as data extraction already, so one can build some
       | prolog DB and update it as it consumes data. Then translate any
       | logic problems into prolog queries. I want to see this in
       | practice.
       | 
       | Similar with usage of logic engines and computation/programs.
       | 
       | I also think that RL can come up with better training function
       | for LLMs. In the programming domain for example one could ask LLM
       | to think about all possible test for given code and evaluate them
       | automatically.
       | 
       | I was also thinking about using diffusER pattern where
       | programming rules are kinda hardcoded (similar to
       | add/replace/delete but instead algebra on functions/variables).
       | Thats probably not AGI path but could be good for producing
       | programs.
        
       | sgt101 wrote:
       | >Here's one of the many astounding finds in Microsoft Research's
       | Sparks of AGI paper. They found that GPT-4 could write the LaTex
       | code to draw a unicorn.
       | 
       | a lot of people have tried to replicate this, I have tried. It's
       | very hard to get GPT-4 to draw a unicorn, also asking it to draw
       | an upside down unicorn is even harder.
        
         | midlightdenight wrote:
         | The model of GPT-4 those researchers had was not the same
         | that's available to the public. It's assumed it was far more
         | capable before alignment training (or whatever it's called).
        
         | nyrikki wrote:
         | Stocastic parrots are going to stochasticate.
         | 
         | The author also cited a few human assisted efforts as ML only.
         | 
         | The fact that the author also is surprised that GPT is better
         | at falsifying user input while it struggles at new ideas
         | demonstrates the fact that those who are hyping LLMs as getting
         | us closer to strong AI don't know or ate ignoring the know
         | limitations of problems like automated theorem solving.
         | 
         | I think generative AI is powerful and useful. But the AGI is
         | near camp is starting to make it a hard sell because the
         | general public is discovering the limits and people are trying
         | to force it into inappropriate domains.
         | 
         | Over parameterization and double decent is great at expanding
         | what it can do, but I haven't seen anything that justifies the
         | AGI hype yet.
        
         | RaftPeople wrote:
         | A person commenting on this topic at a different site mentioned
         | that there is a lot of content on the internet around how to
         | draw (animals?) with LaTex and a different tool (can't remember
         | the name), so it's unclear if GPT is just regurgitating or if
         | it's generalizing.
        
       | lossolo wrote:
       | A few more interesting papers not mentioned in the article:
       | 
       | "Faith and Fate: Limits of Transformers on Compositionality"
       | 
       | https://arxiv.org/abs/2305.18654
       | 
       | "Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning
       | Tasks":
       | 
       | https://arxiv.org/abs/2311.09247
       | 
       | "Embers of Autoregression: Understanding Large Language Models
       | Through the Problem They are Trained to Solve"
       | 
       | https://arxiv.org/abs/2309.13638
       | 
       | "Pretraining Data Mixtures Enable Narrow Model Selection
       | Capabilities in Transformer Models"
       | 
       | https://arxiv.org/abs/2311.00871
        
       | cavisne wrote:
       | If the size of the internet is really a bottleneck it seems
       | Google is in quite a strong position.
       | 
       | Assuming they have effectively a log of the internet, rather than
       | counting the current state of the internet as usable data we
       | should be thinking about the list of diffs that make up the
       | internet.
       | 
       | Maybe this ends up like Millenium Management where a key
       | differentiator is having access to deleted datasets.
        
         | nextworddev wrote:
         | True that said market structure changes so rapidly that old
         | datasets aren't that useful for most strategies
        
         | jeremyjh wrote:
         | I'd guess at most they have 5x more data, but it is probably
         | nowhere near that, and the article says 100,000x more data is
         | needed.
        
       | nextworddev wrote:
       | I am in the believer camp for simple reasons: 1) we haven't even
       | scratched the surface of government led investments into AI, 2)
       | AI itself could probably discover better architectures than
       | transformers (willing to bet heavily on this)
        
         | diggan wrote:
         | > AI itself could probably discover better architectures than
         | transformers (willing to bet heavily on this)
         | 
         | Is there any existing cases of LLMs coming up with novel,
         | useful and namely better architectures? Either related to AI/ML
         | itself or any other field.
        
         | jeremyjh wrote:
         | > AI itself could probably discover better architectures than
         | transformers
         | 
         | The entire subject of the article is concerned with what it
         | will take and how likely it is than an AI will ever will able
         | to generate improvements like this.
        
       | zoogeny wrote:
       | I was thinking last night about LLMs with respect to Wittgenstein
       | after watching this interesting discussion of his philosophy by
       | John Searle [1].
       | 
       | I think Wittgenstein's ideas are pertinent to the discussion of
       | the relation of language to intelligence (or reasoning in
       | general). I don't meant this in a technical sense (I recall
       | Chomsky mentioning that almost no ideas from Wittgenstein
       | actually have a place in modern linguistics) but from a
       | metaphysical sense (Chomsky also noted that Wittgenstein was one
       | of his formative influences).
       | 
       | The video I linked is a worthy introduction and not too long so I
       | recommend it to anyone interested in how language might be the
       | key to intelligence.
       | 
       | My personal take, when I see skeptics of LLMs approaching AGI, is
       | that they implicitly reject a Wittgenstein view of metaphysics
       | without actually engaging with it. There is an implicit Cartesian
       | aspect to their world view, where there is either some mental
       | aspect not yet captured by machines (a primitive soul) or some
       | physical process missing (some kind of non-language _system_ ).
       | 
       | Whenever I read skeptical arguments against LLMs they are not
       | credibly evidence based, nor are they credibly theoretical. They
       | almost always come down to the assumption that language alone
       | isn't sufficient. Wittgenstein was arguing long before LLMs were
       | even a possibility that language wasn't just sufficient, it was
       | inextricably linked to reason.
       | 
       | What excites me about scaling LLMs, is we may actually build
       | evidence that supports (or refutes) his metaphysical ideas.
       | 
       | 1.
       | https://www.youtube.com/watch?v=v_hQpvQYhOI&ab_channel=Philo...
        
       | bob1029 wrote:
       | I think the "self-play" path is where the scary-powerful AI
       | solutions will emerge. This implies persistence of state and
       | logic that lives external to the LLM. The language model is just
       | one tool. AGI/ASI/whatever will be a _system_ of tools, of which
       | the LLM might be the _least_ complicated one to worry about.
       | 
       | In my view, domain modeling, managing state, knowing when to
       | transition _between_ states, techniques for final decision
       | making, consideration for the time domain, and prompt engineering
       | are the real challenges.
        
         | lern_too_spel wrote:
         | It's not necessary for the author's purpose of providing more
         | data. We're only training on one kind of input so far, text,
         | from which these models have built some understanding of the
         | world. Humans train on more inputs, and the data to provide
         | those inputs for training a model is readily available, in far
         | larger quantities than individual human brains consume. Data is
         | not the issue.
        
       ___________________________________________________________________
       (page generated 2023-12-27 23:00 UTC)