[HN Gopher] About That OpenAI "Breakthrough"
       ___________________________________________________________________
        
       About That OpenAI "Breakthrough"
        
       Author : passwordoops
       Score  : 21 points
       Date   : 2023-11-23 16:59 UTC (6 hours ago)
        
 (HTM) web link (garymarcus.substack.com)
 (TXT) w3m dump (garymarcus.substack.com)
        
       | gwnywg wrote:
       | I find it weird to see people calling A* an 'AI technique'... Is
       | BFS or DFS an AI technique too? I thought it's merely a search
       | algorithm...
        
         | svachalek wrote:
         | AI is the cutting edge of CS. When it's well understood it
         | stops being AI. Some day LLM and neural nets will get the same
         | reaction... who calls LLM "AI"? It's just a formula!
        
         | ashiban wrote:
         | Most likely for historical reasons where A* was used alot in
         | games path finding for in-game AI characters
        
         | furyofantares wrote:
         | I first learned it in the famous book "Artificial Intelligence:
         | A Modern Approach" by Peter Norvig. I think 3 chapters are
         | dedicated to search. A lot of AI can be framed as search, and a
         | whole lot of historical AI, or mundane present AI, is
         | straightforward search.
        
         | Ukv wrote:
         | When it was published, A* was considered within the field of
         | AI. The paper[0]'s authors were from Stanford's AI group.
         | 
         | The "AI effect"[1] describes how the field tends to get left
         | with only the cutting-edge research.
         | 
         | [0]: http://ai.stanford.edu/~nilsson/OnlinePubs-
         | Nils/PublishedPap...
         | 
         | [1]: https://en.wikipedia.org/wiki/AI_effect
        
           | T-A wrote:
           | By way of [1], here [2] is a paper about A* also posted under
           | CS/AI. It introduces "Q* search, a search algorithm that uses
           | deep Q-networks to guide search" and uses it "to solve the
           | Rubik's cube", the thing which Gary Marcus complained about
           | OpenAI not having done with a neural network in 2019.
           | 
           | As pointed out by techbro92 [3] the authors do not seem to be
           | affiliated with OpenAI, but AFAIK nobody claimed that OpenAI
           | _invented_ Q*. It 's not hard to imagine OpenAI trying it out
           | with their Rubik-solving robot hand and then starting to
           | think about other things it might be applied to.
           | 
           | [1] https://news.ycombinator.com/item?id=38395959
           | 
           | [2] https://arxiv.org/abs/2102.04518
           | 
           | [3] https://news.ycombinator.com/item?id=38398243
        
         | naasking wrote:
         | Search is 100% key to AI, even historically. Prolog has its
         | roots in AI research, and Prolog evaluation is a form of
         | unification, which is a search.
        
         | lagrange77 wrote:
         | I think (probabilistic) logic programming also relies on graph
         | traversal/path searching, which is part of the symbolic AI
         | paradigm.
        
         | flohofwoe wrote:
         | Probably because of A*'s popularity for path finding in game
         | development. Everything that imitates human-like behaviour in
         | games is called "AI", even when most of it is just a big messy
         | hairball of adhoc if-else code under the hood ;)
        
       | Buttons840 wrote:
       | I'm trying to refresh mind on what Q learning is, if I may think
       | out loud here?
       | 
       | Q is an evaluation of how good a state is assuming the agent acts
       | optimally going forward. In Chess for example, sacrificing a rook
       | to capture a queen would have a high Q value. Sacing a rook is
       | bad, but gaining the queen later is better. Q values are supposed
       | to see past immediate consequences and reflect the long term
       | consequences.
       | 
       | How do we find a Q value though? We can look several steps ahead,
       | or we can look to the end of the game, etc. These aren't real Q
       | values though, because optimal actions probably weren't taken. We
       | can use Bellman equations, which roughly update one Q value by a
       | small amount to the _maximum_ of the next possible states, which
       | results in the highest Q values gradually flowing backward into
       | states that can lead to good outcomes.
       | 
       | I'm trying to think how this would apply to LLMs though. Where do
       | the basic values come from? In Chess the basic values are the
       | number of pieces, or whether or not a side ultimately wins. What
       | are the basic values in a LLM? What is the Q agent aiming to do?
        
         | jacquesm wrote:
         | In computer chess, traditionally 'minimax' was the strategy
         | that allowed you to efficiently probe ahead. The only
         | requirement for this is that your evaluation function returns a
         | single (scalar) value.
        
         | seanhunter wrote:
         | I don't think your intuition about computer chess is going to
         | help you here with transformer architecture.
         | 
         | Usually in transformer models[1], for each attention head there
         | are 3 vectors of weights, known as q(uery), k(ey) and v(alue).
         | I was assuming that the q in q* applied to the q vector, so
         | q-learning is training this vector. In a transformer model you
         | don't have an objective function for any sort of state
         | evaluation so it can't be the Q you're thinking of.
         | 
         | If they've done something funky with the q vector that could
         | indeed be a breakthrough since a lot of people feel that we are
         | sort of running out of juice with scaling transformers as-is
         | and need a new architecture to really have a step change in
         | capability. That's pure speculation on my part though.
         | 
         | [1] Here's Vaswani et al, the paper that first set out the
         | transformer architecture https://arxiv.org/abs/1706.03762
        
           | Buttons840 wrote:
           | Good point. I'm thinking of the Q from reinforcement
           | learning, but as you say, this Q is probably different.
        
           | a_wild_dandan wrote:
           | Current architectures have a _lot_ of juice left across
           | several axes, using extraordinarily accurate scaling laws. We
           | 're nowhere close to a wall.
           | 
           | The industry has been focusing _hard_ on  'System Two'
           | approaches to augment our largely 'System One'-style models,
           | where optimal decision policies make Q-learning a natural
           | approach. The recent unlock here might be related to using
           | the neural nets' general intelligence/flexibility to better
           | broadly estimate their own states/rewards/actions (like
           | humans). [EDIT: To be clear, the Q-learning stuff is my own
           | speculation, whereas the 'System Two' stuff is well known.]
           | 
           | Serendipitously, Karpathy broadly discussed these two issues
           | _yesterday_! Toward the lecture 's end:
           | https://www.youtube.com/watch?v=zjkBMFhNj_g
        
         | nuc1e0n wrote:
         | It's probably something to do with path finding with
         | constraints in high dimensional search spaces I'd guess. Like
         | quake bots used to do in the late 90s
        
       | throw310822 wrote:
       | I am surprised by how dismissive the whole post sounds. For
       | example:
       | 
       | > OpenAI could in fact have a breakthrough that fundamentally
       | changes the world
       | 
       | Well, it appears to me that OpenAI _already_ has such a
       | breakthrough- it had it roughly 4 years ago with GPT2, and it 's
       | still scaling it.
       | 
       | Considering that _it 's not yet a year since the introduction of
       | the first ChatGPT_, and given the pace at which it's evolving, I
       | would say that the current product is already showing great
       | promise to fundamentally change the world. I would not be
       | surprised if just incremental changes were enough to fulfill that
       | prediction. The impact at this point seems more limited by the
       | ability of society to absorb and process the technology rather
       | than intrinsic limits of the technology itself.
        
         | naasking wrote:
         | The article is from Gary Marcus, a well known AI skeptic. Not
         | surprising if he's being dismissive.
        
           | psbp wrote:
           | Skeptic and completely reactionary. I had to unfollow him on
           | Twitter because he always has to have a "take" on every AI
           | headline, and he's often contradictory between "AI is
           | useless" and "AI is a huge threat".
        
             | throw310822 wrote:
             | Yes, skepticism is healthy and useful, but this sounds more
             | like intellectual dishonesty.
        
         | golol wrote:
         | >Gary Marcus
        
         | lucubratory wrote:
         | >I am surprised by how dismissive the whole post sounds.
         | 
         | I wouldn't be surprised, it's Gary Marcus. He's an academic
         | with a lot of prestige to lose if the LLM approach is actually
         | good/useful/insightful, who's only widely publicly known now
         | because AI has had a backlash and media needed an expert to
         | quote for "the other side". Same as the computational
         | linguistics researchers who always get quoted for the same
         | reason.
         | 
         | In general, academics in competing fields whose funding
         | threatens to get tanked or eclipsed by research approaches that
         | work on principles they have fundamental academic disagreements
         | with are going to talk negatively about the tech, no matter
         | what it is achieving. Where I think it can be valuable to
         | listen to them is when they're giving the technology credit -
         | generally they'll only do that when it's something really
         | undeniable or potentially concerning.
        
         | jksk61 wrote:
         | I don't know, it is an opinion and it seems kind of well-
         | founded (i.e. there's no evidence for groundbreaking research
         | on OpenAI part except for scaling things up).
        
       | moneytide1 wrote:
       | T g x,
        
       | gfodor wrote:
       | Gary Marcus is a grifter and should be ignored.
        
       | nuc1e0n wrote:
       | I used to be friends with some guys who made a rubiks cube
       | solving robot at college years ago. It was a cool novelty back
       | then, but is unimpressive for OpenAI to boast about now. That
       | robot could solve a cube that was wasn't specially instrumented
       | like openai's is here as well.
        
         | dartos wrote:
         | What are you trying to say?
        
           | nuc1e0n wrote:
           | Just as in my parent post. Q learning is kinda old tech for
           | AI. It works rather well though.
        
       | npalli wrote:
       | Let's see .. oh Gary Marcus. Never mind.
        
       ___________________________________________________________________
       (page generated 2023-11-23 23:01 UTC)