[HN Gopher] About That OpenAI "Breakthrough"
___________________________________________________________________
About That OpenAI "Breakthrough"
Author : passwordoops
Score : 21 points
Date : 2023-11-23 16:59 UTC (6 hours ago)
(HTM) web link (garymarcus.substack.com)
(TXT) w3m dump (garymarcus.substack.com)
| gwnywg wrote:
| I find it weird to see people calling A* an 'AI technique'... Is
| BFS or DFS an AI technique too? I thought it's merely a search
| algorithm...
| svachalek wrote:
| AI is the cutting edge of CS. When it's well understood it
| stops being AI. Some day LLM and neural nets will get the same
| reaction... who calls LLM "AI"? It's just a formula!
| ashiban wrote:
| Most likely for historical reasons where A* was used alot in
| games path finding for in-game AI characters
| furyofantares wrote:
| I first learned it in the famous book "Artificial Intelligence:
| A Modern Approach" by Peter Norvig. I think 3 chapters are
| dedicated to search. A lot of AI can be framed as search, and a
| whole lot of historical AI, or mundane present AI, is
| straightforward search.
| Ukv wrote:
| When it was published, A* was considered within the field of
| AI. The paper[0]'s authors were from Stanford's AI group.
|
| The "AI effect"[1] describes how the field tends to get left
| with only the cutting-edge research.
|
| [0]: http://ai.stanford.edu/~nilsson/OnlinePubs-
| Nils/PublishedPap...
|
| [1]: https://en.wikipedia.org/wiki/AI_effect
| T-A wrote:
| By way of [1], here [2] is a paper about A* also posted under
| CS/AI. It introduces "Q* search, a search algorithm that uses
| deep Q-networks to guide search" and uses it "to solve the
| Rubik's cube", the thing which Gary Marcus complained about
| OpenAI not having done with a neural network in 2019.
|
| As pointed out by techbro92 [3] the authors do not seem to be
| affiliated with OpenAI, but AFAIK nobody claimed that OpenAI
| _invented_ Q*. It 's not hard to imagine OpenAI trying it out
| with their Rubik-solving robot hand and then starting to
| think about other things it might be applied to.
|
| [1] https://news.ycombinator.com/item?id=38395959
|
| [2] https://arxiv.org/abs/2102.04518
|
| [3] https://news.ycombinator.com/item?id=38398243
| naasking wrote:
| Search is 100% key to AI, even historically. Prolog has its
| roots in AI research, and Prolog evaluation is a form of
| unification, which is a search.
| lagrange77 wrote:
| I think (probabilistic) logic programming also relies on graph
| traversal/path searching, which is part of the symbolic AI
| paradigm.
| flohofwoe wrote:
| Probably because of A*'s popularity for path finding in game
| development. Everything that imitates human-like behaviour in
| games is called "AI", even when most of it is just a big messy
| hairball of adhoc if-else code under the hood ;)
| Buttons840 wrote:
| I'm trying to refresh mind on what Q learning is, if I may think
| out loud here?
|
| Q is an evaluation of how good a state is assuming the agent acts
| optimally going forward. In Chess for example, sacrificing a rook
| to capture a queen would have a high Q value. Sacing a rook is
| bad, but gaining the queen later is better. Q values are supposed
| to see past immediate consequences and reflect the long term
| consequences.
|
| How do we find a Q value though? We can look several steps ahead,
| or we can look to the end of the game, etc. These aren't real Q
| values though, because optimal actions probably weren't taken. We
| can use Bellman equations, which roughly update one Q value by a
| small amount to the _maximum_ of the next possible states, which
| results in the highest Q values gradually flowing backward into
| states that can lead to good outcomes.
|
| I'm trying to think how this would apply to LLMs though. Where do
| the basic values come from? In Chess the basic values are the
| number of pieces, or whether or not a side ultimately wins. What
| are the basic values in a LLM? What is the Q agent aiming to do?
| jacquesm wrote:
| In computer chess, traditionally 'minimax' was the strategy
| that allowed you to efficiently probe ahead. The only
| requirement for this is that your evaluation function returns a
| single (scalar) value.
| seanhunter wrote:
| I don't think your intuition about computer chess is going to
| help you here with transformer architecture.
|
| Usually in transformer models[1], for each attention head there
| are 3 vectors of weights, known as q(uery), k(ey) and v(alue).
| I was assuming that the q in q* applied to the q vector, so
| q-learning is training this vector. In a transformer model you
| don't have an objective function for any sort of state
| evaluation so it can't be the Q you're thinking of.
|
| If they've done something funky with the q vector that could
| indeed be a breakthrough since a lot of people feel that we are
| sort of running out of juice with scaling transformers as-is
| and need a new architecture to really have a step change in
| capability. That's pure speculation on my part though.
|
| [1] Here's Vaswani et al, the paper that first set out the
| transformer architecture https://arxiv.org/abs/1706.03762
| Buttons840 wrote:
| Good point. I'm thinking of the Q from reinforcement
| learning, but as you say, this Q is probably different.
| a_wild_dandan wrote:
| Current architectures have a _lot_ of juice left across
| several axes, using extraordinarily accurate scaling laws. We
| 're nowhere close to a wall.
|
| The industry has been focusing _hard_ on 'System Two'
| approaches to augment our largely 'System One'-style models,
| where optimal decision policies make Q-learning a natural
| approach. The recent unlock here might be related to using
| the neural nets' general intelligence/flexibility to better
| broadly estimate their own states/rewards/actions (like
| humans). [EDIT: To be clear, the Q-learning stuff is my own
| speculation, whereas the 'System Two' stuff is well known.]
|
| Serendipitously, Karpathy broadly discussed these two issues
| _yesterday_! Toward the lecture 's end:
| https://www.youtube.com/watch?v=zjkBMFhNj_g
| nuc1e0n wrote:
| It's probably something to do with path finding with
| constraints in high dimensional search spaces I'd guess. Like
| quake bots used to do in the late 90s
| throw310822 wrote:
| I am surprised by how dismissive the whole post sounds. For
| example:
|
| > OpenAI could in fact have a breakthrough that fundamentally
| changes the world
|
| Well, it appears to me that OpenAI _already_ has such a
| breakthrough- it had it roughly 4 years ago with GPT2, and it 's
| still scaling it.
|
| Considering that _it 's not yet a year since the introduction of
| the first ChatGPT_, and given the pace at which it's evolving, I
| would say that the current product is already showing great
| promise to fundamentally change the world. I would not be
| surprised if just incremental changes were enough to fulfill that
| prediction. The impact at this point seems more limited by the
| ability of society to absorb and process the technology rather
| than intrinsic limits of the technology itself.
| naasking wrote:
| The article is from Gary Marcus, a well known AI skeptic. Not
| surprising if he's being dismissive.
| psbp wrote:
| Skeptic and completely reactionary. I had to unfollow him on
| Twitter because he always has to have a "take" on every AI
| headline, and he's often contradictory between "AI is
| useless" and "AI is a huge threat".
| throw310822 wrote:
| Yes, skepticism is healthy and useful, but this sounds more
| like intellectual dishonesty.
| golol wrote:
| >Gary Marcus
| lucubratory wrote:
| >I am surprised by how dismissive the whole post sounds.
|
| I wouldn't be surprised, it's Gary Marcus. He's an academic
| with a lot of prestige to lose if the LLM approach is actually
| good/useful/insightful, who's only widely publicly known now
| because AI has had a backlash and media needed an expert to
| quote for "the other side". Same as the computational
| linguistics researchers who always get quoted for the same
| reason.
|
| In general, academics in competing fields whose funding
| threatens to get tanked or eclipsed by research approaches that
| work on principles they have fundamental academic disagreements
| with are going to talk negatively about the tech, no matter
| what it is achieving. Where I think it can be valuable to
| listen to them is when they're giving the technology credit -
| generally they'll only do that when it's something really
| undeniable or potentially concerning.
| jksk61 wrote:
| I don't know, it is an opinion and it seems kind of well-
| founded (i.e. there's no evidence for groundbreaking research
| on OpenAI part except for scaling things up).
| moneytide1 wrote:
| T g x,
| gfodor wrote:
| Gary Marcus is a grifter and should be ignored.
| nuc1e0n wrote:
| I used to be friends with some guys who made a rubiks cube
| solving robot at college years ago. It was a cool novelty back
| then, but is unimpressive for OpenAI to boast about now. That
| robot could solve a cube that was wasn't specially instrumented
| like openai's is here as well.
| dartos wrote:
| What are you trying to say?
| nuc1e0n wrote:
| Just as in my parent post. Q learning is kinda old tech for
| AI. It works rather well though.
| npalli wrote:
| Let's see .. oh Gary Marcus. Never mind.
___________________________________________________________________
(page generated 2023-11-23 23:01 UTC)