[HN Gopher] Jeff Dean responds to EDA industry about AlphaChip
       ___________________________________________________________________
        
       Jeff Dean responds to EDA industry about AlphaChip
        
       Author : nsoonhui
       Score  : 213 points
       Date   : 2024-12-01 00:28 UTC (22 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | twothreeone wrote:
       | I get why Jeff would be pressed to comment on this, given he's
       | credited on basically all of "Google Brain" research output. But
       | saying "they couldn't replicate it because they're idiots,
       | therefore it's replicable" is not a rebuttal, just bullying.
       | Sounds like the critics struck a nerve and there's no good way
       | for him to refute the replication problem his research apparently
       | exhibits.
        
         | 1024core wrote:
         | > they couldn't replicate it because they're idiots
         | 
         | If they did not follow the steps to replicate (pre-training,
         | using less compute, etc.) and then failed, so what's wrong with
         | calling out the flaws in their attempted "replication"?
        
           | twothreeone wrote:
           | It's not a value judgement, just doesn't help his case at
           | all. He'd need to counter the replication problem, but
           | apparently that's not an option. Instead, he's making people
           | who were unable to replicate it look bad, which actually
           | strengthens their criticism.
        
             | skybrian wrote:
             | I don't know how you rebut a flawed paper without making
             | its authors look bad? That would be a general-purpose
             | argument against criticizing papers.
             | 
             | Actually, people _should_ criticize flawed papers. That 's
             | how science works! When you publish scientific papers, you
             | should expect criticism if there's something that doesn't
             | look right.
             | 
             | The only way to avoid that is to get critical feedback
             | _before_ publishing the paper, and it 's not always
             | possible, so then the scientific debate happens in public.
        
               | twothreeone wrote:
               | The situation here is different though.. If I'm making an
               | existence claim by demonstrating a constructive argument
               | and then being criticized for it, the most effective
               | response to that critique would be a second, alternative
               | construction, not attacking the critic's argument. After
               | all, I'm the one claiming existence.. the burden of proof
               | is on me, not my critics.
        
               | skybrian wrote:
               | I don't know which argument is more constructive, though?
               | Both teams reported what they did. They got different
               | results. Figuring out why is the next step, and pointing
               | out that they did different things seems useful.
               | 
               | Though, the broader question is how useful the results of
               | the original paper are to other people who might do the
               | same thing.
        
             | xpe wrote:
             | > But saying "they couldn't replicate it because they're
             | idiots, therefore it's replicable" is not a rebuttal, just
             | bullying.
             | 
             | > It's not a value judgement, just doesn't help his case at
             | all.
             | 
             | Calling it "bullying" looks like a value judgment to me. Am
             | I missing something?
             | 
             | To me, Dean's response is quite sensible, particularly
             | given his claims the other papers made serious mistakes and
             | have potential conflicts of interest.
        
               | twothreeone wrote:
               | I'm not saying "Bullying is bad and bullies are bad
               | people", that would be a value judgement. I'm saying
               | bullying is the strictly worse strategy for strengthening
               | his paper's claims in this scenario. The better strategy
               | would be to foster an environment in which people can
               | easily replicate your claims.
        
               | xpe wrote:
               | I think for most people the word "bullying" has a value
               | judgment built-in.
        
               | xpe wrote:
               | In a perfect world, making a paper easier to replicate
               | has advantages, sure. (But it also has costs.)
               | 
               | Second, even a healthy environment can be undermined by
               | lack of skills or resources, intellectual dishonesty, or
               | conflicts of interest.
        
               | xpe wrote:
               | Are you suggesting Dean take a different approach in his
               | response? Are you saying it was already too late given
               | the environment? (I'm also not sure I know what you mean
               | by environment here.)
        
             | danielmarkbruce wrote:
             | they have open source code.
        
         | danpalmer wrote:
         | > But saying "they couldn't replicate it because they're
         | idiots, therefore it's replicable" is not a rebuttal, just
         | bullying
         | 
         | That's not an argument made in the linked tweet. His claim is
         | "they couldn't replicate it because they didn't follow the
         | steps", which seems like a very reasonable claim, regardless of
         | the motivation behind making it.
        
           | bayarearefugee wrote:
           | At the end of the day my question is simply why does anyone
           | care about the drama over this one way or another?
           | 
           | Either the research is as much of a breakthrough as is
           | claimed and Google is about to pull way ahead of all these
           | other "idiots" who can't replicate their method even when it
           | is described to them in detail, or the research is flawed and
           | overblown and not as effective as claimed. This seems like
           | exactly the sort of question the market will quickly decide
           | over the next couple of years and not worth arguing over.
           | 
           | Why do a non-zero amount of people have seemingly religious
           | beliefs about this topic on one side or the other?
        
             | refulgentis wrote:
             | > why does anyone care
             | 
             | n.b. you're on a social news site
             | 
             | > pull way ahead of all these other "idiots"
             | 
             | Pulling way ahead sounds sufficient, not necessary. Can we
             | prove it's not the case? Let's say someone says that's why
             | Gemini inference is so cheap. Can we show that's wrong?
             | 
             | > "idiots"
             | 
             | ?
        
             | pclmulqdq wrote:
             | The reason Jeff Dean cares is that his team's improvement
             | compared to standard EDA tools was marginal at best and may
             | have overfitted to a certain class of chips. Thus, he is
             | defending his research because it is not widely accepted.
             | Open source code has been out for years and in that time
             | the EDA companies have largely done their own ML-based
             | approaches that do not match his. He attributes this not to
             | failings in his own research but to the detractors at these
             | companies not giving it a fair chance.
             | 
             | The guys at EDA companies care because Google's result
             | makes them look like idiots when you take the paper at face
             | value, and does advance the state of the art a bit. They
             | have been working hard for marginal improvements, and that
             | some team of ML people can come in and make a big splash
             | with something like this is offensive to them. Furthermore,
             | the result is not that impressive and does not generalize
             | enough to be useful to them (and competent teams at these
             | companies absolutely have checked).
             | 
             | The fact that the result is so minor _is the reason_ that
             | this is so contentious.
        
               | choppaface wrote:
               | The result is minor AND Google spent a (relative) lot of
               | money to achieve it (especially in the eyes of the new
               | CFO). Jeff Dean is desperately trying to save the
               | prestige of the research (in a very insular, Google-y
               | way) because he wants to save the 2017-era economically-
               | not-viable blue sky culture where Tensorflow & the TPU
               | flourished and the transformer was born. But the reality
               | is that Google's core businesses are under attack (anti-
               | trust, Jedi Blue etc), the TPU now has zero chance versus
               | NVidia, and Google is literally no longer growing ads.
               | His financing is about to pop in the next 1-2 years.
               | 
               | https://sparktoro.com/blog/is-google-losing-search-
               | market-sh...
        
               | alsodumb wrote:
               | What makes you say TPU has zero chance against growing
               | NVIDIA?
               | 
               | If anything, now is the best time for TPU to grow and I'd
               | say investing in TPU gave Google an edge. There is no
               | other large scale LLM that was trained on anything but
               | NVIDIA GPUs. Gemini is the only exception. Every big
               | company is scrambling to make their own hardware in the
               | AI era while Google already has it.
               | 
               | Everyone I know who worked with TPUs loves how well they
               | scale. Sure Jax has a learning curve but it's not a
               | problem, especially given the performance advantages it
               | gives.
        
             | bsder wrote:
             | > Why do a non-zero amount of people have seemingly
             | religious beliefs about this topic on one side or the
             | other?
             | 
             | Because lots of engineers are being told by managers "Why
             | aren't we using that tool?" and a bunch of engineers are
             | stuck saying "Because it doesn't actually work." aka
             | "Google is lying through their teeth." to which the
             | response is "Oh, so you know better than Google?" to which
             | the reponse is "Yeah, actually, I fucking do. Now piss off
             | and let me finish timing closure this goddamn block that is
             | already 6 weeks late."
             | 
             | Now can you understand why this is a bit contentious?
             | 
             | Marketing "exaggerations" from authority can cause _huge_
             | amounts of grief.
             | 
             | In my little corner of the world, I had to sit and defend
             | against the lies that a startup with famous designers were
             | putting out about power consumption while we were designing
             | similar chips in the space. I had to go toe to toe with
             | Senior VPs over it and I had to stand my ground and defend
             | my team who analyzed things dead on. All this occurred in
             | spite of the fact that they had no silicon. In addition, I
             | _knew_ the famous designers involved would happily lie
             | straight to your face having worked with them before and
             | having been lied straight to my face and having had to
             | clean up the mess when they left the company.
             | 
             | To be fair, it is also the only time I have had a Senior VP
             | remember the kerfuffle and _apologize_ when said startup
             | finally delivered silicon and not only were the real
             | numbers not what they claimed they weren 't even close to
             | the ones we were getting.
        
               | btilly wrote:
               | And do you believe that that is what's happening in this
               | case?
               | 
               | If you have personal experience with Jeff Dean et al that
               | you're willing to share, I'd be interested in hearing
               | about it.
               | 
               | From where I'm sitting it looks like, "Google spent a
               | fortune on deep learning, and got a small but real win.
               | People who don't like Google failed to follow Google's
               | recipe and got a large and easily replicated loss."
               | 
               | It's not even clear that Google's approach is feasible
               | right now for companies not named Google. It is not clear
               | that it works on other classes of chip. It is not clear
               | that the technique will grow beyond what Google already
               | got. It is really not clear that anyone should be jumping
               | on this.
               | 
               | But there is a world of difference between that, and
               | concluding that Google is lying.
        
               | bsder wrote:
               | > From where I'm sitting it looks like, "Google spent a
               | fortune on deep learning, and got a small but real win.
               | People who don't like Google failed to follow Google's
               | recipe and got a large and easily replicated loss."
               | 
               | From where I'm sitting it looks like Google cooked the
               | books maximally, barely beat humans let alone state of
               | the art algorithms, published a crappy article in Nature
               | because it would never have passed editorial muster at
               | something like DAC or an IEEE journal and now have to
               | browbeat other people who are calling them out on it.
               | 
               | And that's the _best_ interpretation we can cough up.
               | 
               | I'll go further, we don't even have any raw data that
               | says that they actually did beat the humans. Some of the
               | humans I know who run P&R are _REALLY_ good at what they
               | do. The data could be _completely made up_. Given how
               | much scientific fraud has come out lately, I 'm amazed at
               | the number of people defending Google on this.
               | 
               | Where I'm from, we call what Google is doing both "lying"
               | and "bullying".
               | 
               | Look, Google can _easily_ defuse this in all manner of
               | ways. Publish their raw data. Run things on testbenches
               | and benchmarks that the EDA tools vendors have been
               | running on for years. Run things on the open source VLSI
               | designs that they sponsored.
               | 
               | What I suspect happened is that Google's AI group has
               | gotten used to being able to make hyperbolic marketing
               | claims which are difficult to verify. They then poked at
               | place and route, failed, and published an article anyway
               | because someone's promotion is tied to this. They
               | expected that everybody would swallow their glop just
               | like every other time, be mostly ignored and the people
               | involved can get their promotions and move on.
               | 
               | Unfortunately, Google is shoveling bullshit around
               | something that has objective answers; real money is at
               | stake; and they're getting rightfully excoriated for it.
               | 
               | Whoops.
        
               | btilly wrote:
               | Look, either the follow-up article did pretraining or
               | not. Jeff Dean is claiming that the importance of
               | pretraining was mentioned 37 times and the follow-up
               | didn't do it. That sounds easy to verify.
               | 
               | Likewise the importance of spending 20x as much money on
               | the training portion seems easy to verify, and
               | significant.
               | 
               | That they would fail to properly test against industry
               | standard workbenches seems reasonable to me. This is a
               | bunch of ML specialists who know nothing about chip
               | design. Their background is beating everyone at Go and
               | setting a new state of the art for protein folding, and
               | not chip design. If you dismiss those particular past
               | accomplishments as hyperbolic marketing, that's your
               | decision. But you aren't going to find a lot of people in
               | these parts who agree with you.
               | 
               | If you think that those were real, but that a bunch of
               | more recent accomplishments are BS, I haven't been
               | following closely enough to have an opinion. The stuff
               | that crossed my radar since AlphaFold is mostly done at
               | places like OpenAI, and not Google.
               | 
               | Regardless, the truth will out. And what Google is
               | claiming for itself here really isn't all that
               | impressive.
        
               | gabegobblegoldi wrote:
               | In this case there were credible claims of fraud from
               | Google insiders. See my comment above.
        
             | joatmon-snoo wrote:
             | > This seems like exactly the sort of question the market
             | will quickly decide over the next couple of years and not
             | worth arguing over.
             | 
             | Discussions like this are _how_ the market decides whether
             | or not this achievement is real or not.
        
             | throwaway2037 wrote:
             | One constant that I see on HN: they love drama and love to
             | tear down a winner, presumably over jealousy.
        
         | jsnell wrote:
         | It's a good thing that he didn't say that, then.
         | 
         | The tweet just says that the reproduction attempt didn't didn't
         | actually follow the original methodology. There is no claim
         | that the authors of the replication attempt were "idiots" or
         | anything similar, you just made that up. The obviously
         | fallacious logic in "they couldn't replicate it ..., therefore
         | it's replicable" is also a total fabrication on your part.
        
           | twothreeone wrote:
           | A Google Nature Paper has not been replicated for over 3
           | years, but I'm the one fabricating stuff :D
           | 
           | Making a novel claim implies its *_claimed_ replicability.
           | 
           | "You did not follow the steps" is calling them idiots.
           | 
           | The only inference I made is that he's pressed to comment. He
           | could have said nothing.. instead he's lashing out publicly,
           | because other people were unable to replicate it. If there's
           | no problem replicating the work, why hasn't that happend? Any
           | other author would be worried if a publication about their
           | work were saying "it's not replicable" and trying their best
           | to help replicate it.. but somehow that doesn't apply to him.
        
             | griomnib wrote:
             | "You can only validate my results if you have an entire
             | Google data center worth of compute available. Since you
             | don't, you can't question us."
        
               | jeffbee wrote:
               | We're actually talking about the difference between Cheng
               | using 8 GPUs and 2 CPUs while Google used 16 GPUs and 40
               | CPUs. These are under-your-desk levels of resources.
               | Cheng et al authors are all affiliated with UCSD which
               | owns the Expanse supercomputer which is orders of
               | magnitude larger than what you would need to reproduce
               | the original work. Cheng et al does not explain why they
               | used fewer resources.
        
               | griomnib wrote:
               | That's a fair complaint then.
        
               | phonon wrote:
               | No it's not. They ran it longer instead.
        
               | jeffbee wrote:
               | The 2022 paper pretty explicitly says that runtime is not
               | a substitute. They say their best result "can only be
               | achieved in our 8-GPU setup".
        
               | phonon wrote:
               | I assume you mean Fig. 6 here?[0]
               | 
               | But that was explicitly limited to 8 hours for all
               | setups. Do they have another paper that shows that you
               | can't increase the number of hours of a smaller GPU setup
               | to compensate?
               | 
               | [0]https://dl.acm.org/doi/pdf/10.1145/3505170.3511478
        
         | make3 wrote:
         | "they couldn't replicate it because they're idiots, therefore
         | it's replicable" That's literally not what he says though. He
         | says, "they didn't replicate it so their conclusions are
         | invalid", which is a completely different thing than what
         | you're accusing him of, and is valid.
        
         | griomnib wrote:
         | I love how he's claiming bias due to his critic's employer. As
         | though working for _Google_ has no conflicts? A company that is
         | desperately hyping each and every "me too" AI development to
         | juice the stock price?
         | 
         | Jeff drank so much kool aid he forget what water is.
        
           | jeffbee wrote:
           | He's criticizing Markov for not disclosing the conflict, not
           | for the conflict itself. Hiding your affiliation in a
           | scientific publication is far outside the norms of science,
           | and they should be criticized for that. The publication we
           | are discussing -- "That Chip Has Sailed" -- dismisses Markov
           | in a few paragraph and spends the bulk its arguments on
           | Cheng.
        
             | griomnib wrote:
             | I know the norms of science, I also know norms of present
             | day Google. Nobody from Google should have the gall to
             | accuse others of anything.
        
         | johnfn wrote:
         | > "they couldn't replicate it because they're idiots, therefore
         | it's replicable"
         | 
         | Where does it say that? Dean outlines explicit steps that the
         | authors missed in the tweet.
        
         | iamnotafraid wrote:
         | Interesting point, I give u this
        
         | MicolashKyoka wrote:
         | if they are idiots and couldn't replicate it, it's worth saying
         | it. better that than sugarcoating idiocy until it harms future
         | research.
        
       | nsoonhui wrote:
       | The context of Jeff Dean's response:
       | 
       | https://news.ycombinator.com/item?id=41673769
       | 
       | https://news.ycombinator.com/item?id=41673808
        
       | bsder wrote:
       | The fact that the EDA companies are garbage in no way mitigates
       | the fact that Google continues to peddle unsubstantiated snake
       | oil.
       | 
       | This is easy to debunk from the Google side: release a tool. If
       | you don't want to release a tool, then it's unsubstantiated and
       | you don't get to publish. Simple.
       | 
       | That having been said:
       | 
       | 1) None of these "AI" tools have yet demonstrated the ability to
       | classify "This is datapath", "This is array logic", "This is
       | random logic". This is the _BIG_ win. And it won 't just be a
       | couple of percentage points in area or a couple of days saved
       | when it works--it will be 25%+ in area and months in time.
       | 
       | 2) Saving a couple of percentage points in random logic isn't
       | impressive. If I have the compute power to run EDA tools with a
       | couple of different random seeds, at least one run will likely be
       | a couple percentage points better.
       | 
       | 3) I really don't understand why they don't do stuff on
       | analog/RF. The patterns are smaller and much better matches to
       | the kind of reinforcement learning that current "AI" is suited
       | for.
       | 
       | I put this snake oil in the same category as "financial advice"--
       | if it worked, they wouldn't be sharing it and would simply be
       | printing money by taking advantage of it.
        
         | joshuamorton wrote:
         | As someone who has no skin in the game and is only loosely
         | following this, there is a tool: https://github.com/google-
         | research/circuit_training, the detractors claim to not be able
         | to reproduce Google's results (what Dean is commenting on) with
         | it, Google and 1-2 other companies claim to be using it
         | internally to success (e.g. see the end of this article:
         | https://deepmind.google/discover/blog/how-alphachip-
         | transfor...).
        
           | bsder wrote:
           | There are benchmarks in this space. You can also bring your
           | chip designs into the open and show what happens with
           | different tools. You can run the algorithm on the placed
           | designs that you sponsor for open source VLSI to show how
           | much better they are.
           | 
           | None of this has been done. This is _table stakes_ if you
           | want to talk about your EDA algorithm advancement. If this
           | weren 't coming out of Google, everybody would laugh it out
           | of the room (see what happened to a similar publication with
           | similar claims from a Chinese source--everybody dismissed it
           | out of hand--rightfully so even though that paper was _MUCH_
           | better than anything Google has promulgated).
           | 
           | Extraordinary claims require extraordinary evidence. Nothing
           | about AlphaChip even reaches _ordinary_ evidence.
           | 
           | If they hadn't gotten a publication in Nature for effectively
           | a failure, this would be _way_ less contentious.
        
             | throwaway2037 wrote:
             | > Nothing about AlphaChip even reaches ordinary evidence.
             | 
             | You reply is wildly confident and dismissive. If correct,
             | why did Nature choose to publish?
        
               | rowanG077 wrote:
               | Can you stop with this pure appeal to authority.
               | Publishing in nature is not proof it works. It's only
               | proof the paper has packaged the claim it works semi
               | well.
        
               | gabegobblegoldi wrote:
               | As Markov claims Nature did not follow their own policy.
               | Since Google's results are only on their designs, no one
               | can replicate them. Nature is single blind, so they
               | probably didn't want to turn down Jeff Dean so that they
               | wouldn't lose future business from Google.
        
         | xpe wrote:
         | > Google continues to peddle unsubstantiated snake oil
         | 
         | I read your comment, but I'm not following -- or maybe I
         | disagree with it -- I'm not sure yet.
         | 
         | "Snake oil" is an emotionally loaded term that raises the
         | temperature of the conversation. That usually makes having a
         | conversation harder.
         | 
         | From my point of view, AlphaGo, AlphaZero, AlphaFold were
         | significant achievements. Agree? Are you claiming that
         | AlphaChip is not? Are you claiming they are perpetrating some
         | kind of deception or exaggeration? Your numbered points seem
         | like valid criticisms (I haven't evaluated them closely), but
         | even if true, I don't see how they support your "snake oil"
         | claim.
        
           | griomnib wrote:
           | They have literally been caught faking AI demos, they brought
           | distrust on themselves.
        
             | rajup wrote:
             | Really not sure how you're conflating product demos which
             | are known to be pie in the sky across the industry (not
             | just Google) with peer reviewed research published in
             | journals. Super basic distinction imho.
        
               | stackghost wrote:
               | >peer reviewed research published in journals
               | 
               | Peer review doesn't mean as much as Elsevier would like
               | you to believe. Plenty of peer-reviewed research is
               | absolute trash.
        
               | throwaway2037 wrote:
               | All of the highest impact papers authored by DeepMind and
               | Google Brain have appeared in Nature, which is the gold
               | standard for peer-reviewed natural science research. What
               | exactly are you trying to claim about Google's peer-
               | reviewed papers?
        
               | stackghost wrote:
               | Nature is just as susceptible to the perverse incentives
               | at play in the academic publishing market as anyone else,
               | and has had their share of controversies over the years
               | including having to retract papers after they were found
               | to be bogus.
               | 
               | In and of itself, "Being published in a peer reviewed
               | journal" does not place the contents of a paper beyond
               | reproach or criticism.
        
               | gabegobblegoldi wrote:
               | Peer review is not designed to combat fraud.
        
               | nautilius wrote:
               | From personal experience: in Nature Communications the
               | handling editor and editor in chief absolutely do
               | intervene, in my example to suppress a proper lit review
               | that would have revealed the paper under review as much
               | less innovative than claimed.
        
           | 11101010001100 wrote:
           | Their material discovery paper turned out to have negligible
           | significance.
        
             | xpe wrote:
             | If so, does this qualify as "snake oil"? What do you mean?
             | Snake oil requires exaggeration and deception. Fair?
             | 
             | If a paper / experiment is done with intellectual honesty,
             | great! If it doesn't make a big splash, fine.
        
               | 11101010001100 wrote:
               | The paper is more or less a dead end. If there is another
               | name you want to call it, by all means.
        
               | sanxiyn wrote:
               | I think the paper was probably done honestly, but also
               | very poorly. They claimed synthesis of 36 new materials.
               | When reviewed, for 24/36 "the predicted structure has
               | ordered cations but there is no evidence for order, and a
               | known, disordered version of the compound exists". In
               | fact, with other errors, 36/36 claims were doubtful. This
               | reflects badly for authors and worse for peer review
               | process of Nature.
               | 
               | https://x.com/Robert_Palgrave/status/1744383962913394758
        
           | bsder wrote:
           | > From my point of view, AlphaGo, AlphaZero, AlphaFold were
           | significant achievements.
           | 
           | These things you mentioned had obvious benchmarks that were
           | _easily_ surpassed by the appropriate  "AI". The evidence
           | that they were better wasn't just significant, it was
           | _obvious_.
           | 
           | This leaves the fact that with what appears to be maximal
           | cooking of the books, the only thing AlphaChip seems to be
           | able to beat is human, manual placement and not anything
           | algorithmic--even from many, many generations ago.
           | 
           | Trying to pass that off as a significant "advance" in a
           | "scientific publication" borders on scientific fraud and
           | should _definitely_ be called out.
           | 
           | The problem here is that I am certain that this is wired to
           | the career trajectories of "Very Important People(tm)" and
           | the fact that it essentially failed miserably is simply not
           | politically allowed.
           | 
           | If they want to lie, they can do that in press releases. If
           | they want published in something reputable, they should have
           | to be able to provide proper evidence for replication.
           | 
           | And, if they can't do that, well, that's an answer itself,
           | no?
        
             | xpe wrote:
             | > Trying to pass that off as a significant "advance" in a
             | "scientific publication" borders on scientific fraud and
             | should definitely be called out.
             | 
             | If true, your stated concerns with the AlphaChip paper --
             | selective benchmarking and potential overselling of results
             | - reflect poor scientific practice and possible
             | intellectual dishonesty. This does not constitute
             | scientific fraud, which occurs when the underlying
             | method/experiment/rules are faked.
             | 
             | If the paper has issues with how it positions and
             | contextualizes its contribution, criticism is warranted,
             | sure. But don't confuse this with "scientific fraud".
             | 
             | Some context: for as long as benchmark suites have existed,
             | people rightly comment on which benchmarks should be
             | included and how they should be weighted.
        
             | xpe wrote:
             | > "scientific publication"
             | 
             | These air quotes suggests the commenter above doesn't think
             | the paper qualifies a scientific publication. Such a
             | characterization is unfair.
             | 
             | When I read the Nature article titled "Addendum: A graph
             | placement methodology for fast chip design" [1], I see
             | writing that more than meets the bar for a scientific
             | publication. For example:
             | 
             | > Since publication, we have open-sourced a software
             | repository [21] to fully reproduce the methods described in
             | our paper. External researchers can use this repository to
             | pre-train on a variety of chip blocks and then apply the
             | pre-trained model to new blocks, as was done in our
             | original paper. As part of this addendum, we are also
             | releasing a model checkpoint pre-trained on 20 TPU blocks
             | [22]. For best results, however, we continue to recommend
             | that developers pre-train on their own in-distribution
             | blocks [18], and provide a tutorial on how to perform pre-
             | training with our open-source repository [23].
             | 
             | [1]: https://www.nature.com/articles/s41586-024-08032-5
             | 
             | [18]: Yue, S. et al. Scalability and generalization of
             | circuit training for chip floorplanning. In Proc. 2022
             | International Symposium on Physical Design 65-70 (2022).
             | 
             | [21]: Guadarrama, S. et al. Circuit Training: an open-
             | source framework for generating chip floor plans with
             | distributed deep reinforcement learning. GitHub
             | https://github.com/google-research/circuit_training (2021).
             | 
             | [23]: Guadarrama, S. et al. Pre-training. GitHub
             | https://github.com/google-
             | research/circuit_training/blob/mai... (2021).
        
           | seanhunter wrote:
           | Well here's one exaggeration that was pretty obvious to me
           | straight away as a somewhat disinterested observer. In her
           | status on X Anna Goldie says [1] " AlphaChip was one of the
           | first RL methods deployed to solve a real-world engineering
           | problem". This seems very clearly untrue- for example here's
           | a real-world engineering use of reinforcement learning by
           | google AI themselves from 6 years ago [2] which if you use
           | Anna Goldie's own timeline is 2 years before alphachip.
           | 
           | [1] https://x.com/annadgoldie/status/1858531756506558688
           | 
           | [2] https://youtu.be/W4joe3zzglU?si=mFvZq8gEI6LeEQdC
        
         | xpe wrote:
         | > if it worked, they wouldn't be sharing it and would simply be
         | printing money by taking advantage of it.
         | 
         | Sure, there are some techniques in financial markets that are
         | only valuable when they are not widely known. But claiming this
         | pattern applies universally is incorrect.
         | 
         | Publishing a technique doesn't prove it doesn't work. (Stating
         | it this way makes it fairly obvious.)
         | 
         | DeepMind, like many AI research labs, publish important and
         | useful research. One might ask "is a lab leaving money off the
         | table by publishing?". Perhaps a better question is "What
         | 'game' is the lab playing and over what time scale?".
        
         | lobochrome wrote:
         | Agreed, in particular on #2
         | 
         | Given infinite time and compute - maybe the approach is
         | significantly better. But that's just not practical. So unless
         | you see dramatic shifts - no one is going to throw away proven
         | results on your new approach because of the TTM penalty if it
         | goes wrong.
         | 
         | The EDA industry is (has to be) ultra conservative.
        
           | throwaway2037 wrote:
           | > The EDA industry is (has to be) ultra conservative.
           | 
           | What is special about EDA that requires it to be more
           | conservative?
        
             | achierius wrote:
             | Taping out a chip is an incredibly expensive (7-8 figure)
             | fixed cost. If the chips that come out have too many bugs
             | (say because your PD tools missed up some wiring for 1 in
             | 10,000 blocks) then that money is gone. If you're Intel
             | this is enough to make people doubt the health of your
             | firm; if you're a startup, you're just done.
        
         | raverbashing wrote:
         | Honestly this does not compute
         | 
         | > None of these "AI" tools have yet demonstrated the ability to
         | classify "This is datapath", "This is array logic", "This is
         | random logic".
         | 
         | Sounds like a good objective, one that could be added to
         | training parameters. Or maybe it isn't needed (AI can
         | 'understand' some concepts without explicitly tagging)
         | 
         | > If I have the compute power to run EDA tools with a couple of
         | different random seeds, at least one run will likely be a
         | couple percentage points better.
         | 
         | Then do it?! How long does it actually take to run? I know EDA
         | tools creators are bad at some kinds of code optimization (and
         | yes, it's hard) but let's say for a company like Intel, if it
         | takes 10 days to rerun a chip to get 1% better, that sounds
         | like a worthy tradeoff.
         | 
         | > I put this snake oil in the same category as "financial
         | advice"--if it worked, they wouldn't be sharing it and would
         | simply be printing money by taking advantage of it.
         | 
         | Yeah I don't think you understood the problem here. Good
         | financial advice is about balancing risks and returns.
        
         | throwaway2037 wrote:
         | > EDA companies are garbage
         | 
         | I don't understand this comment. Can you please explain? Are
         | they unethical? Or do they write poor software?
        
           | bsder wrote:
           | Yes and yes.
           | 
           | EDA companies are gatekeeping monopolies. They absolutely
           | abuse their monopoly position to extract huge chunks of money
           | out of companies, and are pretty much single-handedly
           | responsible for the fact that the hardware startup ecosystem
           | is moribund compared to that of the software startup
           | ecosystem.
           | 
           | They have been horrible liars about performance and
           | benchmarketing for decades. They dragged their feet miserably
           | over releasing Linux versions of their software because they
           | were extracting money based upon number of CPU licenses
           | (everything was on Sparc which was vastly inferior). Their
           | software hasn't really improved all that much over decades--
           | mostly they benefited from Moore's Law. They have made a
           | point of stifling attempts at interoperability and open data
           | exchange. They have bought lots of competitors mostly to just
           | shut them down. I can go on and on.
           | 
           | The EDA companies aren't quite Oracle--but they're not far
           | off.
           | 
           | This is one of the reasons why Google is getting pounded over
           | this--maybe even unfairly. People in the field are _super_
           | sensitive about bullshit claims from EDA vendors--we 've
           | heard them _all_ and been on the receiving end of the stick
           | _far_ too many times.
        
             | alexey-salmin wrote:
             | > pretty much single-handedly responsible for the fact that
             | the hardware startup ecosystem is moribund compared to that
             | of the software startup ecosystem.
             | 
             | This was the case before EDA companies even appeared.
             | Hardware is hard because it's manufacturing. You can't
             | "iterate quickly", every iteration costs millions of
             | dollars and so does every mistake.
        
               | bsder wrote:
               | > Hardware is hard because it's manufacturing. You can't
               | "iterate quickly", every iteration costs millions of
               | dollars and so does every mistake.
               | 
               | This is true for injection molding and yet we do that all
               | the time in small businesses.
               | 
               | A mask set for an older technology can be in the range of
               | $50K-$100K. That's right about the same price as
               | injection molds.
               | 
               | The main difference is that Solidworks is about $25K
               | while Cadence, et al, is about a megabuck.
        
             | octoberfranklin wrote:
             | _and are pretty much single-handedly responsible for the
             | fact that the hardware startup ecosystem is moribund_
             | 
             | Yes but not single-handedly -- it's them and the foundries,
             | hand-in-hand.
             | 
             | No startup can compete with Synopsys because TSMC doesn't
             | give out the true design rules to anybody smaller than
             | Apple for finfet processes. Essentially their DRC+LVS
             | software has become a DRM-encoded version of the design
             | rule manual.
        
             | teleforce wrote:
             | > The EDA companies aren't quite Oracle--but they're not
             | far off.
             | 
             | Agreed with most you mentioned but not about EDA companies
             | are not worst than Oracle, at least Oracle is still
             | supporting popular and useful open source projects namely
             | MySQL, Virtualbox, etc.
             | 
             | What open-source design software these EDA companies are
             | supporting currently although most of their software
             | originated from open source EDA software from UC Berkeley,
             | etc?
        
         | throwup238 wrote:
         | _> if it worked, they wouldn 't be sharing it and would simply
         | be printing money by taking advantage of it._
         | 
         | This is a fallacious argument. A better chip design process
         | does not eliminate all other risks like product-market fit or
         | the upfront cost of making masks or chronic mismanagement.
        
       | vighneshiyer wrote:
       | I have published an addendum to an article I wrote about
       | AlphaChip (https://vighneshiyer.com/misc/ml-for-placement/) at
       | the very bottom that addresses this rebuttal from Google and the
       | AlphaChip algorithm in general.
       | 
       | In short, I think the Nature authors have made some reasonable
       | criticisms regarding the training methodology employed by the
       | ISPD authors, but the extreme compute cost and runtime of
       | AlphaChip still makes it non-competitive with commercial
       | autofloorplanners and AutoDMP. Regardless, I think the ISPD
       | authors owe the Nature authors an even more rigorous study that
       | addresses all their criticisms. Even if they just try to evaluate
       | the pre-trained checkpoint that Google published, that would be a
       | useful piece of data to add to the debate.
        
         | wholehog wrote:
         | We're talking 16 GPUs for ~6 hrs for inference, and 48 hrs for
         | pre-training. This is not an exorbitant amount of compute.
         | 
         | A GPU costs $1-2/hr on the cloud market. So, ~$100-200 for
         | inference, and ~$800-1600 for pre-training, which amortizes
         | across chips. Cloud prices are an upper bound -- most CS labs
         | will have way more than this available on premises.
         | 
         | In an industry context, these costs are completely dwarfed by
         | the rest of the chip design process. (For context, the
         | licensing costs alone for most commercial EDA software are in
         | the millions of dollars.)
        
           | bushbaba wrote:
           | h100 GPU instances are multiple orders of magnitude more
           | expensive.
        
             | radq wrote:
             | Not true, H100s cost $2-3/GPU/hr on the open market.
        
               | menaerus wrote:
               | Yes, they even do at $1/GPU/hr. However, 8xH100 cluster
               | at full utilization is ~8kWh of electricity and costs
               | almost ~0.5M$. 16xH100 cluster is probably 2x of that.
               | How many years before you break-even at ~24$/GPU/day
               | income?
        
               | Jabbles wrote:
               | 7
               | 
               | https://www.google.com/search?q=0.5e6%2F8%2F24%2F365
        
               | menaerus wrote:
               | Did you really not understand rethoric nature of my
               | question and assumed that I can't do 1st grade primary
               | school math?
        
               | solidasparagus wrote:
               | Who cares? That's someone else's problem. I just pay
               | 2-3$/hr and the H100s are usable
        
             | YetAnotherNick wrote:
             | H100 GPUs are more or less similar in price/performance. It
             | is 2-3x more expensive per hour for 2-3x higher
             | performance.
        
           | vighneshiyer wrote:
           | You are correct. For commercial use, the GPUs used for
           | training and fine-tuning aren't a problem financially.
           | However, if we wanted to rigorously benchmark AlphaChip
           | against simulated annealing or other floorplanning
           | algorithms, we have to afford the same compute and runtime
           | budget to each algorithm. With 16 GPUs running for 6 hours,
           | you could explore a huge placement space using any algorithm,
           | and it isn't clear if RL will outperform the other ones.
           | Furthermore, the runtime of AlphaChip as shown in the Nature
           | paper and ISPD was still significantly greater than Cadence's
           | concurrent macro placer (even after pre-training, RL requires
           | several hours of fine-tuning on the target problem instance).
           | Arguably, the runtime could go down with more GPUs, but at
           | this point, it is unclear how much value is coming from the
           | policy network / problem embedding vs the ability to explore
           | many potential placements.
        
             | Jabbles wrote:
             | You're saying that if the other methods were given the
             | equivalent amount of compute they might be able to perform
             | as well as AlphaChip? Or at least that the comparison would
             | be fairer?
             | 
             | Are the other methods scalable in that way?
        
               | pclmulqdq wrote:
               | Yes, they are. The other approaches usually look like
               | simulated annealing, which has several hyperparameters
               | that control how much computing is used and improve
               | results with more compute usage.
        
               | vighneshiyer wrote:
               | Existing mixed-placement algorithms depend on
               | hyperparameters, heuristics, and initial states /
               | randomness. If afforded more compute resources, they can
               | explore a much wider space and in theory come up with
               | better solutions. Some algorithms like simulated
               | annealing are easy to modify to exploit arbitrarily more
               | compute resources. Indeed, I believe the comparison of
               | AlphaChip to alternatives would be fairer if compute
               | resources and allowed runtime were matched.
               | 
               | In fact, existing algorithms such as naive simulated
               | annealing can be easily augmented with ML (e.g. using
               | state embeddings to optimize hyperparameters for a given
               | problem instance, or using a regression model to fine-
               | tune proxy costs to better correlate with final QoR).
               | Indeed, I strongly suspect commercial CAD software is
               | already applying ML in many ways for mixed-placement and
               | other CAD algorithms. The criticism against AlphaChip
               | isn't about rejecting any application of ML to EDA CAD
               | algorithms, but rather the particular formulation they
               | used and objections to their reported results /
               | comparisons.
        
         | nemonemo wrote:
         | In the conclusion of the article, you said: "While I concede
         | that there are things the ISPD authors could have done better,
         | their conclusion is still sound. The Nature authors do not
         | address the fact that CMP and AutoDMP outperform CT with far
         | less runtime and compute requirements."
         | 
         | One key argument in the rebuttal against the ISPD article is
         | that the resources used in their comparison were significantly
         | smaller. To me, this point alone seems sufficient to question
         | the validity of the ISPD work's conclusions. What are your
         | thoughts on this?
         | 
         | Additionally, I noticed that the neutral tone of this comment
         | is quite a departure from the strongly critical tone of your
         | article toward the AlphaChip work (words like "arrogance",
         | "disdain", "hyperbole", "belittling", "hostile" for AlphaChip
         | authors, as opposed to "excellent" for a Synopsys VP.) Could
         | you share where this difference in tone originates?
        
           | vighneshiyer wrote:
           | > One key argument in the rebuttal against the ISPD article
           | is that the resources used in their comparison were
           | significantly smaller. To me, this point alone seems
           | sufficient to question the validity of the ISPD work's
           | conclusions. What are your thoughts on this?
           | 
           | I believe this is a fair criticism, and it could be a reason
           | why the ISPD Tensorboard shows divergence during training for
           | some RTL designs. The ISPD authors provide their own
           | justification for their substitution of training time for
           | compute resources in page 11 of their paper
           | (https://arxiv.org/pdf/2302.11014).
           | 
           | I do not think it changes the ISPD work's conclusions however
           | since they demonstrate that CMP and AutoDMP outperform CT wrt
           | QoR and runtime even though they use much fewer compute
           | resources. If more compute resources are used and CT becomes
           | competitive wrt QoR, then it will still lag behind in
           | runtime. Furthermore, Google has not produced evidence that
           | AlphaChip, with their substantial compute resources,
           | outperforms commercial placers (or even AutoDMP). In the
           | recent rebuttal from Google
           | (https://arxiv.org/pdf/2411.10053), the only claim on page 8
           | says Google VLSI engineers preferred RL over humans and
           | commercial placers on a blind study conducted in 2020.
           | Commercial mixed placers, if configured correctly, have
           | become very good over the past 4 years, so perhaps another
           | blind study is warranted.
           | 
           | > Additionally, I noticed that the neutral tone of this
           | comment is quite a departure from the strongly critical tone
           | of your article
           | 
           | I will openly admit my bias is against the AlphaChip work. I
           | referred to the Nature authors as 'arrogant' and 'disdainful'
           | with respect to their statement that EDA CAD engineers are
           | just being bitter ML-haters when they criticize the AlphaChip
           | work. I referred to Jeff Dean as 'belittling' and 'hostile'
           | and using 'hyperbole' with respect to his statements against
           | Igor Markov, which I think is unbecoming of him. I referred
           | to Shankar as 'excellent' with respect to his shrewd business
           | acumen.
        
             | nemonemo wrote:
             | Thank you for your thoughtful response. Acknowledging
             | potential biases openly in a public forum is never easy,
             | and in my view, it adds credibility to your words compared
             | to leaving such matters as implicit insinuations.
             | 
             | That said, on page 8, the paper says that 'standard
             | licensing agreements with commercial vendors prohibit
             | public comparison with their offerings.' Given this
             | inherent limitation, what alternative approach could have
             | been taken to enable a more meaningful comparison between
             | CT and CMP?
        
               | vighneshiyer wrote:
               | So I'm not sure what Google is referring to here. As you
               | can see in the ISPD paper (https://vlsicad.ucsd.edu/Publi
               | cations/Conferences/396/c396.p...) on page 5, they openly
               | compare Cadence CMP with AutoDMP and other algorithims
               | quantitatively. The only obfuscation is with the
               | proprietary GF12 technology, where they can't provide
               | absolute numbers, but only relative ones. Comparison
               | against commercial tools is actually a common practice in
               | academic EDA CAD papers, although usually the exact tool
               | vendor is obfuscated. CAD tool vendors have actually
               | gotten more permissive about sharing tool data and
               | scripts in public over the past few years. However, PDKs
               | have always been under NDAs and are still very
               | restrictive.
               | 
               | Perhaps the Cadence license agreement signed by a
               | corporation is different than the one signed by a
               | university. In such a case, they could partner with a
               | university. But I doubt their license agreement prevents
               | any public comparison. For example, see the AutoDMP paper
               | from NVIDIA (https://d1qx31qr3h6wln.cloudfront.net/public
               | ations/AutoDMP.p...) where on page 7 they openly
               | benchmark their tool against Cadence Innovus. My
               | suspicion is they wish to keep details about the TPU
               | blocks they evaluated under tight wraps.
        
               | nemonemo wrote:
               | The UCSD paper says "We thank ... colleagues at Cadence
               | and Synopsys for policy changes that permit our methods
               | and results to be reproducible and sharable in the open,
               | toward advancement of research in the field." This
               | suggests that there may have been policies restricting
               | publication prior to this work. It would be intriguing to
               | see if future research on AlphaChip could receive a
               | similar endorsement or support from these EDA companies.
        
               | vighneshiyer wrote:
               | Cadence in particular has been quite receptive to
               | allowing academics and researchers to benchmark new
               | algorithms against their tools. They have also been quite
               | permissive with letting people publish TCL scripts for
               | their tools (https://github.com/TILOS-AI-
               | Institute/MacroPlacement/tree/ma...) that in theory
               | should enable precise reproduction of results. From my
               | knowledge, Cadence has been very permissive from 2022
               | onwards, so while Google's objections to publishing data
               | from CMP may have been valid when the Nature paper was
               | published, they are no longer valid today.
        
               | nemonemo wrote:
               | We're not just talking about academia--Google's AlphaChip
               | has the potential to disrupt the balance of the EDA
               | industry's duopoly. It seems unlikely that Google could
               | easily secure the policy or license changes necessary to
               | publish direct comparisons in this context.
               | 
               | If publicizing comparisons of CMPs is as permissible as
               | you suggest, have you seen a publication that directly
               | compares a Cadence macro placement tool with a Synopsys
               | tool? If I were the technically superior party, I'd be
               | eager to showcase the fairest possible comparison,
               | complete with transparent benchmarks and tools. In the
               | CPU design space, we often see standardized benchmarking
               | tools like SPEC microbenchmarks and gaming benchmarks.
               | (And IMO that's part of why AMD could disrupt the PC
               | market.) Does the EDA ecosystem support a similarly open
               | culture of benchmarking for commercial tools?
        
       | AtlasBarfed wrote:
       | How the hell would you verify an AI-generated silicon design?
       | 
       | Like, for a CPU, you want to be sure it behaves properly for the
       | given inputs. Anyone remember that floating point error in, was
       | it Pentium IIs or Pentium IIIs?
       | 
       | I mean, I guess if the chip is designed for AI, and AIs are
       | inherently nonguaranteed output/responses, then the AI chip
       | design being nonguaranteed isn't any difference in nonguarantees.
       | 
       | Unless it is...
        
         | lisper wrote:
         | > How the hell would you verify an AI-generated silicon design?
         | 
         | The same way you verify a human-generated one.
         | 
         | > Anyone remember that floating point error in, was it Pentium
         | IIs or Pentium IIIs?
         | 
         | That was 1994. The industry has come a long way in the
         | intervening 30 years.
        
         | gwervc wrote:
         | A well working CPU is probably beside the point. What's
         | important now is for researchers to publish papers using or
         | speaking about AI. Then executives and managers to deploy AI in
         | their companies. Then selling AI PC (somehow, we are already at
         | this step). Whatever the results are. Customers issues will be
         | solved by using more AI (think chatbots) until morale improves.
        
         | quadrature wrote:
         | > How the hell would you verify an AI-generated silicon design?
         | 
         | I think you're asking a different question, but in the context
         | of the OP researchers are exploring AI for solving
         | deterministic but intractable problems in the field of chip
         | design and not generating designs end to end.
         | 
         | Here's an excerpt from the paper.
         | 
         | "The objective is to place a netlist graph of macros (e.g.,
         | SRAMs) and standard cells (logic gates, such as NAND, NOR, and
         | XOR) onto a chip canvas, such that power, performance, and area
         | (PPA) are optimized, while adhering to constraints on placement
         | density and routing congestion (described in Sections 3.3.6 and
         | 3.3.5). Despite decades of research on this problem, it is
         | still necessary for human experts to iterate for weeks with the
         | existing placement tools, in order to produce solutions that
         | meet multi-faceted design criteria."
         | 
         | The hope is that Reinforcement Learning can find solutions to
         | such complex optimization problems.
        
           | throwaway2037 wrote:
           | > Despite decades of research on this problem, it is still
           | necessary for human experts to iterate for weeks with the
           | existing placement tools, in order to produce solutions that
           | meet multi-faceted design criteria.
           | 
           | Ironically, this sounds a lot like building a bot to play
           | StarCraft, which is exactly what AlphaStar did. I had no idea
           | that EDA layout is still so difficult and manual in 2024.
           | This seems like a very worth area of research.
           | 
           | I am not an expert in AI/ML, but is the ultimate goal: Train
           | on as many open source circuit designs as possible to build a
           | base, then try to solve IC layouts problems via reinforcement
           | learning, similar to AlphaStar. Finally, use the trained
           | model to do inference during IC layout?
        
         | asveikau wrote:
         | The famous FPU issue that I can think of was the original
         | Pentium.
        
       | wholehog wrote:
       | The paper: https://arxiv.org/abs/2411.10053
        
       | _cs2017_ wrote:
       | Curious why there's so much emotion and unpleasantness in this
       | dispute? How did it evolve from the boring academic argument
       | about benchmarks, significance, etc to a battle of personal
       | attacks?
        
         | boredatoms wrote:
         | A lot of people work on non-AI implementations
        
           | benreesman wrote:
           | This is a big part of the reason. But it behooves us to ask
           | why a key innovation in a field (and I trust Jeff Dean that
           | this is one, I've never seen any reason to doubt either his
           | integrity or ability) should produce such a reaction. What
           | could make people act not just chagrined that their approach
           | wasn't the end state, but as though it was _existential_ to
           | discredit such an innovation?
           | 
           | Surely all of the people who did the work that the innovation
           | rests on should be confident they will be relevant, involved,
           | comfortable, and safe in the post-innovation world?
           | 
           | And yet it's not clear they should feel this way. Luddism
           | seems an unfounded ideology over the scope of history since
           | the origin of the term. But over the period since "AI"
           | entered the public discussion at the current level? Almost
           | two years exactly? Making the Luddite agenda credible has
           | seemed a ubiquitous talking point.
           | 
           | Over that time frame technical people have been laid off in
           | staggering numbers, a steadily-shrinking number of employers
           | have been slashing headcount and posting EPS beats, and "AI"
           | has been mentioned in every breath. It's so extreme that even
           | sophisticated knowledge of the kinds of subject matter that
           | goes into AlphaChip is (allegedly) useless without access to
           | the Hopper FLOPs.
           | 
           | If the AI Cartel was a little less rapacious, people might be
           | a little more open to embracing the AI revolution.
        
         | jart wrote:
         | If you think this is unpleasant, you should see the
         | environmentalists who try to take a poke at Jeff Dean on
         | Twitter.
        
           | _cs2017_ wrote:
           | Well... I kinda expect some people to be overly emotional.
           | But I just didn't expect this particular group of people to
           | be that.
        
         | RicoElectrico wrote:
         | Making extraordinary claims without a way to replicate it. And
         | then running to the press, which will swallow anything. Because
         | "AI designs AI... umm... I mean chips" sounds futuristic to a
         | liberal-arts majors (and apparently programmers too, which I'd
         | expect to know better and question everything "AI")
         | 
         | The whole publication process seems dishonest, starting from
         | publishing in Nature (why not ISCCC or something similar?)
        
         | aithrowawaycomm wrote:
         | The issue is that Big Tech commercial incentives around AI have
         | polluted the "boring academic" waters with dishonest
         | infomercials masquerading as journal articles or arXiv
         | preprints[1], and as a direct result contemporary AI research
         | has a much worse "replication crisis" than the social sciences,
         | yet with far fewer legitimate excuses.
         | 
         | Assuming Google isn't lying, a lot of controversy would go away
         | if they actually released their benchmark data for independent
         | people to look at. They are still refusing to do so:
         | https://cacm.acm.org/news/updates-spark-uproar/ Google thinks
         | we should simply accept their conclusions by fiat. And don't
         | forget about this:                 Madden further pointed out
         | that the "30 to 35%" advantage of RePlAce was consistent with
         | findings reported in a leaked paper by internal Google
         | whistleblower Satrajit Chatterjee, an engineer who Google fired
         | in 2022 when he first tried to publish the paper that
         | discredited the "superhuman" claims Google was making at the
         | time for its AI approach to chip design.
         | 
         | It is entirely appropriate to make "personal attacks" against
         | Jeff Dean, because the heart of the criticism is that his
         | personality is dishonest and authoritarian: he publishes
         | suspicious research and fires people who dissent.
         | 
         | [1] Jeff Dean hypocritically sneering about the critique being
         | a conference paper is especially galling. What an unbelievable
         | asshole.
        
         | akira2501 wrote:
         | Money.
        
       | rowanG077 wrote:
       | I don't get. Why isn't the model open if it works? If it isn't
       | this is just a fart in the wind. If it is the findings should be
       | straightforward to replicate.
        
         | amelius wrote:
         | Yes, the community should force Nature to up its standards or
         | ditch it. Software replication should be trivial in this day
         | and age.
        
       | gabegobblegoldi wrote:
       | Additional context: Jeff Dean has been accused of fraud and
       | misconduct in AlphaChip.
       | 
       | https://regmedia.co.uk/2023/03/26/satrajit_vs_google.pdf
        
         | jiveturkey wrote:
         | the link is for a wrongful termination lawsuit, related to the
         | fraud but not a case for the fraud itself. settled may 2024
        
       | oesa wrote:
       | In the tweet Jeff Dean says that Cheng at al. failed to follow
       | the steps required to replicate the work of the Google
       | researchers.
       | 
       | Specifically:
       | 
       | > In particular the authors did no pre-training (despite pre-
       | training being mentioned 37 times in our Nature article), robbing
       | our learning-based method of its ability to learn from other chip
       | designs
       | 
       | But in the Circuit Training Google repo[1] they specifically say:
       | 
       | > Our results training from scratch are comparable or better than
       | the reported results in the paper (on page 22) which used fine-
       | tuning from a pre-trained model.
       | 
       | I may be misunderstanding something here, but which one is it?
       | Did they mess up when they did not pre-train or they followed the
       | "steps" described in the original repo and tried to get a fair
       | reproduction?
       | 
       | Also, the UCSD group had to reverse-engineer several steps to
       | reproduce the results so it seems like the paper's results
       | weren't reproducible by themselves.
       | 
       | [1]: https://github.com/google-
       | research/circuit_training/blob/mai...
        
         | gabegobblegoldi wrote:
         | Markov's paper also has links to Google papers from two
         | different sets of authors that shows minimal advantage of
         | pretraining. And given the small number of benchmarks using a
         | pretrained model from Google whose provenance is not known
         | would be counterproductive. Google likely trained it on all
         | available benchmarks to regurgitate the best solutions of
         | commercial tools.
        
         | cma wrote:
         | Training from scratch could presumably mean including the new
         | design attempts and old designs mixed in.
         | 
         | So no contradiction: pretrain on old designs then finetune on
         | new design, vs train on everything mixed together throughout.
         | Finetuning can cause catastrophic forgetting. Both could have
         | better performance than not including old designs.
        
       | puff_pastry wrote:
       | The biggest disappointment is that these discussions are still
       | happening on Twitter/X. Leave that platform already
        
         | xpe wrote:
         | Sure, we want individuals to act in a way to mitigate
         | collective action problems. But the collective action problem
         | exists (by definition) because individuals are trapped in some
         | variation of a prisoner's dilemma.
         | 
         | So, collective action problems are nearly a statistical
         | certainty across a wide variety of situations. And yet we still
         | "blame" individuals? We should know better.
        
           | pas wrote:
           | So you're saying Head of AI of Google of Jeff can't choose a
           | better venue?
           | 
           | He's not the first Jeffery with a lot of power who doesn't
           | care.
        
             | xpe wrote:
             | > So you're saying Head of AI of Google of Jeff can't
             | choose a better venue?
             | 
             | Phrasing it this way isn't useful. Talking about choice in
             | the abstract doesn't help with a game-theoretic analysis.
             | You need costs and benefits too.
             | 
             | There are many people who face something like a prisoner's
             | dilemma (on Twitter, for example). We could assess the
             | cost-benefit of a particular person leaving Twitter. We
             | could even judge them according to some standards (ethical,
             | rational, and so on). But why bother?...
             | 
             | ...Think about major collective action failures. How often
             | are they the result of just one person's decisions? How
             | does "blaming" or "judging" an individual help make a
             | situation better? This effort on blaming could be better
             | spent elsewhere; such as understanding the system and
             | finding leverage points.
             | 
             | There are cases where blaming/guilt can help, but only in
             | the prospective sense: if a person knows they will be
             | blamed and face consequences for an action, it will make
             | that action more costly. This might be enough to deter than
             | decision. But do you think this applies in the context of
             | the "do I leave Twitter?" decision? I'd say very little, if
             | at all.
        
               | pas wrote:
               | Yes, but the game matrix is not that simple. There's a
               | whole gamut of possible actions between defect and sleep
               | with Elon.
               | 
               | Cross-posting to a Mastodon account is not that hard.
               | 
               | I look at this from two viewpoints. One is that it's good
               | that he spends most of this time and energy doing
               | research/management and not getting bogged down in
               | culture war stuff. The other is that those who have all
               | this power ought to wield it a tiny tiny bit more
               | responsibly. (IMHO social influence of the
               | elites/leaders/cool-kids are also among those leverage
               | points you speak of.)
               | 
               | Also, I'm not blaming him. I don't think it's morally
               | wrong to use X. (I think it's mentally harmful, but X is
               | not unique in this. Though character limit does select
               | for "no u" type messages.) I'm at best cynically musing
               | about the claimed helplessness of Jeff Dean with regards
               | to finding a forum.
        
       | segmondy wrote:
       | It's ridiculous how expensive the wrong hire can be
       | https://www.wired.com/story/google-brain-ai-researcher-fired...
        
         | benreesman wrote:
         | It is indeed a big deal to hire people who will commit or
         | contrive at fraud: academic, financial, or otherwise.
         | 
         | But the best (probably only) way to put downward pressure on
         | that is via internal incentives, controls, and culture. You
         | push hard enough for such percent per cadence with no upper
         | bound and graduate the folks who reliably deliver it without
         | checking if the win was there to begin with? This is scale-
         | invariant: it could be in a pod, a department, a company, a
         | hedge fund that owns much of those companies, a fund of those
         | funds, the federal government.
         | 
         | Sooner or later your leadership is substantially penetrated by
         | the unscrupulous. We see this in academia with the spate of
         | scandals around publications. We see this in finance with, who
         | can even count that high anymore. You see Holmes and SBF in
         | prison but the folks they funded still at the apex of relevance
         | and everyone from that clique? Everyone who didn't just fall of
         | a turnip truck knows has carried that ideology with them and
         | has better lawyers now.
         | 
         | There's an old saw that a "fish rots from the head". We can't
         | look at every manner of shadiness and constant scandal from the
         | iconic leaders of our STEM industry and say "good for them,
         | they outsmarted the system" and expect any result other than a
         | broad-spectrum attack on any honest, fair, equitable status
         | quo.
         | 
         | We all voted with our feet (and I did my share of that too
         | before I quit in disgust) for a "might makes right" quasi-
         | religious system of ideals, known variously as Objectivism,
         | Effective Altruism, and Capitalism (of which it is no kind). We
         | shouldn't be surprised that everything is kind of tarnished
         | sticky now.
         | 
         | The answer today? I don't know. Work for the less bad as
         | opposed to more bad companies, speak out at least anonymously
         | about abuses, listen to the leaders speak in interviews and
         | scrutinize it. I'm open to suggestions.
        
         | fxtentacle wrote:
         | It seems that Chatterjee - the bad guy in your linked article -
         | is now suing Google because he thinks he got canned for
         | pointing out that his boss - Jeff Dean mentioned in the article
         | discussed here - was knowingly publishing fraudulent claims.
         | 
         | "To be clear, we do NOT have evidence to believe that RL
         | outperforms academic state-of--art and strongest commercial
         | macro placers. The comparisons for the latter were done so
         | poorly that in many cases the commercial tool failed to run due
         | to installation issues." and that's supposedly a screenshot
         | from an internal presentation done by Jeff Dean.
         | 
         | https://regmedia.co.uk/2023/03/26/satrajit_vs_google.pdf
         | 
         | As an outsider, I find it very difficult to judge if Chatterjee
         | was a bad and expensive hire (because he suppressed good
         | results by coworkers) or if he was a very valuable employee
         | (because he tried to prevent publishing false statements).
        
           | mi_lk wrote:
           | Clearly there's a huge difference between
           | 
           | 1. preventing bad things
           | 
           | 2. preventing bad in a way that all junior members on the
           | receiving end feel bullied
           | 
           | So judging from the article alone, it's either suppressing
           | good results and 2. above, both of which are not valuable in
           | my book
        
             | gabegobblegoldi wrote:
             | The court case provides more details. Looks like the junior
             | researchers and Jeff Dean teamed up and bullied Chatterjee
             | and his team to prevent the fraud from being exposed. IIRC
             | the NYT reported at the time that Chatterjee was fired
             | within an hour of disclosing that he was going to report
             | Jeff Dean to the Alphabet Board for misconduct.
        
           | jiveturkey wrote:
           | case was settled in may 2024
        
       | dang wrote:
       | Related. Others?
       | 
       |  _AI Alone Isn 't Ready for Chip Design_ -
       | https://news.ycombinator.com/item?id=42207373 - Nov 2024 (2
       | comments)
       | 
       |  _That Chip Has Sailed: Critique of Unfounded Skepticism Around
       | AI for Chip Design_ -
       | https://news.ycombinator.com/item?id=42172967 - Nov 2024 (9
       | comments)
       | 
       |  _Reevaluating Google 's Reinforcement Learning for IC Macro
       | Placement (AlphaChip)_ -
       | https://news.ycombinator.com/item?id=42042046 - Nov 2024 (1
       | comment)
       | 
       |  _How AlphaChip transformed computer chip design_ -
       | https://news.ycombinator.com/item?id=41672110 - Sept 2024 (194
       | comments)
       | 
       |  _Tension Inside Google over a Fired AI Researcher's Conduct_ -
       | https://news.ycombinator.com/item?id=31576301 - May 2022 (23
       | comments)
       | 
       |  _Google is using AI to design chips that will accelerate AI_ -
       | https://news.ycombinator.com/item?id=22717983 - March 2020 (1
       | comment)
        
       | lumb63 wrote:
       | I've not followed this story at all, and have no idea what is
       | true or not, but generally when people use a boatload of
       | adjectives which serve no purpose but to skew opinion, I assume
       | they are not being honest. Using certain words to describe a
       | situation does _not_ make the situation what the author is
       | saying, and if it is as they say, then the actual content should
       | speak for itself.
       | 
       | For instance:
       | 
       | > Much of this unfounded skepticism is driven by a deeply flawed
       | non-peer-reviewed publication by Cheng et al. that claimed to
       | replicate our approach but failed to follow our methodology in
       | major ways. In particular the authors did no pre-training
       | (despite pre-training being mentioned 37 times in our Nature
       | article),
       | 
       | This could easily be written more succinctly, and with less bias,
       | as:
       | 
       | > Much of this skepticism is driven by a publication by Cheng et
       | al. that claimed to replicate our approach but failed to follow
       | our methodology in major ways. In particular the authors did no
       | pre-training,
       | 
       | Calling the skepticism unfounded or deeply flawed does not make
       | it so, and pointing out that a particular publication is not peer
       | reviewed does not make its contents false. The authors would be
       | better served by maintaining a more neutral tone rather than
       | coming off accusatory and heavily biased.
        
       | Keyframe wrote:
       | At this point in time, why wouldn't we give at least benefit of
       | the doubt to Jeff Dean immediately? His track record is second to
       | none, and he's still going strong. Has something happened that
       | cast a shadow on him? Sometimes it is the messenger that brings
       | in the weight.
        
         | gabegobblegoldi wrote:
         | Looks like he aligned himself with the wrong folks here. He is
         | a system builder at heart but not an expert in chip design or
         | EDA. And also not really an ML researcher. Some would say he
         | got taken for a ride by a young charismatic grifter and is now
         | in too deep to back out. His focus on this project didn't help
         | with his case at Google. They moved all the important stuff
         | away from him and gave it to Demis last year and left him with
         | an honorary title. Quite sad really for someone of his
         | accomplishments.
        
           | alsodumb wrote:
           | I mean Jeff Dean is probably more ML researcher than probably
           | 90% of the ML researchers out there. Sure, he may not be
           | working on state of the art stuff himself; but he's too up
           | the chain to do that.
        
       ___________________________________________________________________
       (page generated 2024-12-01 23:00 UTC)