More

gabegobblegoldi · on Dec 1, 2024

Looks like he aligned himself with the wrong folks here. He is a system builder at heart but not an expert in chip design or EDA. And also not really an ML researcher. Some would say he got taken for a ride by a young charismatic grifter and is now in too deep to back out. His focus on this project didn’t help with his case at Google. They moved all the important stuff away from him and gave it to Demis last year and left him with an honorary title. Quite sad really for someone of his accomplishments.

the-rc · on Dec 2, 2024

Not a ML researcher, too? He was working on neural networks in 1990. Last year he was under Research and now reports directly to Sundar. What do you know that we don't?

t_serpico · on Dec 2, 2024

I don't think he got taken for a ride. Rather, he also wanted to believe that AlphaChip would be as revolutionary as it claimed to be and chose to ignore Chaterjee's reservations. Understandable, given all the AlphaX models coming out around that timeframe.

alsodumb · on Dec 1, 2024

I mean Jeff Dean is probably more ML researcher than probably 90% of the ML researchers out there. Sure, he may not be working on state of the art stuff himself; but he's too up the chain to do that.

dogleg77 · on Dec 2, 2024

That's an appeal to authority, and not an effective one. Jeff Dean doesn't have a good track record in chip design.

wholehog · on Dec 2, 2024

What are you even talking about? Jeff had a hand in TPU, which is so successful that all other AI companies are trying to clone this project and spin up their own efforts to make custom AI chips.

dogleg77 · on Dec 2, 2024

Please specify what you mean by "had a hand in TPU" and where else he had his hand. Thank you

otterley · on Dec 2, 2024

By what measure are TPUs “successful”? Where is your data coming from?

cosentiyes · on Dec 3, 2024

They're the only non-nvidia accelerator used to train state of the art large language models at scale?

otterley · on Dec 4, 2024

AWS Trainium does as well: https://aws.amazon.com/ai/machine-learning/trainium/

wholehog · on Dec 2, 2024

> Some would say he got taken for a ride by a young charismatic grifter and is now in too deep to back out.

Was the TPU physical design team also taken in? And also MediaTek? And also TF-Agents, which publicly said they re-produced the AlphaChip method and results exactly?

dogleg77 · on Dec 2, 2024

What did the TPU physical design team say about this publicly? Can you also point to a statement from MediaTek? (I've seen a quote in Google blog, but was unable to confirm it). Who in the TF-agents team has serious physical design background?

wholehog · on Dec 2, 2024

Are you really suggesting that the TPU team does not stand behind the graphs in Google's own blog post? And that MediaTek does not stand behind their quoted statement?

dogleg77 · on Dec 2, 2024

How would I know either way?

Someone mentioned here before that Google folks have been using "hyperbolae". So, if MediaTek can clarify how they are using AlphaChip, everyone wins.

gabegobblegoldi · on Dec 1, 2024

The court case provides more details. Looks like the junior researchers and Jeff Dean teamed up and bullied Chatterjee and his team to prevent the fraud from being exposed. IIRC the NYT reported at the time that Chatterjee was fired within an hour of disclosing that he was going to report Jeff Dean to the Alphabet Board for misconduct.

gabegobblegoldi · on Dec 1, 2024

Markov’s paper also has links to Google papers from two different sets of authors that shows minimal advantage of pretraining. And given the small number of benchmarks using a pretrained model from Google whose provenance is not known would be counterproductive. Google likely trained it on all available benchmarks to regurgitate the best solutions of commercial tools.

gabegobblegoldi · on Dec 1, 2024

In this case there were credible claims of fraud from Google insiders. See my comment above.

gabegobblegoldi · on Dec 1, 2024

As Markov claims Nature did not follow their own policy. Since Google’s results are only on their designs, no one can replicate them. Nature is single blind, so they probably didn’t want to turn down Jeff Dean so that they wouldn’t lose future business from Google.

gabegobblegoldi · on Dec 1, 2024

Peer review is not designed to combat fraud.

gabegobblegoldi · on Dec 1, 2024

Additional context: Jeff Dean has been accused of fraud and misconduct in AlphaChip.

https://regmedia.co.uk/2023/03/26/satrajit_vs_google.pdf

jiveturkey · on Dec 1, 2024

the link is for a wrongful termination lawsuit, related to the fraud but not a case for the fraud itself. settled may 2024

Analemma_ · on Dec 2, 2024

"Settled" does not mean "Dean did nothing wrong". It means "Google paid the plaintiffs a lot of money so they'd stop saying publicly that Dean did something wrong", which is very different.

gabegobblegoldi · on Oct 3, 2024

Good question. I thought the tpus were a way for Google to apply pricing pressure to nvidia by having an alternative. They are not particularly better (it’s hard to get utilization), and I believe Google continues to be a big buyer of nvidia chips.

gabegobblegoldi · on Oct 3, 2024

Interesting == Suspicious? I think this is a big red flag to those in the know.

gabegobblegoldi · on Sept 27, 2024

Doesn’t look like it. In fact the original paper claimed that their RL method could be used for all sorts of combinatorial optimization problems. Yet they chose an obscure problem in chip design and showed their results on proprietary data instead of standard public benchmarks.

Instead they could have demonstrated their amazing method on any number of standard NP hard optimization problems e.g. traveling salesman, bin packing, ILP, etc. where we can generate tons of examples and verify easily whether it produces better results than other solvers or not.

This is why many in the chip design and optimization community felt that the paper was suspicious. Even with this addendum they adamantly refuse to share any results that can be independently verified.

AshamedCaptain · on Sept 28, 2024

> Yet they chose an obscure problem in chip design

It is not obscure (in chip design). If anything it is one of the most easily reachable problems. Almost every other PhD student in the field has implemented a macro placer, even if just for fun, and there are frequent academic competitions. A lot of design houses also roll their own macro placers since it's not a difficult problem and generally adding a bit of knowledge of your design style can help you gain an extra % over the generic commercial tools.

It does not surprise me at all that they decided to start with this for their foray into chip EDA. It's the minimum effort route.

gabegobblegoldi · on Sept 28, 2024

Sorry. I meant obscure relative to the large space of combinatorial optimization problems not just chip design.

Most design houses don’t write their own macro placers but customize commercial flows for their designs.

The problem with macro placement as an RL technology demonstrator is that to evaluate quality you need to go through large parts of the design flow which involves using other commercial tools. This makes it incredibly hard to evaluate superiority since all those steps and tools add noise.

Easier problems would have been to use RL to minimize the number of gates in a logic circuit or just focus on placement with half perimeter wirelength (I think this is what you mean with your grad student example). Essentially solving point problems in the design flow and evaluating quality improvements locally.

They evaluated quality globally and only globally and that destroys credibility in this business due to the noise involved unless you have lots of examples, can show statistical significance, and (unfortunately for the authors) also local improvements.

That’s what the follow on studies did and that’s why the community has lost faith in this particular algorithm.

AshamedCaptain · on Sept 28, 2024

> Most design houses don’t write their own macro placers but customize commercial flows for their designs.

Most I don't know, but all the mid-to-large ones have automated macro placers. Obviously, the output is introduced into the commercial flow, generally by setting placement constraints. The larger houses go much further and may even override specific parts of the flow, but not basing it on an commercial flow is out of the question right now.

> The problem with macro placement as an RL technology demonstrator is that to evaluate quality you need to go through large parts of the design flow which involves using other commercial tools.

Not really, not any more than any other optimization such as e.g. frontend which I'm more familiar with. If you don't want to go through the full design flow (which I agree introduces noise more than anything else), then benchmark your floorplans in some easily calculable metric (e.g., HPWL). Likewise, if you want to test the quality of some logic simplification _in theory_ you'd have to also go through the entire flow (backend included), but no one does that and you just evaluate some easily calculable metric e.g. number of gates. These distinctions are traditional more than anything else.

Academic macro placers generally have limited access to commercial flows (either due to licensing issues or computing resource availability) so it is rather common to benchmark them in other metrics. Google paper tried to be too smart for its own good and therefore incomparable to anything academic.

gabegobblegoldi · on Oct 3, 2024

Thanks.