This is a good title because it succinctly captures the issue: LeCun hyped this work by making wildly inaccurate claims and cherry picking model outputs. Go read his original tweets about the model's capabilities. Read Facebook's own characterization of what this model could achieve.
Not only did they exaggerate and hype, but they also didn't even try to solve some of the most glaring issues. The efforts on toxicity mentioned in their paper aren't even mid. They barely put effort into measuring the issue, and definitely didn't make any attempt to mitigate or correct.
Toxicity isn't really the point. Here's the point. If you can't prevent a model from being overtly toxic, then why should I believe you can give any guarantee at all about the model's output? I shouldn't, because you can't.
Galactica is just another a language model. It can be a useful tool. Facebook and LeCun oversold its capabilities and downplayed its issues. If they had just been honest and humble, things would've probably gone very differently.
In some sense, this is good news. The deep learning community -- and generative model work in particular -- is getting a much-needed helping of humble pie.
Hopefully we can continue publishing and hosting models without succumbing to moral panic. But the first step toward that goal is for scientists to be honest about the capabilities and limitations of their models.
----
My account is new so I am rate limited and unable to reply to replies. My response to the general vibes of replies is therefore added to the above post as an edit. Sorry.
Response about toxicitiy:
It's a proxy that they say they care about. I can stop there, but I'll also point out: it's not just "being nice", it's also stuff like overt defense of genocide, instructions for making bombs, etc. These are lines that no company wants their model to cross, and reasonably so. If you can't even protect Meta enough to keep the model online for more than a day or two, then why should I believe you can give any guarantee at all about the model's output in my use case? (And, again, they can't. It's a huge problem with LLLMs)
Response about taking the model down:
I'm not at FB/Meta, but I think I know what happened here.
In the best case, Meta was spending a lot of valuable zero-sum resources (top of the line GPUs) hosting the model. In the worst case they were setting a small fortune on fire at a cloud provider. Even at the largest companies with the most compute, there is internal competition and rationing for the types of GPUs you would need to host a Galactica-sized model. Especially in prototype phase.
An executive decided they would rather pull the plug on model hosting than spend zero-sum resources on a public relations snafu with no clear path to revenue. It was a business decision. The criticism of Galactica and especially the messaging around it was totally fair. The business decision was rational. Welcome to private sector R&D; it works a little different from your academic lab for better and for worse.
So, instead of people attacking him for absurdly hyping the model, they attacked the model as being DANGEROUS in order to remove it to .... spite whom exactly? Typical academic catfight
Also, let's not pretend that every academic and institution is not overhyping their work. If you read a bunch of academic press releases you 'd think we are on the verge of curing cancer and fusion any day now.
Blatantly false claims and "hype" need to be addressed immediately and strongly when they're being pushed into places that actually matter, like city streets and medical science.
While I understand you're using "toxicity" here as a proxy, the implication is that if it were nicer you would have more easily believed it's output.
Is that really where we want things to be? Because I strongly suspect it's a lot easier for them to make it nicer than it is for them to make the output good.
Just a reminder, those people on twitter screaming up a storm didn't take the model down, Facebook did. If Facebook didn't want the model to come down, there's nothing the twitter mob can do about that. Twitter isn't a democracy, a million screaming voices have no power.
Not only did they exaggerate and hype, but they also didn't even try to solve some of the most glaring issues. The efforts on toxicity mentioned in their paper aren't even mid. They barely put effort into measuring the issue, and definitely didn't make any attempt to mitigate or correct.
Toxicity isn't really the point. Here's the point. If you can't prevent a model from being overtly toxic, then why should I believe you can give any guarantee at all about the model's output? I shouldn't, because you can't.
Galactica is just another a language model. It can be a useful tool. Facebook and LeCun oversold its capabilities and downplayed its issues. If they had just been honest and humble, things would've probably gone very differently.
In some sense, this is good news. The deep learning community -- and generative model work in particular -- is getting a much-needed helping of humble pie.
Hopefully we can continue publishing and hosting models without succumbing to moral panic. But the first step toward that goal is for scientists to be honest about the capabilities and limitations of their models.
----
My account is new so I am rate limited and unable to reply to replies. My response to the general vibes of replies is therefore added to the above post as an edit. Sorry.
Response about toxicitiy:
It's a proxy that they say they care about. I can stop there, but I'll also point out: it's not just "being nice", it's also stuff like overt defense of genocide, instructions for making bombs, etc. These are lines that no company wants their model to cross, and reasonably so. If you can't even protect Meta enough to keep the model online for more than a day or two, then why should I believe you can give any guarantee at all about the model's output in my use case? (And, again, they can't. It's a huge problem with LLLMs)
Response about taking the model down:
I'm not at FB/Meta, but I think I know what happened here.
In the best case, Meta was spending a lot of valuable zero-sum resources (top of the line GPUs) hosting the model. In the worst case they were setting a small fortune on fire at a cloud provider. Even at the largest companies with the most compute, there is internal competition and rationing for the types of GPUs you would need to host a Galactica-sized model. Especially in prototype phase.
An executive decided they would rather pull the plug on model hosting than spend zero-sum resources on a public relations snafu with no clear path to revenue. It was a business decision. The criticism of Galactica and especially the messaging around it was totally fair. The business decision was rational. Welcome to private sector R&D; it works a little different from your academic lab for better and for worse.