Anthropic has been slow at deploying their models at scale. For a very long period of time, it was virtually impossible to get access to their API for any serious work without making a substantial financial commitment. Whether that was due to safety concerns or simply the fact that their models were not cost-effective or scalable, I don't know. Today, we have many capable models that are not only on par but in many cases substantially better than what Anthropic has to offer. Heck, some of them are even open-source. Over the course of a year, Anthropic has lost some footing.
So of course, being a little late due to poorly executed strategy, they will be playing the status game now. Let's face it, though: these models are not more dangerous than Wikipedia or the Internet. These models are not custodians of ancient knowledge on how to cook Meth. This information is public knowledge. I'm not saying that companies like Anthropic don't have a responsibility for safeguarding certain types of easy access to knowledge, but this is not going to cause a humanity extinction event. In other words, the safety and alignment work done today resembles an Internet filter, to put it mildly.
Yes, there will be a need for more research in safety, for sure, but this is not something any company can do in isolation and in the shadows. People already have access to LLMs, and some of these models are as moldable as it gets. Safety and alignment have a lot to do with safe experimentation, and there is no better time to experiment safely than today because LLMs are simply not good enough to be considered dangerous. At the same time, they provide interesting capabilities to explore safety boundaries.
What I would like to see more of is not just how a handful of people make decisions on what is considered safe, because they simply don't know and will have blind spots like anyone else, but access to a platform where safety concerns can be explored openly with the wider community.
Hi, Anthropic is a 3 year old company that, until the release of GPT-4o last week from a company that is almost 10 years old, had the most capable model in the world, Opus, for a period of two months. With regard to availability, we had a huge amount of inbound interest on our 1P API but our model was consistently available on Amazon Bedrock throughout the last year. The 1P API has been available for the last few months to all.
No open weights model is currently within the performance class of the frontier models: GPT-4*, Opus, and Gemini Pro 1.5, though it’s possible that could change.
We are structured as a public benefit corporation formed to ensure that the benefits of AI are shared by everyone; safety is our mission and we have a board structure that puts the Response Scaling Policy and our policy mission at the fore. We have consistently communicated publicly about safety since our inception.
We have shared all of our safety research openly and consistently. Dictionary learning, in particular, is a cornerstone of this sharing.
The ASL-3 benchmark discussed in the blog post is about upcoming harms including bioweapons and cybersecurity offensive capabilities. We agree that information on web searches is not a harm increased by LLMs and state that explicitly in the RSP.
I’d encourage you to read the blog post and the RSP.
> We are structured as a public benefit corporation formed to ensure that the benefits of AI are shared by everyone; safety is our mission and we have a board structure that puts the Response Scaling Policy and our policy mission at the fore. We have consistently communicated publicly about safety since our inception.
Nothing against Anthropic, but as we all watch OpenAI become not so open, this statement has to be taken with a huge grain of salt. How do you stay committed to safety, when your shareholders are focused on profit? At the end of the day, you have a business to run.
> Let's face it, though: these models are not more dangerous than Wikipedia or the Internet. These models are not custodians of ancient knowledge on how to cook Meth. This information is public knowledge.
I don't think this is the right frame of reference for the threat model. An organized group of moderately intelligent and dedicated people can certainly access public information to figure out how to produce methamphetamine. An AI might make it easy for a disorganized or insane person to procure the chemicals and follow simple instructions to make meth.
But the threat here isn't meth, or the AI saying something impolite or racist. The danger is that it could provide simple effective instructions on how to shoot down a passenger airplane, or poison a town's water supply, or (the paradigmatic example) how to build a virus to kill all the humans. Organized groups of people that purposefully cause mass casualty events are rare, but history shows they can be effective. The danger is that unaligned/uncensored intelligent AI could be placing those capabilities into the hands of deranged homicidal individuals, and these are far more common.
I don't know that gatekeeping or handicapping AI is the best long term solution. It may be that the best protection from AI in the hands of malevolent actors is to make AI available to everyone. I do think that AI is developing at such a pace that something truly dangerous is far closer than most people realize. It's something to take seriously.
>Yes, there will be a need for more research in safety, for sure, but this is not something any company can do in isolation and in the shadows.
Looking through Antrhopic's publication history, their work on alignment & safety has been pretty out in the open, and collaborative with the other major AI labs.
I'm not certain your view is especially contrarian here, as it mostly aligns with research Anthropic are already doing, openly talking about, and publishing. Some of the points you've made are addressed in detail in the post you've replied to.
Anthropic has been slow at deploying their models at scale. For a very long period of time, it was virtually impossible to get access to their API for any serious work without making a substantial financial commitment. Whether that was due to safety concerns or simply the fact that their models were not cost-effective or scalable, I don't know. Today, we have many capable models that are not only on par but in many cases substantially better than what Anthropic has to offer. Heck, some of them are even open-source. Over the course of a year, Anthropic has lost some footing.
So of course, being a little late due to poorly executed strategy, they will be playing the status game now. Let's face it, though: these models are not more dangerous than Wikipedia or the Internet. These models are not custodians of ancient knowledge on how to cook Meth. This information is public knowledge. I'm not saying that companies like Anthropic don't have a responsibility for safeguarding certain types of easy access to knowledge, but this is not going to cause a humanity extinction event. In other words, the safety and alignment work done today resembles an Internet filter, to put it mildly.
Yes, there will be a need for more research in safety, for sure, but this is not something any company can do in isolation and in the shadows. People already have access to LLMs, and some of these models are as moldable as it gets. Safety and alignment have a lot to do with safe experimentation, and there is no better time to experiment safely than today because LLMs are simply not good enough to be considered dangerous. At the same time, they provide interesting capabilities to explore safety boundaries.
What I would like to see more of is not just how a handful of people make decisions on what is considered safe, because they simply don't know and will have blind spots like anyone else, but access to a platform where safety concerns can be explored openly with the wider community.