Anthropic has amazing scientists and engineers, but when it comes to results that align with the narrative of LLMs being conscious, or intelligent, or similar properties, they tend to blow the results out of proportion
Edit: In my opinion at least, maybe they would say that if models are exhibiting that stuff 20% of the time nowadays then we’re a few years away from that reaching > 50%, or some other argument that I would disagree with probably
Not necessarily meaningless, but maybe relative, i.e. a person who generally replaces non-Apple laptops every X years would replace MacBooks every Y years, with Y > X
Mixture of Experts isn't using multiple models with different specialties, it's more like a sparsity technique, where you massively increase the number of parameters and use only a subset of the weights in each forward pass.
Yep, I even wrote in to Crawford to let him know how hard it slapped. I watched the whole thing with my brother and I'll treasure the experience forever, haha.
reply