It is ultimately all speculation, until Deepseek releases their own 145B MoE model, and then we can compare the activations/results
It is ultimately all speculation, until Deepseek releases their own 145B MoE model, and then we can compare the activations/results