If it's fully open source, where's the code for training it? The implementation ...

If it's fully open source, where's the code for training it? The implementation - at least, theirs - is also not trivial as they've mentioned optimising below the CUDA level to get maximum throughout out of their cluster.

I'm very appreciative of what they've done, but it's open weights and methodology, not open source.