- Single digit TPS on rare chance it responds, and frequent complete hangs (1 out of maybe 20 requests even complete)
- 4k input token cap (vs native 128k context window)
- No pricing
- Unstated rate limits
It genuinely seems like they spun up a single H100 cluster to enable the headline of this post and help form a narrative then left it at that. Definitely not meant to genuinely provide access to R1 in any serious way.