no_flaks_given's comments

no_flaks_given · 2025-10-31T04:40:47 1761885647

Not many use cases actually need websockets. We're still building new shit in sync python and avoiding the complexity of all the other bullshit

no_flaks_given · 2025-10-16T00:08:53 1760573333

What I want to see is an Anthropic + Cerebras partnership.

Haiku becomes a fucking killer at 2000token/second.

Charge me double idgaf

no_flaks_given · 2025-10-16T00:06:33 1760573193

But Turso is a for profit company that's bound to rug pull eventually.

So SQLite is still the bar

no_flaks_given · 2025-08-02T07:50:57 1754121057

This model is super quantized and the quality isn't great, but that's necessary because just like everyone else except for Nvidia and AMD

They shat the bed. They went for super crazy fast compute and not much memory, assuming that models would plateu at a fee billion parameters.

Last year 70b parameters was considered huge, and a good place to standardize around.

Today we have 1t parameter models and we know it still scales linearly with parameters.

So next year we might have 10T parameter LLMs and these guys will still be playing catch up.

All that matters for inference right now is how many HBM chips you can stack and that's it

smallerize · 2025-08-02T13:42:08 1754142128

Cerebras doesn't normally quantize the models. Do you have more information about this?

d3vr · 2025-08-02T18:10:30 1754158230

It's FP8 [0]