Hacker Newsnew | past | comments | ask | show | jobs | submit | linolevan's commentslogin

Played around with the code to implement a little bit of SIMD. Was able to squeeze out a decent improvement, ~250 fps avg, ~140 low, ~333 high (on an m4). Looks pretty straightforward to do threading with as well. Cool stuff! Could work to bring more gpu stuff back down to the cpu.

Oops! Looks like we posted at the same time.

> Does bonus usage count against my weekly usage limit?

> No. The additional usage you get during off-peak hours doesn’t count toward any weekly usage limits on your plan.


So the first 100% of 5-hour usage are billed against weekly usage at normal rates, but the second additional 100% are not counted?

I just watched my "weekly limit" get used while I ran a claude code command.

I'm not sure how to square that with the quote you gave.


Did you exhaust the five-hour usage limit already? As I understand it, the ”additional usage” refers to anything beyond the standard five-hour usage limit.

Did… you copy paste this from another discussion? I’ve read this comment before.

Me too. This is funny

According to the providers that I keep track of, Cumulus is typically pretty price competitive, except for MiniMax where DeepInfra and Together are much cheaper and GLM-5 where DeepInfra and z.AI's own hosting is much cheaper.

(Also technically qwen3 8b w/ novita being first place but barely)


Can we get context length / output length docs (looks like you mention "Max tokens (chat)" of 128k but it's unclear what that means)? Also it looks like your docs page is out of date from your playground page.

Also piece of feedback: it kind of sucks to have glm/minimax/kimi on separate api endpoints. I assume it's a game you play to get lower latency on routing for popular models but from a consumer perspective it's not great.


Thank you for the feedback. Taking note of this!

Looks like at least a second release, they had one other LLM before this.

This is awesome. Almost all of these are believable even if you're looking at pretty carefully. I need this on a firestick or something.

This is awesome. Got completely lost reading this and was struggling to figure out where I got this link from. Amazing story.

Yep, I recall one of the big components being libc i18n

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: