Hacker Newsnew | past | comments | ask | show | jobs | submit | vkkhare's commentslogin

updated the link


Isn't the inference cost of running these models at scale challenging? Currently it feels like small LLMs (1B-4B) are able to perform well for simpler agentic workfows. There are definitely some constraints but surely much easier than to pay for big clusters on cloud running for these tasks. I believe it distributes the cost more uniformly


It is very likely that you consume less power running a 1B LLM on an Nvidia supercluster than you do trying to download and run the same model on a smartphone. I don't think people understand just how fast the server hardware is compared to what is in their pocket.

We'll see companies push for tiny on-device models as a novelty, but even the best of those aren't very good. I firmly believe that GPUs are going to stay relevant even as models scale down, since they're still the fastest and most power-efficient solution.


We will be open sourcing the entire platform soon. This blog shows our platform to build the AI app

https://www.nimbleedge.com/blog/how-to-run-kokoro-tts-model-...


We have just started open sourcing the on-device AI platform.

We have started with the github repo for custom Kokoro TTS model. It is basically a batch implementation for Kokoro while supporting streaming.

https://github.com/nimbleEdge/kokoro

We will soon share the discord community too.


maybe try local p2p networking and compute offloading?

Check out https://nimbleedge.ai for a cool demo

This is our repository https://github.com/NimbleEdge/RecoEdge


I would love to talk more about it and understand your take. Is there a way I can reach out to you? I have been reading the content in web 3 and form my own opinion on the merits/demerits.


Sure, join our Discord at https://discord.gg/dnsxyz or ping me on Twitter (@DNS)


What do people think of automatic differentiation support facebook was trying for Kotlin.

They called it differentiable programming https://ai.facebook.com/blog/paving-the-way-for-software-20-...


I believe edge computing is one of the game changing technologies for the decade. I am not sure if it can fall under the purview of Web 3 or not. For example one of the library I implemented was along training ML models on user devices instead of the cloud maintaining privacy and personalization.

https://github.com/NimbleEdge/RecoEdge


That's what I am worried about, from its description the intent looks good but being aligned to only one industry or vertical defeats the purpose of ubiquity. People wouldn't trust it if they all just see is a volatile currency


Marc is reading this ;p


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: