Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What kind of issues did you have with streaming? I also set up ollama on fly.io, and had no issues getting streaming to work.

For the LLM itself, I just used a custom startup script that downloaded the model once ollama was up. It's the same thing I'd do on a local cluster though. I'm not sure how fly could make it better unless they offered direct integration with ollama or some other inference server?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: