Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would love to train an LLM from scratch to help me with some problems that they're not good at, but I can't, because it costs thousands of dollars to do so. You probably can't either, or can just in a very limited capacity (agents, or maybe LoRa).

A while back, I didn't even knew those problems existed. It took me a while to understand them and why they're interesting and lots of people spend time on them.

I have tried to adapt the problems to the LLMs as well, such as shaping the problem to look more like a thing that they're alreay trained on, but I soon realized the limitations of that approach.

I think in a couple of decades, maybe earlier, that kind of thing will be commonplace. People training their own stuff from scratch, on cheap hardware. It will unleash an even more rewarding learning experience for those willing to go the extra mile.

I think you're missing that perspective. That's fine, by the way. You're totally cool and probably helping lots of people with your work. I support it, it allows people to understand better where LLMs currently can help and where they cannot.



There aren't many tasks these days for which training or fine-tuning a model seems necessary to me.

One of the reasons I am so excited about the "skills" concept from Anthropic is that it helps emphasize how the latest generation of LLMs really can pick up new capabilities if you get them to read a single, carefully constructed markdown file.


I'm trying to simplify the live-bootstrap project by either removing dependencies, reducing build time or making it more unattended (by automating the image creation steps, for example).

https://github.com/fosslinux/live-bootstrap/

Other efforts around the same problem are trying to make it more architecture independent or improve regenerations (re-building things like automake during the process).

It's free and open source, you're welcome to fork it and try your best with the aid of Claude. All you need is an x86 or x86-64 machine or qemu.

The project and other related repositories are already full of documentation in the markdown format and high quality commented code.

Here's a friendly primer on the problem:

https://www.youtube.com/watch?v=Fu3laL5VYdM

If you decide to help, please ask the maintainers if AI use is allowed beforehand. I'm OK with it, they might not be.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: