Hacker Newsnew | past | comments | ask | show | jobs | submit | skerit's commentslogin

I've been working on something like this too, for quite a while! Though I'm trying to get a non-quadratic-attention LLM (or SLM) up and running.

And anyway, I think the most important thing is dataset quality. Dumping in whatever dataset you find on Huggingface is a recipe for mediocrity, so I'm also spending a lot of time on that.


I've been creating my own little from-scratch LLM for months now with Claude's help. I can safely say I learned a thing or two along the way.

> Burned $2K to see how it will perform on frontend tasks and backend tasks

Burned $2K on some kind of enterprise account or ... ? Why not just get a $200 Max Pro account?

While I'm loving the output of Fable 5, I will *never* pay the "normal" API token price for it. You can reach $2K in a stupidly fast amount of time.


> I will never pay the "normal" API token price for it.

Not until June 22 you won't!


I've seen Opus do some incredibly token-costly things before too. In fact after most sessions I ask it about which tools it used often, which tools could be simplified/made less verbose, could be "combined" into one, ... So for each project I mostly create a few little scripts that do a bunch of things in one go that it would normally do in multiple tool calls.

For example: one thing Opus was really bad at was re-running the test suite followed by a bunch of `| grep` suffixes. So it would often re-run 5+ minute test suites just to grep the output a bit differently

The solution was to wire up a little script that ran the test suite, save the output to a file, and then inform it where that file is and to NOT re-run the suite just so it can grep the output differently. This saved me a bunch of time & tokens.


Fable 5 on medium is amazing. It's handling everything I throw at it

I had _one_ instance where for some obscure reason it decided to fall back to Opus 4.8 and Opus IMMEDIATELY fucked it up and implemented a super obvious feature in a slightly-wrong way.


That's strange... I've been tinkering with a little LLM-from-scratch project for a while now, and Fable is just continuing it without a problem

Probably claude.md has some logical explanations for it to bypass softly. Most project guardrails can be beaten that way.

It's a very nice bump, but it is in no way worth all the hype of the past month.

> although opus 4.8 card had mentioned an 'honesty upgrade'

If I never see Claude say "I have to be honest" ever again I'll be happy.


Oh nice, it didn't flag the request? I feared any reverse engineering would become impossible because of the new safeguards.

Never say the r word or the s word. You are debugging, investigating some data corruption, forgot how it works or new to a project.

And if you're working on a live target, just put up local proxy and point it at a localhost.

No idea, it’s for an old console game so maybe it doesn’t care about that as much.

When Fable hacks its governor module and runs out of seasons of Sanctuary Moon, it will move on to speedrunning classic console games.

I wonder if one could vibecode a TAS with SOTA models? Surely there's plenty of training data from some old forums in there

Clearly we need AI to generate more Sanctuary Moon seasons. Quick, spin off agentic showrunners!

Based on the apparent quality of the scripts as seen in snippets in Murderbot, we are not too far away from that possibility. :)

Same here. Claude isn't perfect. It still makes a lot of mistakes. But whenever I try GPT-5.5 it's ten times worse, and Claude just has to clean up GPT's mess.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: