Not only is this incredibly cool, but it bridges my knowledge of x-ray (CR/XR, in this case, DR) even further. I work for a medical imaging company and we actually sell and promote our digital radiography machine that will do this, but for humans (at a much, much slower rate, but enough to be useful in medical diagnostics).
I had never considered an application of use outside of medicine.
So, my day is now accounted for. In addition to "overdue training" on things I already know, I will be youtubing all the x-ray, MRI, ultrasound fun things that you would never see otherwise.
Man I'm so far away from being concerned about this...just like, we still need to fix duplicate icons that mean different things, phrases like "Are you sure to delete?", mis-aligned text/inputs EVERYWHERE...
If this is your biggest concern at a company (regarding the UI at least), you are miles ahead of us. Congrats
> Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.
Mentions 120b is runnable on 8GB VRAM too: "Note that even with just 8GB of VRAM, we can adjust the CPU layers so that we can run the large 120B model too"
I have in many cases had better results with the 20b model, over the 120b model.
Mostly because it is faster and I can iterate prompts quicker to choerce it to follow instructions.
> had better results with the 20b model, over the 120b model
The difference of quality and accuracy of the responses between the two is vastly different though, if tok/s isn't your biggest priority, especially when using reasoning_effort "high". 20B works great for small-ish text summarization and title generation, but for even moderately difficult programming tasks, 20B fails repeatedly while 120B gets it right on the first try.
But the 120b model has just as bad if not worse formatting issues, compared to the 20b one. For simple refactorings, or chatting about possible solutions i actually feel teh 20b halucinates less than the 120b, even if it is less competent. Migth also be because of 120b not liking being in q8, or not being properly deployed.
> But the 120b model has just as bad if not worse formatting issues, compared to the 20b one
What runtime/tools are you using? Haven't been my experience at all, but I've also mostly used it via llama.cpp and my own "coding agent". It was slightly tricky to get the Harmony parsing in place and working correct, but once that's in place, I haven't seen any formatting issues at all?
The 20B is definitely worse than 120B for me in every case and scenario, but it is a lot faster. Are you running the "native" MXFP4 weights or something else? That would have a drastic impact on the quality of responses you get.
Edit:
> Migth also be because of 120b not liking being in q8
Yeah, that's definitely the issue, I wouldn't use either without letting them be MXFP4.
Hmmm...now that you say that, it might have been the 20b model.
And like a dumbass I accidentally deleted the directory and didn't have a back up or under version control.
Either way, I do know for a fact that the gpt-oss-XXb model beat chatgpt by 1 answer and it was 46/50 at 6 minutes and 47/50 at 1+ hour. I remember because I was blown away that I could get that type of result running locally and I had texted a friend about it.
I was really impressed but disappointed at the huge disparity between time the two.
I had never considered an application of use outside of medicine.
So, my day is now accounted for. In addition to "overdue training" on things I already know, I will be youtubing all the x-ray, MRI, ultrasound fun things that you would never see otherwise.
Thank you OP!
reply