> Q: What makes a good custom interface for reviewing LLM outputs? Great interfa...

dbish · 2025-07-03T20:09:38 1751573378

I think the truth is somewhere in between. I find label studio to be lacking a lot of niceties and generally built for very the average text labeling or image labeling use case, but anything else (like a multi-step agent workflow or some sort of multi-modal task specific problem) it is not quite right for and you do end up doing a bit of trying to build your own custom interface.

So, imho you should try label studio but timebox and really decide for yourself quickly if it's going to work for you in a day, and if not go vibecode a different view and try it out or build labeling into a copy of a front end you're already using for your task if that's quick.

What I think we really need here is a "lovable meets labelstudio" that starts with simple defaults and lets anyone use natural language, sketches, screenshots, to create custom interfaces and modify them quickly.

ultrasaurus · 2025-07-03T21:27:21 1751578041

The SaaS version of Label Studio does have a natural language interface to create custom interfaces: https://docs.humansignal.com/guide/ask_ai

I'm ostensibly an expert in the product and I probably use that 90%+ of the time (unless I'm testing something specific) -- using a sketch as input is a cool idea though!

Disclaimer: I'm the VP Product at HumanSignal the company behind Label Studio.

bbischof · 2025-07-03T20:03:47 1751573027

Label studio is fine if it covers your need, but in many cases the core opportunity in an eval interface is fitting in with the SME’s workflow or current tech stack.

If label studio looks like what they can use, it’s fine. If not, a day of vibecoding is worth the effort to make your partners with special knowledge comfortable.

jph00 · 2025-07-03T19:38:58 1751571538

Label Studio is great, but by trying to cover so many use cases, it becomes pretty complex.

I've found it's often easier to just whip up something for my specific needs, when I need it.

abletonlive · 2025-07-03T18:50:24 1751568624

This awful advice can’t be blanket applied and misses the point: starting from zero is extremely easy now with LLMs, the last 10% is the hardest part. Not only that, if you don’t start from zero you aren’t able to build from whatever you think the new first principles are. Spacex would not exist if it tried to extend old paradigm of rocketry.

There’s nothing wrong with starting from scratch or rebuilding an existing tool from the ground up. There’s no reason to blindly build from the status quo.

ReDeiPirati · 2025-07-03T19:16:29 1751570189

I'd have agreed with you, if the principles would be different. But what was showed in the content is EXACTLY what those tools are doing today. Actually those tools are way more powerful and considering & covering way more scenarios.

> There’s nothing wrong with starting from scratch or rebuilding an existing tool from the ground up. There’s no reason to blindly build from the status quo.

Generally speaking all the options are ok, but not if you want to have something up as fast as you can or if your team is piloting something. I think the time you spend to vibe code it is greater than to setting any of those tools up.

And BTW, you shouldn't vibe code something that flows proprietary data. At least you would work with co-pilots