Hacker Newsnew | past | comments | ask | show | jobs | submit | whatevsmate's commentslogin

> thank you for the clarification! I appreciate you sharing that domain knowledge about the document-signature relationship

> …

> Your expertise about the system's constraints helps provide important context that static analysis tools can't capture.

So much fawning bullshit bloating the message and the token count. I think this might be the thing with LLMs I dislike most.

Suggestion for prompt writers: “Don’t waste tokens. Keep messages succinct and direct.”


What promises were made, by whom? Graphics APIs have never been about ease of use as a first order goal. They've been about getting code and data into GPUs as fast as reasonably possible. DevEx will always play second fiddle to that.

I think WebGPU is a decent wrapper for exposing compute and render in the browser. Not perfect by any means - I've had a few paper cuts working with the API so far - but a lot more discoverable and intuitive than I ever found WebGL and OpenGL.


> They've been about getting code and data into GPUs as fast as reasonably possible. DevEx will always play second fiddle to that.

That's a tiny bit revisionist history. Each new major D3D version (at least before D3D12) also fixes usability warts compared to the previous version with D3D11 probably being the most convenient to use 3D API - while also giving excellent performance.

Metal also definitely has a healthy balance between convenience and low overhead - and more recent Metal versions are an excellent example that a high performance modern 3D API doesn't have to be hard to use, nor require thousands of lines of boilerplate to get a triangle on screen.

OTH, OpenGL has been on a steady usability downward trend since the end of the 1990s, and Vulkan unfortunately had continued this trend (but may steer into the right direction in the future:

https://www.youtube.com/watch?v=NM-SzTHAKGo


I hear you but I also don't see a ton of disagreement here either. Like, the fact that D3D12 includes _some_ usability fixes suggests that DevEx really does take a back seat to the primary goal.

I'm not arguing that DevEx doesn't exist in graphics programming. Just that it's second to dots on screen. I also find webgpu to be a lot nicer in terms of DevEx than WebGL.

Wdyt? Still revisionist, or maybe just a slightly different framing of the same pov?


> I also find webgpu to be a lot nicer in terms of DevEx than WebGL.

Amen.

IMHO a new major and breaking D3D version is long overdue. There must be plenty of learnings in which areas it was actually worth it to sacrifice ease-of-use for peformance and where it wasn't.

Or maybe something completely radical/ridiculous and make HLSL the new "D3D API" (with some parts of HLSL code running on the CPU, just enough to prepare CPU side data for upload to the GPU).


I think that is where they are going with mesh shaders, amplification shaders and bringing more C++ into HLSL, but still as part of DirectX 12.

I don't imagine them pushing for a DirectX 13, only available on Windows 12 onwards kind of thing, as they have done in past.

Either way, I see we will be back to software rendering, although it is actually hardware accelerated.


> Metal also definitely has a healthy balance between convenience and low overhead - and more recent Metal versions are an excellent example that a high performance modern 3D API doesn't have to be hard to use, nor require thousands of lines of boilerplate to get a triangle on screen.

Metal 4 has moved a lot in the other direction, and now copies a lot of concepts from Vulkan.

https://developer.apple.com/documentation/metal/understandin...

https://developer.apple.com/documentation/metal/resource-syn...


If only Vulkan SDK was half as good as Metal development experience, including IDE integration, proper support for managed languages, and graphical debugging and profiling experience.

That has been the main pain point of Khronos APIs, it isn't only the extension spaghetti, the first step is always to go fishing all the puzzle pieces to have a proper development experience.

At least now there is LunarG SDK, however for how long are they going to sponsor them, and it isn't applicable to Android, where Google does the minimum, a Github repo dump with samples and good luck.

Compare that with Apple Metal frameworks.


"What promises were made, by whom?"

Technically true, but practically tone deaf.

WebGPU is both years too late, and just a bit early. Wheras WebGL was OpenGL circa 2005, WebGPU is native graphics circa 2015. It shouldn't need to be said that the bleeding edge new standard for web graphics shouldn't be both 10 years out of date and awful.

Vendors are finally starting to deprecate the old binding model as the byzantine machinery that it is. Bindless resources are an absolute necessity for the modern style of rendering with nanite and raytracing.

Rust's WGPU on native supports some of this, but WebGPU itself doesn't.

It's only intuitive if you don't realize just how huge the gap is between dispatching a vertex shader to render some triangles, and actually producing a lit, shaded and occlusioned image with PBR, indirect lighting, antialiasing and postfx. Would you like to render high quality lines or points? Sorry, it's not been a priority to make that simple. Better go study up on SDFs and beziers.

Which, tbh, is the impression I get from webgpu efforts. Everyone forgets the drivers have been playing pretend for decades, and very few have actually done the homework. Of those that have, most are too enamored with being a l33t gfx coder to realize how terrible the dev exp is.


I'm not sure I disagree with you really - and I ack that webgpu feels like 2015 tech to someone who knows their stuff. I don't have a take on "l33t gfx coder"; I'm a hobbyist not a professional, and I've enjoyed getting up to speed with WebGPU over and above my experiences with WebGL. Happy to be schooled.

I've never impl PBF or raytracing because my interests haven't gone that way. I don't find SDFs to be a particularly difficult concept to "study up on" either though. It's about as close to math-as-drawing that I've seen and doesn't require much more than a couple triangles and a fragment shader. By contrast I've been learning about SVT for a couple months and still haven't quite pieced together a working impl in webgpu... though I understand there are extensions specifically in support of virtual tiling that WebGPU could pursue in a future version.

Agreed DevEx broadly isn't great when working on graphics. But WebGPU feels like a considerable improvement rather than a step backward.


I can give a bit more context as someone that got on WebGL, then WebGPU, and is now picking up Vulkan for the first time.

The problem is that GPU hardware is rapidly changing to enable easier development while still having low level control. With ReBAR for example you can just take a pointer into gigabytes of GPU memory and pump data into it as if it was plain old RAM with barely any performance loss. 100 lines of bullshit suddenly turn into a one line memcpy.

Vulkan is changing to support all this stuff, but the Vulkan API was (a) designed when it didn't exist and is (b) fucking awful. I know that might be a hot take, and I'm still going to use it for serious projects because there's nothing better right now, but the same extensibility that makes it possible for Vulkan to just pivot huge parts of the API to support new stuff also makes it dogshit to use day to day, the code patterns are terrible and it feels like you're constantly compromising on readability at every turn because there is simply zero good options for how to format your code.

WebGPU doesn't have those problems, I quite liked it as an API. But it's based on a snapshot of these other APIs right at the moment before all this work has been done to simplify graphics programming as a whole. And trying to bolt new stuff onto WebGPU in the same way Vulkan is doing is going to end up turning WebGPU into a bloated pile of crap right alongside it.

If you're coming from WebGL, WebGPU is going to feel like an upgrade (or at least it did for me). But now that I've seen a taste of the future I'm pretty sure WebGPU is dead on arrival, it just had horrendous timing, took too long to develop, and now it's backed into a corner. And in the same vein, I don't think extending Vulkan is the way forward, it feels like a pretty big shift is happening right now and IMO that really should involve overhauls at the software/library level too. I don't have experience with DX12 or Metal but I wouldn't be surprised if all 3 go bye bye soon and get replaced with something new that is way simpler to develop with and reflects the current state of hardware and driver capabilities.


That is why game studios always went with engines, and never had a drama with APIs like FOSS developers happen to complain all the time.

You get to design a good developer experience, while the plugin system takes care of the optimal API and configuration for each platform.


Historically, Microsoft didn't have a problem making breaking changes in new D3D APIs so I think they'll be one of the first to make a clean API to leverage the new hardware features


Console vendors, 8 and 16 bit computers did it first, even if in many cases it was bare metal programming, that is still an API kind of.


If it weren't for the brand new shading language it might have been a step forward. But instead it's further fragmentation. Vulkan runs happily with GLSL, Proton runs HLSL on Linux, SPIR-V isn't bad.

And the new shading language is so annoying to write it basically has to be generated. Weird shader compilation stuff was already one of the biggest headaches in graphics. Feels like it'll be decades before it'll all be stable.


While I am also not happy with WGSL, note that GLSL has reached a dead end, Khronos officially isn't developing it any further other than extensions, see Vulkanised 2024 talks/panel.

Hence why NVidia's slang offer was welcomed with open arms.


Vulkan does not run GLSL. There are tools that convert GLSL to SPIR-V. That's not the same thing. So, if you want the exact same experience, you grab a tool, say, Slang, and have it output WGSL. Now you've got the same experience. An API that doesn't take a language you want to write in and a tool that converts from some language you do into the other.


Only if you don't take tooling into consideration, after a decade of WebGL, there still isn't anything other than SpectorJS and no vendor sees as priority to provide anything beyond pixel debugging.


>It's only intuitive if you don't realize just how huge the gap is between dispatching a vertex shader to render some triangles, and actually producing a lit, shaded and occlusioned image with PBR, indirect lighting, antialiasing and postfx. Would you like to render high quality lines or points? Sorry, it's not been a priority to make that simple. Better go study up on SDFs and beziers.

I think this is a tad unfair. You're basically describing a semi-robust renderer at that point. IMO to make implementing such a renderer truly "intuitive" (I don't know what this word means to you, so I'm taking it to mean--offloading these features to the API itself) would require railroading the developer some, which appears to go against the design of modern graphics APIs.

I think Unity/Unreal/Godot/Bevy make more sense if you're trying to quickly iterate such features. But even then, you may have to hand write the shader code yourself.


As a former l33t gfx, my love for Khronos APIs ended with Long Peaks failure, the endless way to load extensions, and the realisation of how much better the experience with proprietary APIs happens to be when it is though out end to end, with a proper SDK, IDE tooling and graphical debugging.


From Steve Wittens, a well respected graphics hacker, and maker of the excellent Use.GPU. https://acko.net/tv/usegpu/ . I'm mostly posting to expand context, and sprinkle in a couple light options.

> Bindless resources are an absolute necessity for the modern style of rendering with nanite and raytracing.

Yeah, for real. Looking at the November 2024 post "What's next for WebGPU" and HN comments, bindless is pretty high up there! There's a high level field survey & very basic proposal (in the hackmd link), and wgpu seems to be filling in the many gaps and seemingly quite far along in implementation. Not seeing any signs yet that the broader WebGPU implementors/spec folks are involved or following along, but at least wgpu is very cross platform & well regarded.

https://developer.chrome.com/blog/next-for-webgpu https://news.ycombinator.com/item?id=42209272 https://hackmd.io/PCwnjLyVSqmLfTRSqH0viA https://hackmd.io/@cwfitzgerald/wgpu-bindless https://github.com/gfx-rs/wgpu/issues/3637 https://github.com/gpuweb/gpuweb/issues/380

> Would you like to render high quality lines or points? Sorry, it's not been a priority to make that simple. Better go study up on SDFs and beziers.

I realize lines and font rendering are an insanely complex fields, and that OpenGL offering at least lines and Vulkan not sure feels like a slap in the face. The work being done by groups like https://linebender.org/ is intense. Overall though that intensity makes me question the logic of trying to include it, wonders whether fighting to specify something that clearly we don't have full mastery over makes sense: even the very best folks are still improving the craft. We could specify an API without specifying an exact implementation, without conformance tests, perhaps, but that feels like a different risk. Maybe having to reach for a library that does the work reflects where we are, causes the iteration & development we sort of need?

> actually producing a lit, shaded and occlusioned image with PBR, indirect lighting, antialiasing and postfx

I admit to envying the ambition to make this simple, to have such a great deep knowledge as Steve and to think such hard things possible.

I really really am so thankful and hope funding can continue for the incredibly hard work of developing webgpu specs & implementations, and wgpu. As @animats chimes in in the HN submission, bindless in particular is quite a crisis, which either will enable the web to go forward, or remain a lasting real barrier to the web's growth. Really seems to be the tension of Steve's opening position:

> WebGPU is both years too late, and just a bit early. Wheras WebGL was OpenGL circa 2005, WebGPU is native graphics circa 2015.


> OpenGL offering at least lines...

WebGPU does have line (and point) primitives since they are a direct GPU feature.

It just doesn't bother to 'emulate' lines or points that are wider than 1 pixel, since this is not commonly supported in modern native 3D APIs. Drawing thick lines and points are better done by a high level vector drawing API.


How about only sending submissions to humans if they include a reproducible test case? Actual compilable source code + payload that reproduces an attack. Would this be too easily gamed by security researchers as well?


I want to see the mouse cursor moving and clicking, and keys typed by the AI appearing onscreen in realtime like you see in software product tutorials.

The jumpiness of pages switching and things changing when an AI is driving is extremely disorienting. I find it hard to follow a thread of continuity in the page flashes and ui changes as the bot acts.

Right now it’s like watching a screen recording with no hint as to what I’m “supposed” to be focusing on.

Regardless - I have use cases for this in the mcp/browser automation vein another user mentioned so super interested to see where this goes.


This is very useful feedback, thank you!

We will look into add a cursor movements; key typing should already appear like a human would (but probably we can slow it down a bit).


Want you really want is a caretaker ai


I’m unclear what it is you’re describing. I’m describing UI affordances.


I think sociolinguists will have to do better than just chatting with one bot[1] before waxing poetic about the invisible power dynamics that shaped the interaction. This is critical theory: thought-provoking cynical fart-sniffing. Autoethnography “research”.

[1] no clear sign of model chosen either. Who knows what software version or system prompt they were blabbing to.


This is the answer.

Otherwise-comfortable people searching for anything in the “matrix of oppression” to cling to. Because for the past decade and a half that’s been the easiest way to insert yourself into the attention economy.


Wow a lot of the stories people are writing here are super depressing. If a junior developer is delivering you a pile of code that doesn’t work, hasn’t been manually tested and verified by them, hasn’t been carefully pared down to its essential parts, and doesn’t communicate anything about itself either through code style, comments or docs … then you are already working with an LLM ; it just so happens to be hosted in - or parsed thru - a wetware interface. Critical thinking and taking responsibility for the outcome is the real job and always has been.

And, cynically, I bet a software LLM will be more responsive to your feedback than the over-educated and overpaid junior “engineer” will be. Actually I take it back, I don’t think this take is cynical at all.


People think juniors submitting LLM-generated code to seniors to review is a sign of how bad LLM is.

I see it as a sign of how bad juniors are, and the need of seniors interacting with LLM directly without the middlemen.


The main problem in this environment is IMO: how does a junior become a senior, or even a bad junior become a good junior. People aren't learning fundamentals anymore beyond what's taught, and all the rest of 'trade knowledge' is now never experienced, people just trust that the LLM has absorbed it sufficiently. Engineering is all about trade-offs. Failing to understand why from 10 possible ways of achieving something, 4 are valid contenders and possible 1-2 are best in the current scenario, and even the questions to ask to get to that answer, is what makes a senior.


I think the solution becomes clearer - juniors need to worry less about knowing how to program in a void, since the LLM can handle most of that, but care more about how to produce code that doesn't break things, that doesn't have unintended 2nd order effects, that doesn't add unneeded complexity, etc.

In my experience I see juniors come out of college who can code in isolation as well as me or better. But the difference between jr/sr is much more about integration, accuracy and simplicity than raw code production. If LLMs remove a lot of the hassle of code production I think that will BENEFIT the other elements, since those things will be much more visible.

Personally, I think juniors are going to start emerging with more of a senior mindset. If you don't have to sweat uploading tons of programming errata to your brain you can produce more code abd more quickly need to focus on larger structural challenges. That's a good thing! Yes, they will break large codebases but they have been soing that forever, if given the chance. The difference now is they will start doing that much sooner.


The LLM is the coding tool, not the arbiter of outcome.

A human’s ability to assess, interrogate, compare, research, and develop intuition are all skills that are entirely independent of the coding tool. Those skills are developed through project work, delivering meaningful stuff to someone who cares enough to use it and give feedback (eg customers), making things go whoosh in production, etc etc.

This is a XY problem and the real Y are galaxy brains submitting unvalidated and shoddy work that make good outcomes harder rather than easier to reach.


LLMs are so easy to use though, it's addictive. Even as a senior I find myself asking LLMs stuff I know I should be looking up online instead.


I use LLMs to code. I think they’re great tools and learning the new ropes has been fun as hell. Juniors should use them too. But any claim that the LLM is responsible for garbage code being pushed into PRs is misreading the actual state of play imo.


Why look it up online when good results are buried under ads and the websites themselves are choked with astroturfed content. The exception is when libraries have good documentation.


We're in an environment where management is demanding the staff use these tools. The junior staff is going to listen to the CEO.


Why should a Jr dev NOT use an LLM? Its the skill of the future, its even an underlying plank in your argument!

Jr Devs are responding to incentives to learn how to LLM, which we are saying all coders need to.

So now we have to torture the argument to create a carve out for junior devs - THEY need to learn critical thinking and taking responsibility.

Using an LLM directly reduces your understanding of whatever you used it write, so you can't have both - learning how to code, and making sure your skills are future proof.


Nothing I wrote is in counterpoint to this.

There’s no carve out. Anyone pushing thoughtless junk in a PR for someone else to review is eschewing responsibility.


Neat, I plan to check this out.

I really want an AI to jam with on a canvas rather than to just have it generate the final results.

I have been hoping someone would pick up on the time series forecasting innovations in the LLM space, combine them with data from e.g. the Google quick draw dataset, and turn that into a real-time “painting partner” experience, kind of like chatting with an LLM through brush strokes.


Using the kontext models in Fal.ai shows you a nice slider of the before and after edits and has a button that lets you set the edited image as the new source so you can continue to make changes.

Now that BFL has released a dev model, I'd love to see a Kontext plugin for Krita given that it already has one for Stable Diffusion though!

https://github.com/Acly/krita-ai-diffusion


The Krita plugin is a bridge to ComfyUI which can already run Flux and presumably will have native support for Kontext (dev) within a week or so, and the plugin already has limited support for using Flux, so Kontext in the existing plugin (rather than requiring a new one) seems a fairly reasonable expectation.


> ComfyUI which can already run Flux and presumably will have native support for Kontext (dev) within a week or so

This was pessimistic, native support today, with workflow and pointer to an alternate fp8 model download for people that can't run the full fp16 checkpoint.

https://comfyanonymous.github.io/ComfyUI_examples/flux/#flux...


I did this a month ago and don't regret it one bit. I had a long laundry list of ML "stuff" I wanted to play with or questions to answer. There's no world in which I'm paying by the request, or token, or whatever, for hacking on fun projects. Keeping an eye on the meter is the opposite of having fun and I have absolutely nowhere I can put a loud, hot GPU (that probably has "gamer" lighting no less) in my fam's small apartment.


Right on. I also have a laundry list of ML things I want to do starting with fine tuning models.

I don't mind paying for models to do things like code. I like to move really fast when I'm coding. But for other things, I just didn't want to spend a week or two coming up on the hardware needed to build a GPU system. You can just order a big GPU box, but it's going to cost you astronomically right now. Building a system with 4-5 PCIE 5.0 x16 slots, enough power, enough pcie lanes... It's a lot to learn. You can't go on PC part picker and just hunt a motherboard with 6 double slots.

This is a machine to let me do some things with local models. My first goal is to run some quantized version of the new V3 model and try to use it for coding tasks.

I expect it will be slow for sure, but I just want to know what it's capable of.


Close your eyes. Imagine a black tunnel that you are floating through. Every time a thought pops up, paint it black in that tunnel and then float on past it. That includes the thoughts "this tunnel is black" or anything related to your current experience. Just paint it over and float on again.

I self-taught this method at a young age and have picked up a few other "quiet mind" techniques over the years that do similar-ish things. The principle, from my pov, is to basically sit with it and proactively teach your brain to stfu, one thought at a time.


I have a very similar technique that I made up:

Close your eyes. For every thought you have, imagine it to be a soap bubble, floating upwards. After a few seconds of floating upwards, it pops and is gone.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: