> Which you could argue is not a problem if it won’t be read by humans anyways anymore in the near future.
It's a problem right now for code that isn't being read by humans.
LLM-backed agents start by writing slightly bad code that's a little too verbose, too careful in error handling, writes too much fallback code, among other common minor LLM-ish flaws. And then it's next turn of the crank sees all that, both as an example but also as code it must maintain, and is slightly more bad in all those ways.
This is why vibing ends up so bad. It keeps producing code that does what you asked for a fairly long time, so you can get a long way vibing. By the time you hit a brick wall it will have been writing very bad code for a long while, and it's not clear that it's easier to fix it than start over and try not to accept any amount of slop.
> too careful in error handling, writes too much fallback code
Is it possible that your code goes a little cowboy when it comes to error handling? I don't think I've ever seen code that was too careful when it came to error handling -- but I wrote GPU drivers, so perhaps the expectations were different in that context.
I’ve definitely seen agents add null checks to a computed value in a function, but then not change the return type to be non-null. Later, it adds a null checks at each call site, each with a different error message and/or behavior, but all unreachable.
For bonus points, it implements a redundant version of the same API, and that version can return null, so now the dozen redundant checks are sorta unreachable.
When I'm writing web services I think I handle almost every error and I don't have this complaint there.
When I'm writing video games there's lots of code where missing assets or components simply mean the game is misconfigured and won't work and I would like it to loudly and immediately fail. I often like just crashing there. There are better options sometimes too, making a lot of noise but allowing continuation. But LLMs seem to be bad at using those too.
Actually to go back to web services, I do still hate the way I've had LLMs handle errors there too - too often they handle them silently or worse, provide some fallback behavior that masks the error. They just don't write code that looks like it was written by someone with 1) some assumptions about how the code is going to be used 2) some ideas about how likely their assumptions are to be wrong or 3) some opinions about how they'd like to learn their assumptions are wrong if so.
Grant reviews are blind reviews - so you don’t know.
Also - and even worse - there is no rebuttal process. It gets rejected without you having a chance to clarify / convince reviewers.
Instead you’d need to resubmit and start the entire process from scratch. What a waste of resources …
It’s the final nail what made me quit pursuing a scientific career path despite having good pubs & PhD /w honours.
That's unfortunate. My personal sense is that while agentic LLM's are not going to get us close to AGI, a few relatively modest architectural changes to the underlying models might actually do that, and I do think mimicry of our own self-referential attention is a very important component of that.
While the current AI boom is a bubble, I actually think that AGI nut could get cracked quietly by a company with even modest resources if they get lucky on the right fundamental architectural changes.
I agree - and I think having interdisciplinary approach here is going to increase the odds here. There is a ton of useful knowledge in related disciplines - often just named differently - but turns out investigating the same problem from a different angle.
We’re assembling a small but mighty team to build and own the data platform that’s going to be the backbone of the business' strategic shift towards automation and data-driven products in the procure-to-pay industry. You’ll get the opportunity to work on the largest technology initiative since the company's founding about 10y ago. An opportunity with massive impact, visibility and full buy-in from the senior leadership and executives.
Our mission is to enable our gang of 40+ product engineers to deliver data-driven products and features.
Profile (roughly):
- great personality
- solid engineering skills with Python and pyspark / deltalake
- data architecture and data modelling
- CI/CD, testing and quality tests for data pipelines
- experience with databricks and its recent products highly desired (deltalake, autoloader, unity catalog, structured streaming, DLT, DAB, ...)
Some highlights:
- 4-day work week (every Friday off)
- Well funded as we just raised 50m in series C
- Excellent senior leadership with a human-centric culture
- Great business momentum and growth trajectory
Put "hackernews" into the last textbox and I'll make sure it gets in front of our recruiter and hiring manager.
Sounds like LLMs having their SQL injection equivalent moment.
I’d also say this described phenomenon isn’t new - except for it’s applied context: it’s essentially disinformation - a well known technique used by military since decades. Except now we hack LLM agents instead of real people’s minds.
In theory they should be useful when know that the underlying process should be monotone. I think in the past I found them more sensitive to noise and wondered if monotone approximation might not be better than monotone interpolation for that reason.
I added class comments to each class which explain the high level implementation details. Clamping is supported with natural cubic splines, and this is done by taking the slopes at each endpoint.
Monotonicity is currently not supported (for cubic splines).
Well, I’m not complaining, I’m suggesting an automated reply to notify people that their stuff was submitted. Which is how things usually work when you submit through a recruiting portal.
FYI: your https website is not loading.
Plus, neither your post nor the non-https version of your website provides any information on what you're actually looking/recruiting for.
Probably not the best way to attract people.
It’s not that it is wrong or anything - it’s just unnecessary verbose.
Which you could argue is not a problem if it won’t be read by humans anyways anymore in the near future.