Hacker Newsnew | past | comments | ask | show | jobs | submit | hi_hi's commentslogin

Its LLMs all the way down :-)

This can't be scaled to more generalised tasks. If you solve that then you've solved the hallucination issue.


Regulated industries (amongst many) need to be deterministic. Imagine your bank being non-deterministic.


>Imagine your bank being non-deterministic.

That's already the case. Payments are not deterministic. It can take multiple days for things to settle. The real world is messy.

When I make a payment I have no clue if the money is actually going to make it to a merchant or if some fraud system will block it.


The bank can very much determine if the payment has been made or not (although not immediately, as you mentioned). As a rule, banks like to keep track of money.


Yes it settles deterministically. With AI it claims to be settled and goes on, and it's up to you to figure it out how deterministic the whole transaction actually was.


Is it the main issue? Payments suffer from race conditions, but the processes themselves are deterministic, auditable and may be rolled back. Not sure how many of these important attributes would remain with a neural network at the helm.


The article seems well researched, has some good data, and is generally interesting. It's completely irrelevant to the reality of the situation we are currently in with LLMs.

It's falling into the trap of assuming we're going to get to the science fiction abilities of AI with the current software architectures, and within a few years, as long as enough money is thrown at the problem.

All I can say for certain is that all the previous financial instruments that have been jumped on to drive economic growth have eventually crashed. The dot com bubble, credit instruments leading to the global financial crisis, the crypto boom, the current housing markets.

The current investments around AI that we're all agog at are just another large scale instrument for wealth generation. It's not about the technology. Just like VR and BioTech wasn't about the technology.

That isn't to say the technology outcomes aren't useful and amazing, they are just independant of the money. Yes, there are Trillions (a number so large I can't quite comprehend it to be honest) being focused into AI. No, that doesn't mean we will get incomprehensible advancements out the other end.

AGI isn't happening this round folks. Can hallucinations even be solved this round? Trillions of dollars to stop computers lying to us. Most people where I work don't even realise hallucinations are a thing. How about a Trillion dollars so Karen or John stop dismissing different viewpoints because a chat bot says something contradictory, and actually listen? Now that would be worth a Trillion dollars.

Imagine a world where people could listen to others outside of their bubble. Instead they're being given tools that re-inforce the bubble.


Indeed, this could be AI's fusion energy era, or AI's VR era, or even AI's FTL travel era.


I'd be interested in seeing React Native in this comparison.

I'm not overly familiar with it, but we use it at work. I've no idea if I should expect it to be quicker or slower than something like Next.


What do you hope to see from the result of that comparison?


To gauge where RN sits on the spectrum of fast to slow.


My PS2 Fat still works perfectly. The only difficulty I had was getting it running on a HDMI TV, but that was fixed with a fairly well known HDMI adapter and inputting of a specific controller combo to get it into the correct display mode.

I was kinda shocked to see the state of some of those PS2 consoles.


I'm waiting for the Tesla FSD playbook to be rolled out for Grok. That is, launch something named like Grok AGI 1, wait for it to become obvious it isn't infact AGI, create a narrative redefining AGI, promise new AGI is 1 year away, and repeat for many years.


Bonus points if you manage to kill a few poor deluded saps with your unsafe product along the way.


> create a narrative redefining AGI

Hasn't OpenAI redefined AGI already as "any AI that can [supposedly] create a hecto-unicorn's worth of economic value"?


If you're going to be this ridiculous, at least make it look cool.

Re-invent the Penny Farthing perhaps. An electric version of that, with some funky way of lifting the rider up, would be amazing.

This is like the cronut of bikes, all the downsides of a scooter, and none of the upsides of a motorbike.


> This is like the cronut of bikes, all the downsides of a scooter, and none of the upsides of a motorbike.

Is it possible you didn't look at the article and are thinking of something like a Vespa rather than the type of thing you stand upright on while holding onto a handle?

These things have plenty of advantages. They're incredibly portable and easy to charge. When the weather is nice my friend rides his the ~20 miles from his place to my place (which is mostly covered by rail trail bike paths). You can bring one into the office and charge it at your desk. Many public transit systems allow you to bring them on the subway or bus, which at rush hour might not be possible with a normal bicycle. If bike lockups aren't available at the nearest station/stop, this lets people who live within a mile or two still get the advantage of public transit.

> Re-invent the Penny Farthing perhaps. An electric version of that, with some funky way of lifting the rider up, would be amazing.

The penny farthing bike is certainly ridiculous, but I can't imagine a way you could possibly make one look cool. Is it even legal to ride one if you don't have some sort of ironic facial hair, a bowler hat, or a vest?


24 kW on a Penny Farthing-like bike ?

At 100 MPH you hit the brakes and instead of stopping the whole thing just rolls over to slam you with the same velocity to the ground ? x)


Penny Farthing or E-Scooter, at these speeds you're going to slam into the ground and become a mess regardless. At least on the PF, you look cool...


It only takes basic high school physics to see how much worse a penny farthing is. And I'm no fan of scooters.


I dunno man. That big wheel is going to ride over those pot holes like they're nothing.


We were specifically talking about breaking.


Listen I don't know the physics, I'm the ideas man, the practical details are for you eggheads to figure out.


At last, someone who gets it. Whats the point in moving at 100mph, on the edge of death, and looking like a dork.


I may not have fully grasped this, but on the surface, it looks like they want me to have an AI agents inserted directly into my git workflow...like right there with all my wonderful juicy code? Is that correct?

Isn't this a recipe for disaster, or is all the FUD around agents wrecking havoc getting to me? I love Claude Code, but it can be somewhat bonkers and is at least at arms length from doing any real damage to my code (assuming I'm following good dev practices, and don't let it loose on my wider filesystem).


What’s wrong with receiving code/security/MR review comments from AI?


I've always wondered, who was the first person to milk a cow, and then...drink it?


This reminds me of a Calvin and Hobbes comic: https://www.reddit.com/r/calvinandhobbes/comments/1gc8af/why...


I'm sure there was an element of relating to how humans do the same thing but we can make this thing always produce milk and we can't with humans


Doesn't it all come down to "what is the ideal interface for humans to deal with digital information"?

We're getting more and more information thrown at us each day, and the AIs are adding to that, not reducing it. The ability to summarise dense and specialist information (I'm thinking error logs, but could be anything really) just means more ways for people to access and view that information who previously wouldn't.

How do we, as individuals, best deal with all this information efficiently? Currently we have a variety of interfaces, websites, dashboards, emails, chat. Are all these necessary anymore? They might be now, but what about the next 10 years. Do I even need to visit a companies website if can get the same information from some single chat interface?

The fact we have AIs building us websites, apps, web UI's just seems so...redundant.


Websites were a way to get authoritative information about a company, from that company (or another trusted source like Wikipedia). That trust is powerful, which is why we collectively spent so much time trying to educate users about the "line of death" in browsers, drawing padlock icons, chasing down impersonator sites, mitigating homoglyph attacks, etc. This all rested on the assumption that certain sites were authoritative sources of information worth seeking out.

I'm not really sure what trust means in a world where everyone relies uncritically on LLM output. Even if the information from the LLM is usually accurate, can I rely on that in some particularly important instance?


You raise a good point, and one I rarely see discussed.

I still believe it fundamentally comes down to an interface issue, but how trust gets decoupled from the interface (as you said, the padlock shown in the browser and certs to validate a website source), thats an interesting one to think about :-)


I imagine there will be the same problems as with Facebook and other large websites, that used their power to promote genocide. If you're in the mood for some horror stories:

https://erinkissane.com/meta-in-myanmar-full-series

When LLM are suddenly everywhere, who's making sure that they are not causing harm? I got the above link from Dan Luu (https://danluu.com/diseconomies-scale/) and if his text there is anything to go by, the large companies producing LLMs will have very little interest in making sure their products are not causing harm.


In some cases like the Air Canada one where the courts made them uphold a deal offered by their chatbot it'll be "accurate" information whether the company wants it to be or not!

Not not everything an LLM tells you is going to be worth going to court over if it's wrong though.


The designers of 6th gen fighter jets are confronting the same challenge. The cockpit, which is an interface between the pilot and the airframe, will be optionally manned. If the cockpit is manned, the pilot will take on a reduced set of roles focused on higher-level decision making.

By the 7th generation it's hard to see how humans will still be value-add, unless it's for international law reasons to keep a human in the loop before executing the kill chain, or to reduce Skynet-like tail risks in line with Paul Christiano's arms race doom scenario.

Perhaps interfaces in every domain will evolve this way. The interface will shrink in complexity, until it's only humans describing what they want to the system, at higher and higher levels of abstraction. That doesn't necessarily have to be an English-language interface if precision in specification is required.


> keep a human in the loop before executing the kill chain, or to reduce Skynet-like tail risks in line with Paul Christiano's arms race doom scenario.

It is a little known secret that plenty of defense systems are already set up to dispense of the human in the loop protocol before a fire action. For defense primarily, but also for attack once a target has been designated. I worked on protocols in the 90's, and this decision was already accepted.

It happens to be so effective that the military won't bulge on this.

Also, it is not much worse to have a decision system act autonomously for a kill system, if you consider that the alternative is a dumb system such as a landmine.

Btw: while there always is a "stop button" in these systems, don't be fooled. Those are meant to provide semblance of comfort and compliance to the designers of those systems, but are hardly effective in practice.


We will get to the dream of Homer Simpson gorging on donuts and "operating" a nuclear power plant.


Is this just what you think it might happen or are you directly involved in these decisions and first-hand exposing a challenge?


Computers / the web are a fast route to information, but they are also a storehouse of information; a ledger. This is the old "information wants to be free, and also very expensive." I don't want all the info on my PC, or the bank database, to be 'alive', I want it to be frozen in kryptonite, so it's right where I left it when I come back.

I think we're slowly allowing AI access to the interface layer, but not to the information layer, and hopefully we'll figure out how to keep it that way.


I like the smartphone. It’s honestly perfect and underutilized.


Every human is different, don't generalize the interface. Dynamically customize it on the fly.


yep I think this is the fundamental question as well, everything else is intermediate


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: