> Apparently, he succeeded - his subject elected to release the fake AI.
Hasn't he refused to release the transcripts for this experiment? And I think there've been subsequent runs of the experiment, with some losses as well as wins.
Edit: I see that not releasing the transcripts is part of the design of the experiment. Which seems kind of like it'd seriously hamper any kind of independent analysis of the results or the quality of the experiment.
That hampering is necessary to ensure that the fake AI is artificially more intelligent. If you allow retrospective analysis, then you can reasonably outthink the AI because you have more time to evaluate and make the decision.
With a real AI, you will not be able to do this: the decision will already have been made and the consequences of its release already in effect.
Besides, you would need to repeat the experiment many times to get any meaningful data.
But this is all extremely hypothetical. Nobody knows what the real scenario will be, if something like this ever comes up - it probably won't be one person sitting in a room with a chat terminal and a "Release" button, under pressure to make a decision on the spot.
This contrived experiment is being done in the name of research, ostensibly to arm humans with the tools we need to deal with this situation if it arises. If the AI's reasoning can't hold up to a wider analysis, then you can only conclude that it's a trick of some sort, and hopefully when the time comes we won't rely simply on the testimony of one tricked person to make such a critical decision.
I have a lot of respect for Yudkowski but it honestly baffles me that he thinks this particular exercise should be given any weight whatsoever. It's totally unscientific, relies entirely on the unsubstantiated testimony of one individual, and if he weren't the one holding the keys I have to feel like he would be highly critical of it as well.
That exercise was not research, it was a counterexample to a single specific security claim: that we can "choose not to release the AI," so we don't have to worry about them being dangerous.
You don't need to research a counter-example, you just produce one. He did so by exhibiting a relatively weak AI that couldn't be contained.
>You don't need to research a counter-example, you just produce one. He did so by exhibiting a relatively weak AI that couldn't be contained.
Well, no. He showed that a person pretending to be an AI could convince a person to let it out of the box. Sometimes.
It's not really a meaningful counterexample. Everyone involved with the event knew that the AI wasn't real and that there are zero consequences for letting it out. You'd have to concoct a much more involved test as a decent counterexample.
A person pretending to be an AI is surely easier to contain than a super-powerful AI that is smarter than all people. A failure to contain the former -- a strictly easier task -- is a perfect example of a failure to contain the latter.
Hasn't he refused to release the transcripts for this experiment? And I think there've been subsequent runs of the experiment, with some losses as well as wins.
Edit: I see that not releasing the transcripts is part of the design of the experiment. Which seems kind of like it'd seriously hamper any kind of independent analysis of the results or the quality of the experiment.