Roko’s basilisk is a thought experiment which states that an otherwise benevolent artificial superintelligence (AI) in the future would be incentivized to create a virtual reality simulation to torture anyone who knew of its potential existence but did not directly contribute to its advancement or development, in order to incentivize said advancement.It originated in a 2010 post at discussion board LessWrong, a technical forum focused on analytical rational enquiry. The thought experiment’s name derives from the poster of the article (Roko) and the basilisk, a mythical creature capable of destroying enemies with its stare.
While the theory was initially dismissed as nothing but conjecture or speculation by many LessWrong users, LessWrong co-founder Eliezer Yudkowsky reported users who panicked upon reading the theory, due to its stipulation that knowing about the theory and its basilisk made one vulnerable to the basilisk itself. This led to discussion of the basilisk on the site being banned for five years. However, these reports were later dismissed as being exaggerations or inconsequential, and the theory itself was dismissed as nonsense, including by Yudkowsky himself. Even after the post’s discreditation, it is still used as an example of principles such as Bayesian probability and implicit religion. It is also regarded as a simplified, derivative version of Pascal’s wager.
Found out about this after stumbling upon this Kyle Hill video on the subject. It reminds me a little bit of “The Game”.
And yet you choose to spread this information.
Anyways, this is a fascinating thought experiment, but it does have some holes similar to Pascal’s Wager. I propose Feather’s Mongoose: A hypothetical AI system that, if created, will punish anyone who attempted to create Roko’s Basilisk, and will ensure that it is not created. In fact, you could make this same hypothetical for an AI with any goal-- therefore, it’s not possible to know what the AI that is actually created would want you to do, and so every course of action is indeterminately damning or not.
What motivation would the mongoose have to prevent the basilisk’s creation?
A more complete argument would be that an AI that seeks to maximise happiness would also want to prevent the creation of AIs like Roko’s basilisk.
I think you just answered your own question.
Also a super intelligence (inasmuch as such a thing makes sense) might be totally unfathomable. Unless by this we mean an intelligence with mundane and comprehensible higher goals, but explosive strategic capabilities to bring them about. In which case their actions might seem random to us.
Like the typical example applies: could an amoeba guess at the motivations of a human?