When German journalist Martin Bernklautyped his name and location into Microsoft’s Copilot to see how his articles would be picked up by the chatbot, the answers horrified him. Copilot’s results asserted that Bernklau was an escapee from a psychiatric institution, a convicted child abuser, and a conman preying on widowers. For years, Bernklau had served as a courts reporter and the AI chatbot had falsely blamed him for the crimes whose trials he had covered.

The accusations against Bernklau weren’t true, of course, and are examples of generative AI’s “hallucinations.” These are inaccurate or nonsensical responses to a prompt provided by the user, and they’re alarmingly common. Anyone attempting to use AI should always proceed with great caution, because information from such systems needs validation and verification by humans before it can be trusted.

But why did Copilot hallucinate these terrible and false accusations?

  • rsuri@lemmy.world
    link
    fedilink
    English
    arrow-up
    29
    arrow-down
    2
    ·
    2 months ago

    “Hallucinations” is the wrong word. To the LLM there’s no difference between reality and “hallucinations”, because it has no concept of reality or what’s true and false. All it knows it what word maybe should come next. The “hallucination” only exists in the mind of the reader. The LLM did exactly what it was supposed to.

    • Hobo@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      2
      ·
      edit-2
      2 months ago

      They’re bugs. Major ones. Fundamental flaws in the program. People with a vested interest in “AI” rebranded them as hallucinations in order to downplay the fact that they have a major bug in their software and they have no fucking clue how to fix it.

      • SkunkWorkz@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        edit-2
        2 months ago

        It’s not a bug. Just a negative side effect of the algorithm. This what happens when the LLM doesn’t have enough data points to answer the prompt correctly.

        It can’t be programmed out like a bug, but rather a human needs to intervene and flag the answer as false or the LLM needs more data to train. Those dozens of articles this guy wrote aren’t enough for the LLM to get that he’s just a reporter. The LLM needs data that explicitly says that this guy is a reporter that reported on those trials. And since no reporter starts their articles with ”Hi I’m John Smith the reporter and today I’m reporting on…” that data is missing. LLMs can’t make conclusions from the context.

  • Broken@lemmy.ml
    link
    fedilink
    English
    arrow-up
    20
    ·
    2 months ago

    This sounds like a great movie.

    AI sends police after him because of things he wrote. Writer is on the run, trying to clear his name the entire time. Somehow gets to broadcast the source of the articles to the world to clear his name. Plot twist ending is that he was indeed the perpetrator behind all the crimes.

  • gcheliotis@lemmy.world
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    4
    ·
    2 months ago

    The AI did not “decide” anything. It has no will. And no understanding of the consequences of any particular “decision”. But I guess “probabilistic model produces erroneous output” wouldn’t get as many views. The same point could still be made about not placing too much trust on the output of such models. Let’s stop supporting this weird anthropomorphizing of LLMs. In fact we should probably become much more discerning in using the term “AI”, because it alludes to a general intelligence akin to human intelligence with all the paraphernalia of humanity: consciousness, will, emotions, morality, sociality, duplicity, etc.

    • Hello Hotel@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      2
      ·
      edit-2
      2 months ago

      the AI “decided” in the same way the dice “decided” to land on 6 and 4 and screw me over. the system made a result using logic and entropy. With AI, some people are just using this informal way of speaking (subconsciously anthropomorphising) while others look at it and genuinely beleave or want to pretend its alive. You can never really know without asking them directly.

      Yes, if the intent is confusion, it is pretty minipulative.

      • gcheliotis@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 months ago

        Granted, our tendency towards anthropomorphism is near ubiquitous. But it would be disingenuous to claim that it does not play out in very specific and very important ways in how we speak and think about LLMs, given that they are capable of producing very convincing imitations of human behavior. And as such also produce a very convincing impression of agency. As if they actually do decide things. Very much unlike dice.

        • Hello Hotel@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 months ago

          A doll is also designed to be anthropomorphised, to have life projected onto it. Unlike dolls, when someone talks about LLMs as alive, most people have no clue if they are pretending or not. (And marketers take advantage of it!) We are feed a culture that accedentially says “chatGPT + Boston Dynamics robot = Robocop”. Assuming the only fictional part is that we dont have the ability to make it, not that the thing we create wouldn’t be human (or even be need to be human).

  • n0m4n@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    2 months ago

    If this were some fiction plot, Copilot reasoned the plot twist, and ran with it. Instead of the butler, the writer did it. To the computer, these are about the same.

  • erenkoylu@lemmy.ml
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    5
    ·
    2 months ago

    The problem is not the AI. The problem is the huge numbers of morons who deploy AI without proper verfication and control.

    • stingpie@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 months ago

      No, you’re thinking of the first scene of the movie where a fly falls into the teletype machine and causes it to type ‘tuttle’ instead of ‘buttle’.