• Rogers@lemmy.ml
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    9
    ·
    24 days ago

    The latest llms get a perfect score on the south Korean SAT and can pass the bar. More than pure marketing if you ask me. That does not mean 90% of business that claim ai are nothing more than marketing or the business that are pretty much just a front end for GPT APIs. llms like claud even check their work for hallucinations. Even if we limited all ai to llms they would still be groundbreaking.

    • clutchtwopointzero@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      24 days ago

      Korean SAT are highly standardized in multiple choice form and there is an immense library of past exams that both test takers and examiners use. I would be more impressed if the LLMs could show also step by step problem work out…

      • Rogers@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        5
        ·
        24 days ago

        Claud 3.5 and o1 might be able to do that; if not, they are close to being able to do that. Still better than 99.99% of earthly humans