• Rogers@lemmy.ml
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      13
      ·
      1 month ago

      The latest llms get a perfect score on the south Korean SAT and can pass the bar. More than pure marketing if you ask me. That does not mean 90% of business that claim ai are nothing more than marketing or the business that are pretty much just a front end for GPT APIs. llms like claud even check their work for hallucinations. Even if we limited all ai to llms they would still be groundbreaking.

      • clutchtwopointzero@lemmy.world
        link
        fedilink
        English
        arrow-up
        13
        ·
        1 month ago

        Korean SAT are highly standardized in multiple choice form and there is an immense library of past exams that both test takers and examiners use. I would be more impressed if the LLMs could show also step by step problem work out…

        • Rogers@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          8
          ·
          1 month ago

          Claud 3.5 and o1 might be able to do that; if not, they are close to being able to do that. Still better than 99.99% of earthly humans