• rah@hilariouschaos.com
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 day ago

    LLM limitations like “they only predict the next token” and other things that have already been falsified

    What do LLMs do beyond predicting the next token?

    • kromem@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      23 hours ago

      A few months back it was found that when writing rhyming couplets the model has already selected the second rhyming word when it was predicting the first word of the second line, meaning the model was planning the final rhyme tokens at least one full line ahead and not just predicting that final rhyme when it arrived at that token.

      It’s probably wise to consider this finding in concert with the streetlight effect.

      • rah@hilariouschaos.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 hours ago

        selected

        What do you mean by that? What does it mean to “select” something in the context of a neural net with input nodes and output nodes?

        the model was planning

        How have you come to that conclusion?

          • rah@hilariouschaos.com
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            7 hours ago

            Are you able to explain succinctly what you mean by “selected” so that we can communicate? That page is pretty dense and opaque.