OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling’s Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

  • TropicalDingdong@lemmy.world
    link
    fedilink
    English
    arrow-up
    145
    arrow-down
    24
    ·
    1 year ago

    Its a bit pedantic, but I’m not really sure I support this kind of extremist view of copyright and the scale of whats being interpreted as ‘possessed’ under the idea of copyright. Once an idea is communicated, it becomes a part of the collective consciousness. Different people interpret and build upon that idea in various ways, making it a dynamic entity that evolves beyond the original creator’s intention. Its like issues with sampling beats or records in the early days of hiphop. Its like the very principal of an idea goes against this vision, more that, once you put something out into the commons, its irretrievable. Its not really yours any more once its been communicated. I think if you want to keep an idea truly yours, then you should keep it to yourself. Otherwise you are participating in a shared vision of the idea. You don’t control how the idea is interpreted so its not really yours any more.

    If thats ChatGPT or Public Enemy is neither here nor there to me. The idea that a work like Peter Pan is still possessed is such a very real but very silly obvious malady of this weirdly accepted but very extreme view of the ability to possess an idea.

    • Bogasse@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      2
      ·
      1 year ago

      Well, I’d consider agreeing if the LLMs were considered as a generic knowledge database. However I had the impression that the whole response from OpenAI & cie. to this copyright issue is “they build original content”, both for LLMs and stable diffusion models. Now that they started this line of defence I think that they are stuck with proving that their “original content” is not derivated from copyrighted content 🤷

      • TropicalDingdong@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Well, I’d consider agreeing if the LLMs were considered as a generic knowledge database. However I had the impression that the whole response from OpenAI & cie. to this copyright issue is “they build original content”, both for LLMs and stable diffusion models. Now that they started this line of defence I think that they are stuck with proving that their “original content” is not derivated from copyrighted content 🤷

        Yeah I suppose that’s on them.

    • Toasteh@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 year ago

      Copyright definitely needs to be stripped back severely. Artists need time to use their own work, but after a certain time everything needs to enter the public space for the sake of creativity.

    • AgentOrange@lemm.ee
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      1 year ago

      To add to that, Harry Potter is the worst example to use here. There is no extra billion that JK Rowling needs to allow her to spend time writing more books.

      Copyright was meant to encourage authors to invest in their work in the same way that patents do. If you were going to argue about the issue of lifting content from books, you should be using books that need the protection of copyright, not ones that don’t.

      • TropicalDingdong@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        1 year ago

        Copyright was meant

        I just don’t know that I agree that this line of reasoning is useful. Who cares what it was meant for? What is it now, currently and functionally, doing?

      • TropicalDingdong@lemmy.world
        link
        fedilink
        English
        arrow-up
        39
        arrow-down
        12
        ·
        1 year ago

        If you sample someone else’s music and turn around and try to sell it, without first asking permission from the original artist, that’s copyright infringement.

        I think you completely and thoroughly do not understand what I’m saying or why I’m saying it. No where did I suggest that I do not understand modern copyright. I’m saying I’m questioning my belief in this extreme interpretation of copyright which is represented by exactly what you just parroted. That this interpretation is both functionally and materially unworkable, but also antithetical to a reasonable understanding of how ideas and communication work.

          • kmkz_ninja@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            5
            ·
            1 year ago

            Yeah, this is definitely leaning a little too “People shouldn’t pump their own gas because gas attendants need to eat, feed their kids, pay rent” for me.

      • NOT_RICK@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        edit-2
        1 year ago

        A sample is a fundamental part of a song’s output, not just its input. If LLMs are changing the input’s work to a high enough degree is it not protected as a transformative work?