OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling’s Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

  • TropicalDingdong@lemmy.world
    link
    fedilink
    English
    arrow-up
    145
    arrow-down
    24
    ·
    1 year ago

    Its a bit pedantic, but I’m not really sure I support this kind of extremist view of copyright and the scale of whats being interpreted as ‘possessed’ under the idea of copyright. Once an idea is communicated, it becomes a part of the collective consciousness. Different people interpret and build upon that idea in various ways, making it a dynamic entity that evolves beyond the original creator’s intention. Its like issues with sampling beats or records in the early days of hiphop. Its like the very principal of an idea goes against this vision, more that, once you put something out into the commons, its irretrievable. Its not really yours any more once its been communicated. I think if you want to keep an idea truly yours, then you should keep it to yourself. Otherwise you are participating in a shared vision of the idea. You don’t control how the idea is interpreted so its not really yours any more.

    If thats ChatGPT or Public Enemy is neither here nor there to me. The idea that a work like Peter Pan is still possessed is such a very real but very silly obvious malady of this weirdly accepted but very extreme view of the ability to possess an idea.

    • Bogasse@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      2
      ·
      1 year ago

      Well, I’d consider agreeing if the LLMs were considered as a generic knowledge database. However I had the impression that the whole response from OpenAI & cie. to this copyright issue is “they build original content”, both for LLMs and stable diffusion models. Now that they started this line of defence I think that they are stuck with proving that their “original content” is not derivated from copyrighted content 🤷

      • TropicalDingdong@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Well, I’d consider agreeing if the LLMs were considered as a generic knowledge database. However I had the impression that the whole response from OpenAI & cie. to this copyright issue is “they build original content”, both for LLMs and stable diffusion models. Now that they started this line of defence I think that they are stuck with proving that their “original content” is not derivated from copyrighted content 🤷

        Yeah I suppose that’s on them.

    • Toasteh@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 year ago

      Copyright definitely needs to be stripped back severely. Artists need time to use their own work, but after a certain time everything needs to enter the public space for the sake of creativity.

    • AgentOrange@lemm.ee
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      1 year ago

      To add to that, Harry Potter is the worst example to use here. There is no extra billion that JK Rowling needs to allow her to spend time writing more books.

      Copyright was meant to encourage authors to invest in their work in the same way that patents do. If you were going to argue about the issue of lifting content from books, you should be using books that need the protection of copyright, not ones that don’t.

      • TropicalDingdong@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        1 year ago

        Copyright was meant

        I just don’t know that I agree that this line of reasoning is useful. Who cares what it was meant for? What is it now, currently and functionally, doing?

      • TropicalDingdong@lemmy.world
        link
        fedilink
        English
        arrow-up
        39
        arrow-down
        12
        ·
        1 year ago

        If you sample someone else’s music and turn around and try to sell it, without first asking permission from the original artist, that’s copyright infringement.

        I think you completely and thoroughly do not understand what I’m saying or why I’m saying it. No where did I suggest that I do not understand modern copyright. I’m saying I’m questioning my belief in this extreme interpretation of copyright which is represented by exactly what you just parroted. That this interpretation is both functionally and materially unworkable, but also antithetical to a reasonable understanding of how ideas and communication work.

          • kmkz_ninja@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            5
            ·
            1 year ago

            Yeah, this is definitely leaning a little too “People shouldn’t pump their own gas because gas attendants need to eat, feed their kids, pay rent” for me.

      • NOT_RICK@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        edit-2
        1 year ago

        A sample is a fundamental part of a song’s output, not just its input. If LLMs are changing the input’s work to a high enough degree is it not protected as a transformative work?

  • fubo@lemmy.world
    link
    fedilink
    English
    arrow-up
    83
    arrow-down
    12
    ·
    edit-2
    1 year ago

    If I memorize the text of Harry Potter, my brain does not thereby become a copyright infringement.

    A copyright infringement only occurs if I then reproduce that text, e.g. by writing it down or reciting it in a public performance.

    Training an LLM from a corpus that includes a piece of copyrighted material does not necessarily produce a work that is legally a derivative work of that copyrighted material. The copyright status of that LLM’s “brain” has not yet been adjudicated by any court anywhere.

    If the developers have taken steps to ensure that the LLM cannot recite copyrighted material, that should count in their favor, not against them. Calling it “hiding” is backwards.

    • cantstopthesignal@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      4
      ·
      edit-2
      1 year ago

      You are a human, you are allowed to create derivative works under the law. Copyright law as it relates to machines regurgitating what humans have created is fundamentally different. Future legislation will have to address a lot of the nuance of this issue.

    • UnculturedSwine@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      1 year ago

      Another sensationalist title. The article makes it clear that the problem is users reconstructing large portions of a copyrighted work word for word. OpenAI is trying to implement a solution that prevents ChatGPT from regurgitating entire copyrighted works using “maliciously designed” prompts. OpenAI doesn’t hide the fact that these tools were trained using copyrighted works and legally it probably isn’t an issue.

  • Blapoo@lemmy.ml
    link
    fedilink
    English
    arrow-up
    64
    arrow-down
    8
    ·
    1 year ago

    We have to distinguish between LLMs

    • Trained on copyrighted material and
    • Outputting copyrighted material

    They are not one and the same

    • Even_Adder@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      25
      arrow-down
      7
      ·
      1 year ago

      Yeah, this headline is trying to make it seem like training on copyrighted material is or should be wrong.

      • TropicalDingdong@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        arrow-down
        6
        ·
        1 year ago

        I think this brings up broader questions about the currently quite extreme interpretation of copyright. Personally I don’t think its wrong to sample from or create derivative works from something that is accessible. If its not behind lock and key, its free to use. If you have a problem with that, then put it behind lock and key. No one is forcing you to share your art with the world.

        • Railcar8095@lemm.ee
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          1
          ·
          1 year ago

          Following that, if a sailor is the sea were to put a copy of a protected book on the internet and ChatGPT was trained on it, how that argument would go? The copyright owner didn’t place it there, so it’s not “their decision”. And savvy people can make sure it’s accessible if they want to.

          My belief is, if they can use all non locked data for free, then the model should be shared for free too and it’s outputs shouldn’t be subject to copyright. Just for context

        • Bogasse@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          2
          ·
          1 year ago

          Most books are actually locked behind paywalls and not free to use? Or maybe I don’t understand what you meant?

    • TwilightVulpine@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      Should we distinguish it though? Why shouldn’t (and didn’t) artists have a say if their art is used to train LLMs? Just like publicly displayed art doesn’t provide a permission to copy it and use it in other unspecified purposes, it would be reasonable that the same would apply to AI training.

      • Blapoo@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Ah, but that’s the thing. Training isn’t copying. It’s pattern recognition. If you train a model “The dog says woof” and then ask a model “What does the dog say”, it’s not guaranteed to say “woof”.

        Similarly, just because a model was trained on Harry Potter, all that means is it has a good corpus of how the sentences in that book go.

        Thus the distinction. Can I train on a comment section discussing the book?

  • Skanky@lemmy.world
    link
    fedilink
    English
    arrow-up
    46
    ·
    1 year ago

    Vanilla Ice had it right all along. Nobody gives a shit about copyright until big money is involved.

  • rosenjcb@lemmy.world
    link
    fedilink
    English
    arrow-up
    35
    arrow-down
    2
    ·
    edit-2
    1 year ago

    The powers that be have done a great job convincing the layperson that copyright is about protecting artists and not publishers. It’s historically inaccurate and you can discover that copyright law was pushed by publishers who did not want authors keeping second hand manuscripts of works they sold to publishing companies.

    Additional reading: https://en.m.wikipedia.org/wiki/Statute_of_Anne

    • stewsters@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      Yeah, it refuses to give you the first sentence from Harry Potter now.

      Which is kinda lame, you can find that on thousands of webpages. Many of which the system indexed.

      If someone was looking to pirate the book there are way easier ways than issuing thousands of queries to ChatGPT. Type “Harry Potter torrent” into Google and you will have them all in 30 seconds.

      • BURN@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        ChatGPT has a ton of extra query qualifiers added behind the scenes to ensure that specific outputs can’t happen

  • paraphrand@lemmy.world
    link
    fedilink
    English
    arrow-up
    25
    arrow-down
    9
    ·
    1 year ago

    Why are people defending a massive corporation that admits it is attempting to create something that will give them unparalleled power if they are successful?

    • bamboo@lemm.ee
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      2
      ·
      1 year ago

      Mostly because fuck corporations trying to milk their copyright. I have no particular love for OpenAI (though I do like their product), but I do have great distain for already-successful corporations that would hold back the progress of humanity because they didn’t get paid (again).

      • assassin_aragorn@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        1 year ago

        There’s a massive difference though between corporations milking copyright and authors/musicians/artists wanting their copyright respected. All I see here is a corporation milking copyrighted works by creative individuals.

    • Whimsical@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      The dream would be that they manage to make their own glorious free & open source version, so that after a brief spike in corporate profit as they fire all their writers and artists, suddenly nobody needs those corps anymore because EVERYONE gets access to the same tools - if everyone has the ability to churn out massive content without hiring anyone, that theoretically favors those who never had the capital to hire people to begin with, far more than those who did the hiring.

      Of course, this stance doesn’t really have an answer for any of the other problems involved in the tech, not the least of which is that there’s bigger issues at play than just “content”.

      • otherbastard@lemm.ee
        link
        fedilink
        English
        arrow-up
        11
        arrow-down
        8
        ·
        1 year ago

        An LLM is not a person, it is a product. It doesn’t matter that it “learns” like a human - at the end of the day, it is a product created by a corporation that used other people’s work, with the capacity to disrupt the market that those folks’ work competes in.

        • Touching_Grass@lemmy.world
          link
          fedilink
          English
          arrow-up
          11
          arrow-down
          6
          ·
          edit-2
          1 year ago

          And it should be able to freely use anything that’s available to it. These massive corporations and entities have exploited all the free spaces to advertise and sell us their own products and are now sour.

          If they had their way they are going to lock up much more of the net behind paywalls. Everybody should be with the LLMs on this.

          • otherbastard@lemm.ee
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            2
            ·
            1 year ago

            You are somehow conflating “massive corporation” with “independent creator,” while also not recognizing that successful LLM implementations are and will be run by massive corporations, and eventually plagued with ads and paywalls.

            People that make things should be allowed payment for their time and the value they provide their customer.

            • Touching_Grass@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              2
              ·
              edit-2
              1 year ago

              People are paid. But they’re greedy and expect far more compensation then they deserve. In this case they should not be compensated for having an LLM ingest their work work if that work was legally owned or obtained

          • assassin_aragorn@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            2
            ·
            1 year ago

            Except the massive corporations and entities are the ones getting rich on this. They’re seeking to exploit the work of authors and musicians and artists.

            Respecting the intellectual property of creative workers is the anti corporate position here.

            • uis@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 year ago

              Except corporations have infinitely more resources(money, lawyers) compared to people who create. Take Jarek Duda(mathematician from Poland) and Microsoft as an example. He created new compression algorythm, and Microsoft came few years later and patented it in Britain AFAIK. To file patent contest and prior art he needs 100k£.

              • assassin_aragorn@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 year ago

                I think there’s an important distinction to make here between patents and copyright. Patents are the issue with corporations, and I couldn’t care less if AI consumed all that.

                • uis@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  1 year ago

                  And for copyright there is no possible way to contest it. Also when copyright expires there is no guarantee it will be accessable by humanity. Patents are bad, copyright even worse.

            • uis@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              1 year ago

              There is nothing anti corporate if result can be alienated.

          • Cosmic Cleric@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            5
            ·
            1 year ago

            If they had their way they are going to lock up much more of the net behind paywalls.

            This!

            When the Internet was first a thing corpos tried to put everything behind paywalls, and we pushed back and won.

            Now, the next generation is advocating to put everything behind a paywall again?

          • scarabic@lemmy.world
            link
            fedilink
            English
            arrow-up
            10
            arrow-down
            6
            ·
            1 year ago

            First, we don’t have to make AI.

            Second, it’s not about it being unable to learn, it’s about the fact that they aren’t paying the people who are teaching it.

              • AncientMariner@lemmy.world
                link
                fedilink
                English
                arrow-up
                3
                arrow-down
                1
                ·
                1 year ago

                Humans can judge information make decisions on it and adapt it. AI mostly just looks at what is statistically what is most likely based on training data. If 1 piece of data exists, it will copy, not paraphrase. Example was from I think copilot where it just printed out the code and comments from an old game verbatim. I think Quake2. It isn’t intelligence, it is statistical copying.

                • uis@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  arrow-down
                  2
                  ·
                  1 year ago

                  Well, mathematics cannot be copyrighted. In most countries at least.

    • SCB@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      2
      ·
      1 year ago

      Leftists hating on AI while dreaming of post-scarcity will never not be funny

  • ClamDrinker@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    1
    ·
    edit-2
    1 year ago

    This is just OpenAI covering their ass by attempting to block the most egregious and obvious outputs in legal gray areas, something they’ve been doing for a while, hence why their AI models are known to be massively censored. I wouldn’t call that ‘hiding’. It’s kind of hard to hide it was trained on copyrighted material, since that’s common knowledge, really.

  • Technoguyfication@lemmy.ml
    link
    fedilink
    English
    arrow-up
    27
    arrow-down
    13
    ·
    1 year ago

    People are acting like ChatGPT is storing the entire Harry Potter series in its neural net somewhere. It’s not storing or reproducing text in a 1:1 manner from the original material. Certain material, like very popular books, has likely been interpreted tens of thousands of times due to how many times it was reposted online (and therefore how many times it appeared in the training data).

    Just because it can recite certain passages almost perfectly doesn’t mean it’s redistributing copyrighted books. How many quotes do you know perfectly from books you’ve read before? I would guess quite a few. LLMs are doing the same thing, but on mega steroids with a nearly limitless capacity for information retention.

    • Hup!@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      5
      ·
      edit-2
      1 year ago

      Nope people are just acting like ChatGPT is making commercial use of the content. Knowing a quote from a book isn’t copyright infringement. Selling that quote is. Also it doesn’t need to be content stored 1:1 somewhere to be infringement. That misses the point. If you’re making money of a synopsis you wrote based on imperfect memory and in your own words it’s still copyright infringment until you sign a licensing agreement with JK. Even transforming what you read into a different medium like a painting or poetry cam infinge the original authors copyrights.

      Now mull that over and tell us what you think about modern copyright laws.

      • Ronath@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 year ago

        Just adding, that, outside of Rowling, who I believe has a different contract than most authors due to the expanded Wizarding World and Pottermore, most authors themselves cannot quote their own novels online because that would be publishing part of the novel digitally and that’s a right they’ve sold to their publisher. The publisher usually ignores this as it creates hype for the work, but authors are careful not to abuse it.

    • abbotsbury@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      7
      ·
      1 year ago

      but on mega steroids with a nearly limitless capacity for information retention.

      That sounds like redistributing copyrighted books

  • RadialMonster@lemmy.world
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    3
    ·
    1 year ago

    what if they scraped a whole lot of the internet, and those excerpts were in random blogs and posts and quotes and memes etc etc all over the place? They didnt injest the material directly, or knowingly.

    • beetus@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      3
      ·
      1 year ago

      Not knowing something is a crime doesn’t stop you from being prosecuted for committing it.

      It doesn’t matter if someone else is sharing copyright works and you don’t know it and use it in ways that infringes on that copyright.

      “I didn’t know that was copyrighted” is not a valid defence.

      • stewsters@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        3
        ·
        1 year ago

        Is reading a passage from a book actually a crime though?

        Sure, you could try to regenerate the full text from quotes you read online, much like you could open a lot of video reviews and recreate larger portions of the original text, but you would not blame the video editing program for that, you would blame the one who did it and decided to post it online.

    • chemical_cutthroat@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      7
      ·
      1 year ago

      That’s why this whole argument is worthless, and why I think that, at its core, it is disingenuous. I would be willing to be a steak dinner that a lot of these lawsuits are just fishing for money, and the rest are set up by competition trying to slow the market down because they are lagging behind. AI is an arms race, and it’s growing so fast that if you got in too late, you are just out of luck. So, companies that want in are trying to slow down the leaders, at best, and at worst they are trying to make them publish their training material so they can just copy it. AI training models should be considered IP, and should be protected as such. It’s like trying to get the Colonel’s secret recipe by saying that all the spices that were used have been used in other recipes before, so it should be fair game.

      • Kujo@lemm.ee
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        1 year ago

        If training models are considered IP then shouldn’t we allow other training models to view and learn from the competition? If learning from other IPs that are copywritten is okay, why should the training models be treated different?

        • chemical_cutthroat@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          2
          ·
          1 year ago

          They are allegedly learning from copyrighted material, there is no actual proof that they have been trained on the actual material, or just snippets that have been published online. And it would be illegal for them to be trained on full copyrighted materials, because it is protected by laws that prevent that.

  • scarabic@lemmy.world
    link
    fedilink
    English
    arrow-up
    20
    arrow-down
    13
    ·
    1 year ago

    One of the first things I ever did with ChatGPT was ask it to write some Harry Potter fan fiction. It wrote a short story about Ron and Harry getting into trouble. I never said the word McGonagal and yet she appeared in the story.

    So yeah, case closed. They are full of shit.

    • PraiseTheSoup@lemm.ee
      link
      fedilink
      English
      arrow-up
      25
      arrow-down
      4
      ·
      1 year ago

      There is enough non-copywrited Harry Potter fan fiction out there that it would not need to be trained on the actual books to know all the characters. While I agree they are full of shit, your anecdote proves nothing.

      • Cosmic Cleric@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        12
        ·
        1 year ago

        While I agree they are full of shit, your anecdote proves nothing.

        Why? Because you say so?

        He brings up a valid point, it seems transformative.

        • LittleLordLimerick@lemm.ee
          link
          fedilink
          English
          arrow-up
          12
          arrow-down
          1
          ·
          1 year ago

          The anecdote proves nothing because the model could potentially have known of the McGonagal character without ever being trained on the books, since that character appears in a lot of fan fiction. So their point is invalid and their anecdote proves nothing.

          • Cosmic Cleric@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            1 year ago

            I was questioning how much non- copyrightable material was available to train an AI on.

            It’s not a brain dead question just because you may disagree with it.

            • GroggyGuava@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              Which he literally answers in the comment you questioned him on. You asked him something after he explained what you then asked.

              That’s braindead, and not because I “disagree” with your question, whatever that means.

              • Cosmic Cleric@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                1 year ago

                I wasn’t agreeing with him and I was asking him to back up what he said. But you carry on, Internet Warrior.

  • Jat620DH27@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    8
    ·
    1 year ago

    I thought everyone knows that OpenAI has the same access to any books, knowledge that human beings have.

    • benni@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      5
      ·
      1 year ago

      Yeah, but if you wanna act out the contents of the book and sell it as a movie, you need to buy the rights.

      • nednobbins@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        3
        ·
        1 year ago

        Yes but there’s a threshold of how much you need to copy before it’s an IP violation.

        Copying a single word is usually only enough if it’s a neologism.
        Two matching words in a row usually isn’t enough either.
        At some point it is enough though and it’s not clear what that point is.

        On the other hand it can still be considered an IP violation if there are no exact word matches but it seems sufficiently similar.

        Until now we’ve basically asked courts to step in and decide where the line should be on a case by case basis.

        We never set the level of allowable copying to 0, we set it to “reasonable”. In theory it’s supposed to be at a level that’s sufficient to, “promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” (US Constitution, Article I, Section 8, Clause 8).

        Why is it that with AI we take the extreme position of thinking that an AI that makes use of any information from humans should automatically be considered to be in violation of IP law?

        • assassin_aragorn@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          1
          ·
          1 year ago

          Making use of the information is not a violation – making use of that violation to turn a profit is a violation. AI software that is completely free for the masses without any paid upgrades can look at whatever it wants. As soon as a corporation is making money on it though, it’s in violation and needs to pay up.

          • Corkyskog@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            1 year ago

            It’s also just not fair, unless your going to rule that nothing an AI produces can be copyrighted. Otherwise some billionaire could just flood the office with copyrighted requests and copyrighted everything… hell if they really did that they would probably convince the government to let them hire outside contractors for free to speed up the process…

          • nednobbins@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Is that intended as a legal or moral position?

            As far as I know, the law doesn’t care much if you make money off of IP violations. There are many cases of individuals getting hefty fines for both the personal use and free distribution of IP. I think if there is commercial use of IP the profits are forfeit to the IP holder. I’m not a lawyer though, so don’t bank on that.

            There’s still the initial question too. At present, we let the courts decide if the usage, whether profitable or not, meets the standard of IP violation. Artists routinely take inspiration from one another and sometimes they take it too far. Why should we assume that AI automatically takes it too far and always meets the standard of IP violation?

          • GroggyGuava@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            2
            ·
            1 year ago

            Idk that feels like saying that as soon as you sell the skills you learned on YouTube, you should have to start paying the people you learned from, since you’re “using” their copyrighted material to turn profit.

            I don’t agree whatsoever that copyright extends to inspiration of other artists/data models. Unless they recreate what you’ve made in a sufficiently similar manor, they haven’t copied you.

        • Cosmic Cleric@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          4
          ·
          1 year ago

          Why is it that with AI we take the extreme position of thinking that an AI that makes use of any information from humans should automatically be considered to be in violation of IP law?

          Luddites throwing their sabots into the machinery.

      • LordShrek@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        5
        ·
        1 year ago

        yes, but that’s a different situation. with the LLM, the issue is that the text from copyrighted books are influencing the way it speaks. this is the same with humans.

    • Touching_Grass@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      3
      ·
      edit-2
      1 year ago

      Mods remove this comment as this instance no longer tolerates discussions of piracy. We went through this last week