Quite frequently I come across scanned books that are viewable for free online. For example, the publisher put them there (such as preview chapters), a library (old books from their collection that are in public domain), etc. Since I like hoarding data, and the online viewers that are used to present the book to me might not be very practical, I frequently try to download the books one way or another. This requires toying with the “inspect element” tool and various other methods of getting the images/PDF. Now, all that I access is what is, well, accessible; I don’t hack into the servers or something. But - the stuff is meant to be hidden from the normal user. Does that act of hiding the material, no matter how primitive and easily circumvented, mean that I’m not allowed to access it at all?

I suppose ripping a public domain book is no big deal, but would books under copyright fare differently?

Mainly I’m asking out of curiosity, I don’t expect the police to come visit me for ripping a 16th century dictionary.

Note: I live in EU, but I’d be curious to hear how this is treated elsewhere too.

Edit: I also remembered a funny trick I noticed on one site - it allows viewing PDFs on their website, but not downloading, unless you pay for the PDF. But when you load the page, even without paying, the PDF is already downloaded onto your computer and can be found in the browser cache. Is it legal to simply save the file that is already on your computer?

  • simple@lemm.ee
    link
    fedilink
    arrow-up
    37
    ·
    3 months ago

    AFAIK web scraping (the act of grabbing and downloading any data you see available on the internet) isn’t illegal, and I would assume downloading PDFs provided to you online would fall under that. Since it is copyrighted it would probably be illegal to share it, though.

    • nvermind@lemm.ee
      link
      fedilink
      arrow-up
      16
      arrow-down
      1
      ·
      edit-2
      3 months ago

      This. In a case around LinkedIn courts ruled that in the US it’s legal to scrape publicly available data. The company doing the scraping was selling that data to corporate customers, but ultimately use might depend on the information you’re accessing and under what permissions. (Not a lawyer)

      • papalonian@lemmy.world
        link
        fedilink
        arrow-up
        1
        ·
        3 months ago

        If you scraped a pirate site and stored a bunch of links to copyrighted content you’d probably be fine, actually using those links to download or share copyrighted content is what’s illegal. It’d be like buying the stuff to make a bomb or drugs, but then not making any bombs or drugs.

        That being said, while not necessarily illegal, I wouldn’t want authorities to find my bomb and drug ingredients, or my scraped piracy links, as I’d probably have some 'splainin to do.

        (Not a lawyer)

          • papalonian@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            3 months ago

            Who said that you can’t scrape content from the other place?

            If you scraped a pirate site and stored a bunch of links to copyrighted content you’d probably be fine,

            If you’re referring to the last line, I say I wouldn’t want authorities to find it because I don’t want to have to explain it. I’m 99% sure someone would not just store links to a bunch of pirated content for fun, they probably have accessed said pirated content, now you have to explain to the authorities why you have links to pirated content without implicating yourself in copyright infringement.

            Like I said, probably fine, I just wouldn’t want the hassle if I somehow got caught.

              • papalonian@lemmy.world
                link
                fedilink
                arrow-up
                1
                ·
                3 months ago

                Sorry man, I’m not exactly sure what you’re asking.

                If you are able to load the content on your computer without infringing copyright laws, you’re allowed to circumvent whatever the website has in place to store whatever data you would like from whatever website you would like, regardless of the nature of the site, so long as the content is legal (is not CP) and again not being presented in a way that infringes aforementioned copyright laws.

                If you’re asking why the copyright laws exist, I can’t really help you with that one.

  • Vipsu@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    arrow-down
    3
    ·
    3 months ago

    According to the big tech its ok if you’re training large language model with it.

    • lugal@lemmy.world
      link
      fedilink
      arrow-up
      8
      ·
      3 months ago

      You’re confusing the law that applies for the ruling class with the one that applies to common people

      • Mango@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        3 months ago

        There’s a law for the ruling class? I always figured they gotta just cut their political buddies in.

      • SlothMama@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        3 months ago

        Unironically yes, you would not know who Spiderman was without viewing a copyrighted work demonstrating what he looks like, and now you understand while generative AI fundamentally has to ingest copyrighted works.

  • slazer2au@lemmy.world
    link
    fedilink
    English
    arrow-up
    19
    ·
    3 months ago

    As with everything with the law, it depends.

    In Australia, distribution is the illegal part, seeding/sharing is where they get you. Not the actual download itself.

  • ulterno@lemmy.kde.social
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    edit-2
    3 months ago

    viewable for free online

    If you are viewing it on your computer, you have already downloaded it.
    Don’t let anyone tell you otherwise.

    already downloaded onto your computer and can be found in the browser cache

    Exactly.

  • The_v@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    3 months ago

    Not an expert, but in the U.S. making a copy of a broadcast for personal use is legal under fair-use. Anything that loads up on your computer screen you can make a copy and save it for personal use. So screen captures are by definition legal.

    How exactly you copy the material on your screen gets tricky under the DMCA clusterfuck. Breaking encryption to copy the material is illegal unless there is an valid exception for fair-use. What exactly those valid exceptions are is above my paygrade.

  • Etterra@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    3 months ago

    It might be illegal to post it without permission, but you can download it all you damn well please and they can’t stop you. Unless it’s like government top secret something or other. In that case you probably don’t want it anywhere near your computer and should probably tell somebody where you found it.

      • steeznson@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        3 months ago

        Astonishing listening to the news coverage of that story where the anchors were reading some terminally online nonsense from the teleprompter about Discord “Thug Shakers”

    • ulterno@lemmy.kde.social
      link
      fedilink
      English
      arrow-up
      5
      ·
      3 months ago

      should probably tell somebody where you found it

      Somebody, as in your lawyer. Who can then inform the correct authorities, while making sure you don’t become their scapegoat.

    • accideath@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      3 months ago

      That sadly isn’t true everywhere. Here in Germany (and I suspect large parts of the EU) downloading/streaming copyrighted content without license used to be a grey area but has been completely illegal for a few years now.

      Of course, VPNs are perfectly legal.

  • orcrist@lemm.ee
    link
    fedilink
    arrow-up
    2
    ·
    3 months ago

    If something is in the public domain, there is no copyright covering it, so you should make as many copies as you feel like. Many public domain books are posted on the Internet Archive, where you can easily download them in various formats. Then you won’t have to work hard to get the data. Public domain artwork, likewise, is often available on Wikimedia Commons.

  • oxjox@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 months ago

    Digital tools as you’ve described could be used by the service to manage access to content. A book’s author or publisher may object to the book being available for free. There may be limits on the amount of time you can read a book. Some content may be public domain but there may be versions of that content which the publisher has altered to in some way making some portions of the book not public domain.

    Knowingly possessing something that was not freely provided to you or the public by the licensed owner, or otherwise known to be unprotected by copyright, is not legal. Just because a file is cached on your device does not mean you are the legal owner of that content forever.

    There’s a number of reasons you may be charged to download a pdf. It could be a means of legally granting ownership and sharing revenue with the content owner. It could be because the authorized provider of the content is simply charging to maintain the service you’ve acquired the content from. It could be both or it could be a sketchy website just trying to get your CC info.

    This is coming from the perspective of someone in the US. I’m not sure about the rest of the world but imagine basic copyright laws are similar around the world.

    • antonim@lemmy.dbzer0.comOP
      link
      fedilink
      arrow-up
      1
      ·
      3 months ago

      Honestly much of your reply is confusing me and doesn’t seem to be relevant to my questions. This is what I think is crucial:

      Just because a file is cached on your device does not mean you are the legal owner of that content forever.

      What does being “the legal owner forever” actually entail, either with regards to a physical book or its scan? And what does that mean regarding what I can legally do with the cached file on my computer?

      • oxjox@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 months ago

        If you have legally obtained something, you have agreed to the terms of ownership with the provider / owner / creator of the content. Whether you find a document on your computer or you have paid for it, it does not explicitly give you full ownership of that data forever.

        For example: if you buy a DVD from a store, you’re actually purchasing a license to watch the content of that DVD. If you were to give or sell that disc to someone else, you are transferring your permission to watch that disc to them. So, if you were to rip that movie to your computer, legally - you only have permission to watch that for as long as you are in possession of that physical media.
        Conversely, if you were to “buy” a movie from an online platform, they may relinquish your right to watch that movie if the publisher of that content (or a government agency) no longer permits them to stream that content to you. If you were to download that movie, that does not change the agreement you made with the service to watch it. This is why it’s not possible to save an iTunes video purchase to your computer in a non-encrypted format.

        In other words, you’ve got to read the terms and conditions. Even then, they may change the terms and conditions of the agreement.

        • The_v@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          3 months ago

          Terms and conditions are NOT copyright law. They are a separate agreement that is the companies “wishlist” of things they want the consumer to agree too. It’s common for them to spell out terms in direct conflict with copyright law.

          The reason that an iTunes video purchase is encrypted is because it is illegal to break the encryption in order to make a copy (DMCA). However capturing the playback and transforming it to another medium is for personal use is fair-use.

          There is also no time limit to how long a person can save the copy for. As long as they had legal access to the content at the time of making the copy. For example say I recorded a football game from a streaming service. I can save that copy for personal use for the rest of my life even though I purchased a one time only streaming.

          • oxjox@lemmy.ml
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            Sure. Regardless, their terms and conditions should give you some idea of how they’re using technology to permit and/or restrict access.

            The reason that an iTunes video purchase is encrypted illegal to copy is because it is illegal to break the encryption in order to make a copy

            FTFY

            I don’t think content providers are encrypting things because it’s illegal to decrypt things. They’re encrypting things because the content producers (movie studios) want to ensure that (1) they’re getting paid for the content, (1B) it’s not given away for free and, (2) they’re in business to make money.

            To my knowledge, there are no laws about making copies. Breaking encryption is illegal because the encryption itself is protected under law. Selling copies is illegal. Playing copies of something for which you are not permitted or do not legally own a license to watch is illegal. So, if you make a copy of a cassette tape, legal; profiting from that copy, illegal.

            Copyright law is not contract law.

            Some items have time limits - such as renting a movie from iTunes or Amazon or borrowing a book from a physical or digital library. You are entering a contract with the provider where they grant you temporary access to something. If you were to make a copy of something you were given temporary access to, you are breaking the contract.

            I don’t know what the agreement is for football organizations or your content provider. If you’re breaking broadcast or HDMI encryption to record a stream, that’s illegal. If you’re somehow bypassing encryption, that is probably legal. I do know that it’s illegal to re-broadcast the content in public and to resell that program. There are also some fair use rules (in the US) which permit limited use for commentary and education purposes.