• JohnWorks@sh.itjust.works
    link
    fedilink
    arrow-up
    59
    ·
    2 days ago

    I don’t really understand the grounds for a suit here mostly because

    1. It’s publicly available information. They may “own” it but anyone can access it for free right?

    2. Didn’t ai training companies already get away with using copy written ip to train their models? Ex: movies, tv shows, ect.

  • RedditAdminsSuckIt@lemmy.world
    link
    fedilink
    arrow-up
    32
    ·
    2 days ago

    In their TOS, they own anything you post there. Didn’t they sell or scrape data of all its users in the recent past when they changed their TOS?

    They’re guilty of the same shit

    • wizardbeard@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      24
      ·
      2 days ago

      That’s the basis for this lawsuit though. Reddit adjusted its ToS to forbid anyone but their explicitly approved business partners to scrape Reddit data.

      I believe Google is the only company legally allowed to scrape Reddit data for AI training usage. Anthropic isn’t.

      • FaceDeer@fedia.io
        link
        fedilink
        arrow-up
        20
        ·
        2 days ago

        Did Anthropic accept the ToS? Reddit’s publishing their information on a public website that anyone can visit and read without agreeing to any terms. If they didn’t accept the ToS then the only thing regulating what you can do with that public information is the usual copyright. AI training has yet to be shown to be a violation of copyright.

  • ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    ·
    2 days ago

    They just want to make sure the users that make up their content are properly compensated, right?

    Right, motherfuckers?!?

    • blargle@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 days ago

      They want to make sure the other companies they already sold out their users to for AI content scraping are properly compensated.

  • _Momo_@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    2 days ago

    For people leaving Reddit: you need to scrub your posts before you delete your account. Or replace them with random nonsense is even better.

    If you delete the account first, your posts stay, your username is just updated to “deleted”.

    Scrub, then delete. There is software for this also, but I am not familiar enough to suggest something.

    For anyone familiar with “pillage, and then burn” same idea here.

    • dubyakay@lemmy.ca
      link
      fedilink
      arrow-up
      6
      ·
      2 days ago

      biggest problem is that API and post history only goes back 1000 comments. If you have ever made more than 1000 comments, the only way you are going to scramble them as nonsense is if you manage to find a permalink.

      • Tangent5280@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        2 days ago

        Wait, really? Thats a massive issue, and I didnt see any comments about this back in 2024 when everyone was migrating here and scrubbing their reddit accounts.

        • phx@lemmy.ca
          link
          fedilink
          arrow-up
          3
          ·
          1 day ago

          Back then the apps doing so might have had access to it via the API’s they later shuttered

      • topherclay@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        1 day ago

        Can you just use a web driver with Selenium or something to get the permalinks the way a human would and then scrub them that way? It’s not efficient but it’s only really a one-time use tool anyway so if it works then it works.

  • doug@lemmy.today
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    2 days ago

    I’m a little surprised they let me delete my account after they permabanned me, and that deleting my account deleted all of my 18 years-worth of posts and comments (and not just my username/profile). All that data they want to train their bots on, gone (at least publicly, anyway).

    • BassTurd@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      2 days ago

      I would imagine that your records are flagged as deleted in a DB, but they are still being used to train models for those that are paying.

      It does at least stop bots from scraping, but at that point, I’d almost rather have bots scrape just to create more overhead for Reddit and to lessen the value for the people paying for premium access.

    • RedditAdminsSuckIt@lemmy.world
      link
      fedilink
      arrow-up
      6
      ·
      2 days ago

      Same here. Oldest account was 12 years and I had 2 alts.

      Got permabanned on main but was still able to delete the account. Deleted my other 2 as well.

      I can still browse subs if I want to I just can’t interact. Browsing now doesn’t mean I’ve agreed to anything.

      Oh well, these are headaches I’ll never have to worry about

    • Krudler@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      2 days ago

      You’re not deleted and nothing is deleted.

      Ever been to an old post and it says posted by [deleted]? That’s what Reddit do.

      The only route is non-paricipation on the platform

  • untakenusername@sh.itjust.works
    link
    fedilink
    arrow-up
    8
    ·
    2 days ago

    ai companies should be in favor of expanding the fediverse because we dont have the resources to fight legal wars against them taking our posts and comments and training their stuff on it

    just a random thought

    • ramble81@lemm.ee
      link
      fedilink
      arrow-up
      7
      ·
      2 days ago

      You imply they’re not already here. All it takes is setting up a server that’s federated with common endpoints and then sucking everything in via ActivityPub. No need for scraping.