Should lemmy.ml block chatgpt scraping in robots.txt?

GnuLinuxDude@lemmy.ml · 2 years ago

Should lemmy.ml block chatgpt scraping in robots.txt?

harshvoldy@lemmy.world · 2 years ago

deleted by creator

totallynotarobot@lemmy.world · 2 years ago

If they’ll pay us when they scrape our content, sure.

Geist_@lemmy.world · 2 years ago

… Is that like a non-argument? How do you suppose they would pay sites, let alone site users to scrape their content?

totallynotarobot@lemmy.world · 2 years ago

Yes that’s the point

maegul (he/they)@lemmy.ml · 2 years ago

I think this is a general question and problem for the whole fediverse, and can easily lead to the question of whether, or even when the fediverse is going to embrace having closed or private spaces or even invite only spaces, in order to try to secure some “human interaction only” social media.

7heo@lemmy.ml · 2 years ago

That won’t stop OpenAI. We need actual blocking, on the server side. Problem is, with federation and all, it will be really, really difficult to do. And expensive.

Mechanismatic@lemmy.ml · edit-2 2 years ago

I can understand privacy concerns, but I feel like it’s inevitable that LLMs will be used to make lots of decisions, some possibly important, so wouldn’t you want some content included in its training? For instance, would you want an LLM to be ignorant of FOSS because all the FOSS sites blocked it, and then a child asks an LLM for advice on software and gets recommended Microsoft and Apple products only?

Geist_@lemmy.world · edit-2 2 years ago

… It’s probably going to recommend paid and non-FOSS apps and programs just on the basis that those companies probably will pay to be the top suggestions. Just like google ads. So no, I don’t think that’s a good enough reason. They can still scrape wiki’s if they need info on FOSS sites, imo. Those shouldn’t (?) block AI’s and other aggregators.

Roastchicken@lemmy.world · 2 years ago