This new data poisoning tool lets artists fight back against generative AI

ElectroVagrant@lemmy.world · 2 years ago

This new data poisoning tool lets artists fight back against generative AI

Margot Robbie@lemmy.world · 2 years ago

It’s made by Ben Zhao? You mean the “anti AI plagerism” UChicago professor who illegally stole GPLv3 code from an open source program called DiffusionBee for his proprietary Glaze software (reddit link), and when pressed, only released the code for the “front end” while still being in violation of GPL?

The Glaze tool that promised to be invisible to the naked eyes, but contained obvious AI generated artifacts? The same Glaze that reddit defeated in like a day after release?

Don’t take anything this grifter says seriously, I’m surprised he hasn’t been suspended for academic integrity violation yet.

ElectroVagrant@lemmy.world · 2 years ago

Thanks for added background! I haven’t been monitoring this area very closely so wasn’t aware, but I’d have thought a publication that has been would then be more skeptical and at least mention some of this, particularly highlighting disputes over the efficacy of the Glaze software. Not to mention the others they talked to for the article.

Figures that in a space rife with grifters you’d have ones for each side.

Zeth0s@lemmy.world · edit-2 2 years ago

Don’t worry, it is normal.

People don’t understand AI. Probably all articles I have read on it by mainstream media were somehow wrong. It often feels like reading a political journalist discussing about quantum mechanics.

My rule of thumb is: always assume that the articles on AI are wrong. I know it isn’t nice, but that’s the sad reality. Society is not ready for AI because too few people understand AI. Even AI creators don’t fully understand AI (this is why you often hear about “emergent abilities” of models, it means “we really didn’t expect it and we don’t understand how this happened”)

ElectroVagrant@lemmy.world · edit-2 2 years ago

Probably all articles I have read on it by mainstream media were somehow wrong. It often feels like reading a political journalist discussing about quantum mechanics.

Yeah, I view science/tech articles from sources without a tech background this way too. I expected more from this source given that it’s literally MIT Tech Review, much as I’d expect more from other tech/science-focused sources, albeit I’m aware those require scrutiny just as well (e.g. Popular Science, Nature, etc. have spotty records from what I gather).

Also regarding your last point, I’m increasingly convinced AI creators’ (or at least their business execs/spokespeople) are trying to have their cake and eat it too in terms of how much they claim to not know/understand how their creations work while also promoting how effective it is. On one hand, they genuinely don’t understand some of the results, but on the other, they do know enough of how it works to have an idea of how/why those results came about, however it’s to their advantage to pretend they don’t insofar as it may mitigate their liability/responsibility should the results lead to collateral damage/legal issues.

joel_feila@lemmy.world · 2 years ago

By that logic humanity isnt ready for personal computers since few understand how they work.

Zeth0s@lemmy.world · edit-2 2 years ago

Kind of true. Check the law proposals on encryption around the world…

Technology is difficult, most people don’t understand it, result is awful laws. AI is even more difficult, because even creators don’t fully understand it (see emergent behaviors, i.e. capabilities that no one expected).

Computers luckily are much easier. A random teenager knows how to build one, and what it can do. But you are right, many are not yet ready even for computers

joel_feila@lemmy.world · 2 years ago

I read an article the other day about managers complaining about zoomers not even knowing how type on a keyboard.

P03 Locke@lemmy.dbzer0.com · 2 years ago

who illegally stole GPLv3 code from an open source program called DiffusionBee for his proprietary Glaze software (reddit link), and when pressed, only released the code for the “front end” while still being in violation of GPL?

Oh, how I wish the FSF had more of their act together nowadays and were more like the EFF or ACLU.

Margot Robbie@lemmy.world · 2 years ago

You should check out the decompilation they did on Glaze too, apparently it’s hard coded to throw out a fake error upon detecting being ran on an A100 as some sort of anti-adversarial training measure.

Dadifer@lemmy.world · 2 years ago

Thank you, Margot Robbie! I’m a big fan!

Margot Robbie@lemmy.world · 2 years ago

You’re welcome. Bet you didn’t know that I’m pretty good at tech too.

Also, that’s Academy Award nominated character actress Margot Robbie to you!

Blaster M@lemmy.world · 2 years ago

Oh no, another complicated way to jpeg an image that an ai training program will be able to just detect and discard in a week’s time.

egeres@lemmy.world · 2 years ago

Here’s the paper: https://arxiv.org/pdf/2302.04222.pdf

I find it very interesting that someone went in this direction to try to find a way to mitigate plagiarism. This is very akin to adversarial attacks in neural networks (you can read more in this short review https://arxiv.org/pdf/2303.06032.pdf)

I saw some comments saying that you could just build an AI that detects poisoned images, but that wouldn’t be feasible with a simple NN classifier or feature-based approaches. This technique changes the artist style itself to something the AI would see differently in the latent space, yet, visually perceived as the same image. So if you’re changing to a different style the AI has learned, it’s fair to assume it will be realistic and coherent. Although maaaaaaaybe you could detect poisoned images with some dark magic tho, get the targeted AI then analyze the latent space to see if the image has been tampered with

On the other hand, I think if you build more robust features and just scale the data this problems might go away with more regularization in the network. Plus, it assumes you have the target of one AI generation tool, there are a dozen of these, and if someone trains with a few more images in a cluster, that’s it, you shifted the features and the poisoned images are invalid

nandeEbisu@lemmy.world · 2 years ago

Haven’t read the paper so not sure about the specifics, but if it relies on subtle changes, would rounding color values or down sampling the image blur that noise away?

RubberElectrons@lemmy.world · edit-2 2 years ago

Removed by mod

MamboGator@lemmy.world · edit-2 1 year ago

deleted by creator

ElectroVagrant@lemmy.world · 2 years ago

Until the law catches up with the technology, people need ways of protecting themselves.

I agree, and I wonder if the law might be kicked into catching up quicker as more companies try to adopt these tools and inadvertently infringe on other companies’ copyrighted material. 😅

regbin_@lemmy.world · edit-2 2 years ago

Disagree. It’s only unethical if you use it to generate the artist’s existing pieces and claim it as yours.

MamboGator@lemmy.world · edit-2 1 year ago

deleted by creator

Vodik_VDK@lemmy.world · 2 years ago

New CAPCHA just dropped.

wizardbeard@lemmy.dbzer0.com · 2 years ago

This is already a concept in the AI world and is often used while a model is being trained specifically to make it better. I believe it’s called adversarial training or something like that.

Mango@lemmy.world · 2 years ago

No, that’s something else entirely. Adversarial training is where you put an ai against a detector AI as a kind of competition for results.

afraid_of_zombies@lemmy.world · 2 years ago

I am waiting for the day that some obsessed person starts finding ways to do like code injection in pictures.

𝕽𝖔𝖔𝖙𝖎𝖊𝖘𝖙@lemmy.world · 2 years ago

Here ya go

gregorum@lemm.ee · 2 years ago

Ooo, this is fascinating. It reminds me of that weird face paint that bugs out facial-recognition in CCTV cameras.

seaQueue@lemmy.world · edit-2 2 years ago

Or the patterned vinyl wraps they used on test cars that interferes with camera autofocus.

penix@sh.itjust.works · edit-2 2 years ago

Removed by mod

Meowoem@sh.itjust.works · 2 years ago

It doesn’t even need a work around, it’s not going to affect anything when training a model.

It might make style transfer harder using them as reference images on some models but even that’s fairly doubtful, it’s just noise on an image and everything is already full of all sorts of different types of noise.

hh93@lemm.ee · 2 years ago

The problem is identifying it. If it’s necessary to preprocess every image used for training instead of just feeding it is a model that already makes it much more resources costly

RVMWSN@lemmy.ml · edit-2 2 years ago

deleted by creator

Ataraxia@sh.itjust.works · 2 years ago

As an artist I agree. People are being so irrational with this.

ElectroVagrant@lemmy.world · 2 years ago

I generally don’t believe in intellectual property, I think it creates artificial scarcity and limits creativity. Of course the real tragedies in this field have to do with medicine and other serious business.

But still, artists claiming ownership of their style of painting is fundamentally no different. Why can’t I paint in your style? Do you really own it? Are you suggesting you didn’t base your idea mostly on the work of others, and no one in turn can take your idea, be inspired by it and do with it as they please? Do my means have to be a pencil, why can’t my means be a computer, why not an algorythm?

Limitations, limitations, limitations. We need to reform our system and make the public domain the standard for ideas (in all their forms). Society doesn’t treat artists properly, I am well aware of that. Generally creative minds are often troubled because they fall outside norms. There are many tragic examples. Also money-wise many artists don’t get enough credit for their contributions to society, but making every idea a restricted area is not the solution.

People should support the artists they like on a voluntary basis. Pirate the album but go to concerts, pirate the artwork but donate to the artist. And if that doesn’t make you enough money, that’s very unfortunate. But make no mistake: that’s how almost all artists live. Only the top 0.something% actually make enough money by selling their work, and that’s is usually the percentile that’s best at marketing their arts, in other words: it’s usually the industry. The others already depend upon donations or other sources of income.

We can surely keep art alive, while still removing all these artificial limitations, copying is, was and will never be in any way similar to stealing. Let freedom rule. Join your local pirate party.

Reformatted for easier readability.

aesthelete@lemmy.world · edit-2 2 years ago

deleted by creator

TropicalDingdong@lemmy.world · 2 years ago

The AI can have some NaN, as a treat…

Smoogs@lemmy.world · 2 years ago

As a topping on some Pi

zwaetschgeraeuber@lemmy.world · 2 years ago

this is so dumb and clear it wont work at all. thats not the slightest how ai trains on images.

you would be able to get around this tool by just doing the nft thing and screenshot the image and boom code in the picture is erased.