I fucked with the title a bit. What i linked to was actually a mastodon post linking to an actual thing. but in my defense, i found it because cory doctorow boosted it, so, in a way, i am providing the original source here.
please argue. please do not remove.
I think we should have a rule that says if a LLM company invokes fair use on the training inputs then the outputs are public domain.
The outputs are not copyrightable.
But something not being copyrightable doesn’t necessarily mean openly distributed.
It does mean OpenAI can’t really restrict or go after other companies training off of GPT-4 outputs though, which is occurring broadly.
Not just the outputs but the models as well