• 1 Post
  • 29 Comments
Joined 1 year ago
cake
Cake day: June 15th, 2023

help-circle

  • Using memory efficiently can give you a 10-100x win.

    Yes, it can. But why is this exclusive to assembly? What are you planning to do with your memory use in assembly that is not achievable in C++ or other languages? Memory optimizations are largely about data structures and access patterns. This is available to you in C++.

    Also, if you don’t want 90% of the craziness of C++ then why not just code in C++ without 90% of the craziness? As far as I know what’s what a lot of performance-critical projects do. They operate with a feature whitelist/blacklist. Don’t tell me you have the discipline to work entirely in assembly and the knowledge to beat the compiler at the low level stuff that is not available to you in C++ but you can’t manage avoiding the costly abstractions.

    I think it speaks volumes how rarely you hear about programs being programmed in assembly. It’s always this one game and never any meaningful way to prove that it would gain performance by not being written in C++ when using a modern compiler.



  • For anyone stumbling onto this who actually wants to be educated, the science has practically unanimously agreed that climate change is mainly caused by human activity. No expert is unaware of the cycles that temporarily affect climate. They are well studied, modeled, and found to pale in comparison to human-made climate change. You can find comparisons between human and natural drivers, with sources from expert organizations and scientific studies, here and here. Funnily enough, the NOAA, which this commenter used as a source for El Niño and La Niña below, also hosts this article which literally starts by linking to a page that points out how climate change is mostly caused by humans.







  • No, the intent and the consequences of an action are generally taken into consideration in discussions of ethins and in legislation. Additionally, this is not just a matter of ToS. What OpenAI does is create and distribute illegitimate derivative works. They are relying on the argument that what they do is transformative use, which is not really congruent with what “transformative use” has meant historically. We will see in time what the courts have to say about this. But in any case, it will not be judged the same way as a person using a tool just to skip ads. And Revanced is different to both the above because it is a non-commercial service.



  • Humans are not generally allowed to do what AI is doing! You talk about copying someone else’s “style” because you know that “style” is not protected by copyright, but that is a false equivalence. An AI is not copying “style”, but rather every discernible pattern of its input. It is just as likely to copy Walt Disney’s drawing style as it is to copy the design of Mickey Mouse. We’ve seen countless examples of AI’s copying characters, verbatim passages of texts and snippets of code. Imagine if a person copied Mickey Mouse’s character design and they got sued for copyright infringement. Then they go to court and their defense was that they downloaded copies of the original works without permission and studied them for the sole purpose of imitating them. They would be admitting that every perceived similarity is intentional. Do you think they would not be found guilty of copyright infringement? And AI is this example taken to the extreme. It’s not just creating something similar, it is by design trying to maximize the similarity of its output to its training data. It is being the least creative that is mathematically possible. The AI’s only trick is that it threw so many stuff into its mixer of training data that you can’t generally trace the output to a specific input. But the math is clear. And while its obvious that no sane person will use a copy of Mickey Mouse just because an AI produced it, the same cannot be said for characters of lesser known works, passages from obscure books, and code snippets from small free software projects.

    In addition to the above, we allow humans to engage in potentially harmful behavior for various reasons that do not apply to AIs.

    • “Innocent until proven guilty” is fundamental to our justice systems. The same does not apply to inanimate objects. Eg a firearm is restricted because of the danger it poses even if it has not been used to shoot someone. A person is only liable for the damage they have caused, never their potential to cause it.
    • We care about peoples’ well-being. We would not ban people from enjoying art just because they might copy it because that would be sacrificing too much. However, no harm is done to an AI when it is prevented from being trained, because an AI is not a person with feelings.
    • Human behavior is complex and hard to control. A person might unintentionally copy protected elements of works when being influenced by them, but that’s hard to tell in most cases. An AI has the sole purpose of copying patterns with no other input.

    For all of the above reasons, we choose to err on the side of caution when restricting human behavior, but we have no reason to do the same for AIs, or anything inanimate.

    In summary, we do not allow humans to do what AIs are doing now and even if we did, that would not be a good argument against AI regulation.



  • I have my own backup of the git repo and I downloaded this to compare and make sure it’s not some modified (potentially malicious) copy. The most recent commit on my copy of master was dc94882c9062ab88d3d5de35dcb8731111baaea2 (4 commits behind OP’s copy). I can verify:

    • that the history up to that commit is identical in both copies
    • after that commit, OP’s copy only has changes to translation files which are functionally insignificant

    So this does look to be a legitimate copy of the source code as it appeared on github!

    Clarifications:

    • This was just a random check, I do not have any reason to be suspicious of OP personally
    • I did not check branches other than master (yet?)
    • I did not (and cannot) check the validity of anything beyond the git repo
    • You don’t have a reason to trust me more than you trust OP… It would be nice if more people independently checked and verified against their own copies.

    I will be seeding this for the foreseeable future.





  • Im not 100% comfortable with AI gfs and the direction society could potentially be heading. I don’t like that some people have given up on human interaction and the struggle for companionship, and feel the need to resort to a poor artificial substitute for genuine connection.

    That’s not even the scary part. What we really shouldn’t be uncomfortable with is this very closed technology having so much power over people. There’s going to be a handful of gargantuan immoral companies controlling a service that the most emotionally vulnerable people will become addicted to.



  • LLMs can do far more

    What does this mean? I don’t care what you (claim) your model “could” do, or what LLMs in general could do. What we’ve got are services trained on images that make images, services trained on code that write code etc. If AI companies want me to judge the AI as if that is the product, then let them give us all equal and unrestricted access to it. Then maybe I would entertain the “transformative use” argument. But what we actually get are very narrow services, where the AI just happens to be a tool used in the backend and not part of the end product the user receives.

    Can it write stories in the style of GRRM?

    Talking about “style” is misleading because “style” cannot be copyrighted. It’s probably impractical to even define “style” in a legal context. But an LLM doesn’t copy styles, it copies patterns, whatever they happen to be. Some patterns are copyrightable, eg a character name and description. And it’s not obvious what is ok to copy and what isn’t. Is a character’s action copyrightable? It depends, is the action opening a door or is it throwing a magical ring into a volcano? If you tell a human to do something in the style of GRRM, they would try to match the medieval fantasy setting and the mood, but they would know to make their own characters and story arcs. The LLM will parrot anything with no distinction.

    Any writer claiming to be so unique that they aren’t borrowing from other writers is full of shit.

    This is a false equivalence between how an LLM works and how a person works. The core ideas expressed here is that we should treat products and humans equivalently, and that how an LLM functions is basically how humans think. Both of these are objectively wrong.

    For one, humans are living beings with feelings. The entire point of our legal system is to protect our rights. When we restrict human behavior it is justified because it protects others; at least that’s the formal reasoning. We (mostly) judge people based on what they’ve done and not what we know they could do. This is not how we treat products and that makes sense. We regulate weapons because they could kill someone, but we only punish a person after they have committed a crime. Similarly a technology designed to copy can be regulated, whereas a person copying someone else’s works could be (and often is) punished for it after it is proven that they did it. Even if you think that products and humans should be treated equally, it is a fact that our justice system doesn’t work that way.

    People also have many more functions and goals than an LLM. At this point it is important to remember that an LLM does literally one thing: for every word it writes it chooses the one that would “most likely” appear next based on its training data. I put “most likely” in quotes because it sounds like a form of prediction, but actually it is based on the occurrences of words in the training data only. It has nothing else to incorporate to its output, and it has no other need. It doesn’t have ideas or a need to express them. An LLM can’t build upon or meaningfully transform the works it copies, it’s only trick is mixing together enough data to make it hard for you to determine the sources. That can make it sometimes look original but the math is clear, it is always trying to maximize the similarity to the training data, if you consider choosing the “most likely” word at every step to be a metric of similarity. Humans are generally not trying to maximize their works’ similarity to other peoples’ works. So when a creator is inspired by another creator’s work, we don’t automatically treat that as an infringement.

    But even though comparing human behavior to LLM behavior is wrong, I’ll give you an example to consider. Imagine that you write a story “in the style of GRRM”. GRRM reads this and thinks that some of the similarities are a violation of his copyright so he sues you. So far it hasn’t been determined that you’ve done something wrong. But you go to court and say the following:

    • You pirated the entirety of GRRM’s works.
    • You studied them only to gain the ability to replicate patterns in your own work. You have no other user for them, not even personal satisfaction gained from reading them.
    • You clarify that replicating the patterns is achieved by literally choosing your every word to be the one that you determined GRRM would most likely use next.
    • And just to be clear you don’t who GRRM is or what he talks like. Your understanding of what word he would most likely use is based solely on the pirated works.
    • You had no original input of your own.

    How do you think the courts would view any similarities between your works? You basically confessed that anything that looks like a copy is definitely a copy. Are these characters with similar names and descriptions to GRRM’s characters just a coincidence? Of course not, you just explained that you chose those names specifically because they appear in GRRM’s works.