I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won’t work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
To be clear, I am not minimizing the problems of scrapers. I am merely pointing out that this strategy of proof-of-work has nasty side effects and we need something better.
These issues are not short term. PoW means you are entering into an arms race against an adversary with bottomless pockets that inherently requires a ton of useless computations in the browser.
When it comes to moving towards something based on heuristics, which is what the developer was talking about there, that is much better. But that is basically what many others are already doing (like the “I am not a robot” checkmark) and fundamentally different from the PoW that I argue against.
You can’t freely download and edit society. You can here, because this is FOSS. You could download it now and change it, or improve it however you’d like. But, you can’t, because you’re just pretending to be concerned about issues that are, from what I can read, only you have encountered.
Anubis long term actually costs them millions and billions more in energy to run browser and more code. Either way they have to add shit to the bots which costs all the companies money.
It takes like half a second on my Fairphone 3, and the CPU in this thing is absolute dogshit. I also doubt that the power consumption is particularly significant compared to the overhead of parsing, executing and JIT-compiling the 14MiB of JavaScript frameworks on the actual website.
It depends on the website’s setting. I have the same phone and there was one website where it took more than 20 seconds.
The power consumption is significant, because it needs to be. That is the entire point of this design. If it doesn’t take significant a significant number of CPU cycles, scrapers will just power through them. This may not be significant for an individual user, but it does add up when this reaches widespread adoption and everyone’s devices have to solve those challenges.
The usage of the phone’s CPU is usually around 1w, but could jump to 5-6w when boosting to solve a nasty challenge. At 20s per challenge, that’s 0.03 watt hours. You need to see a thousand of these challenges to use up 0.03 kwh
My last power bill was around 300 kwh or 10,000 more than what your phone would use on those thousand challenges. Or a million times more than what this 20s challenge would use.
It is basically instantaneous on my 12 year old Keppler GPU Linux Box. It is substantially less impactful on the environment than AI tar pits and other deterrents. The Cryptography happening is something almost all browsers from the last 10 years can do natively that Scrapers have to be individually programmed to do. Making it several orders of magnitude beyond impractical for every single corporate bot to be repurposed for. Only to then be rendered moot, because it’s an open-source project that someone will just update the cryptographic algorithm for. These posts contain links to articles, if you read them you might answer some of your own questions and have more to contribute to the conversation.
It is basically instantaneous on my 12 year old Keppler GPU Linux Box.
It depends on what the website admin sets, but I’ve had checks take more than 20 seconds on my reasonably modern phone. And as scrapers get more ruthless, that difficulty setting will have to go up.
The Cryptography happening is something almost all browsers from the last 10 years can do natively that Scrapers have to be individually programmed to do. Making it several orders of magnitude beyond impractical for every single corporate bot to be repurposed for.
At best these browsers are going to have some efficient CPU implementation. Scrapers can send these challenges off to dedicated GPU farms or even FPGAs, which are an order of magnitude faster and more efficient. This is also not complex, a team of engineers could set this up in a few days.
Only to then be rendered moot, because it’s an open-source project that someone will just update the cryptographic algorithm for.
There might be something in changing to a better, GPU resistant algorithm like argon2, but browsers don’t support those natively so you would rely on an even less efficient implementation in js or wasm. Quickly changing details of the algorithm in a game of whack-a-mole could work to an extent, but that would turn this into an arms race. And the scrapers can afford far more development time than the maintainers of Anubis.
These posts contain links to articles, if you read them you might answer some of your own questions and have more to contribute to the conversation.
This is very condescending. I would prefer if you would just engage with my arguments.
Scrapers can send these challenges off to dedicated GPU farms or even FPGAs, which are an order of magnitude faster and more efficient.
Lets assume for the sake of argument, an AI scraper company actually attempted this. They don’t, but lets assume it anyway.
The next Anubis release could include (for example), SHA256 instead of SHA1. This would be a simple, and basically transparent update for admins and end users. The AI company that invested into offloading the PoW to somewhere more efficient now has to spend significantly more resources changing their implementation than what it took for the devs and users of Anubis.
Yes, it technically remains a game of “cat and mouse”, but heavily stacked against the cat. One step for Anubis is 2000 steps for a company reimplementing its client in more efficient hardware. Most of the Anubis changes can even be done without impacting the end users at all. That’s a game AI companies aren’t willing to play, because they’ve basically already lost. It doesn’t really matter how “efficient” the implementation is, if it can be rendered unusable by a small Anubis update.
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won’t work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
Sounds like the developer of Anubis is aware and working on these shortcomings.
Still, IMO these are minor short term issues compared to the scope of the AI problem it’s addressing.
To be clear, I am not minimizing the problems of scrapers. I am merely pointing out that this strategy of proof-of-work has nasty side effects and we need something better.
These issues are not short term. PoW means you are entering into an arms race against an adversary with bottomless pockets that inherently requires a ton of useless computations in the browser.
When it comes to moving towards something based on heuristics, which is what the developer was talking about there, that is much better. But that is basically what many others are already doing (like the “I am not a robot” checkmark) and fundamentally different from the PoW that I argue against.
Go do heuristics, not PoW.
Youre more than welcome to try and implement something better.
“You criticize society yet you participate in it. Curious.”
You can’t freely download and edit society. You can here, because this is FOSS. You could download it now and change it, or improve it however you’d like. But, you can’t, because you’re just pretending to be concerned about issues that are, from what I can read, only you have encountered.
That last paragraph is nothing but defeatism
On the contrary, I’m hoping for a solution that is better than this.
Do you disagree with any part of my assessment? How do you think Anubis will work long term?
Anubis long term actually costs them millions and billions more in energy to run browser and more code. Either way they have to add shit to the bots which costs all the companies money.
It takes like half a second on my Fairphone 3, and the CPU in this thing is absolute dogshit. I also doubt that the power consumption is particularly significant compared to the overhead of parsing, executing and JIT-compiling the 14MiB of JavaScript frameworks on the actual website.
It depends on the website’s setting. I have the same phone and there was one website where it took more than 20 seconds.
The power consumption is significant, because it needs to be. That is the entire point of this design. If it doesn’t take significant a significant number of CPU cycles, scrapers will just power through them. This may not be significant for an individual user, but it does add up when this reaches widespread adoption and everyone’s devices have to solve those challenges.
The usage of the phone’s CPU is usually around 1w, but could jump to 5-6w when boosting to solve a nasty challenge. At 20s per challenge, that’s 0.03 watt hours. You need to see a thousand of these challenges to use up 0.03 kwh
My last power bill was around 300 kwh or 10,000 more than what your phone would use on those thousand challenges. Or a million times more than what this 20s challenge would use.
It is basically instantaneous on my 12 year old Keppler GPU Linux Box. It is substantially less impactful on the environment than AI tar pits and other deterrents. The Cryptography happening is something almost all browsers from the last 10 years can do natively that Scrapers have to be individually programmed to do. Making it several orders of magnitude beyond impractical for every single corporate bot to be repurposed for. Only to then be rendered moot, because it’s an open-source project that someone will just update the cryptographic algorithm for. These posts contain links to articles, if you read them you might answer some of your own questions and have more to contribute to the conversation.
It depends on what the website admin sets, but I’ve had checks take more than 20 seconds on my reasonably modern phone. And as scrapers get more ruthless, that difficulty setting will have to go up.
At best these browsers are going to have some efficient CPU implementation. Scrapers can send these challenges off to dedicated GPU farms or even FPGAs, which are an order of magnitude faster and more efficient. This is also not complex, a team of engineers could set this up in a few days.
There might be something in changing to a better, GPU resistant algorithm like argon2, but browsers don’t support those natively so you would rely on an even less efficient implementation in js or wasm. Quickly changing details of the algorithm in a game of whack-a-mole could work to an extent, but that would turn this into an arms race. And the scrapers can afford far more development time than the maintainers of Anubis.
This is very condescending. I would prefer if you would just engage with my arguments.
Lets assume for the sake of argument, an AI scraper company actually attempted this. They don’t, but lets assume it anyway.
The next Anubis release could include (for example), SHA256 instead of SHA1. This would be a simple, and basically transparent update for admins and end users. The AI company that invested into offloading the PoW to somewhere more efficient now has to spend significantly more resources changing their implementation than what it took for the devs and users of Anubis.
Yes, it technically remains a game of “cat and mouse”, but heavily stacked against the cat. One step for Anubis is 2000 steps for a company reimplementing its client in more efficient hardware. Most of the Anubis changes can even be done without impacting the end users at all. That’s a game AI companies aren’t willing to play, because they’ve basically already lost. It doesn’t really matter how “efficient” the implementation is, if it can be rendered unusable by a small Anubis update.
How will Anubis attack if browsers start acting like manual scrapers used by AI companies to collect information?
Because OpenAI is planning to release an AI-powered browser, what happens if it ends up using it as another way to collect information?
Blocking all Chromium browsers, I don’t think, is a good idea.
Means absolutely nothing in context to what I said, or any information contained in this article. Does not relate to anything I originally replied to.
Not what’s happening here. Be Serious.
I did, your arguments are bad and you’re being intellectually disingenuous.
Yeah, that’s the point. Very Astute
If you’re deliberately belittling me I won’t engage. Goodbye.
A javascriptless check was released recently I just read about it. Uses some refresh HTML tag and a delay. Its not default though since its new.
The source I assume: challenges/metarefresh.