Recently I was wandering if there is someone or some group preserving , collecting , organizing and publishing all the knowledge of mankind ever created throughout its existence so that if ever mankind faces the 6th mass extinction we don’t have to reinvent the wheel and can have a kick start to our new post apocalyptic civilization .
You son of a bitch
Such an insightful commentary on the importance of the social contract and the irreplacibility of the individual. The only way forward is to share our personal experiences and strive for understanding. Once we know each other’s value, we will never surrender our common bonds, disappoint one another, go behind each other’s backs, nor do each other harm.
You’re using it right now.
All of Wikipedia is <256 gb.
All of Wikipedia in English <64 gb.
Then archive.org for multimedia, ~10 peta bytes. Yipes.
deleted by creator
According to my ex-, her.
Can confim
Haha you can have her. Good luck!
Please take her back. I’ll even pay you.
I think the internet as a whole is going to be the closest we’ll ever come. Capitalism will make sure it’s never even close to complete so it always has something to monetize.
you can read pretty much (except the lost media like those lost in library burnings , film destruction and wars) read any book written by humans since 2500 bce (example Rig Veda the first ved of Hinduism was written even before 2500 and is today said to be 98% at its original state thanks to Indian gurus and saints who passed it on orally and was made into a book only after 8th century) , watch any movie ever released , hear any music ever made after recording was invented .
ofcourse there is a catch that these medias are not freely and publicly available and most you have to pirate in order to consume it thus we need to have a centralised database of these things safely kept somewhere so that we don’t have to reinvent the wheel in case of a catastrophic event .
I wouldn’t say “complete” can even be sufficiently defined in this case. Every functional definition I can think of has a limiting factor.
Let’s try to define knowledge. What kind of information qualifies? We can usually think of important, useful info like physics and medicine. But what about other data, like sports game stats, atmospheric sensor readings, or even something more esoteric, like the location data of every object on earth.
And even if we could have the information of every single thing at any particular time, what about when things change in the next second? And the one afterwards?
Essentially, nothing will ever be “complete”. Thanks for listening to my rant on semantics.
That was a lovely rant on semantics. I thoroughly enjoyed reading it!
I’m surprised no one mentioned projects like libgen and scihub. They are much better than Wikipedia imo.
imo zlib is much better but they keep changing their domain … also sci hub is only for research papers which most people can understand
It’s never going to be all knowledge, since a lot of stuff is just lost or never recorded. A ton of stuff (like this thread) are probably low on the priority list for recording as well. But the closest you’d probably get to a full catalog of human knowledge (at last text based) are the huge data sets of nearly all text data on the internet used for training LLMs. I wouldn’t be surprised if there are ones soon that include video and pictures as well, since newer AI models are starting to be able to interpret those too.
I believe this is one of those data sets: https://github.com/yaodongC/awesome-instruction-dataset
Edit: here’s a big data set used for a lot of gpt3 https://commoncrawl.org/
Thanks