[HN Gopher] Loss of nearly a full decade of information from ear...
___________________________________________________________________
Loss of nearly a full decade of information from early days of
Chinese internet
Author : cubefox
Score : 161 points
Date : 2024-06-01 16:13 UTC (6 hours ago)
(HTM) web link (chinamediaproject.org)
(TXT) w3m dump (chinamediaproject.org)
| Ajay-p wrote:
| _Written by He Jiayan (He Jia Yan ), an internet influencer
| active since 2018, the essay concluded, based on a wide range of
| searches of various entertainment and cultural figures from the
| late 1990s through the mid-2000s, that nearly 100 percent of
| content from major internet portals and private websites from the
| first decade of China's internet has now been obliterated._
| lelandfe wrote:
| > _Within the selected date range of "May 22, 1998 to May 22,
| 2005" on Baidu, there is just one positive result for "Jack Ma"
| (dated May 22, 2024). [..] Click on the result and you'll find it
| is an article posted in 2021_
|
| US Google: About 2,580,000 results
|
| A pretty remarkable scrubbing of history.
| Arnavion wrote:
| https://www.google.com/search?q=jack+ma&tbs=cdr%3A1%2Ccd_min...
|
| First result for me is https://www.scb.co.th/en/personal-
| banking/stories/business-m... which Google thinks is from
| 2003-03-15, except it mentions COVID-19 so it obviously isn't.
|
| Second result is https://www.instagram.com/jack_overpower/feed/
| which Google thinks is from 2001-01-02, except Instagram didn't
| exist at that time. It might have pictures from 2001 though.
|
| Third result is http://pacificpower.foreignpolicy.com/15-jack-
| ma/ which Google thinks is from 1999-02-15, except it mentions
| Alibaba's 2014 IPO so it obviously isn't.
|
| Fourth result is
| https://www.facebook.com/story.php/?story_fbid=5041357966634...
| which Google doesn't show a date for, but it's a Facebook post
| from 2018.
|
| ...
|
| I don't doubt that some of those results are from 1998 to 2005,
| but the millions of results number specifically is meaningless.
| prophesi wrote:
| Yep; there may be a lack of incentive to preserve old sites,
| but what's worse are the ranking algorithms that prevent
| their discoverability in the first place.
| ccgreg wrote:
| Both the Internet Archive and Common Crawl have tools that
| reveal actual crawl dates. Search engines are not really
| intended to be archives, so it's no surprise that they
| aren't very good archives.
| bbarnett wrote:
| Not really prevented, the _huge_ one is http sites being
| down ranked heavily by google.
|
| But they are still there. Do a specific enough search and
| they'll be at the top of the search results.
| rasz wrote:
| Google has perfect vision of the past (didnt latest leak
| confirm they keep everything crawled indefinitely and have
| extensive historical records for all domains?) but zero
| incentive for redirecting you to old websites with no
| advertisements.
| MichaelZuo wrote:
| This is false many old forums are only sporadically indexed
| by Google even if you do verbatim text searches using the
| site:... modifier.
| boomboomsubban wrote:
| The "custom range" feature simultaneously feels broken, gamed
| by spammers, and intentionally being scrubbed. I'm surprised
| they haven't completely removed it yet.
| Ylpertnodi wrote:
| >except it mentions COVID-19 so it obviously isn't.
|
| Perhaps it was just updated?
|
| I generally ignore/ get annoyed by articles that don't have a
| date/ updated on, on the byline.
| mycall wrote:
| Sometimes you can find the date embedded inside the source
| asset files.
| asdasdsddd wrote:
| There's pretty much nothing in that time range on Baidu, I
| looked up Mao, George Washington, Yue fei (a popular chinese
| folk hero), Garlic bread, etc.
|
| But without the time filter, theres millions of search results.
| tw04 wrote:
| It's probably easier to just blanket scrub everything beyond
| a small set of allowed information (like positive articles
| about the party) than to selectively delete. Why do they care
| if valuable information is lost?
| mensetmanusman wrote:
| Par for the course.
|
| China as we (the world) knows it is only about 60 years old. This
| is more true as they go about systematically destroying their own
| history and forcing village traditions to be stamped out and
| guided towards the city life.
|
| Losing a blip of internet history during the regime of mass
| censorship is probably a blessing in disguise.
| wumeow wrote:
| > Posted on Wednesday, May 22, He's post had been removed from
| WeChat by the following day, yielding a 404 message that read:
| "This content violates regulations and cannot be viewed."
| actionfromafar wrote:
| He will be educated.
| alephnerd wrote:
| Is original MIT BBS still archived? I haven't used it for
| sometime.
| Cheer2171 wrote:
| There are a few commenters in this thread making blatant false
| equivalence with the Western internet. This post is about how on
| major search engines in China, you now set the years to 1998-2005
| and search for a non-controversial celebrity and you get zero
| search results from content actually published in that era.
|
| The loss of the early web due to web hosters not maintaining
| their own hosting and moving to walled gardens is painful and
| tragic, but it is not in any way similar (or functionally
| equivalent) to this blatant censorship.
| anonzzzies wrote:
| Yep... Only archive.org has it sometimes and then you need to
| search there because you won't find it via others.
| Cheer2171 wrote:
| But for the Western internet, it disappears because the
| people hosting those websites gave up, so all we have is
| archive.org. With this case, there appears to be a
| government-level purge.
| ccgreg wrote:
| The western Internet has a bunch of government archives, in
| addition to the Internet Archive and Common Crawl.
|
| Many of the government archives are not public for
| copyright reasons.
| wumeow wrote:
| Yes, this is like if nothing turned up for Bill Gates when you
| did a search for pre-2006 material.
| lostemptations5 wrote:
| Intentionally or not this is -- exactly -- what 1984 is all
| about: changing our perception of history by rewriting or erasing
| previous writings.
|
| Unfortunately alot of it from the article seems typical: blogs
| going off line as bloggers move to new technolgoies, social media
| companies going defunct or just not keeping old content.
|
| Alot of these things can happen in the west. Remember these books
| you could read? "The Feynman Letters", etc. I'm paraphrasing--
| but its impossible now.
|
| Think of this: emails? A person dies and their laptop dies or is
| disposed of -- they're all gone. In the past the physicality of
| the letters would persist. Not so now. All this correspondence
| vanishes.
|
| Facebook, are you kidding me? If someone famous thought to export
| their data -- and it can be found on a laptop still working (and
| you have the login password), then maybe. See above. This repeats
| and repeats for each system we interact with for communication.
|
| Aside from the laptop scenario-- all this is lost. We live now in
| a blackhole of historical details of information, and soon to be
| replaced by a fabricated history hallucinated by LLMs perhaps.
|
| Those that love historical understanding should be very worried.
| Cheer2171 wrote:
| Another false equivalence. "Intentionally or not" actually
| really matters here. It took work to maintain archives in the
| pre-digital era, and it takes work to maintain archives in the
| digital era. So many of those physical letters were lost,
| rotted, burned, etc.
|
| This is a purge, not a failure to maintain archives. This is
| like when during the Cultural Revolution, they literally burned
| archives and letters by intellectuals.
| bloomingeek wrote:
| I love your replay, your answer is the near perfect summing
| up of the issue! My view is some here in America are starting
| to get too lenient towards Russia and other authoritarian
| states. Do we not understand that these states want complete
| control and don't care how they get it? Information and
| educational purges are two of many ways this is done. After
| that, it gets dirty.
|
| Rule of thumb, if the Constitution says it stinks, it does.
| If we don't like something in it, work for a change. In China
| and Russia they don't have that right.
| jimbob45 wrote:
| Should the rewritten history still be preserved as history
| then?
| demosthanos wrote:
| > Posted on Wednesday, May 22, He's post had been removed from
| WeChat by the following day, yielding a 404 message that read:
| "This content violates regulations and cannot be viewed."
|
| You don't get your comments censored by commenting about
| natural entropy on the internet. You do get your comments
| censored by drawing attention to the censors.
|
| I get very tired of people drawing false equivalences between
| organic human behaviors in the West and intentional abuse by
| central authorities in China. We can and should do more to
| preserve our history in the West, but we are already preserving
| orders of magnitude more data per person than any of our
| ancestors could have dreamed of. There's no comparison between
| emails getting lost when someone dies and centralized censors
| actively purging old content to make it easier to change the
| party's narrative.
| pessimizer wrote:
| > I get very tired of people drawing false equivalences
| between organic human behaviors in the West
|
| I get tired of people referring to intelligence agencies as
| organic human behaviors. The farming of anti-whatever the
| DoD, administration, or intelligence agencies dislike is
| anything but organic, and has had millions to billions of
| dollars of the budget assigned to it since WWII.
|
| Forums like this are completely helpless against it. All
| anyone has to do is farm a few accounts to flag the mildest
| mention of this out of existence, and to upvote the most
| obtuse, simplistic anti-enemy animus to the top.
|
| Very few actual people are fooled by this. The US is
| _jealous_ of the control China has over the discussions its
| citizens have, and is closing the gap quickly and
| dishonestly. I 'm jealous of the fact that Chinese people
| speaking out of the government-range just get deleted, rather
| than patronized.
| trealira wrote:
| I don't see how your comment relates to the parent's
| comment; however, here's a reply.
|
| > All anyone has to do is farm a few accounts to flag the
| mildest mention of this out of existence, and to upvote the
| most obtuse, simplistic anti-enemy animus to the top.
|
| Have you considered that the negative sentiment against
| Russia and China is genuine? I know of no evidence that the
| DoD has shills or bots upvoting pro-US-government comments
| and downvoting other ones. People probably just read the
| news and form their opinions that way, and there's a
| variety of different news sources with many different
| perspectives, which don't get censored.
|
| > I'm jealous of the fact that Chinese people speaking out
| of the government-range just get deleted, rather than
| patronized.
|
| It's strange to be jealous of them not having protection
| from government censorship.
| WalterBright wrote:
| I've love to have a single letter from some of my ancestors.
| Natsu wrote:
| I have one, actually, from my grandpa's generation. He told
| another family member about his time growing up in the
| early 1900s, riding trolleys and eating Walnettos (a
| strange Walnut-based candy bar). Then the Spanish Flu came
| around and the eldest sister just died at the breakfast
| table one day. Later, the family rallied together to care
| for each other after his father lost his job due to
| automation. He moved on to doing odd jobs, then later fell
| off the roof and broke his back, ending up as an invalid
| for the rest of his days. They talked about the cherry
| trees they used to feed themselves, which explains
| grandma's fondness for the cherry soup I hated so much, and
| how my grandma and grandpa got married and took care of
| great grandpa while he was invalid.
|
| They also talked about how Wonder Bread (the original
| sliced bread and origin of the phrase "best thing since
| sliced bread") came into town and the eldest son went to
| work for them to support the family after the local baker
| he had worked for folded, lost a finger to the machinery.
| At some point, he had some kind of heated dispute at work
| due to this, was beaten by security, and as I'm told, died
| from injuries sustained during that beating some time
| afterwards.
|
| It was a weird little window into bits of family history
| that would have otherwise been erased.
| yorwba wrote:
| The original post was about natural entropy on the internet.
| Websites from 2005 that have disappeared or been redesigned
| so that you can't find their old content anymore, and the
| uselessness of search engines, domestic or foreign, for date
| range queries reaching that far back into the past. Even on
| the Internet Archive, the earliest working snapshot of Baidu
| Tieba is from 2006.
|
| You may think that it's impossible for an innocuous post to
| get censored unless it has inadvertently unmasked a
| conspiracy to bury the past, but censorship decisions also
| get made to prevent unwanted _reactions_. If a post about
| disappearing content inspires people to complain about
| censorship, that 's enough to suppress it.
|
| If the disappearance of old websites were entirely
| deliberate, you'd also need to explain why the West is in on
| it.
| demosthanos wrote:
| > The original post was about natural entropy on the
| internet.
|
| The post by He Jiayan was, but that post was taken down for
| violating regulations. TFA is largely about the censorship
| angle which He Jiayan specifically avoided talking about
| (not that it helped him).
|
| > If the disappearance of old websites were entirely
| deliberate, you'd also need to explain why the West is in
| on it.
|
| Name one figure who was prominent in between 1995-2005 who
| you can't find any content about from that era when using
| Google's date filters. A single figure.
|
| Some sites go down organically. It happens. _Every_ site
| that references a figure who was once favored and is now
| out of favor? That doesn 't happen in the Western internet.
| yterdy wrote:
| Recently: Google refuses to turn up old pages. I was recently
| searching for a person who used to have a notable web presence
| before passing away about a decade ago. I had to dig to find a
| few links, through DDG and Yandex.
| flir wrote:
| Yandex is getting more and more of my web queries lately.
| There's a definite irony there.
| netsharc wrote:
| Google and Bing (so DuckDuckGo as well) seem to like
| searching for synonyms of search terms and returning the
| most popular results, thinking popular means relevant. I
| remember looking for something where I remembered the exact
| terms and not getting anywhere with them, but on Yandex it
| was the first hit.
| jncfhnb wrote:
| I would guess that 99.9% of letters are destroyed
| akira2501 wrote:
| > In the past the physicality of the letters would persist
|
| I'm willing to bet that these physical letters have
| historically fared about as well as our digital letters are;
| otherwise, our world would be absolutely filled with the
| written detritus of the past.
|
| > Those that love historical understanding should be very
| worried.
|
| As humans we've always disposed of more than we've kept. It's
| just not worth the energy cost to operate any other way.
| Thankfully history is recorded as several overlapping
| collections and not as a series of single data points.
| abecedarius wrote:
| Tangential, but what is "The Feynman Letters" here? I know of a
| book of some of his letters, but not about censorship/loss
| thereof.
| ck2 wrote:
| China is too easy of an example of rewriting history by political
| will.
|
| In North Korea it is illegal to mention famine or hunger.
|
| In Florida it is illegal to mention climate change in any state
| document.
| tromp wrote:
| > In Florida it is illegal to mention climate change in any
| state document.
|
| citation needed. Oh, found one:
| https://www.miamiherald.com/news/state/florida/article129837...
| ( https://archive.is/P9k4m )
|
| > DEP officials have been ordered not to use the term "climate
| change" or "global warming" in any official communications,
| emails, or reports
|
| I'm not sure that amounts to illegal, but they did at least
| make it career impairing. Would be interesting to see someone
| sue for wrongful termination on that basis...
___________________________________________________________________
(page generated 2024-06-01 23:01 UTC)