[HN Gopher] Most downloads of the entire Wikipedia per country
___________________________________________________________________
Most downloads of the entire Wikipedia per country
Author : WithinReason
Score : 32 points
Date : 2022-03-22 13:29 UTC (9 hours ago)
(HTM) web link (stats.kiwix.org)
(TXT) w3m dump (stats.kiwix.org)
| pmfgpmfg wrote:
| marginalia_nu wrote:
| I'm doing my part o7
|
| It's seriously a very interesting and useful dataset that you can
| do a lot of fun stuff with, if you grab one of the zims without
| pictures it's of very manageable size too of just a few dozen
| gigabytes compressed, and there are reasonably good library
| support in many languages.
|
| Last point doesn't go for Java. Only one I could find for that
| was this <https://github.com/openzim/libzim>, it's antique and
| extremely poorly optimized and lacks support for newer
| compression schemes. I have fixed the performance and added
| support for zstd compression, but not published the code as it's
| extremely not finished and major features in the original
| codebase are very broken. I'll get around to sharing the code
| some day but right now it's basically permanently mid surgery as
| I've only patched so far as to get it to extract all or specific
| files. If anyone wants a copy of this code regardless of state,
| give me a holler.
| fragmede wrote:
| Interesting that Russia is at almost 2x the next country (USA).
| nisegami wrote:
| Curiously, that's the relationship between the first and second
| highest frequencies for the Zipfian distribution. However,
| third place and beyond are much smaller than they should be
| under that distribution.
| davidgerard wrote:
| Russia has already written to the Wikimedia Foundation
| demanding that they take down Russian Wikipedia's well-sourced
| and factual article on the _cough_ special operation. Wikimedia
| said "lol no," of course.
| WithinReason wrote:
| There have been worries that Russia might soon ban Wikipedia,
| so people have been downloading it
| nperez wrote:
| I'd do exactly this if I were worried about losing connectivity
___________________________________________________________________
(page generated 2022-03-22 23:02 UTC)