[HN Gopher] How to circumvent Sci-Hub ISP block
___________________________________________________________________
How to circumvent Sci-Hub ISP block
Author : tmkadamcz
Score : 230 points
Date : 2021-06-09 19:12 UTC (3 hours ago)
(HTM) web link (fragile-credences.github.io)
(TXT) w3m dump (fragile-credences.github.io)
| beermonster wrote:
| Just setup a VPN on some cheap cloud provider.
|
| There are lots of sites UK ISPs block even though the sites
| themselves are not illegal or host illegal content. For e.g.
| torrent indexing services (the content itself _may_ be illegal
| but purely providing a search across that content is basically
| doing what Google do).
|
| The UK internet is heavily filtered/censored and so doing this is
| useful anyway.
|
| Business ISP connections don't seem to be restricted. And neither
| do most cloud providers I've tried.
|
| Might be better than using temporary proxies.
| Reason077 wrote:
| My UK ISP (one of the major UK mobile networks) does not block
| sci-hub. They do block torrent sites such as Pirate Bay, etc.
| Quarrel wrote:
| I travelled last week, and was horrified by how much is blocked
| by the mainstream ISPs in the UK.
|
| Afaik, my (London) ISP does not block anything. No idea why, as
| all the others quote high court orders.
| Reason077 wrote:
| Many UK ISPs have "adult content" filters, which tend to be
| wide-reaching and block a lot more than just porn sites. But
| these are optional and can be turned off very easily.
|
| There's a smaller set of non-optional blocks (pirate/torrent
| sites) which you need a VPN to get around.
| sabjut wrote:
| This seems to be just another small step towards a future
| where only pre-approved websites are accessible via the
| method most people will use. It will not be called banning,
| this is just a measure to "ensure that the content we are
| serving to our users is held against our high quality
| standards" or the classic "it's to protect the children" or
| to "condemn terrorism".
|
| Porn is not really Illegal, just unwanted, which is
| apparently reason enough to block it. Does this mean any
| content which is "unwanted" can be blocked just like that?
| regularfry wrote:
| I have a theory that the ISPs over-block with their adult
| content filters, so you've got plausible deniability and
| don't have to ring up and say "I want porn please." Because
| the alternative is that they lose a customer to an ISP who
| doesn't embarrass them.
| koheripbal wrote:
| Are there good recommendations for privacy centric cloud
| providers?
| dredmorbius wrote:
| A few years back, frustrated with increasaed DNS blocking of Sci-
| Hub, I wrote a quick DNSMasq hack (haq?) to return Sci-Hub IPs
| for any "sci-hub.<domain>" possible. The shins-n-grits factor of
| surfing "scihub.elsevier.com" were palpable.
|
| https://old.reddit.com/r/Scholar/comments/7m3uin/meta_if_you...
|
| As others have mentioned, Sci-Hub also maintains a Tor presence,
| and you can access the Onion link using the Tor browser (provided
| you can install that on your desktop or device).
|
| https://scihub22266oqcxt.onion
| tpmx wrote:
| What is forcing these UK ISPs to block these IP ranges?
| dijit wrote:
| The UK Government passed a few legislations to this effect and
| have been doing so since 2015~
|
| Also: all data is required to be logged, and those logs are
| searchable by civil servants without a warrant.
| beermonster wrote:
| The government have passed quite a few bills this year aimed at
| locking down on this kind of thing whilst people were pre-
| occupied with COVID.
| Quarrel wrote:
| I don't know. Virgin Media quote High Court orders saying they
| have to block several I tried, but my home ISP does not block
| any of them. These seems weird to me, that the court orders
| would cover a few specific ISPs, but I haven't looked in to it
| further.
| tpmx wrote:
| Are those High Court orders secret? Is there any journalism
| happening?
| gruez wrote:
| You're probably better off using a paid VPN provider than a paid
| proxy provider. A VPN can be used in more places, and some VPNs
| provide http proxy access (the kind used in the tutorial) in
| additional to openvpn/wireguard. If they have a browser
| extension, chances are they support http proxy.
| xvector wrote:
| This is the reason Tor exists.
| leephillips wrote:
| If you can't get to sci-hub and you need a (free) copy of a
| paper, there are several other ways to get it: https://lee-
| phillips.org/articleAccess/
| StavrosK wrote:
| Too bad the Handshake domain donation to SciHub didn't pan out.
| igbk wrote:
| How do the alternative domains fare in the UK? With https://sci-
| hub.st/ I can circumvent the ISP block in Sweden successfully.
| GordonS wrote:
| Works for me on Plusnet in the UK, thanks!
| hkt wrote:
| Works fine here, on EE network in the UK.
| Gormisdomai wrote:
| I'm UK based and can't hit it on home wifi
| tmkadamcz wrote:
| https://sci-hub.st does not work in the UK [edit: on Sky].
| londons_explore wrote:
| Works on BT broadband
| weavie wrote:
| Virgin says no.
|
| Interestingly I get this:
|
| Secure Connection Failed
|
| An error occurred during a connection to sci-hub.st. SSL
| received a record that exceeded the maximum permissible
| length.
|
| Error code: SSL_ERROR_RX_RECORD_TOO_LONG
| AndrewDucker wrote:
| Yeah, that's how Virgin implement their blocking
| AshamedCaptain wrote:
| I know by memory that is what I get when an HTTP server
| responsd to an HTTPS/SSL request :)
| dijit wrote:
| Seems like this is a really good service to add some legitimacy
| to Tor browsing.
|
| Having SciHub as a hidden service would bring a lot of people to
| Tor.
|
| EDIT: apparently I'm a bit stupid; it exists:
| https://scihub22266oqcxt.onion
|
| But it would be cool to promote the hell out of the onion address
| and tor browser, rather than trying to bypass ISP restrictions.
| GordonS wrote:
| I use SciHub a lot, and just this week have been having
| problems accessing it on the internet (in the UK) - I don't use
| Tor, but I also never even thought of it. Totally agree it
| would be a good idea to promote the Onion address!
| livueta wrote:
| Possibly also a path to greater decentralization. Relying on
| jurisdictional stuff isn't going to cut it forever (see the
| current pause on new uploads), but it's also hard to ask people
| to host data that'll get them sued without offering
| mitigations. Private torrent trackers do that through, well,
| being private, but I'm sure as hell not serving
| springer_catalogue_2020.tar.xz to the whole internet from an
| address linked to me in any way. Maybe an index of
| independently operated hidden services, each serving a
| (redundant) shard of the collection?
| hkt wrote:
| I'm glad this document exists but I tend to favour using
| TorBrowser, both on Android and (in my case) Manjaro.
| brumar wrote:
| Tor works well for me so far
| joelthelion wrote:
| There's a telegram bot that sends you the papers you ask for.
| It's by far the most convenient way to use scihub.
| paufernandez wrote:
| Agreed, that's the one I use, so fast!
| kickout wrote:
| Do you have a link or example for, uh, science? Lol
| vbezhenar wrote:
| Google sci hub telegram bot returns me @scihubbot as the
| first results. You might want to try it out.
| burundi_coffee wrote:
| Actually, it's @scihubot. You can send it links to the
| paper from the journal webpage or the doi link, for
| example: https://doi.org/10.1063/1.432563
| [deleted]
| londons_explore wrote:
| IPFS would be a perfect fit for this...
|
| It sadly isn't censorship resistant, but it should do for a few
| years, and as soon as censorship on IPFS starts to become an
| issue, hopefully the IPFS developers will evolve the design.
| tingletech wrote:
| once upon a time I wrote a script that took a MARC file from the
| library catalog as input, and output a PAC file for the group
| that ran the campus proxy server.
| sneak wrote:
| Another option is to simply configure your workstation to use
| DoH. Then your ISP can't fuck with your address resolution.
|
| I recommend using NextDNS, and then setting up a provisioning
| profile at https://apple.nextdns.io to set it as your revolver on
| your macs and ios devices. The ad-blocking features are a nice
| bonus, too.
|
| NextDNS also has a cool free software CLI local DoH proxy
| resolver which works a charm.
| emsal wrote:
| I once attempted to make something like ProxySwitchy for DNS[1],
| but I didn't work on it long enough to get off the ground. This
| article made me think about it again. Is there actually a use
| case for that kind of thing?
|
| [1]: https://github.com/emsal0/resolvplox
| AlexandrB wrote:
| The way SciHub is being treated by governments is pretty
| infuriating. There's a tiny minority of people who have an
| interest in keeping SciHub off the internet, and they're
| generally neither the researchers who write the papers, nor those
| who want to read them. Despite this, the power of the state has
| been used repeatedly to keep SciHub inaccessible and limit their
| ability to get funding.
| tomrod wrote:
| Yes, due to allocation of property rights.
|
| Cease to supply this system with the fruits of your research
| labors.
| pbhjpbhj wrote:
| In my personal opinion the international community could do
| something like modify the Berne Convention/TRIPS
| (international copyright agreements signed by almost every
| country/WTO members respectively) to exclude copyright of
| academic papers.
|
| The property rights in question are not natural rights, nor
| material rights. Sufficient political will seems like it
| could do it.
|
| Finding politicians in power who will support human progress
| before profits might be hard! [/understatement]
| zwaps wrote:
| Give me tenure and I'm on board.
| Mediterraneo10 wrote:
| > Cease to supply this system with the fruits of your
| research labors.
|
| Historically academics have felt forced to support this
| system, because for-profit journals are the high-prestige
| ones they must publish in in order to get tenure. This has
| changed for certain fields, but it isn't as simple as just
| suggesting that one publish elsewhere.
| divbzero wrote:
| It's up to not only academics who publish articles, but
| also organizations that issue grants and tenure. Public
| policies to adjust their definitions of "prestige" or
| "quality" would help.
| underwires wrote:
| which fields has this changed for?
| Tokelin wrote:
| What are some of the fields where this is changing?
| tmkadamcz wrote:
| By the way, Sci-Hub has stopped adding new articles to the
| database for a few months now (background:
| https://www.reddit.com/r/scihub/comments/mk46x4/scihub_v_els...).
|
| It would be great to develop a truly decentralised solution.
| Having a database of individual torrent links for each paper
| might be a start.
| Azrael3000 wrote:
| Thanks for the background link. I did not know about that and
| it's a good incentive to donate them some money for the legal
| battle.
|
| TL;DR of the link: No more uploads to support a court case in
| India which SciHub might win and thus establish a legal basis
| for operation in the biggest democracy.
| andyxor wrote:
| when donating bitcoin make sure to get the address from the
| official scihub mirrors, which are currently sci-hub.do, sci-
| hub.st or sci-hub.se.
|
| there are some unaffiliated "mirrors" that only redirect to
| scihub but list their own bitcoin address for donations, so
| beware.
|
| /r/scihub on reddit keeps track of this
| https://www.reddit.com/r/scihub/wiki/index
| lowkey wrote:
| Sci-hub is such a great example of a clear and compelling
| use-case for Bitcoin. Bitcoin is censorship-resistant money
| that doesn't rely on countries, laws, central bankers or
| politicians. The US dollar cannot be used for purposes not
| aligned with the US government. Sometimes ideas that the US
| Government doesn't agree with can be useful (e.g. Wikileaks,
| Sci-hub.)
|
| When I hear complaints that Bitcoin has no use except for
| speculation, I think of Sci-hub, Wikileaks and other
| organizations that may be bad for the interests of the US
| government but may be good for mankind.
| hyperbovine wrote:
| It's a censorship resistant technology that also indelibly
| records, publicly, every transaction you ever participated
| in. Talk about a double-edged sword...
| HideousKojima wrote:
| Torrents are great and all but it's dependent on people seeding
| them, sci-hub/libgen is great because you don't have to worry
| about a download suddenly breaking because no one is seeding
| kortilla wrote:
| But they could just always be a seeder. Doesn't that have the
| upsides of the existing solution plus resiliency?
| whimsicalism wrote:
| Isn't scihub/libgen already backed by torrents? I'm
| confused.
| [deleted]
| jandrese wrote:
| Millions of individual torrents is not a great solution.
| Keeping them all seeded is basically impossible unless they run
| a seed for each one, at which point they might as well just
| host the files. Plus you'll never get the economy of scale that
| makes BitTorrent really shine.
|
| When you have a whole lot of tiny files that people will
| generally only want one or two of there isn't much better than
| a plain old website.
|
| A torrent that hosts all of the papers could be useful for
| people who want to make sure the data can't be lost by a single
| police raid.
| tmkadamcz wrote:
| There are already torrents of the archives. But supposing
| scihub was taken down it's pretty non trivial to get from the
| archive back to a working site with search functionality. For
| one thing, none of Sci-Hub's code is available.
| contravariant wrote:
| There was that project some guy posted a while back that used
| a combination of sqlite and partial downloads to enable
| searches on a database before it was downloaded all the way.
| If you can fit PDFs somewhere into that you'd be golden.
|
| Or just use IPFS I suppose.
| o8r3oFTZPE wrote:
| "There was that project some guy posted a while back that
| used a combination of sqlite and partial downloads to
| enable searches on a database before it was downloaded all
| the way."
|
| https://github.com/bittorrent/sqltorrent
| jagged-chisel wrote:
| this is the one:
| https://phiresky.github.io/blog/2021/hosting-sqlite-
| database...
|
| HN submission:
| https://news.ycombinator.com/item?id=27016630
| hkt wrote:
| Isn't that essentially mapreduce? Either way, interesting
| and I'd love to see the link.
| vorticalbox wrote:
| I believe this is the project mentioned
|
| https://github.com/lmatteis/torrent-net
| tmkadamcz wrote:
| This looks like it could be a good approach.
| contravariant wrote:
| That one looks familiar. Though apparently the same thing
| has been tried in several different ways going by the
| replies I got.
| divbzero wrote:
| IPFS would face a similar challenge as the "keep torrents
| seeded" problem mentioned by GP. Wouldn't there be risk to
| peers who host the PDFs?
| Natsu wrote:
| I sort of feel like there should be some way to use some
| kind of construct to get people to seed things so that
| others seed things for them, but I haven't seen that
| invented yet.
| miloignis wrote:
| Been a while since I've looked at them, but IPFS with
| FileCoin and Ethereum Swarm had that kind of goal.
|
| It might be beneficial to create something like what you
| describe without any cryptocurrency association though,
| and I've been mulling over possibilities for distributed
| systems that are inherently currency-less to avoid all of
| the scams that cryptocurrency attracts.
| zolland wrote:
| I think seed ratios and seed time (mostly used by private
| trackers) attempt to solve this problem.
| zolland wrote:
| What kind of risk?
| dredmorbius wrote:
| What documents (books, scientific articles) benefit from
| specifically is a number of highly consistent, highly
| accurate identifiers: DOI (scientific articles), ISBN
| (published books), and others (OCLC identifier, Library of
| Congress Catalogue Number, etc.)
|
| With the addition of hashsums (even MD5 ad SHA1, though
| longer and more robust hashsums are preferred), a pretty
| reliable archive of content can be made. It's a curious case
| where increased legibility seems to be breaking rather than
| creating a gatekeeper monopoly.
|
| I've been interested in the notion of more reliable content-
| based identifiers or fingerprints themselves, though I've
| found little reliable reference on this. Ngram tuples of 4-5
| words are often sufficient to identify a work, particularly
| if a selection of several are made. Aggreeing on _which_
| tuples to use, how many, and how to account for potential
| noise / variations (special characters, whitespace variance,
| OCR inaccuracy) is also a stumbling point.
| posterboy wrote:
| a plain old website or a publishing house with distribution
| services and syndication attached, but for a sane price.
|
| "a whole lot of tiny files" severely underestimates the scale
| at work. Libgen's coverage is relatively shallow, and pdf
| books tend to be huge, at least for older material. Scihub
| piggy backs on the publishers, so that's your reference.
|
| _syndication_ , _syndicate_ , quite apt don't you think?
| Libraries that coluded with the publishers and accepted the
| pricing must have been a huge part of the problem, at least
| historically. Now you know there's only one way out of a
| mafia.
| jandrese wrote:
| In Internet scale it's not a lot of data. Most people who
| think they have big data don't.
|
| Estimates I've seen put the total Scihub cache at 85
| million articles totaling 77TB. That's a single 2U server
| with room to spare. The hardest part is indexing and
| search, but it's a pretty small search space by Internet
| standards.
| HWR_14 wrote:
| It still amazes me that 77TB is considered "small". Isn't
| that still in the $500-$1,000 range of non-redundant
| storage? Or if hosted on AWS, isn't that almost $1,900 a
| month if no one accesses it?
|
| I know it's not Big Data(tm) big data, but it is a lot of
| data for something that can generate no revenue.
| pbhjpbhj wrote:
| I'm prepared to accept " _does_ generate no revenue " but
| " _can_ generate no revenue " ...?
|
| Perhaps some sort of MTurk or captcha-like tasks per
| access? Patr[e]ons? Donation drives? Micro-payments?
| Something else??
| HWR_14 wrote:
| Oh, it _could_ generate revenue if it was legal. But it
| is not, so it seems difficult.
| smichel17 wrote:
| > Isn't that still in the $500-$1,000 range of non-
| redundant storage?
|
| Sure. Let's add redundancy and bump by an order of
| magnitude to give some headroom -- $5-10k is a totally
| reasonable amount to fundraise for this sort of
| application. If it were legal, I'm sure any number of
| universities would happily shoulder that cost. It's
| miniscule compared to what they're paying Elsevier each
| year.
| HWR_14 wrote:
| Sorry. My point was it was a lot of money precisely
| because it cannot legally exist. If it could collect
| donations via a commercial payment processor, it could
| raise that much money from end users easily. Or grants
| from institutions. But in this case it seems like it has
| to be self-funded.
| dredmorbius wrote:
| For an institution, it's a rounding error.
|
| AWS is not the cheapest bulk-storage hosting possible.
| dredmorbius wrote:
| The entire Library of Congress books collection is on the
| order of 40 million items.
|
| At 5 MB per book, this works out to about 200 TB of disk
| storage.
|
| At about $12/TB, hosting the entire LoC collection would
| cost roughly $2,400 presently, with prices halving about
| every three years.
| andyxor wrote:
| The entire archive actually fits in a small desktop NAS
| (e.g. QNAP or Synology) with a few 14-18TB drives, you
| don't even need a server rack.
|
| There is existing index in sql format distributed by
| libgen: https://www.reddit.com/r/scihub/comments/nh5dbu/a
| _brief_intr..., it is around 30GB uncompressed.
|
| Those 851 torrents uncompressed would probably take half
| a petabyte of storage, but I guess for serving pdfs you
| could extract individual files on demand from zip archive
| and (optionally) cache. So the scihub "mirror" could run
| on a workstation or even laptop with 32-64GB memory
| connected to 100TB NAS over 1GBE, serving pdfs over VPN
| and using unlimited traffic plan. The whole setup
| including workstation, NAS and drives would cost $5-7K.
|
| it's not a very difficult project and can be done DIY
| style, if you exclude the proxy part (which downloads
| papers using donated credentials). Of course it would
| still be as risky as running Scihub itself which has $15M
| lawsuit pending against it.
| whimsicalism wrote:
| Libgen's coverage is definitely more shallow than scihub,
| but it is still pretty good.
| einpoklum wrote:
| If the sane price is an optional "Donate to keep this site
| going" link, then ok. But only free access, without
| authentication or payment, to scientific papers, is sane.
| IMHO.
| munk-a wrote:
| Might this be a case where the best resolution would be
| to have the government (which is at least partially
| funding nearly all of these papers) step in and add a
| ledger of papers as a proof of investment?
|
| The cost of maintaining a free and open DB of scientific
| advances and publications would be so incredibly
| insignificant compared to both the value and the
| continued investment in those advancements.
| einpoklum wrote:
| Well, some research venues (and publication venues) are
| not government-funded, and even if they are indirectly
| government funded, it's more of a sophistry than
| something which would make publishers hand over copies of
| the papers.
|
| Also, a per-government ledger would not be super-
| practicable. But if, say, the US, the EU and China would
| agree on something like this, and implement it, and have
| a common ledger, then it would not be some a big leap to
| make it properly international. Maybe even UN-based.
|
| That's a pretty big "if" though.
| andyxor wrote:
| IPFS seems like a perfect fit for this and some of the scihub
| torrents are already in IPFS, but it's not an anonymous
| network.
|
| IPFS via the DHT tells the network of your whole network
| topology, including internal address you may have, and VPN
| endpoints too. It's all public by design as they don't want to
| associate IPFS with piracy per one of their developers.
|
| this thread has some discussions on the alternatives
| https://www.reddit.com/r/DataHoarder/comments/nc27fv/rescue_...
| vvll wrote:
| Can files be taken down off ipfs? There was a fairy widely
| circulated link that had all the IEC and ANSI standards on
| there that has since been taken down.
| whimsicalism wrote:
| Isn't scihub already on IPFS?
| andyxor wrote:
| some torrent files are archived there, but i don't think
| scihub is serving the pdfs from IPFS, they likely use
| private storage network.
|
| I believe libgen.fun which is a new (official) libgen
| mirror is running fully on ipfs, and it serves some
| scientific papers, but I wasn't able to search by DOI or
| title there, looks like it redirects to scihub, also there
| is no index of the papers on IPFS.
|
| Edit: this doc talks about scihub+ipfs (it was created by
| the leader of Scihub rescue effort on Reddit, /u/shrine):
| https://freeread.org/ipfs/
| kodablah wrote:
| You could use libp2p's DHT over Tor (I did a poc of this long
| ago, and the situation's only improved). Combined with other
| libp2p/IPFS components, you can essentially have a private
| IPFS over onion services (not to be confused with accessing
| the existing IPFS network via Tor exit nodes).
| Gormisdomai wrote:
| This is a useful note on using PACs to set up proxies for just
| one site:
|
| > Incidentally, you do not need to be running a web server to use
| the .pac file. You can access it via a file:// type URL. For
| example (note the 3 slashes):
| file:///Users/username/Library/proxy.pac
|
| http://hints.macworld.com/article.php?story=2004010109555326...
| Deathmax wrote:
| Note that if you are using Chromium, it will refuse to load a
| PAC file from the file:// scheme. Here's the bug tracker issue
| for the change:
| https://bugs.chromium.org/p/chromium/issues/detail?id=839566.
| londons_explore wrote:
| And you should... The proxy.pac file is in some cases reloaded
| for every single http request.
| Jcowell wrote:
| Can anyone in Europe with IOS 15 see if Private Relay is able to
| bypass this? We have a similar situation over here in the States
| with Verizon and some piracy sites and It's able to bypass.
| deadalus wrote:
| GreenTunnel is another alternative to evade ISP blocking without
| using a VPN:
|
| https://github.com/SadeghHayeri/GreenTunnel
| [deleted]
| lapinot wrote:
| Obviously another solution on linux is to install a local
| recursive DNS resolver and be done with it... I'm quite happy
| with knot-resolver (kresd).
| beermonster wrote:
| This only works if your ISP is using/abusing/hijacking DNS to
| censor your connections.
|
| If they're doing that you'd be better off using D-o-T or D-o-H,
| to protect your DNS from interference.
| lapinot wrote:
| ISP rarely do anything else than DNS censoring (censoring by
| ip blackholing is for really grave stuff). Also i don't
| understand why you'd be "better off" using encrypted
| connection to a 3rd party DNS which can still lie to you.
| Just run a local resolver, it's so lightweight there's no
| real reason not to. (and honestly, the hypothetical delay
| isn't noticeable)
| NilsIRL wrote:
| Sorry, am I missing something because I'm pretty sure the
| whole point of the article is that ISPs do block more than
| just DNS
| kortilla wrote:
| A 3rd party is better because it can be hosted in some
| other country not subject to local fascism du jour you have
| to deal with from your ISP.
___________________________________________________________________
(page generated 2021-06-09 23:00 UTC)