[HN Gopher] Archive.is owner on "continuity of his project"
___________________________________________________________________
Archive.is owner on "continuity of his project"
Author : spzx
Score : 170 points
Date : 2021-09-08 14:35 UTC (8 hours ago)
(HTM) web link (blog.archive.today)
(TXT) w3m dump (blog.archive.today)
| jeffalo wrote:
| Does any one know what's up with all the different domains for
| archive.is?
|
| The blog is at blog.archive.today, it calls itself the
| "archive.is blog", but when I visit archive.is or archive.today,
| I'm brought to archive.vn. When I click the "archive.today" logo
| in the header, I'm taken to archive.ph
| benjaminikuta wrote:
| He's mentioned this before. He has multiple domains because
| they can go down sometimes.
| Seattle3503 wrote:
| Didn't realize people used it for anything other than getting
| around paywalls.
| john-doe wrote:
| That's my main use and sometimes feel bad about it, since I
| just want to see the content and not archive some mediocre
| article for "posterity".
| indigodaddy wrote:
| Perhaps he can somehow join forces with archive.org? Maybe they
| take over when/if he is no longer able to do it?
| indigodaddy wrote:
| On a related note, I guess Tumblr is still around to use as a
| blog ?
| goodrubyist wrote:
| Yes it has been around, but it's quite jarring to see anyone
| actually use it.
| rosetremiere wrote:
| About reliable email adresses: I'm using my university alumni
| "email forwarding for life", but this loses me some emails due to
| DMARC and friends. What are the alternatives?
| autarch wrote:
| Register your own domain and use that for email. Then you can
| move that domain's email service to different providers.
|
| I use Gmail right now, but I could move my domain's email to
| Fastmail or another provider without _too_ much work.
|
| Of course, it's also a good idea to back up your email archives
| too, since that _can_ go away with the loss of a provider's
| service.
| thom wrote:
| I've recently migrated four different accounts, three of them
| Gmail with their own domains, to Fastmail and I was
| astonished by how easy the process (that I'd put off for
| _years_) was. Huge weight off my mind and I've been very
| happy with the service since then.
| raxi wrote:
| Domain "ownership" is too fragile. Well, you do not lose the
| email archive, but new domain "owner" could steal you
| identity.
| glanard_frugner wrote:
| even if we archive everything, hundreds of years from now all of
| "the worlds information" could very well be unusable and
| unreadable for a variety of factors(no one remembers how to deal
| with the file formats, EMP, bit rot). books however will continue
| to work just fine as they have for thousands of years
| derivagral wrote:
| If we're talking that long of a timescale, how long does your
| typical book these days actually last? I'm no expert, but it
| makes me wonder how long consumer paper actually lasts.
| Reasonable(?) search result below.
|
| https://www.quora.com/How-long-does-it-take-for-paper-to-dec...
| glanard_frugner wrote:
| adding paper to a compost pile will give different results
| than keeping a book stored in the proper conditions
| Shank wrote:
| Archive.is still doesn't work if you use Cloudflare DNS due to a
| spat with Cloudflare and the operator. So to me, the continuity
| and reliability is already a big question. Not only is it a
| question of sustainability economically, but also ideologically:
| what happens if another similar decision is made to lock out a
| portion of users?
| throwawaysea wrote:
| I lost faith in Cloudflare when they switched from being a
| neutral infrastructure service to yet another politically-
| motivated big tech company. In their blog post around the ban
| of 8chan (https://blog.cloudflare.com/terminating-service-
| for-8chan/), they acknowledged that they didn't know if 8chan
| broke any laws but they nevertheless decided to pass personal
| judgment based on a vague notion that 8chan "inspired" a
| shooting. That's quite an unprincipled way to operate a
| fundamental network utility that backs 10% of the Fortune 1000
| and 20% of the top 10000 websites.
| oh_sigh wrote:
| For reference: the spat is that Cloudflare DNS does not leak
| geographic information of the queryer through EDNS, and the
| archive.is fellow is requiring geographic information to
| provide valid DNS lookups. So he intentionally sends back bad
| results when it is 1.1.1.1 querying his nameservers.
|
| I love the site, but his stance on this doesn't really make
| sense to me, and it's a shame that millions and millions of
| people use 1.1.1.1 daily and archive.is is the one website that
| doesn't work for those people.
| slim wrote:
| Millions use 1.1.1.1 ? Any source for that ?
| sp332 wrote:
| The Android app has 418,000 reviews and over 50 million
| installs, and the iPhone app has 230,000 reviews. The
| number of people who use it without an app is probably a
| lot higher.
| spzx wrote:
| HN discussion from 2019 on this topic:
| https://news.ycombinator.com/item?id=19828317
| [deleted]
| hansel_der wrote:
| > I love the site, but his stance on this doesn't really make
| sense to me
|
| would not be surprised if he has some personal axe to grind
| with cf (they are no sheep either).
|
| also i would be wary of overestimating market penetration of
| any 3rd-party dns provider; iirc google has total dominance
| of this segment and is still below 10%.
| silisili wrote:
| I'm not sure the term 'leak' applies. It's an anti-cdn play.
| Refusing to use EDNS correctly makes the web slower for a lot
| of people. And it adds little to nothing to privacy since the
| answer IP is going to know your IP at the next step
| anyways...
|
| As for why archive.is cares so much...that I don't know.
| Perhaps they rely on such data to give a fast experience, and
| are tired of this charade...but that's just speculation.
| magila wrote:
| Cloudflare's edge network is sufficiently dense that ECS
| data is unnecessary in almost all cases. The requesting
| data center will be close enough to the client that doing
| geoip on the source IP will have the same results as using
| ECS.
|
| There's nothing incorrect about what Cloudflare is doing,
| EDNS does not require ECS data to be included in requests,
| but for whatever reason the maintainer of Archive.is
| decided to block 1.1.1.1 over it.
| nextaccountic wrote:
| But not sending ECS data harms non-cloudflare CDNs,
| right?
|
| edit: from here
| https://news.ycombinator.com/item?id=19828702 I gather
| that this indeed harms CDNs outside the ones that
| Cloudflare has a business relationship with.
|
| > EDNS IP subsets can be used to better geolocate
| responses for services that use DNS-based load balancing.
| However, 1.1.1.1 is delivered across Cloudflare's entire
| network that today spans 180 cities. We publish the
| geolocation information of the IPs that we query from.
| That allows any network with less density than we have to
| properly return DNS-targeted results. For a relatively
| small operator like archive.is, there would be no loss in
| geo load balancing fidelity relying on the location of
| the Cloudflare PoP in lieu of EDNS IP subnets.
|
| > We are working with the small number of networks with a
| higher network/ISP density than Cloudflare (e.g.,
| Netflix, Facebook, Google/YouTube) to come up with an
| EDNS IP Subnet alternative that gets them the information
| they need for geolocation targeting without risking user
| privacy and security. Those conversations have been
| productive and are ongoing. If archive.is has suggestions
| along these lines, we'd be happy to consider them.
| forgotmypw17 wrote:
| Cloudflare also often blocks no-JS users.
|
| All-around a huge accessibility impediment.
| deadalus wrote:
| Cloudflare is not neutral. They blocked 8chan after political
| pressure from MSM.
| MrStonedOne wrote:
| This is kinda irrelevant in a discussion about archive.is
| blocking cloudflare _DNS_ users and cloudflare _DNS_ servers
| /ips.
| syshum wrote:
| CloudFlare really is an enemy of the Free (Libre) open
| internet.
|
| They are the next Google in terms of "evil companies"
| Thorentis wrote:
| What is a good alternative to Cloudflare DNS? I've been
| using 1.1.1.1 since it was launched, but now this makes me
| want to switch to something better.
| forgotmypw17 wrote:
| Why not your ISP?
| kook_throwaway wrote:
| Was MitMing half the web the giveaway?
| spzx wrote:
| Also some interesting numbers in this post:
|
| >How much does hosting cost you per month at the moment?
|
| >about ~$2600/mo of pure expenses on servers/domains, not
| counting "work time", "buying laptop/furniture", etc.
| ($100...300/mo covered by donations + $300...500 by ads)
|
| https://blog.archive.today/post/659383959382294528/you-said-...
| john_moscow wrote:
| As someone who managed to get more or less donation campaigns
| in the past, here are my 2 cents.
|
| There's a huge difference between "please donate to our
| project" and "it costs us $X/month to run it, we have Y users
| and managed to collect $Z so far, please donate".
|
| The first one will get you about $1 per 10K-1M users. The
| second one will get your goal fulfilled, as long as you are
| reasonable, and have enough users. All it takes is a noticeable
| message and a way to update it automatically based on the money
| received.
| dmix wrote:
| I see the answer to this question more around backups and helping
| future people overcome technical limitations (knowledge transfer
| + data archiving).
|
| All things are ephemeral after a certain point but archiving
| typically lasts much much longer than the human operators.
| Likewise documenting the process and barriers to overcome will
| help people in the future solve the problem (and a broader amount
| of people).
|
| This doesn't _have_ to be public, just needs a way to become
| public.
| growt wrote:
| He better also has built some savings for legal defense. Because
| he could easily be sued into oblivion especially in Europe.
| ignoramous wrote:
| archive.is is my browser to text-heavy websites like blogs, news,
| twitter, and documentation (outline.com is another one, reader
| mode, yet another). It completely debloats a webpage as it
| archives (unlike web.archive.org, say). Suits my purposes just
| fine. I must note though, archive.is (from what I recall),
| forwards IP address of whoever initiated an archive process to
| the origin.
|
| archive.is was also a great mirror to instagram and linkedin for
| public profiles, but it doesn't archive instagram anymore.
| spzx wrote:
| Here's his quote on instagram archiving problems:
|
| "There is no Instagram content which don't need to login.
|
| If you can access the page without login, it is sort of "promo
| preview", after few pages accessed this way, they add your IP
| into "promo is over" list and will redirect to /login on every
| future request.
|
| I just have not enough fresh IPs to abuse this mechanism."
|
| https://blog.archive.today/post/659927354404192256/instagram...
| kixiQu wrote:
| This is a very fair answer, and all "permalinks" are lies. At the
| same time, I wonder if it might not be possible to have snapshots
| up on, I dunno, IPFS or torrent sites or something such that when
| the unavoidable happens, not all is lost.
| dannyw wrote:
| nothing is permanent in life. heaps of torrents i want have 0
| seeders.
|
| ipfs's economically incentivised hosting layer (filecoin)
| offers storage in monthly increments, not perpetual.
| hansel_der wrote:
| > nothing is permanent in life
|
| practically yes, but
|
| https://github.com/philipl/pifs
| kordlessagain wrote:
| An image of a Web asset is useful for archival purposes and may
| be indexed using text recognition ML.
| wizzwizz4 wrote:
| Images take up _way_ more space than HTML documents...
| actually, what with how bloated most "websites" are these
| days, that 's probably not true any more.
| legrande wrote:
| > This is a very fair answer, and all "permalinks" are lies
|
| Cool URIs don't change:
| https://www.w3.org/Provider/Style/URI.html
|
| You can believe that cool URIs don't change or you could go the
| IPFS route. Similar to the way torrents have a 'health' score
| of plenty of seeders, and IPFS resources could live as long as
| people want that resource to exist (Not sure if that situation
| is baked into IPFS though).
|
| Also: have you looked into Filecoin?[0]
|
| [0] https://filecoin.io/
| golemotron wrote:
| Yes, the site owner's response seems excessively didactic when
| a system of mirrors would solve the problem.
| ragona wrote:
| Mirrors don't solve the problem, they move it.
| dewey wrote:
| How is a mirror not solving the problem of data loss if a
| single instance goes away?
| gambler wrote:
| The problem isn't a single instance going away. The
| problem is what happens when for whatever reason the
| owner stops maintaining the project. This is a common
| problem and despite all the bluster and buzzwords, the IT
| community hasn't really found a solution. Torrents are
| the only kind-of-solution, but they are not ideal for
| something that needs to be constantly updated.
| wolrah wrote:
| > a system of mirrors would solve the problem.
|
| A system of mirrors prevents a single node going down from
| taking the whole system (in theory at least, we've all seen
| plenty of times where failover goes poorly), but it doesn't
| do anything to ensure the long term survival of the system as
| people lose interest, lose the ability to participate, and
| sometimes die.
|
| If you were around certain internet forums in the late '00s
| you might have run in to an image hosting platform called
| WaffleImages which was created in response to yet another
| popular free image hosting service locking down their
| embedding and ruining thousands of old posts. The goal was to
| distribute image hosting among community-operated mirrors,
| and it worked great for a few years. Over time though people
| lost interest while the rate of new mirrors getting added
| dropped to basically zero and eventually it fell apart.
| toomuchtodo wrote:
| Depending on the format of the archives (hopefully WARC), the
| owner could hand the entire archive over to the Internet
| Archive or Archive Team for ingestion by Wayback Machine.
|
| If the concern is perpetual access to archived content but
| under your terms, that is where the cost comes in. Somebody
| somewhere is paying for power, cooling, connectivity, and
| disks. The Internet Archive estimates it costs them $2/GB to
| host data uploaded in perpetuity. Please consider donating if
| you're uploading content for permanent archival and/or deriving
| value from hosted content.
| benjaminikuta wrote:
| The Internet Archive isn't perfect either. They can be
| pressured to remove pages.
| SllX wrote:
| Pretty much. If you really want to maintain access to old
| stuff in perpetuity, you have to pay for it yourself by
| either storing it on your own equipment or paying someone to
| store it on theirs.
| kixiQu wrote:
| Sure -- but at least with something p2p hooked up to it I
| can let someone else benefit from my desire to assure my
| own access.
| fouc wrote:
| Archive.is is an incredible service, but the fact he's paying out
| $3k-4k/mo out of pocket in expenses & time doesn't strike me as
| sustainable for the long term.
|
| I'm reminded of some non-profit organization that was forced to
| shut down their websites because they ran out of money. In
| retrospect they could've setup a trust fund early on, stuck all
| their money in there, and then had a perpetual annual income from
| that for operating costs, instead of spending down. All
| fundraising could've gone into the trust fund in order to boost
| the annual budget, etc.
| glofish wrote:
| There is an interesting conundrum here, when we post to the
| internet do we also consent to having that information saved for
| all eternity?
|
| Archiving everything is still a novel and not fully understood
| concept it is not that clear that it is useful or beneficial over
| long term.
|
| Forgetting can be a bliss.
| 019a wrote:
| This is kind of an antithesis to his message in the post; that
| nothing is actually permanent, and while many people are
| concerned about continuity of service, ultimately perfect
| continuity is impossible, whether that's due to the
| organization running out of money, bad backup practices and a
| fire, global warming wiping Ashburn Virginia off the map, or
| the end of the human civilization by some other means.
|
| I've helped with a couple risk registers at tech companies. Two
| things I've never seen appear in a risk register: The company
| runs out of money. Human society is wiped out. I've been
| laughed at once for bringing up variations on these. They're
| out of scope; risks stop being a threat when there's no one
| left to care about them.
|
| I think the goal of keeping an internet-scale level of data
| accessible and searchable, for longer than one lifetime, is an
| impossible task. Maybe Archive.org/Archive.is can pull it off;
| I doubt it. Its an insane amount of data. Most of it is totally
| pointless, but its really difficult to pick-apart what's useful
| and what's useless, so you have to keep as much as possible
| without bias. All of that is on hard disks which violently spin
| around at 8 meters per second, accessed by software which we
| all know breaks every day but are too afraid to admit it, over
| a network of other computers with all the same flaws,
| distributed globally, yet can be significantly disrupted by one
| roadside construction worker and a jackhammer.
|
| The internet didn't increase the lifetime of data; it decreased
| it. Sure, we have far more of it at our fingertips than any
| other point in history, but that's not lifetime; that's just
| volume. And that volume has desensitized us; its fundamentally
| impacting our innate biological memory capacity, and the social
| structures we form around memory. We know the Library of
| Alexandria existed because people wrote about it; the pages
| laid for thousands of years; its memory passed verbally from
| person to person.
|
| If all computers stopped functioning tomorrow, not even
| disappear, they're still there, they just don't work: Would the
| memory of Stranger Things still be known in two thousand years?
| I doubt it, but: if the only thing which offers us a satisfying
| "Yes" is "we keep the computers running, accessible, indexable,
| searchable"; that seems, at the very least, given the extreme
| challenges we as a species will be facing over the next
| century, beyond the scope of human possibility
| decasteve wrote:
| The Sun's ability to function like an Earth-wide Eprom eraser
| might cause some catastrophic disruption given our reliance
| on the Internet and computing devices. A large enough
| geomagnetic/solar storm is not unprecedented.
|
| See: https://en.wikipedia.org/wiki/Carrington_Event
| [deleted]
| rm_-rf_slash wrote:
| What blissful times we lived in before the daily drumbeat of
| "Accomplished young professional discovered to have said
| offensive things on the internet when they were a dumb
| teenager, reputation tarred and feathered for the rest of their
| life."
| SamoyedFurFluff wrote:
| we don't know if it'll be a lifelong thing though. I suspect
| with all this pushback and emotional exhaustion (and I really
| do believe it's emotionally exhausting to constantly be
| hounding over peoples morality on the internet) people will
| just stop giving a shit about what someone said 5 years ago
| pretty soon.
| rm_-rf_slash wrote:
| It's a numbers game. Billions of people won't care, but a
| few dozen can make enough of a stink that a habitually risk
| averse institution would rather let someone be canceled
| than risk the controversy snowballing into something
| bigger.
| thingification wrote:
| Only if we let them.
|
| There are all kinds of feedback cycles going on. Neither
| of these two are rational:
|
| 1. There are no ways to correct the problem you describe
|
| 2. The problem will correct itself without individuals
| and institutions making it happen
| kart23 wrote:
| Yes. I think people need to understand that anything on the
| internet is by default, there forever. Post something privately
| or behind a password if you don't want everyone seeing it.
| glofish wrote:
| A lot of people think that the current status-quo is the only
| way forward. Why? There is no reason for it.
|
| There is no practical reason why a forum could not have an
| optin-optout-delete API/protocol/etc that all search
| engines/archivers should follow.
|
| When I delete something all should follow the orders and
| those that do not need to be responsible for it.
|
| Of course there are plenty of business reasons why engines
| don't want to do it.
| ergl wrote:
| >There is no reason for it.
|
| The reason is that bits are not physical. Anyone can copy
| them and re-upload them, without any cost.
| raxi wrote:
| This is the difference between "speech" and "text".
|
| When you write a brief or a book, it's a performative act;
| you cannot pretend nothing was said.
|
| The Internet is textual media, just like books, and unlike
| television or talking in the park.
|
| If you don't like it, you can only burn books or drive
| modern technology in that direction
| hansel_der wrote:
| there are plenty of human reasons as well, as indicated by
| your usage of the word "should"
| Y_Y wrote:
| > all should follow the orders and those that do not need
| to be responsible for it
|
| Furthermore the tide will be ordered back, the contents
| returned to Pandora's box, and universal entropy decreased.
| Failure to comply will result in a fine.
| legrande wrote:
| > There is an interesting conundrum here, when we post to the
| internet do we also consent to having that information saved
| for all eternity?
|
| I'm not sure about consent, but presume it will be 'stuck' and
| un-removeable from the net once it's out there. (So be careful
| what you disseminate). Some people even go out of their way to
| make sure certain content will never be forgotten from the web.
| SkyMarshal wrote:
| I think it's one of those things modern parents are going to
| have to understand about their brave new world, and teach their
| modern children. Like look both ways before crossing the
| street, remember that everything you write on the Internet is
| permanent, so think before you write, or if you don't want to
| do that then at least write it under a pseudonym.
| glofish wrote:
| why does it have to be that way though?
|
| you are stating the current status quo as unavoidable - why
| is that?
|
| Archiving everything has a massive cost and if it were
| illegal to archive unless people consent it would be much
| harder to get away with it.
|
| When I post to a public forum I gave that forum rights to
| publish what I said, I did not give everyone else the right
| to store what I said
| SkyMarshal wrote:
| I am not a lawyer but I think it would be legally difficult
| to make it illegal to record what people say and do in
| public spaces. That was a right that the entire Western
| media depended on long before the Internet was invented.
|
| The crux may be whether websites that require you to login
| to use them, like Facebook, are considered public or not.
| But anything you say or do in Facebook that you don't
| restrict to being viewable only by 1st degree friends is
| probably considered public.
| Lammy wrote:
| > why does it have to be that way though?
|
| That'll be -5 points for questioning the social credit
| system, Citizen.
| [deleted]
___________________________________________________________________
(page generated 2021-09-08 23:02 UTC)