[HN Gopher] Internet Archive breached again through stolen acces...
___________________________________________________________________
Internet Archive breached again through stolen access tokens
Author : vladyslavfox
Score : 293 points
Date : 2024-10-20 15:00 UTC (8 hours ago)
(HTM) web link (www.bleepingcomputer.com)
(TXT) w3m dump (www.bleepingcomputer.com)
| wkat4242 wrote:
| Ouch. Once can happen, twice in a row...
| fallingknife wrote:
| Once makes the second time more likely. Shows you are a soft
| target.
| TheFreim wrote:
| > "It's dispiriting to see that even after being made aware of
| the breach weeks ago, IA has still not done the due diligence of
| rotating many of the API keys that were exposed in their gitlab
| secrets," reads an email from the threat actor.
|
| This is quite embarrassing. One of the first things you do when
| breached at this level is to rotate your keys. I seriously hope
| that they make some systemic changes, it seems that there were a
| variety of different bad security practices.
| galleywest200 wrote:
| >"It's dispiriting to see that even after being made aware of
| the breach weeks ago..."
|
| These people are not dispirited whatsoever, if anything they
| are half-cocked that these script kiddies found an easy target.
| chrisrhoden wrote:
| The words came from a message written by the people you are
| calling script kiddies, rather than being editorializing by
| bleepingcomputer, as you seem to believe.
| compootr wrote:
| script kiddie or blackhat hacker is irrelevant. IA has shit
| security practices, and that's a fact regardless of who
| figures that out
| EasyMark wrote:
| I highly doubt they are script kiddies. More than likely they
| are state actors or mercenaries of state actors attempting to
| bring down the free transmittal of information between
| regular folks. IA evidently has not so good security and
| wikipedia must be doing pretty well I guess? I can't recall
| the last time one of these attacks worked on Wiki.
| luckylion wrote:
| Why would they publicly call them out and lay open the way
| they breached them if they were "attempting to bring down
| the free transmittal of information between regular folks"?
|
| They could have done much worse but they chose not to and
| instead made it public. Which state actor does that?
| ghostly_s wrote:
| IA is in bad need of a leadership change. The content of the
| archive is immensely valuable (largely thanks to volunteers)
| but the decisions and priorities of the org have been far off
| base for years.
| echelon wrote:
| I support archival of films, books, and music, but those
| items need to be write-only until copyright expires. The
| purpose of the Internet Archive is to achieve a wide-
| reaching, comprehensive archival, not provide easy and free
| read access to commercial works.
|
| Website caches can be handled differently, but bulk
| collection of commercial works can't have this same public
| access treatment. It's crazy to think this wouldn't be a huge
| liability.
|
| Battling for copyright changes is valiant, but orthogonal.
| And the IA by trying to do both puts its main charter--
| archival--at risk.
|
| The IA should let some other entity fight for copyright
| changes.
|
| I say this as an IA proponent and donor.
| withinboredom wrote:
| I'd agree with you if you live in a country where you can
| walk into your local library and read these for "free." For
| people who live where there may not even be a library, your
| argument makes no sense except to make the publishers
| richer. They typically price some of these books at
| "library prices" so normal people won't be able to afford
| them, but libraries will.
| giantrobot wrote:
| > I support archival of films, books, and music, but those
| items need to be write-only until copyright expires.
|
| Which means no one alive today would ever be able to see
| them out of copyright. It also requires an unfounded belief
| that major copyright owning companies won't extend
| copyright lengths beyond current lengths which are
| effectively "forever".
| fngjdflmdflg wrote:
| Do you have any examples?
| soygem wrote:
| Deleting archives of kiwifarms.
| fngjdflmdflg wrote:
| I don't believe IA itself takes down pages that kiwifarms
| archives/links to. Rather they get a request to take it
| down and comply with it (correct me if I'm wrong here). I
| think IA is actually in a tough spot on this issue
| because they might be able to be sued eg. for defamation
| if they don't take down pages with personal info after a
| request to do so is made. Lastly, I doubt any new
| leadership would be less harsh on kiwifarms.
| dazhengca wrote:
| There was no illegal content on kiwi farms. Even then,
| I'd say taking down a single page by request is
| understandable. However, they surrendered to the mob and
| chose to stop archiving the entire site. This was to
| censor any criticism of the people involved, but as a
| result, we lost all of the other information on the rest
| of the site as well. It's clear this organization cannot
| handle pressure, and is relying on people treating it
| kindly.
| shkkmo wrote:
| They chose to stop serving archives of a site that had
| started explicitly using tham as a distribution mechanism
| to get around much a much broader attempt to censor them.
|
| I'm curious what other information on that site you think
| was valuable to have available to the general public?
| Nothing has been lost in terms of historical data, it's
| only the immediate disemmination that has been slowed.
|
| I'm really trying to understand why I should disagree
| with the IA's choice here. The IA is an archival service,
| not a distribution platform and it is not their job to
| help you distribute content that other people find
| objectionable. Their job is to make and keep an archive
| of internet content so that we don't lose the historical
| record. Blocking unrestricted public access to some of
| that content doesn't harm that mission and can even
| support it.
| wkat4242 wrote:
| That's something I completely support. There's a limit
| and that site crosses it.
| tylerchilds wrote:
| the funny thing about the internet archive is that anyone
| else on this planet could do exactly what they are doing,
| but they consistently choose not to.
|
| kiwifarms could spin up their own infrastructure, serve
| their own content for the world, but it turns out
| technology is a social problem more than a technical
| problem.
|
| anyone that wants to stand up and be the digital backbone
| of "kiwi farms" can, but only the internet archive gets
| flack for not volunteering to be the literal kiwi farm.
|
| for example, the pirate bay goes offline all the time,
| but it turns out the people that use it, care enough to
| keep it online themselves.
| wkat4242 wrote:
| Putting the organisation at risk by playing chicken with
| large publishing corporations. Trying to stretch fair use a
| little too far so they had to go to court.
| superkuh wrote:
| It's the least worst option. Remember when that happened with
| Mozilla? Now they're an ad company. Take the bad (some bad
| mis-steps re:multiple lending during the pandemic, not
| rotating keys immediately after a hack) with the good
| (staying true to the human centric mission and not the money
| flows).
| ranger_danger wrote:
| The content of the archive is 90% mass piracy and Jason Scott
| is demonstrably complicit in encouraging users to upload
| copyrighted content without permission.
|
| Edit: Downvoting doesn't change the truth.
| tgsovlerkhgsel wrote:
| There are many "first things" you need to do if breached, and
| good luck identifying and doing them all in a timely fashion if
| you're a small organization, likely heavily relying on
| volunteers and without a formal security response team...
| trompetenaccoun wrote:
| We need archives built on decentralized storage. Don't get me
| wrong, I really like and support the work Internet Archive is
| doing, but preserving history is too important to entrust it
| solely to singular entities, which means singular points of
| failure.
| oytis wrote:
| We'll need to find even more people willing to expose
| themselves to legal threats and cyberattacks then.
| trompetenaccoun wrote:
| The legal side is a big issue, true. The simplest and best
| workaround that I'm aware of is how the Arweave network
| handles it. They leave it up to the individual what parts of
| the data they want to host, but they're financially
| incentivized to take on rare data that others aren't hosting,
| because the rarer it is the more they get rewarded. Since
| it's decentralized and globally distributed, if something is
| risky to host in one jurisdiction, people in another can take
| that job and vice versa. The data also can not be altered
| after it's uploaded, and that's verifiable through hashes and
| sampling. Main downside in its current form is that
| decentralized storage isn't as fast as having central
| servers. And the experience can vary of course, depending on
| the host you connect to.
|
| As for technical attacks, I'm not an expert but I'd assume
| it's more difficult for bad actors to bring down
| decentralized networks. Has the BitTorrent network ever gone
| offline because it was hacked for example? That seems like it
| would be extremely hard to do, not even the movie industry
| managed to take them down.
| Aachen wrote:
| > decentralized storage isn't as fast as having central
| servers.
|
| With the 30-second "time to first byte" speed we all know
| and love from IA, I'm pretty sure it'd only get faster when
| you're the only person accessing an obscure document on a
| random person's shoebox in Korea as compared to trying to
| fetch it from a centralised server that has a few thousand
| other clients to attend to simultaneously
| Aachen wrote:
| I collect, archive, and host data. Haven't gotten any threats
| or attacks. Not one. The average r/selfhosted user hiding
| their personal OwnCloud behind the DDoS maffia seems more
| afraid than one needs to be even for hosting all sorts of
| things publicly
| MattPalmer1086 wrote:
| Lots of Copies Keeps Stuff Safe
|
| https://www.lockss.org/
|
| This is a brilliant system relying on a randomised consensus
| protocol. I wanted to do my info sec dissertation on it, but
| its security model is extremely well thought out. There wasn't
| anything I felt I could add to it.
| TZubiri wrote:
| High Costs Makes Lots of Copies Unfeasible
| MattPalmer1086 wrote:
| That was actually one of the key constraints in the LOCKSS
| system, since it was designed to be run by libraries that
| don't have big budgets.
|
| The design is really very good.
| ChadNauseam wrote:
| I wish IPFS wasn't so wasteful with respect to storage. I
| tried pinning a 200mb PDF on IPFS and doing so ended up
| taking almost a gigabyte of disk space altogether. It's also
| relatively slow. However its implementation of global
| deduplication is super cool - it means that I can host 5
| pages and you can host 50, and any overlap between them means
| we can both help one another keep them available even if we
| don't know about one another beforehand.
|
| For a large-scale archival project, it might not be ideal.
| Maybe something based on erasure coding would be better. Do
| you know how LOCKSS compares?
| diggan wrote:
| > I tried pinning a 200mb PDF on IPFS and doing so ended up
| taking almost a gigabyte of disk space altogether
|
| Was that any file in particular? I just tried it myself
| with a 257mb PDF (as reported by `ls -lrth`) and doesn't
| seem to add that much overhead: $ du -sh
| ~/.ipfs 84K /home/user/.ipfs $
| ipfs add ~/Downloads/large\ PDF\ File.pdf added
| QmSvbEgCuRNZpkKyQm6nA5vz5RTHW1nxb6MJdR4cZUrnDj large PDF
| File.pdf 256.58 MiB / 256.58 MiB [============]
| 100.00% $ du -sh ~/.ipfs 264M
| /home/user/.ipfs
| Kinrany wrote:
| Is there a high level explanation of the model?
| jdiff wrote:
| This seems to get brought at least once in the comments for
| every one of these articles that pops up.
|
| The IA has tried distributing their stores, but nowhere near
| enough people actually put their storage where their mouths
| are.
| immibis wrote:
| Keep in mind the IA archives a lot of garbage. If it could be
| more focused it would be more likely to work.
| db48x wrote:
| The attempts have actually been focused on specific types
| of content, such as historical videos.
| Blackthorn wrote:
| The IA only works because it archives everything. You don't
| know what you need until you need it.
| Spooky23 wrote:
| Archives generally purposefully don't have a strong
| editorial streak. My trash is your treasure.
| unleaded wrote:
| personally I love all the random crap on IA!
| WarOnPrivacy wrote:
| > nowhere near enough people actually put their storage where
| their mouths are.
|
| Typically because most people who have the upload, don't know
| that they can. And if they come to the notion on their own,
| they won't know how.
|
| If they put the notion to a search engine, the keywords they
| come up with probably don't return the needed ELI5 page.
|
| As in: _How do I [?] for the Internet Archive?_ , most folks
| won't know what [?] needs to be.
| TZubiri wrote:
| This is literally torrents. Just give up
| briandear wrote:
| The problem with torrents is they have a bad reputation
| since people use it to steal and redistribute other
| people's content without their consent.
| card_zero wrote:
| Is there any form of torrent where you can do a full text
| search? That, to me, is the more important problem with
| torrents.
| TZubiri wrote:
| But internet archive doesn't do this? It's a key based
| search (url keys)
| AlienRobot wrote:
| Give it a good reputation then.
|
| What are some legal torrent trackers?
| unleaded wrote:
| archive.org to name one
| boomboomsubban wrote:
| That's debatable. Most of their torrents are for things
| under copyright, though any other decentralized archive
| would have the same problem.
| tourmalinetaco wrote:
| That's a copyright problem. 99% of things made in the
| last 100 years fall under copyright.
| ranger_danger wrote:
| What is your definition of a legal torrent tracker? I was
| not aware there were even any illegal ones.
| AlienRobot wrote:
| A tracker that only tracks legal torrents, e.g. free
| software, OCRemix content, etc.
| TZubiri wrote:
| How would you keep the definition of legality without a
| centralizing authority?
| boomboomsubban wrote:
| https://linuxtracker.org/
| http://www.publicdomaintorrents.info/
| https://ocremix.org/torrents
| mikae1 wrote:
| _> I was not aware there were even any illegal ones._
|
| Depends on the jurisdiction. Remember what happened in
| the The Pirate Bay trial?
| seam_carver wrote:
| Humble Bundle. Various Linux iso
| ranger_danger wrote:
| To me this is like saying you shouldn't use a knife
| because they are also used by criminals.
| John_Cena wrote:
| This kind of talk is simply modern politik-speak. I can't
| stand it and the people who fall for their deception.
| Stretch the truth to disarm the constituents
| thwarted wrote:
| The problem with file transfer is they have a bad
| reputation since people use it to [insert illegal or
| immoral activity here].
|
| Then rename it from "torrent" to something else.
| TZubiri wrote:
| I'm not sure what the argumentative line is here. But
| file uploading and downloading needs to have
| accountability for hosting, which p2p obscures.
|
| The bad reputation is inherent to the tech, not a random
| quirk.
| tourmalinetaco wrote:
| Torrents have a bad reputation due to malicious
| executables, I have never met someone who genuinely saw
| piracy as stealing, only as dangerous. In fact, stealing
| as a definition cannot cover digital piracy, as stealing
| is to take something away, and to take is to possess
| something _physically_. The correct term is copying,
| because you are duplicating files. And that's not even
| getting into the cultural protection piracy affords in
| today's DRM and license-filled world.
| WarOnPrivacy wrote:
| > This is literally torrents. Just give up
|
| Most casual visitors to IA don't know that. Which is the
| point.
|
| Giving up is for others.
| creer wrote:
| And it's guaranteed not to happen if the efforts don't
| continue.
| acdha wrote:
| You could say the same thing about perpetual motion. Being
| realistic about why past efforts have failed is key to
| doing better in the future: for example, people won't
| mirror content which could get them in trouble and most
| people want to feel some kind of benefit or thanks. People
| should be thinking about how to change dynamics like those
| rather than burning out volunteers trying more ideas which
| don't change the underlying game.
| zelphirkalt wrote:
| Perhaps one idea is to let people choose what they want to
| protect. This way people wanting to support it can have their
| mission.
| card_zero wrote:
| I want it to protect all sorts of random obscure documents,
| mostly kind of crappy, that I can't predict in advance, so
| I can pursue my hobby of answering random obscure
| questions. For instance:
|
| * What is a "bird famine", and did one happen in 1880?
|
| * Did any astrologer ever claim that the constellations
| "remember" the areas of the sky, and hence zodiac signs,
| that they belonged to in ancient times before precession
| shifted them around?
|
| * Who first said "psychology is pulling habits out of
| rats", and in what context? (That one's on Wikiquote now,
| but only because I put it there after research on IA.)
|
| Or consider the recently rediscovered Bram Stoker short
| story. That was found in an actual library, but only
| because the library kept copies of old Irish newspapers
| instead of lining cupboards with them.
|
| The necessary documents to answer highly specific questions
| are very boring, and nobody has any reason to like them.
| dawnerd wrote:
| You already can, they have torrents for everything.
| diggan wrote:
| > they have torrents for everything
|
| Including the index itself? That would be awesome.
| tourmalinetaco wrote:
| Their torrents suck and IME don't update to changes in
| the archive.
| sksxihve wrote:
| There's no real financial incentive for people to archive the
| data as a singular entity so even less for a distributed
| collection. Also it's probably easier to fund a single entity
| sufficiently so they can have security/code audits than a bunch
| of entities all trying to work together.
| riiii wrote:
| Some people are motivated by more than just financial
| incentive.
| sksxihve wrote:
| That's true, but something like archiving the internet is
| very costly, IA has an annual budget in the tens of
| millions.
| trompetenaccoun wrote:
| Yes, it's a good point. Though they could take that money
| and reward people for hosting the data as well, couldn't
| they? They don't have to be in charge of hosting.
| sksxihve wrote:
| Yes, they could, that's not much different than a single
| company distributing the archive to multiple storage
| centers though. My original comment was about it being
| more cost effective for a single company to do that than
| coordinating with a bunch of disjoint entities.
| trompetenaccoun wrote:
| Our digital memory shouldn't be in the hands of a small
| number of organizations in my view. You're right about
| cost effectiveness. There are pros and cons to both but
| it's not just external threats that have to be
| considered.
|
| History has always gotten rewritten throughout time. If
| you have a giant library it's easier for bad actors to
| gain influence and alter certain books, or remove them.
| This isn't just theoretical, under external pressure IA
| has already removed sites from its archive for copyright
| and political reasons.
|
| There are also threats that are generally not even
| considered because they happen with rare frequency, but
| when they happen they're devastating. The library of
| Alexandria was burned by Julius Caesar during a war.
| Likewise, if all your servers are in one country that
| geographic risk, they can get destroyed in the event of a
| war or such. No one expects this to happen today in the
| US, but archives should be robust long term, for decades,
| ideally even centuries.
| delfinom wrote:
| >Our digital memory shouldn't be in the hands of a small
| number of organizations in my view.
|
| I would wager at least 95% of "digital memory" archived
| is just absolute garbage from SEO spam to just some small
| websites holding no actual value.
|
| The true digital memory of the world is almost entirely
| behind the walls of reddit, twitter, facebook, and very
| few other sites. The internet landscape has changed
| massively from the 90s and 2000s.
| BlueTemplar wrote:
| So, about $0.01 per person per year ?
|
| We _are_ talking about an (almost) worldwide archive
| after all.
| __MatrixMan__ wrote:
| To make the web distributed-archive-friendly I think we need to
| start referencing things by hash and not by a path which some
| server has implied it will serve consistently but which
| actually shows you different data at different times for a
| million different reasons.
|
| If different data always gets a different reference, it's easy
| to know if you have enough backups of it. If the same name gets
| you a pile of snapshots taken under different conditions, it's
| hard to be sure which of those are the thing that we'd want to
| back up for that particular name.
| Cheer2171 wrote:
| Done. It is called IPFS. The IA already supports it.
|
| https://github.com/internetarchive/dweb-
| archive/blob/master/...
| __MatrixMan__ wrote:
| Right, what I'm saying is that now we need to get the rest
| of the web (or at least the parts we want to keep) on
| board.
| majorchord wrote:
| IPFS has shown that the protocol is fundamentally broken at
| the level of growth they want to achieve and it is already
| extremely slow as it is. It often takes several minutes to
| locate a single file.
| diggan wrote:
| The beauty is that IA could offer their own distribution
| of IPFS that uses their own DHT for example, and they
| could allow only public read access to it. This would
| solve the slow part of finding a file, for IA
| specifically. Then the actual transfers tend to be pretty
| quick with IPFS.
|
| What's the point of using IPFS then? Others can still
| spread the file elsewhere and verify it's the correct
| one, by using the exact same ID of the file, although on
| two different networks. The beauty of content-addressing
| I guess.
| acdha wrote:
| That isn't solving the problem, it's just giving them
| more of it to work on. IA has enough material that I'd be
| surprised if they didn't hit IPFS's design limits on
| their own, and they'd likely need to change the design in
| ways which would be hard to get upstream.
| BlueTemplar wrote:
| Several minutes sounds more than fine for this purpose ?
|
| Especially if it's about having an Internet Archive
| backup.
| Aachen wrote:
| I think the point is that it's already slow at the
| current amount of data, let alone when you stuff dozens
| more PB into it
| Groxx wrote:
| Which has a rather lengthy section explaining why it's
| currently a failed experiment:
| https://github.com/internetarchive/dweb-
| archive/blob/master/...
|
| (this doc is 5-6 years old though, and I'm not sure what
| may have changed since then)
|
| In my own (toy-scale) IPFS experiments a couple years ago
| it has been rather usable, but also the software has been
| utterly insane for operators and users, and if I were IA I
| would only consider it if I budgeted for a from-scratch
| rewrite (of the stuff in use). Nearly uncontrollable and
| unintrospectable and high resource use for no apparent
| reason.
| TechSquidTV wrote:
| This has really shown that the be true. I am stuck in a
| situation right now where I have some lost media I want to
| upload but they have been down for over a week. I plan to
| create a torrent in the meantime but that means relying on my
| personal network connection for the vast majority of downloads
| up front. I looked into CloudFlare R2, not terrible but not
| free either.
|
| I was looking into using R2 as a web seed for the torrent but I
| don't _really_ want to spend much to upload content that is
| going to get "stolen" and reuploaded by content farms anyway
| you know?
| tourmalinetaco wrote:
| Why not subscribe to a seedbox? They're about $5/2TB/mo. It
| protects your IP, you can buy for only the month, and since
| seedboxes are hosted in DMCA-resistant data centers you can
| download riskier torrents lightning fast, meaning you're not
| _just_ spending money for others, you can get something out
| of it too.
| Cheer2171 wrote:
| You say this as if the IA is not already deeply invested in the
| DWeb movement. If you go to a DWeb event in the Bay Area, there
| is a good chance it will be held at the IA.
| sschueller wrote:
| Yes, I was quite shocked when I found out that all their DCs
| are within driving distance.
| delfinom wrote:
| Yea so, who pays for the decentralized storage long term? What
| happens when someone storing decentralized data decides to
| exit? Will data be copied to multiple places, who is going to
| pay for doubling, tripling or more the storage costs for
| backups?
|
| Centralized entities emerge to absorb costs because nobody else
| can do it as efficiently alone.
| NelsonMinar wrote:
| Is anyone using ArchiveBox regularly? It's a self-hosted
| archiving solution. Not the ambitious decentralized system I
| think this comment is thinking of but a practical way for
| someone to run an archive for themselves.
| https://archivebox.io/
| _fat_santa wrote:
| I don't know what their funding model looks like but if they have
| some cash I'd say hiring a security team would be on top of the
| list of things to invest in.
| brendoelfrendo wrote:
| I believe that, at this point in time at least, IA's funding
| model consists of sweating profusely while awaiting a colossal
| legal judgement.
| udev4096 wrote:
| Is it the same email spoofing attack vector of zendesk which was
| disclosed last week?
| steffanA wrote:
| Article says API token was stolen in original breach.
| myself248 wrote:
| I'd like to imagine a world where every lawyer, when their case
| is helped by a Wayback Machine snapshot of something, flips a few
| bucks to IA. They could afford a world-class admin team in no
| time flat.
| thaumasiotes wrote:
| That's a terrible solution. The Wayback Machine takes down
| their snapshots at the request of whoever controls the domain.
| That's not archival.
|
| If the state of a webpage in the past matters to you, you need
| a record that won't cease to exist when your opposition asks it
| to. This is the concept behind perma.cc.
| myself248 wrote:
| Ooo, excellent. Yes, hiding items is imperfect, but I
| understood that it was legally required or something. (IANAL
| and IDFK, TBH) I wonder how perma.cc gets around that.
| immibis wrote:
| Most likely by breaking the law.
| berdario wrote:
| I'm afraid that it just hasn't been tested in court yet.
|
| I haven't read this paper yet, but...
|
| https://www.tesble.com/10.1080/0270319x.2021.1886785
|
| from the abstract:
|
| > The article concludes that Perma.cc's archival use is
| neither firmly grounded in existing fair use nor library
| exemptions; that Perma.cc, its "registrar" library,
| institutional affiliates, and its contributors have some
| (at least theoretical) exposure to risk
|
| It seems that the article is about copyright, but of course
| there are several other reasons that might justify takedown
| of content stored on perma.cc:
|
| - Right to be forgotten... perma.cc might be able to ignore
| it, but could this lead to perma.cc being blocked by
| european ISPs
|
| - ITAR stuff
|
| - content published by entities recognized by $GOVERNMENT
| as terrorist organizations
|
| - revenge porn
|
| - CSAM
| db48x wrote:
| No, they don't delete the archived content. When the domain's
| robots.txt file bans spidering, then the Wayback Machine
| _hides_ the content archived at that domain. It is still
| stored and maintained, but it isn't distributed via the
| website. The content will be unhidden if the robots.txt file
| stops banning spiders, or if an appropriate request is made.
| speerer wrote:
| In some cases they do appear to delete, on request.
|
| edit: "Other types of removal requests may also be sent to
| info@archive.org. Please provide as clear an explanation as
| possible as to what you are requesting be removed for us to
| better understand your reason for making the request.",
| https://help.archive.org/help/how-do-i-request-to-remove-
| som...
| db48x wrote:
| Nope. Nothing is deleted, just hidden.
| rascul wrote:
| How do you know?
| db48x wrote:
| I worked there for a short while.
| bombcar wrote:
| So if the Internet Archive accidentally archived child
| porn, they wouldn't delete it?
|
| I suspect they DO delete some things.
| Raed667 wrote:
| They do delete entire domains from the archive upon request
| & proof of ownership.
| db48x wrote:
| Again, no they don't. They just hide them.
| speerer wrote:
| That's correct, but only for present evidence - what about
| the past evidence, that you didn't know you needed until it
| was too late? IA is broad enough to cover the past five times
| out of ten.
| badlibrarian wrote:
| Restating my love for Internet Archive and my plea to put a
| grownup in charge of the thing.
|
| Washington Post: The organization has "industry standard"
| security systems, Kahle said, but he added that, until this year,
| the group had largely stayed out of the crosshairs of
| cybercriminals. Kahle said he'd opted not to prioritize
| additional investments in cybersecurity out of the Internet
| Archive's limited budget of around $20 million to $30 million a
| year.
|
| https://archive.ph/XzmN2
| semicolon_storm wrote:
| In security, industry standard seems to be about the same as
| military grade: the cheapest possible option that still checks
| all the boxes for SOC.
| incahoots wrote:
| Basically, whatever the liability insurance wants for you to
| be in compliance, than that's the standard.
| Spivak wrote:
| Hot take, this is the way it should be. If you want better
| security then you update the requirements to get your
| certification.
|
| Security by its very nature has a problem of knowing when to
| stop. There's always better security for an ever increasing
| amount of money and companies don't sign off on budgets of
| infinity dollars and projects of indefinite length. If you
| want security _at all_ you have bound the cost and have well-
| defined stopping points.
|
| And since 5 security experts in a room will have 10 different
| opinions on what those stopping points should be-- what
| constitutes "good-enough" they only become meaningful when
| there's industry wide agreement on them.
| db48x wrote:
| Yep. And worse, now matter how much you pay for security it
| is still possible for someone to make a mistake and publish
| a credential somewhere public.
| gjsman-1000 wrote:
| This ^
|
| We can't all have the latest EPYC processors with the
| latest bug fixes using Secure Enclaves and homomorphic
| encryption for processing user data while using remote
| attestation of code running within multiple layers of
| virtualization. With, of course, that code also being
| written in Rust, running on a certified microkernel, and
| only updatable when at least 4 of 6 programmers, 1 from
| each continent, unite their signing keys stored on HSMs to
| sign the next release. All of that code is open source, by
| the way, and has a ratio of 10 auditors per programmer with
| 100% code coverage and 0 external dependencies.
|
| Then watch as a kid fakes a subpoena using a hacked police
| account and your lawyers, who receive dozens every day,
| fall for it.
| gjsman-1000 wrote:
| Hilariously, I've been downvoted to -2 by butthurt
| security experts without a counter-argument.
| evilduck wrote:
| No, it's your demeanor that is unbecoming and not worth
| engaging with. Villianizing your poor behavior not
| successfully baiting people into replying as you want is
| childish too. Take a breather.
| abadpoli wrote:
| There never will be an adequate industry-wide
| certification. There is no universal "good enough" or "when
| to stop" for security. What constitutes "good enough" is
| entirely dependent on what you are protecting and who you
| are protecting it from, which changes from system to system
| and changes from day to day.
|
| The budget that it takes to protect against a script kiddy
| is a tiny fraction of the budget it takes to protect from a
| professional hacker group, which is a fraction of what it
| takes to protect from nation state-funded trolls. You can
| correctly decide that your security is "good enough" one
| day, but all it takes is a single random news story or
| internet comment to put a target on your back from someone
| more powerful, and suddenly that "good enough" isn't good
| enough anymore.
|
| The Internet Archive might have been making the correct
| decision all this time to invest in things that further its
| mission rather than burning extra money on security, and it
| seems their security for a long time was "good enough"...
| until it wasn't.
| goodpoint wrote:
| > since 5 security experts in a room will have 10 different
| opinions
|
| If that happens you need to seriously rethink your hiring
| process.
| EasyMark wrote:
| Military grade has different meanings. I've worked in the
| electronics industry a long time and will say with confidence
| that the pcbs and chips we sent to the military were our
| best. Higher temperature ranges, much more thorough
| environmental testing, many more thermal and humidity cycles,
| lots more vibration testing. However we also sell them for
| 5-10x our regular prices but in much lower quantities. It's a
| failed meme in many instances as the internet uses it though.
| pessimizer wrote:
| The Internet Archive has a management problem. They seem to be
| more comfortable _disrupting libraries_ than managing an online,
| publicly accessible database of disputed, disorganized material.
|
| Despite all of the positive self-talk, I don't know if they
| realize how important they are, or how easy it would be for them
| to find good help and advice if their management were transparent
| and everything was debated in public. That may have protected it
| to some extent; as a counterexample, Wikipedia has been extremely
| fragile due to its transparency and accessibility to everyone.
| With IA being driven by its creator's ideology, maybe that
| ideology should be formalized and set in stone as bylaws, and the
| torch passed to people openly debating how IA should be run, its
| operations, and what it should be taking on.
|
| I don't mean they should be run by the random set of Confucian-
| style libertarian aphorisms that is running the credibility of
| Wikipedia into the ground, but Debian is a good model to follow.
| Or maybe do better than both?
| avazhi wrote:
| https://www.wired.com/story/internet-archive-memory-wayback-...
|
| I appreciate their ethos and I've used the site many times (and
| donated!), but clearly it's at the point where Kahle et al just
| aren't equipped either personally (as a matter of technical
| expertise) or collectively (they are just a handful of people)
| to be dealing with what are probably in many cases nation-state
| attacks. Kahle's attitude towards (and misunderstanding of)
| copyright law is IMO proof that he shouldn't be running things,
| because his legal gambles (gambles that a first year law
| student could have predicted would fail spectacularly) have put
| IA at long term risk (see: Napster). And this information
| coming out over the past few weeks about their technical
| incompetence is arguably worse, because the tech side of things
| are what he and his team are actually supposed to be good at.
|
| It's true that Google and Microsoft and others should be
| propping up the IA financially but that isn't going to solve
| the IA's lack of technical expertise or its delusional hippie
| ethos.
| badlibrarian wrote:
| Don't forget the time Brewster tried to run a bank -- Internet
| Archive Federal Credit Union. Or that the physical archives are
| stored on an active fault line and unlikely to receive prompt
| support during an emergency. Or that, when someone told him
| that archives are often stored in salt mines he replied, "cool,
| where can I buy one?"
| mrweasel wrote:
| > Debian is a good model to follow.
|
| While I have no idea how Debian is actually funded I'd agree.
| One issue might be that The Internet Archive actually need to
| have people on staff, not sure if Debian has that requirement.
| You're not going to get people to man scanner or VHS players 8
| hours a day without pay, at least not at this scale.
|
| The Internet Archive needs a better funding strategy that
| asking for money on their own site. People aren't visiting them
| frequently enough for that to work. They need a fundraising
| team, and a good one.
|
| Finding managers are probably even worse. They can't get a
| normal CEO type person, because they aren't a company and the
| type of people who apply to or are attracted to running non-
| profit, server the community, don't be evil organisation are
| frequently bat-shit crazy.
| kmeisthax wrote:
| > Confucian-style libertarian aphorisms that is running the
| credibility of Wikipedia
|
| Can you elaborate? I'm aware of Wikipedia having very
| particular rules and lots of very territorial editors, but I'm
| not sure how this runs their credibility into the ground aside
| from pissing off the far right when they come in with an agenda
| to push.
| notmysql_ wrote:
| I sent them a resume almost a year ago, and got nothing back in
| response until yesterday. Looks like they are going through their
| backlog right now to find more hands.
| TZubiri wrote:
| Interesting, for a security position?
| notmysql_ wrote:
| It was a while ago, I think it was for their general position
| option, though I did talk about sec experience in it
| sirolimus wrote:
| It's incredibly sad to see threat actors attack something as
| altruistic as an internet library. Truly demoralizing to see such
| degeneracy.
| codezero wrote:
| There are many state actors that attack targets of opportunity
| just to cause chaos and asymmetric financial costs.
| croes wrote:
| Seems like the actor did it only for the street credit and the
| second breach is only a reminder that IA didn't properly fixed
| it after the first breach.
|
| Could be worse.
| userbinator wrote:
| When there are plenty of people who are steeped in the dogma of
| Imaginary Property, and whose lives depend on it, it's not too
| surprising.
| xyst wrote:
| Blame bad leadership.
| callc wrote:
| Is there a reason to blame the victim, rather than the
| attackers?
|
| I'm asking seriously - did IA do shitty things that make them
| a worthy cause for politically/ideologically motivated
| hacking?
| lolinder wrote:
| I imagine they're referring to the fact that the leadership
| showed extremely bad judgement in deciding to pick a battle
| with the major publishing companies that _everyone_ knew
| they would lose before it even began [0].
|
| I don't think that justifies blaming the victim here, and
| from what I can see the attacker doesn't seem to be
| motivated by anything other than funsies, but I absolutely
| lost a lot of faith in their leadership when they pulled
| the NEL nonsense. The IA is too valuable for them to act
| like a young activist org--there's too much for us to lose
| at this point. They need to hold the ground they've won and
| leave the activism to other organizations.
|
| [0] https://www.wired.com/story/internet-archive-loses-
| hachette-...
| jampekka wrote:
| > there's too much for us to lose at this point
|
| Feeling entitled?
| luckylion wrote:
| A different framing is: be grateful that it's these types of
| people breaching IA and being vocal about it & asking IA to fix
| their systems. Others might just nuke them, or subtly alter
| content, or do whatever else bad thing you can think of.
|
| They're providing a public service by pointing out that a
| massive organization controlling a lot of PII doesn't care
| about security at all.
| A4ET8a8uTh0 wrote:
| Not defending attacker, because I see IA as common good. That
| said one of the messages from this particular instance reads
| almost as if they were trying to help by pointing out issues
| that IA clearly missed:
|
| "Whether you were trying to ask a general question, or
| requesting the removal of your site from the Wayback Machine
| your data is now in the hands of some random guy. If not me,
| it'd be someone else."
|
| I am starting to wonder if the chorus of 'maybe one org should
| not be responsible for all this; it is genuinely too important'
| has a point.
| sim7c00 wrote:
| anything with tons of traffic going to it is a target. it has
| nothing to do with what the entity does, more with what
| potential reach it has. criminal behaviour is what it is.
| people pulling loads of visitors need to properly secure their
| shit, to prevent their their customers becoming their victims.
| gweinberg wrote:
| Does anyone know who is targeting the Internet Archive, and why?
| I get the impression the attacks are too sophisticated for it to
| just be vandal punks.
| xyst wrote:
| Is it sophisticated if IA leaves the door wide open? I blame
| shit leadership.
| lolinder wrote:
| > I get the impression the attacks are too sophisticated for it
| to just be vandal punks.
|
| What gives that impression? Everything I've seen about the
| attacker's messaging says "vandal punk(s)" to me, and nothing
| in what I've seen of the IA's systems screams Fort Knox. It
| wouldn't surprise me if they actually had a pretty lax approach
| to security on the assumption that there's very little reason
| to target them.
| jrm4 wrote:
| It strikes me as reasonable to _assume_ (or at least strongly
| bet on) -- I 'm not sure of the right phrase for it -- but like
| a mercenary type operation on behalf of some larger old media
| company?
|
| There's just too much "means, motive and opportunity" there.
| dokyun wrote:
| The group that claimed to be responsible for the first hack was
| said to be Russian-based, anti-U.S., pro-Palestine, and their
| reasoning for the attack was because of IA's violation of
| copyright....
|
| I think you should draw your own more informed conclusions, but
| it smells a lot like feds to me.
| polytely wrote:
| With the amount of comments calling for a leadership change my
| tinfoilhat theory is that this is a concerted effort to get a
| leadership change.
| alexey-salmin wrote:
| A genuine question to commenters asking to "put a grownup in
| charge of the thing" and saying that "Kahle shouldn't be running
| things": he built the thing, why exactly he can't run it the way
| he sees fit?
| et-al wrote:
| He is. But at the cost of the greater good.
|
| Most of us care mainly about the Wayback Machine and archiving
| webpages; not borrowing books still under copyright and
| fighting publishers.
| TZubiri wrote:
| Speak for yourself, the internet archive successfully
| increased its scope and made creative contributions to case
| law (although it lost at the appeals court)
| pvg wrote:
| A good place to direct that question might be in a reply to the
| person who made that comment.
| anthk wrote:
| The Internet Archive had legal gems such as the Jamendo Album
| Collection, a huge CC haven. Yes, most of it under NC licenses,
| but for non-commercial streaming radio with podcasts, these have
| been invaluable.
|
| Do you know Nanowar? They began there.
|
| Also, as commercal music has been deliberately dumbed down for
| the masses (in paper, not by cheap talking), discovering Jamendo
| and Magnatune in late 00's has been like crossing a parallel
| universe.
| 999900000999 wrote:
| Do any organizations have a mirror of this?
|
| Even if it's not publicly available...
| butz wrote:
| Is there any way IA could be mirrored in read-only mode, while
| security concerns are addressed?
| kleiba wrote:
| People with solid info sec knowledge: this is a good opportunity
| to offer your expertise pro-bono for a good cause!
| kyleyeats wrote:
| They're buried in these offers right now.
| op00to wrote:
| I wonder how many offers are legitimate.
| TZubiri wrote:
| An org amidst an attack might not be the most open to
| giving credentials and access to strangers.
| RcouF1uZ4gsC wrote:
| The Library of Congress should be archiving the Internet and it
| should have the budget required to do so.
|
| This is in line with its mission as the "Library of Congress".
| Being able to have an accurate record of what was on the Internet
| at a specific point in time would be helpful when discussing
| legislation or potential regulation involving the internet.
| awkwardpotato wrote:
| The Library of Congress does currently archive limited
| collections of the internet[0]. They have a blog post[1]
| breaking down the effort, currently it's 8 full time staff with
| a team of part time members. According to Wikipedia[2], it's
| built on Heritrix and Wayback which are both developed by the
| Internet Archive (blog post also mentions "Wayback software").
| Current archives are available at: http://webarchive.loc.gov/
|
| [0] https://www.loc.gov/programs/web-archiving/about-this-
| progra...
|
| [1] https://blogs.loc.gov/thesignal/2023/08/the-web-archiving-
| te...
|
| [2]
| https://en.m.wikipedia.org/wiki/List_of_Web_archiving_initia...
___________________________________________________________________
(page generated 2024-10-20 23:00 UTC)