hngopher.com

       [HN Gopher] AWS Tape Gateway
       ___________________________________________________________________
        
       AWS Tape Gateway
        
       Author : brudgers
       Score  : 121 points
       Date   : 2023-02-02 15:36 UTC (1 days ago)
        
 (HTM) web link (aws.amazon.com)
 (TXT) w3m dump (aws.amazon.com)
        
       | zetazzed wrote:
       | I think I'll point to this as one of the examples of how AWS has
       | been scrappy and successful. (Yes, I'll get like 10 flames on
       | this comment.) They made this nice simple API for S3 that has
       | become a de facto standard. They added lots of features and
       | security controls, and I can easily imagine an architect saying
       | "everyone should just adopt S3, it's so easy!" But huge, paying
       | customers were used to tape libraries so they said, damnit, fine
       | we'll pretend to be a tape library even though that is super
       | weird on some level... Meeting users where they are can be a
       | superpower.
        
         | chickenpotpie wrote:
         | > damnit, fine we'll pretend to be a tape library
         | 
         | IIRC, they may not be pretending. It's not public knowledge
         | what storage medium AWS glacier is built on, but many speculate
         | it is actually tape.
        
           | adamsb6 wrote:
           | My knowledge is a decade old, so they very well may be using
           | tape now, but were not originally.
           | 
           | At announcement time Glacier was using super dense racks of
           | hard disks but only powering a subset of them. This led to
           | some tape-like behaviors, like your restore request might
           | take a long time to process because your data was likely
           | stored on disks that weren't powered on and you'd need to
           | wait for them to come online.
        
             | dastbe wrote:
             | i'd (probably apocryphally) heard at announcement it was
             | just s3 under the hood to support customers while they
             | built the dataplane out :)
        
           | capableweb wrote:
           | Well, they do call it "virtual tapes", would be weird (maybe
           | even illegal? False/misleading advertisement?) if they ended
           | up storing it on physical tapes anyways.
           | 
           | Edit: found in another thread here, the inventor linked the
           | patent (https://image-ppubs.uspto.gov/dirsearch-
           | public/print/downloa...) which confirms it's virtual tapes,
           | not actual tapes.
        
             | theamk wrote:
             | It would be "virtual tape" over S3 API over whatever
             | technology AWS uses for glacier over physical tape.
             | 
             | They physical tapes, if any, would be many layers down.
        
             | dragonwriter wrote:
             | > Well, they do call it "virtual tapes", would be weird
             | (maybe even illegal? False/misleading advertisement?) if
             | they ended up storing it on physical tapes anyways.
             | 
             | No, just like it is not weird and definitely not illegal
             | for the things they call "virtual machines" to end up being
             | run on actual machines.
        
               | capableweb wrote:
               | If you purchase a service describing it as "isolated
               | virtual machines" and it ends up just being processes on
               | a computer without any virtualization, wouldn't you call
               | that at least misleading?
        
               | dragonwriter wrote:
               | Well, yes, because the "isolated" is false, not because
               | the "virtual machine" abstraction is realized on a real
               | machine.
        
               | capableweb wrote:
               | Right. If it says "Virtual machines" and it isn't, you
               | wouldn't call that at least misleading?
        
               | iamtedd wrote:
               | I think you're misunderstanding the scenario. What if
               | each 'virtual' machine is actually bare metal i.e.: every
               | EC2 instance is actually a separate computer?
               | 
               | There's still isolation, there's just no virtualisation
               | abstraction.
        
               | adrianmonk wrote:
               | Also virtual memory backed by physical memory, virtual
               | credit card numbers for a real credit card, and virtual
               | desktops displayed on real desktops.
        
             | [deleted]
        
         | orwin wrote:
         | Funny, i had this conversation 5 hours ago with my tech lead
         | and my engineering manager.
         | 
         | We talked about how we want to change our design to copy AWS.
         | Because it's standards, and our users (internal users) know how
         | to use it.
        
         | fnordpiglet wrote:
         | It's not that weird. Tape libraries are well established and
         | have well vetted highly reliable semantics that are deeply
         | ingrained in many companies way of working. What's weird is
         | companies trying to sell some completely new way of achieving
         | similar results and being surprised when the universe doesn't
         | reengineer everything as homage to their brilliance
        
           | derefr wrote:
           | > What's weird is companies trying to sell some completely
           | new way of achieving similar results and being surprised when
           | the universe doesn't reengineer everything as homage to their
           | brilliance
           | 
           | Some abstractions can be proven to be objectively-better
           | solutions, along some axes you care about, for your use-case,
           | such that the existing abstraction you were relying upon is
           | then provably sub-optimal. Switching to the more-optimal
           | abstraction is not then an "homage to the brilliance" of the
           | company that created the abstraction; it's a simple "pay
           | CapEx to lower OpEx in a way that will pay itself back after
           | N years" decision. If you can budget it, you do it.
           | 
           | I'm not saying that every abstraction that the IaaS vendors
           | invent is this kind of better mouse-trap, mind you. You can
           | usually distinguish the ones that are, because they get
           | independently reimplemented and used by people who aren't
           | just aiming to build "cloud-native" software, but instead
           | just really want/need the semantics of that particular
           | abstraction, independent of where it runs.
           | 
           | Object storage is a great example of an abstraction where
           | that _is_ the case. There are many things that do object
           | storage now -- both cloud services, and regular deployable
           | software. In fact, even the more-traditional backup vendors,
           | like Backblaze, now also offer object-storage APIs.
        
         | 0xbadcafebee wrote:
         | When someone is waving money in your face and saying "please
         | just give me the thing I want and I will give you all this
         | money", it's ridiculous not to listen. So many competitors just
         | don't listen.
        
           | capableweb wrote:
           | > So many competitors just don't listen
           | 
           | Sometimes people build things/products/companies/whatever not
           | because of the chase of money, but because they want to put
           | their idea in the world. Maybe it's just a case of someone
           | believing they can build something grander :)
        
         | CSMastermind wrote:
         | I use AWS at the companies I work for because of their amazing
         | customer service.
         | 
         | I'm fairly critical of Amazon's engineering culture and
         | practices but as a customer I have to admit they are head and
         | shoulders above their competition and frankly most SaaS
         | providers as a whole.
         | 
         | They've secured millions of dollars in revenue from me alone
         | just by being so customer friendly.
        
           | oneplane wrote:
           | Yep, that's been my experience as well. AWS (and Amazon as a
           | whole) are bad for the people that work there, but as for
           | their service offerings, it would take GCP and Azure combined
           | to get even close.
        
             | MisterPea wrote:
             | I think if you took a poll of AWS workers you'd find that
             | most enjoy it. The culture leads to a vocal minority who
             | REALLY hate it but my time there was some of the most
             | enjoyable I've ever had.
             | 
             | I've worked with some of the smartest people in the tech
             | field, building products from scratch for billion dollar
             | customers and learning every day.
             | 
             | I've always told people that it's probably the best place
             | you can start out of college.
             | 
             | EDIT: This was back in 2016, I'm not sure how much has
             | changed without Bezos and Jassy at helm of AWS
        
               | posguy wrote:
               | Amazon is a turn and burn employer at its heart. I've
               | watched dozens of friends and acquaintances work there,
               | rarely making it past 2 years, only to leave for less
               | caustic organization thereafter.
               | 
               | Unless your willing to put up with a huge ration of shit
               | on a regular basis, or you make it into management, your
               | time is limited by the up and out mentality of the
               | organization as a whole.
               | 
               | Amazon has created the perfect scenario to spend
               | shitloads of money getting talent up to speed, only to
               | part ways with them shortly thereafter.
        
               | dastbe wrote:
               | I can't really speak to amazon retail, but in my 7 odd
               | years in aws I found most people left because
               | 
               | * working in infrastructure means doing a lot not quite
               | so exciting things. most people aren't actually excited
               | by ops, and a lot of your dev work will be on ops. its
               | also the case that ops was historically not as well
               | rewarded, though thats been fixed
               | 
               | * every service needs to be held to the highest tier of
               | operational excellence, and so you can't just chabuduo
               | when things aren't working as well as they should. in the
               | worst case, this means throwing bodies to the ops
               | meatgrinder.
               | 
               | * delivery speed is slowwwww. some of this in endemic to
               | the space, but aws at the end of 2021 (when i left :) )
               | was a lot slower than than when i joined due to so much
               | risk aversion and bureaucracy
               | 
               | * comp wasn't as good compared many tech peers, and comp
               | could very easily not keep pace if you weren't tippy top
               | every year. this is imo the most self-defeating part of
               | amazon culture as a whole because it incentivizes people
               | rated between ~p10-p20 and p50 to leave.
        
               | amzn-throw wrote:
               | Sorry dude, you don't have credibility based on "Unless
               | your willing to put up with a huge ration of shit on a
               | regular basis, or you make it into management,"
               | 
               | Management at AWS arguably puts up with much more shit on
               | a regular basis.
        
               | simplotek wrote:
               | > I think if you took a poll of AWS workers you'd find
               | that most enjoy it.
               | 
               | Isn't the average tenure at Amazon less than 3 years?
               | "Most" of the people who you'd ask would be fresh from
               | the interview rounds and with only one or two performance
               | review rounds under their belt.
        
               | glenngillen wrote:
               | Same here (2017-2019). The main reason I left was how
               | much travel I had to do. With a young family at the time
               | I wasn't going to get those years back.
        
             | jorblumesea wrote:
             | That's Amazon in general, it's an effective company from
             | the outside because how it treats the people inside. The
             | customer is everything, and other things are secondary.
             | Including work life balance, employee happiness, burn
             | out...
        
       | glasss wrote:
       | My first and only experience with tape backups was in 2014
       | working with the BBB of Chicago. They were using tape backups,
       | and there I learned that almost no company that small should be
       | using tapes. Unless you have one or two people dedicated to
       | handling that plus whatever other backup solutions you have in
       | place, it won't end well.
        
         | robohoe wrote:
         | I hear you. I had to manage a Quantum LTO4 tape library with
         | Bacula back in early 2010s at a small company. Yeah that was a
         | fun experience (not) and I'm glad I don't have to deal with
         | them. Good riddance. Just getting the labeling right and making
         | sure that the tapes were writable was a pain.
        
           | guenthert wrote:
           | But that is Bacula fault isn't it? I used it myself @home and
           | couldn't help thinking that there must be a better way. At
           | work Commvault was used iirc, but others were in charge of
           | that. I evaluated Tivoli SM, but that's more than 15 years
           | ago and I barely remember. TSM and Bacula share some
           | concepts, but the latter appears quite hackish. Not that
           | confidence inspiring.
        
         | jedberg wrote:
         | Interesting. I wonder if tapes got more complicated or if
         | everything else just got easier.
         | 
         | The last time I dealt with tapes was in the 1990s, and it was
         | dead simple. We used tape backups at a company with only 150
         | people. I was the most junior admin so I had to be the tape
         | monkey, but it was easy. Just run backups regularly, change the
         | tape every couple of days and label it, ship a stack to the
         | offsite storage and request the old stack back, and then do a
         | test restore each time a stack came back.
         | 
         | The whole thing took maybe two hours a week.
        
         | ghshephard wrote:
         | Agreed you need someone dedicated a minimum 1/2 to full
         | headcount to run a backup tape backup system - particularly the
         | first six months when you are getting all your routines in
         | place (if can drop down to a minimum 8-10 hour/week time
         | commitment once things are running smoothly - but obvious can
         | scale up to many many headcount if you have a lot of data and a
         | lot of systems being backed up). You purchase a Tape Library
         | with sufficient drives / slots to support your data volume, you
         | decide on tape software, you set up a pickup schedule with Iron
         | Mountain or whoever will pick up (and return) your tapes, set
         | up up your schedule of fulls / diffs rotation. There are three
         | tricky and time consuming elements that require a lot of
         | knowledge and dedicated time - #1 making sure you have a
         | consistent snapshot and backup of the snapshots and _RESTORE_
         | process for all your systems. Someone needs to test /verify
         | roughly quarterly test restores (very time consuming and
         | annoying but critical). And, the one thing that very few people
         | do, but is also important - doing a "Black Start" - in which
         | your primary backup site (with the robot, and software, and
         | systems) are lost and you have to bootstrap, from scratch
         | _everything_.
         | 
         | So, yeah - virtual tape library, while obviously having
         | downsides, is massively useful for a small enterprise.
        
       | jasoneckert wrote:
       | It's been a long time since I've heard the word "tape" in my
       | circles (I can't get John Cleese's Institute for Backup Trauma
       | out of my head: https://www.youtube.com/watch?v=q_9wIupr9Hc).
       | 
       | That being said, kudos to Amazon for having a solution for those
       | who aren't in my circles and still use them.
        
       | jmclnx wrote:
       | Having to restore data from Mag Tapes over decades and getting
       | CRC errors maybe 20% of the time, this cannot possibly be worse.
        
         | Tepix wrote:
         | But vastly more expensive, that's for sure.
        
           | Nexxxeh wrote:
           | Depends on the cost of a potential 20%+ loss of data.
        
         | shiftpgdn wrote:
         | You needed to run a cleaner through that tape device.
        
           | jmclnx wrote:
           | No kidding :) But the company that had the issue (I left a
           | bit ago) believed in "cheaper the better". Cleaners ? That
           | place would expect you to use soap from the restroom :)
        
       | johnklos wrote:
       | "Tape Gateway supports all leading backup applications"
       | 
       | Only huge companies like Amazon can be this dumb. They don't even
       | mention tar or pax, the two most common tape backup applications.
       | 
       | Also, how will this magic Amazon "Tape Gateway" back up petabytes
       | over slow links? There are many data heavy businesses that don't
       | necessarily have tons of Internet bandwidth. Fully saturating a
       | 100 Mbps outgoing connection will only get you 1 TB a day, so
       | what happens when you have a tape's worth of new data daily and
       | an already heavily used Internet connection?
        
         | felixgallo wrote:
         | click on the 'AWS snowball with tape gateway' tab.
        
         | jaywalk wrote:
         | > Also, how will this magic Amazon "Tape Gateway" back up
         | petabytes over slow links?
         | 
         | Easy: it won't. If you have more data than bandwidth, you won't
         | use this service. Not sure why you seem to think otherwise.
        
           | brudgers wrote:
           | Amazon has a service to send a semi-trailer to the loading
           | dock of your data warehouse, so you can forklift in pallets
           | of physical media.
           | 
           | Then hauls it all away for ingestion into its cloud.
           | 
           | Or, if it will fit in your station wagon, you can haul your
           | data down the highway to one of Amazon's clouds.
           | 
           | Or put it in a box and ship it Fedex.
        
         | sithadmin wrote:
         | You upgrade your WAN circuits to meet your needs, or you suffer
         | the consequences. Simple as that. If you can't upgrade the WAN
         | circuits, you stick to local tape libraries.
        
         | unethical_ban wrote:
         | No organization considering this use case will have a 100Mbps
         | line for it.
        
       | bob1029 wrote:
       | AWS doing tapes like this feels weird to me. The physicality of
       | the thing is kind of the point in my mind. Virtualizing the last
       | line of defense seems wrong, but perhaps I could be convinced
       | otherwise.
       | 
       | Why doesn't Iron Mountain or some other competitor offer a
       | service like this?
        
         | sithadmin wrote:
         | Iron Mountain does have a service that will pick up physical
         | tape and offload to other storage, but nothing like AWS's
         | virtual tape capabilities.
         | 
         | FWIW, Amazon isn't 'doing tapes' with this product. It's just a
         | compatibility shim laid on top of object storage so that legacy
         | tape backup/archival systems can eliminate physical tape
         | libraries.
        
       | cynicalsecurity wrote:
       | This is getting ridiculous.
        
       | jurassic wrote:
       | > Tape Gateway stores virtual tapes in Amazon S3, Amazon S3
       | Glacier Flexible Retrieval, and Amazon S3 Glacier Deep Archive,
       | protected by 99.999999999% of durability.
       | 
       | That's a lot of 9s.
        
         | harshaw wrote:
         | Think about it in terms of erasure encoding where you can have
         | enough shards on enough hosts. That's how you get the
         | durability.
        
           | ff317 wrote:
           | Yeah but that's also incredibly naive. Inevitably there will
           | be some un-calculated-for shortcuts that make all those
           | copies not as failure-independent as they seem. If nothing
           | else, what are the odds some random Amazon engineer makes a
           | software rollout or hardware upgrade mistake and takes out a
           | whole lot of copies in the process?
        
         | vbezhenar wrote:
         | It means that 1 bit out of 12.5 GB is not durable. Not a lot if
         | you ask me.
        
         | kcb wrote:
         | I hate it. Once the number gets so small beyond comprehension
         | it's hard to believe the validity. It reminds me of the
         | Feynman's statements during the Space Shuttle Challenger
         | disaster.
         | 
         | > Feynman was disturbed by two aspects of this practice. First,
         | NASA management assigned a probability of failure to each
         | individual bolt, sometimes claiming a probability of 1 in 10^8,
         | i.e. one in one hundred million. Feynman pointed out that it is
         | impossible to calculate such a remote possibility with any
         | scientific rigor.
         | 
         | https://en.wikipedia.org/wiki/Rogers_Commission_Report
         | 
         | Unless S3 can survive nuclear war or an asteroid hitting the
         | earth, I don't buy it.
        
           | chaxor wrote:
           | One bit change in a sea of 100TB is quite a small percentage.
           | Could work that way.
        
           | daxfohl wrote:
           | Yeah, the most depressing part of my job is when I have to
           | calculate the SLO metrics for our directors. The first pass
           | of stuff we control is always way within SLO. Then we have to
           | subtract stuff where some third party messes up, or like the
           | request doesn't even get to our service and it always puts us
           | in the red.
        
           | lopkeny12ko wrote:
           | There _is_ rigor behind the claim. Check out this talk from a
           | few years ago: https://www.youtube.com/watch?v=DzRyrvUF-C0
        
             | dgacmu wrote:
             | There's rigor within the assumptions. The problem is when
             | the assumptions are wrong - most notably around failure
             | independence. Data center fire? Accounted for. Simultaneous
             | terrorist attack / asteroid (as per GP), / nuclear war on
             | all data centers? Maybe not. Where does insider threat fit
             | in on that spectrum? We don't know.
             | 
             | I'm personally fine with the claim as I understand the
             | methodology and assumptions but it has to be interpreted
             | carefully.
        
         | aftbit wrote:
         | Does than mean that you should expect to lose 1 byte for every
         | 100 GB that you store?
        
           | primax wrote:
           | The statement is made at an object level typically for AWS
        
       | mjlee wrote:
       | I think it's worth pointing out that this service was first
       | released in 2012.
        
         | s_dev wrote:
         | So why is it relevant to the curious mind on HN in 2023? Was
         | there a significant change made to the system?
        
       | chrisshroba wrote:
       | Tangent: What's the cheapest way to backup a few TB of personal
       | data these days, pricing based on the premise that I probably
       | will never need to retrieve it (due to local backups as well),
       | but I don't want to pay thousands if I do have to (hundreds would
       | be okay). Glacier Deep Archive?
        
         | howeyc wrote:
         | Scaleway Glacier
        
         | BonoboIO wrote:
         | Hetzner Storage Boxes I host nearly everything there. Just
         | awesome company.
         | 
         | 1TB = 3.2EUR
         | 
         | 5TB = 10.9EUR
         | 
         | 10TB = 20.8EUR
         | 
         | ...
         | 
         | https://www.hetzner.com/storage/storage-box
         | 
         | No transfer fees.
        
         | CharlesW wrote:
         | > _What 's the cheapest way to backup a few TB of personal data
         | these days..._
         | 
         | I'm using CrashPlan for Small Business, so I can backup select
         | NAS shares as well. $10/month.
        
         | unethical_ban wrote:
         | Yes.
         | 
         | You can use rclone to backup to an S3 bucket, and then have the
         | S3 bucket set to "instantly" move files to deep archive.
         | 
         | To retrieve them, you will need to run an extra command to
         | "unarchive" them for some period of time.
         | 
         | I find this easier than trying to interact directly with any
         | Deep Archive API.
         | 
         | It is very cheap, and AWS's durability guarantees are
         | impressive.
         | 
         | Main S3 page, linked to a note about Deep Archive
         | https://rclone.org/s3/#glacier-and-glacier-deep-archive
         | 
         | The only thing I am not sure on is how to quickly restore a
         | large number of files in a bucket from GDA to S3. It obviously
         | can be scripted, but I don't have that handy. I only keep a
         | small number of large, previously encrypted files in it, so I
         | manually restore from the GUI.
         | 
         | (By the way, rclone can transparently encrypt files and
         | filenames client-side!)
        
         | ghaff wrote:
         | Probably. Looks like about $1/TB/month. Although Backblaze
         | personal backup isn't that much more for a few TB--about
         | $6/month "unlimited" paid annually.
        
         | floren wrote:
         | Backblaze B2 would cost you about $10/mo to store 2TB. I've
         | been using them for years for a few tens of gigabytes worth of
         | documents and photos.
         | 
         | https://www.backblaze.com/b2/cloud-storage-pricing.html
         | 
         | edit: it integrates with TrueNAS (formerly FreeNAS) so it's
         | been pretty much set-and-forget, but I can check in and see my
         | stuff through their management interface. I _do_ encrypt my
         | docs before uploading, which TrueNAS also supports.
        
           | thomaslord wrote:
           | Have you had any luck with picking and choosing what gets
           | backed up? I have a bunch of smaller personal files I'd love
           | to sync to B2 (photos, backups, etc.) but I also have 7TB+ of
           | linux ISOs that don't really need to be backed up because I
           | can easily download them again. I tried to exclude the
           | folders I don't want to back up, but I haven't had any luck.
        
             | artificialLimbs wrote:
             | I use rsync with Backblaze. I put everything I want sent to
             | BB in /backupfolder/Archive, and everything else straight
             | in /backupfolder. I have 2 rclone remotes, one pointing to
             | another machine at my house, and 1 pointed to BB.
        
             | phatfish wrote:
             | rclone should do this, its been a while since it set it up,
             | but its a cli to manage cloud storage providers, Backblaze
             | is supported.
        
             | msh wrote:
             | I use cyberduck and just manually upload.
        
             | floren wrote:
             | It's been a long time since I set it up, but I believe each
             | cloud sync task will sync one directory. So I've got two
             | tasks: one that syncs /mnt/main/media/Photos in PUSH/COPY
             | mode, and one that syncs /mnt/main/Documents in PUSH/SYNC
             | mode.
             | 
             | You should be able to accomplish the same, but you may need
             | to either re-organize your stuff or set up multiple tasks.
             | 
             | edit: for clarification, "/mnt/main/media" contains various
             | other subdirectories of Linux ISOs etc., which is why I
             | push /mnt/main/media/Photos specifically.
        
           | Tijdreiziger wrote:
           | There are also (cheaper) S3-compatible providers in Europe,
           | which might be preferable for some: https://european-
           | alternatives.eu/category/object-storage-pro...
        
         | fulafel wrote:
         | Cheapest is probably a used HDD.
        
           | ghaff wrote:
           | As the parent said, they have local backup but they _also_
           | want a cloud backup which makes perfect sense.
        
         | kiririn wrote:
         | Emphasis on cheapest, even after their recent price hike you
         | won't find cheaper than 1fichier
        
         | user3939382 wrote:
         | Wasabi has full S3 interface but significantly cheaper
         | https://wasabi.com/cloud-storage-pricing/ $6/TB/month free
         | egress so for 3 TB $18/mo
        
         | seized wrote:
         | AWS Glacier Deep Archive. It's about $1/TB/month. Retrieve is
         | more expensive, about $90/TB all in.
         | 
         | You can use rclone as one simple tool.
        
         | hondo77 wrote:
         | I have over 3TB backed up to Backblaze (in addition to a local
         | backup) for $70/yr. That's for Mac or PC. If you're on linux, I
         | believe they offer something for that, too.
        
           | ghaff wrote:
           | I'm a long-term satisfied Backblaze user on Mac.
           | 
           | The pro is that it has a client and, once setup, it just
           | works.
           | 
           | The con is that it's a client running on a single system and
           | the personal pricing plan doesn't support a NAS I believe. I
           | periodically copy my files over from a RAID NAS to a local
           | USB.
        
         | jquaint wrote:
         | I recently set up this: https://github.com/vsespb/mt-aws-
         | glacier
         | 
         | Found it a lot easier than rolling my own AWS Glacier solution.
        
           | jwineinger wrote:
           | That's quite an old (perl) codebase. Did you have any issues?
        
       | SCdF wrote:
       | Dumb question, and I'm guessing this is a "if you don't know it's
       | not for you" situation, but what is the point of a virtual tape?
       | Isn't the point of a tape that it's not virtual? Or is this more
       | replicating tape software apis (WINE/proton style) so you can get
       | rid of physical tapes (because you no longer care about their
       | physicality) without having to change your backup strategy?
        
         | thedougd wrote:
         | It's the latter. In a large enterprise, the backup
         | configuration can be wildly complicated, a matrix of systems,
         | schedules, slas, etc. Just reconfigure your backup software to
         | use this new virtual tape device and you're on your way.
        
           | logifail wrote:
           | > Just reconfigure your backup software to use this new
           | virtual tape device and you're on your way
           | 
           | Isn't the whole point of tape that it is a physical thing,
           | and may be taken offsite in a truck / stored in a vault, and
           | all that jazz.
           | 
           | "Just reconfiguring" your backup software sounds like a
           | business might just bypass all that without necessarily
           | realising the consequences.
           | 
           | Then "we got hacked and our local backups are gone" ->
           | "restore from offsite tape" -> "oh, actually there are no
           | tapes, it's all the cloud now" -> ... ??
        
             | kube-system wrote:
             | AWS's hard disks are a physical thing that is offsite
        
               | logifail wrote:
               | > AWS's hard disks are a physical thing that is offsite
               | 
               | Taking tapes to a secure offsite location means they're
               | air-gapped from the source data, so even in a situation
               | where an entire network is compromised, remotely-stored
               | tapes can't be wiped/encrypted.
               | 
               | I'm sure AWS has thought about this issue while designing
               | Tape Gateway, but it's not clear to me how it could know
               | whether a request to retrieve and overwrite a virtual
               | tape was legitimate or not.
        
             | orf wrote:
             | Then "we got hacked and our local backups are gone" ->
             | "restore from offsite tape" -> "oh, actually all the tapes
             | where stored at the wrong temperature or humidity and now
             | they are fucked" -> ... ??
        
         | sithadmin wrote:
         | The gateway appliance is fairly compatible with legacy backup
         | solutions, which makes it a great drop-in replacement for
         | physical tape library systems. There are certainly better
         | backup methods available these days (though it's hard to beat
         | the durability and cost of LTO tape for long-term archival),
         | but I've seen AWS's virtual tape used as a good stopgap while
         | other backup/recovery solutions are still a ways out an org's
         | infrastructure roadmap.
        
           | tablespoon wrote:
           | > though it's hard to beat the durability and cost of LTO
           | tape for long-term archival
           | 
           | I'm not entirely comfortable with the trend towards ever more
           | esoteric and seemingly "all or nothing" technologies.
           | 
           | Tape (and other physical things) may have their downsides,
           | but I think there's something to be said for something that
           | could (theoretically) be forgotten in a closet and read 50
           | years later.
           | 
           | Likewise, AM radio may not be the best quality, but it seems
           | like you can cover more area with a single installation than
           | any other communication technology, which might come in handy
           | after a serious disaster like a nuclear war (e.g. crank up
           | the nighttime power of some undamaged countryside station
           | running on generators to tell survivors where it's still
           | safe).
        
             | vbezhenar wrote:
             | It's very unlikely that you'll be able to read modern tape
             | from closet in 50 years. Tape requires very precise
             | temperature and humidity for storage. Otherwise all bets
             | are off.
        
         | shrike wrote:
         | The VTL was invented because, at the time (2012ish), none of
         | the largest enterprise backup solutions had a good S3 interface
         | and none supported Glacier. After talking with the backup
         | software vendors (Spectrum, Tivoli, Symantec, Commvault, etc)
         | it became clear that adding another backup target wasn't
         | something we (AWS) could get them to prioritize, for perfectly
         | reasonable reasons. We could (and did) apply pressure via our
         | shared customers, even then they estimated it would take years.
         | 
         | The fastest way to enable large enterprise access to S3 and
         | Glacier for backups was to meet them where they were. We did
         | this by virtualizing a tape library.
         | 
         | Background: I'm one of the original inventors - https://image-
         | ppubs.uspto.gov/dirsearch-public/print/downloa.... I am no
         | longer with AWS.
        
           | sshagent wrote:
           | You did such a great job they still try and steer people this
           | route. I've multiple times mentioned "you know we don't need
           | to emulate tape drives any more"
        
           | [deleted]
        
           | free-ideas wrote:
           | [dead]
        
         | brudgers wrote:
         | Even for a new strategy, using a tape abstraction might
         | simplify the design and/or implementation because backup tools
         | and culture have their roots in tape.
         | 
         | There's a sense in which tape is water in which backups swim.
        
         | [deleted]
        
       | api wrote:
       | Anyone considering putting giant amounts of data into AWS or most
       | other big clouds should be sure to do the numbers on retrieval.
       | All the big clouds have "roach motel pricing" -- data is free or
       | very cheap to put in, and expensive to get out. Outbound
       | bandwidth costs from AWS are astronomical, so make sure you are
       | not going to need to stream down all that data or move it any
       | time soon or that you have compared that cost to your in-house
       | solution.
        
         | cube00 wrote:
         | This kind of unbalanced pricing also acts as a disincentive to
         | regularly test your disaster recovery procedures.
        
       | OliverJones wrote:
       | HEY! Before you adopt this check the bandwidth egress costs!
       | Seriously.
       | 
       | Your disaster recovery plan will need to budget for extra egress
       | costs if you ever need to restore something big from one of those
       | virtual tapes. AWS charges for egress -- downbound -- bandwidth
       | but they don't charge for ingress -- upbound -- bandwidth.
       | 
       | (And I think it's funny their icons and images look like old
       | school DECTapes.)
        
         | klabb3 wrote:
         | > Before you adopt this check the bandwidth egress costs!
         | 
         |  _Closed, works as intended_ : This is the industry-standard
         | price model for all ransomware.
         | 
         | /light sarcasm
        
         | jeffrallen wrote:
         | The roach motel of data: your data goes in, but it doesn't go
         | out (for free).
        
       | dzdt wrote:
       | As always the problem seems to be the 'Hotel California' issue:
       | you can check out but you can never leave. Once you have massive
       | data in AWS there is no efficient and affordable solution to move
       | that data back out; you are locked in forever subject to whatever
       | future terms Amazon may chose to impose.
        
         | robocat wrote:
         | "Transfer up to 100 PB per Snowmobile, a 45-foot-long
         | ruggedized shipping container pulled by a semi-trailer truck."
         | - I'm guessing usually used to migrate corporate information
         | into AWS, but can also be used for data export. They advertise
         | Exabytes (using more than one rig I guess).
         | https://aws.amazon.com/snowmobile/
        
           | ufmace wrote:
           | I wonder if they haven't updated that in a while, it sounds
           | low. You can get 22TB 3.5" hard drives these days. If I take
           | that storage and figure full racks with rackmount NASes I get
           | 30 feet long for 1 row of full-size racks. 45 feet is longer
           | and they probably have room for at least 2 rows. Even
           | accounting for probably needing some controller computers and
           | routers and network interfaces and such, I would think they
           | could at least double that these days.
        
           | kube-system wrote:
           | https://aws.amazon.com/snowmobile/faqs/
           | 
           | > Snowmobile does not support data export.
        
             | robocat wrote:
             | Hmmmm - the subheading of the page says "Migrate or
             | transport exabyte-scale datasets into and out of AWS", and
             | in the diagram they repeat it. Either AWS is being
             | deceptive, or the FAQ is wrong. Someone who cares that has
             | a corporate AWS account should check or get documentation
             | corrected? More interesting to know how AWS respond...
             | Q: Can I export data from AWS with Snowmobile? Snowmobile
             | does not support data export. It is designed to let you
             | quickly, easily, and more securely migrate exabytes of data
             | to AWS. When you need to export data from AWS, you can use
             | AWS Snowball Edge to quickly export up to 100TB per
             | appliance and run multiple export jobs in parallel as
             | necessary.
             | 
             | Interesting since there is obviously no technical reason
             | the export can't be done from AWS. [?]Technically you could
             | export exabytes using tens of thousands of 100TB Snowball
             | Edge devices!?
             | 
             | The snowball "device" is pretty neat:
             | 
             | * 45-foot long High Cube shipping container
             | 
             | * [each snowmobile container] comes with a removable
             | connector rack with up to two kilometers of networking
             | cable that can directly connect to the network backbone in
             | your data center.
             | 
             | * AWS can provide the auxiliary chiller if needed based on
             | the site survey findings [if temperature exceeds 85F/29.4C]
             | 
             | * A fully powered Snowmobile requires ~350KW [sic]
             | Generators can be provided by AWS if needed
             | 
             | * Snowmobile pricing is based on the amount of data stored
             | on the truck per month. Provisioned Snowmobile Capacity:
             | $0.005/GB per month
        
         | jimt1234 wrote:
         | AWS will do _anything_ to get your data into their cloud. I
         | realized this a few years ago when Snowmobile was released:
         | https://aws.amazon.com/snowmobile
        
         | thedougd wrote:
         | It is for backups of which you have another copy. To switch
         | providers you would start shipping new backups to the new
         | provider. Once your confident the provider has all the backups
         | to meet your retention policy, you abandon AWS.
        
         | WrtCdEvrydy wrote:
         | > Once you have massive data in AWS there is no efficient and
         | affordable solution to move that data back out; you are locked
         | in forever subject to whatever future terms Amazon may chose to
         | impose.
         | 
         | You can have your shit packed into a snowball and shipped to
         | you.
        
           | pmw wrote:
           | But you pay the same egress rate.
        
           | htrp wrote:
           | Can you?
           | 
           | https://aws.amazon.com/snowmobile/faqs/ > Snowmobile does not
           | support data export.
        
             | electroly wrote:
             | Snowball and Snowmobile are different. Snowball does
             | support it.
             | 
             | https://docs.aws.amazon.com/snowball/latest/developer-
             | guide/...
        
             | vbezhenar wrote:
             | It's confusing.
             | 
             | https://aws.amazon.com/snowmobile/
             | 
             | Quickly retrieve data from the cloud whenever you need it.
             | 
             | What does it mean?
        
               | jml7c5 wrote:
               | Perhaps "once we have imported your data, you can quickly
               | retrieve data from the cloud using the regular S3 APIs
               | and fee structure". Or in other words, "there is no
               | special restriction on data added through Snowball".
               | 
               | Feels a bit like intentional misdirection, but it's more
               | likely a case of sloppy writing.
        
       ___________________________________________________________________
       (page generated 2023-02-03 23:01 UTC)