[HN Gopher] MangaDex infrastructure overview
       ___________________________________________________________________
        
       MangaDex infrastructure overview
        
       Author : m45t3r
       Score  : 521 points
       Date   : 2021-09-07 03:50 UTC (19 hours ago)
        
 (HTM) web link (mangadex.dev)
 (TXT) w3m dump (mangadex.dev)
        
       | probotect0r wrote:
       | How is the Wireguard VPN set up? Has anyone used Wireguard to set
       | up a VPN into AWS VPCs?
        
       | germanjoey wrote:
       | Many manga fans have a love/hate relationship with mangadex. On
       | one hand, it's provided hosting for countless hours of
       | entertainment over the years. Their "v3" version of the site was
       | basically perfect from a usability point of view, to the point
       | that the entire community chose to unite itself under its flag.
       | 
       | On the other hand, directly because of the above, their hasty
       | self-inflicted take down earlier this year nearly killed the
       | entire hobby. Many series essentially stopped updating for the ~5
       | months the site was down, and many more are likely never coming
       | back again.
       | 
       | The decision to suddenly take the site down for a full site
       | rewrite feels completely inexplicable from the outside. (A
       | writeup the above or the previous one[1], both of which read like
       | they were written by a Google Product Manager, especially don't
       | help as they conspicuously avoid any comment to the one question
       | on everyone's mind: "leaving aside the supposed security issues
       | with the backend, why on earth also rewrite and redesign the
       | entire front end from scratch at the same time?")
       | 
       | [1] https://mangadex.dev/why-rebuild/
        
         | Jcowell wrote:
         | I never got the sense that the manga community hated Mangadex
         | and I've been following their Dev of V5 and the rise of other
         | sites to use in their absence.
         | 
         | It's seems weird to attribute Mangadex taking their site down
         | for valid security concerns to the end of scandalization of
         | certain series. That seems like entirely a Scan team problem if
         | they decide not to upload via Cubari like other teams have
         | done. And it doesn't even matter since a series can get sniped
         | at anytime.
         | 
         | It's makes entire sense that if you're going to rewrite the
         | backend and API from scratch , you might as well do the front
         | end too since it was a Goal from the beginning.
        
         | nvarsj wrote:
         | I didn't get that take at all from the why-rebuild link. It
         | seems reasonable to me - legacy code base, hard to maintain,
         | with security problems that led to the massive data leak a
         | while back. They also don't owe anything to anyone, and as a
         | hobbyist project, they wanted to try something new. I'm
         | impressed as they seem to have managed it - and the new site
         | feels a lot more responsive than the old one.
        
         | creshal wrote:
         | Yeah. This whole mess pushed me to moving everything I had (or
         | could remember, anyway) to Tachiyomi1, so I can hop between
         | hosting websites freely without losing progress or access to
         | old chapters (as long as I don't run out of local storage).
         | 
         | And while it works fine for reading, it kills any interaction
         | with the hosting sites. No chance for monetization,
         | socialization or anything else that can help sites survive
         | long-term.
         | 
         | [1] https://tachiyomi.org/
        
           | m45t3r wrote:
           | Before MangaDex we had Batoto (the old Batoto before some
           | sketchy company bought the name), that was kinda of the same:
           | serving high quality manga for most scanlators that wanted it
           | (and also avoiding hosting pirated chapters from official
           | sources, so kinda the same as MangaDex nowadays). As far I
           | remember Batoto closed because of pressure from companies and
           | also because of the high costs related to run the site.
           | 
           | So yeah, considering how fragile maintaining a site like this
           | is, it is always a good idea to sync your progress in a third
           | party so it is easier to migrate if something goes wrong.
           | 
           | > And while it works fine for reading, it kills any
           | interaction with the hosting sites. No chance for
           | monetization, socialization or anything else that can help
           | sites survive long-term.
           | 
           | BTW, MangaDex doesn't have monetization because it is strict
           | a hobby and also because it is a gray area to monetize about
           | this kinda of work [1]. Also, their Tachiyomi client is
           | official (MangaDex v5 API was tested primarily via their
           | Tachiyomi client before they finished the Web interface).
           | 
           | [1]: both for companies (that has the copyright from the
           | works hosted on those sites) and the scanlators (the fans
           | that does actual work of translating those chapters). Sites
           | that host those chapters and monetize are pretty much
           | monetizing on work from other people.
        
           | majora2007 wrote:
           | Totally a self plug, but if you're looking to take it a step
           | further, Kavita is a great program to host your own, Plex-
           | like manga server.
           | 
           | https://kavitareader.com
        
             | creshal wrote:
             | That actually looks really interesting, thanks!
        
         | vymague wrote:
         | > their hasty self-inflicted take down earlier this year nearly
         | killed the entire hobby
         | 
         | It won't kill the hobby. Because these scanlators are making
         | mad money from ads, patreon, crypto mining. I'll never get why
         | they don't get more aggressive take down notices from
         | Chinese/Japanese/Korean publishers.
        
           | level3 wrote:
           | They get plenty of takedown notices, but they mostly get to
           | hide behind services like Cloudflare who won't take action
           | regarding these notices anyway. From the publishers/creators
           | side, there is simply no effective way to take scanlators
           | down.
        
           | kmeisthax wrote:
           | Copyright enforcement is actually quite expensive, both for
           | the litigant and the defendant. The only way for it to be
           | actually profitable to sue someone who is stealing your work
           | is if they immediately settle, which is how copyright trolls
           | operate. Everything else is a massive money pit for everyone
           | involved, even the lawyers. Since this is an international
           | enforcement action, the costs go up _more_ , because now you
           | need _multiple_ legal teams on the bar in each jurisdiction,
           | translators qualified for interpreting laws in foreign
           | languages, knowledge of local copyright quirks, and a lot
           | more coordination than just asking your local counsel to send
           | a takedown notice locally.
           | 
           | (Just as an example of a local copyright quirk that will
           | probably confuse a lot of people in the audience from Europe:
           | copyright registration. America really, _really_ wants you to
           | register your copyright, even though they signed onto Berne
           | /WTO/TRIPS which was supposed to abolish that regime
           | entirely. As a result, America did the bare minimum of
           | compliance. You don't lose your copyright if you don't
           | register, but you can't sue until you do, and if you register
           | after your work was infringed, you don't get statutory
           | damages... which means your costs go way up.)
           | 
           | Furthermore, every enforcement action you take risks PR
           | backlash. The whole fandom surrounding import Japanese comic
           | books basically grew out of a piracy scene. Originally, there
           | were no English translations, and the scene was basically
           | reusing what we'd now call "orphan works". There used to be
           | an unspoken rule among most fansubbers of not translating
           | material that was licensed in the US. All that's changed;
           | most everything gets licensed and many fan translators
           | absolutely are stepping on the toes of licensees. However,
           | every time a licensee or licensor actually takes an
           | enforcement action, they get huge amounts of blowback from
           | their own fans.
        
           | chaorace wrote:
           | I suspect it's because the international market for print
           | manga (the primary cash cow) is rather anemic, particularly
           | compared to anime.
           | 
           | Publishers see the loss as minimal and creators see piracy as
           | free advertising to drum up enthusiasm for anime adaptations,
           | which actually _do_ drum up decent profits internationally (
           | _the committee keeps the streaming licensing fees, not the
           | animation studio_ ).
        
             | level3 wrote:
             | Publishers definitely don't see it that way; that's mostly
             | an extension of a myth in order to justify the piracy.
             | 
             | Most manga publishers will see relatively little revenue
             | from international anime releases. Even for domestic anime
             | releases of the vast majority of titles, the manga
             | publisher is only a small part of the anime production
             | committee, and the hope is mostly that popularity of the
             | anime can lead to increased sales of the manga,
             | merchandise, or other events. So when the anime is released
             | internationally, they get an even smaller cut of that
             | because the international licensee also has to take their
             | profit.
             | 
             | But other than mega-hit titles where an international anime
             | release may also lead to significant international manga
             | sales, the popularity of an anime adaptation overseas is
             | practically irrelevant to the original manga publisher.
        
       | FpUser wrote:
       | I wrote business backend server that calculates various things
       | and returns results as json. It serves in average up to 5000
       | requests/s for about $220CDN / month. Architecture - single
       | executable written in C++ () running on rented dedicated server
       | from OVH. 16 cores, 128GB RAM and couple of SSDs.
       | 
       | It can do much higher requests per second wise on simple requests
       | but most common requests are actually heavy iterative
       | calculations so hence the average of 5000 requests/s
        
       | cyberpsybin wrote:
       | At some point in new re-design, they started to load full size
       | images for thumbnails. The whole site feels slower due to that.
       | Need an automatic re-scaler service.
        
         | tristan9 wrote:
         | Not correct, we generate 2 thumbnails sizes for every cover --
         | if the site loads full-size anywhere by default (rather than
         | when you expand it) it's definitely a bug!
        
       | maxk42 wrote:
       | I run an Alexa top-2000 website. (Mangadex is presently at about
       | 6000.) I spend less than $250 a month.
       | 
       | I have loads and loads of thoughts about what they could be doing
       | differently to reduce their costs but I'll just say that the
       | number one thing Mangadex could be doing right now from a cursory
       | glance is to reduce the number of requests. A fresh load of the
       | home page generates over 100 requests. (Mostly images, then
       | javascript fragments.) Mangadex apparently gets over 100 million
       | requests per day. My site - despite ostensibly having more
       | traffic - gets fewer than half that many in a month. (And yes,
       | it's image-heavy.)
       | 
       | A couple easy wins would be to reduce the number of images loaded
       | on the front page. (Does the "Seasonal" slider really need 30
       | images, or would 5 and a link to the "seasonal" page be enough?
       | Same thing with "Recently Added" and the numbers of images on
       | pages in general.) The biggest win would probably be reducing the
       | number of javascript requests. Somehow people seem to think
       | there's some merit to loading javascript which subsequently loads
       | additional javascript. This adds a tremendous amount of latency
       | to your page load and generates needless network traffic. Each
       | request has a tremendous amount of overhead - particularly for
       | dynamically-generated javascript. It's much better to load all of
       | the javascript you need in a single request or a small handful of
       | requests. Unfortunately, this is probably a huge lift for a site
       | already designed in this way, but the improved loading time would
       | be a big UX win.
       | 
       | Anyway - best of luck to MangaDex! They've clearly put a lot of
       | thought into this.
        
         | tristan9 wrote:
         | Hi, we're trying to lower the requests:pageview ratio in
         | general, but for what it's worth this article essentially:
         | 
         | - ignores the vast majority of "image serving" (most is handled
         | by DDG and our custom CDN)
         | 
         | - the JS fragments thankfully should load only on first visit
         | and then get aggressively cached by DDG/your browser
         | 
         | One of the pain points is that there are a lot of settings for
         | users to decide what they should or shouldn't see (content
         | rating, original language of origin, search tags, etc) and some
         | are already specifically denormarlized (when querying chapter
         | entities, ES indices for those contain some manga-level
         | properties to avoid needing to dereference that first too) --
         | however this also makes caching substantially less efficient in
         | many places, alas
         | 
         | Thanks!
        
           | [deleted]
        
           | the8472 wrote:
           | One issue I see is that flipping back and forth between
           | chapters reloads images from different URLs which means
           | they're uncachable. I guess that's somehow related to the
           | mangadex@home thing, but if the URLs were generated in a more
           | deterministic manner (keyed on some client ID + the chapter
           | being loaded) then the browser could avoid redundant traffic.
        
             | tristan9 wrote:
             | That's very close to how MD@H works, but it also has a time
             | component and tokens are not generated by our main
             | backends, so it'd require a separate internal http call per
             | chapter
        
               | the8472 wrote:
               | Another thing. For each page that's being loaded there's
               | a report being sent. Instead this could be aggregated
               | (e.g. once a second) and then processed as a batch on the
               | server side which should be faster.
               | 
               | And if your JS assets are hashed then you can add cache-
               | control: immutable so that a browser doesn't have to
               | reload them when the user F5s.
        
           | jiggawatts wrote:
           | Hi, I'm a performance tuning expert, and this thread piqued
           | my interest.
           | 
           | The _first thing_ that I noticed is that even with caching
           | enabled, you 're loading "too much data". After loading the
           | main page and then clicking one of the tiles, there are
           | several JSON API calls.
           | 
           | Here's an example, 195 kB transferred (528 kB size): https://
           | api.mangadex.org/manga/bbaa17c4-0f36-4bbb-9861-34fc8...
           | 
           | Oof. Half a _megabyte_ of JSON! Ignore the network traffic
           | for a moment, because GZIP does wonders. The real problem is
           | that generating that much JSON is very  "heavy" on servers.
           | Lots and lots of small object allocations, which gives the
           | garbage collector a ton of work to do. It's also expensive to
           | decode on the browser for similar reasons.
           | 
           | On my computer, this took a whopping 455ms to transfer,
           | nearly half a second. That results in a noticeable latency
           | hit to the site.
           | 
           | In my consulting gig I always give developers the same
           | advice: "Displaying 1 kilobyte of data should take roughly 1
           | kilobyte of traffic".
           | 
           | In other words, there's _isn 't 500 KB of text anywhere on
           | that page!_ A quick cut & paste shows about 8 KB of user-
           | visible text in the final HTML rendering. That's a 1:60 ratio
           | of content-to-data, which is very poor. I bet that behind the
           | scenes, this took a heck of a lot more back-end network
           | traffic and in-memory processing to generate. Probably tens
           | to hundreds of megabytes of internal traffic, all up.
           | 
           | This is one of the core reasons most sites have difficulty
           | scaling, because for every kilobyte of content output to the
           | screen, they're powering through megabytes or even gigabytes
           | of data behind the scenes.
           | 
           | Can this API query be cut down to match what's displayed on
           | the screen? Can it be cached for all users? Can it be cached
           | _precompressed_?
           | 
           | Etc...
        
             | tristan9 wrote:
             | > The real problem is that generating that much JSON is
             | very "heavy" on servers. Lots and lots of small object
             | allocations, which gives the garbage collector a ton of
             | work to do. It's also expensive to decode on the browser
             | for similar reasons.
             | 
             | For what it's worth, this isn't generated live but a mix of
             | existing entity documents
             | 
             | Most of it is page filenames which indeed could be made
             | optional and fetched only by the reader, but that'd be us
             | actively nulling them out in the returned entity, since
             | they are there in the ES documents for the chapters (a
             | manga feed like this being a list of chapters)
        
               | jiggawatts wrote:
               | You're basically dumping down a database to the web
               | browser, including all of the internal metadata that's
               | likely irrelevant to rendering the HTML.
               | 
               | For example, user role memberships:                  {
               | "id": "c80b68c5-09ae-4a50-a447-df7c5a4a6d01",
               | "type": "user",             "attributes": {
               | "username": "kinshiki",                 "roles": [
               | "ROLE_MEMBER",                     "ROLE_GROUP_MEMBER",
               | "ROLE_POWER_UPLOADER"                 ],
               | "version": 1             }         }
               | 
               | Also record timestamp dates like created/changed, along
               | with contact details that may be revealing sensitive
               | info:                   "attributes": {
               | "name": "SENPAI TEAM",             "locked": true,
               | "website": "https:\/\/discord.gg\/84e3j9b",
               | "ircServer": null,             "ircChannel": null,
               | "discord": "84e3j9b",             "contactEmail":
               | "senpai.info@gmail.com",             "description": null,
               | "official": false,             "verified": false,
               | "createdAt": "2021-04-19T21:45:59+00:00",
               | "updatedAt": "2021-04-19T21:45:59+00:00",
               | "version": 1         }
               | 
               | But let's just go back to your response:
               | 
               | > Most of it is page filenames which indeed could be made
               | optional
               | 
               | Do that! If you strip them out, the 529 kB document
               | shrinks to 280 kB, which hardly seems worth the hassle,
               | but when gzipped, this is a miniscule 13 kB! This is
               | because those strings are _hashes_ , which
               | _significantly_ reduces their compressibility compared to
               | general JSON, which usually compresses very well.
               | 
               | It's basic stuff like this that can make a website
               | absolutely fly.
               | 
               | Avoid giving computers unnecessary, mandatory work:
               | https://blog.jooq.org/many-sql-performance-problems-stem-
               | fro...
        
               | tristan9 wrote:
               | As I said, it's not so much that we ask that data to be
               | fetched -- it is there in the first place, and pulled
               | from Elasticsearch, not a SQL database
               | 
               | Because of this model, we also make sure that
               | Elasticsearch merely works a search cache, not as an
               | authoritative content database (hence everything we add
               | in there is considered public, on purpose, and what isn't
               | meant to be public is just not indexed in ES)
               | 
               | However the gzip efficiency improvements would be really
               | neat for sure
               | 
               | Fwiw I also don't work on the backend and there might be
               | good reasons to not expressly filter out data (yet
               | anyway, perhaps it will end up as a separate entity and
               | be a include parameter)
        
               | clambordan wrote:
               | You can query Elastic for specific fields only: https://w
               | ww.elastic.co/guide/en/elasticsearch/reference/curr...
               | 
               | Edit: As you said, there may be reasons on the backend
               | not to filter things out of the query. Though it seems
               | likely that the web response could be trimmed down.
        
               | BizarroLand wrote:
               | I have to say I'm glad this is being talked about in a
               | public forum. Outsiders rarely get to see brainstorming,
               | troubleshooting & group discussion of technological
               | issues like this.
               | 
               | Someone who is focused on the performance aspect &
               | someone who is focused on stack stability discussing the
               | real world input & output of a business system and
               | showing why performance & UX are not the only metrics
               | that matter is a good thing for us to see.
        
               | kmeisthax wrote:
               | This seems less like a performance problem and more of a
               | security issue. Especially considering that this is a
               | website that hosts unlicensed translations. How much of
               | this information is actually intended to be made public?
        
             | krick wrote:
             | > Displaying 1 kilobyte of data should take roughly 1
             | kilobyte of traffic
             | 
             | Is this to be taken literally? I don't consider myself a
             | performance-tuning expert, but I'm not sure how can I make
             | something useful out of this advice. Of course, "the less
             | you transfer, the better" is an obvious thing to say (a bit
             | too obvious to be useful, in fact), but does it really mean
             | I should aspire to transfer only what I'm actually going to
             | display right now? For example, there is a city
             | autocomplete form on the page (well, a couple of thousand
             | relatively short entries). In that case I would probably
             | consider making 1 request to fetch all these cities (on
             | input focus, most likely), instead of making a request to
             | the server on every couple of characters you type. Is it
             | actually a wrong way of thinking?
        
             | baybal2 wrote:
             | > This is one of the core reasons most sites have
             | difficulty scaling, because for every kilobyte of content
             | output to the screen, they're powering through megabytes or
             | even gigabytes of data behind the scenes.
             | 
             | > Can this API query be cut down to match what's displayed
             | on the screen? Can it be cached for all users? Can it be
             | cached precompressed?
             | 
             | This is why you want to bypass the JS realm, (or whatever
             | language does the serdes) and send clients JSON or XML
             | directly from the database, so the client is only getting
             | the data at rest.
        
           | maxk42 wrote:
           | > the JS fragments thankfully should load only on first visit
           | and then get aggressively cached by DDG/your browser
           | 
           | According to Alexa you have a 46.4% bounce rate. [1]
           | 
           | When 46% of your users aren't coming back, how does 31 round-
           | trips to your server for 100% of first-page visitors save
           | anyone time or bandwidth? Your pageviews per visitor is 6.8,
           | meaning the 53.6% that stick around view an average of 11.8
           | pages each. Even if there are zero subsequent js requests on
           | other pages (clicking a random page I see 8) you would be
           | generating 31 requests up-front to save 10.8 subsequent
           | requests for about half of your users. (And again - in any
           | scenario where the number of js fragments transferred on
           | subsequent requests >= 1 even this benefit goes out the
           | window.) How does that save you or your users bandwidth,
           | server load, or other overhead?
           | 
           | The scale is not quite linear, but generally speaking, if you
           | get your number of requests down from > 100 to < 5, you'll be
           | able to handle around 20x the traffic with the same number of
           | web-facing servers. Or alternatively the same amount of
           | traffic with around 1 / 20th the servers.
           | 
           | Would that have a material effect on your costs?
           | 
           | [1] https://www.alexa.com/siteinfo/mangadex.org
        
             | tristan9 wrote:
             | Definitely needs optimising for user experience indeed!
             | 
             | However the serving of this JS has nearly no cost to us (as
             | they are cached at the edge by DDoS-Guard and the frontend
             | is otherwise entirely static on our end)
        
         | tyingq wrote:
         | >A fresh load of the home page generates over 100 requests.
         | 
         | I see 17 requests, all over either h2 or h3. 4 of them JS, and
         | 2 images.
        
           | maxk42 wrote:
           | Then you're not doing a fresh load of the page. There are
           | over 30 images visible on the front page, so your measure
           | doesn't pass the smell test, does it?
        
             | tyingq wrote:
             | >Then you're not doing a fresh load of the page
             | 
             | Nope. Different problem.
             | 
             | The article was linked to a page under the domain
             | "mangadex.dev".
             | 
             | Without any other context, I had assumed "home page" meant
             | http://mangadex.dev , or what I got when clicking "Home" on
             | the linked article.
             | 
             | Apparently not.
        
         | radicalbyte wrote:
         | Do you manage to get as many buzz-words and OSS products into
         | your system as they do? :)
         | 
         | In general the less moving parts you have in a system the more
         | reliable, secure, efficient and cheaper the system becomes.
         | 
         | In their case they run a site that is probably under constant
         | attack by the "hired goons", so they're going to need to have
         | more moving parts than others. Plus they will want to optimise
         | for minimal development time (it's a hobby) so just adding
         | another tried and trusted system into the stack to do something
         | you need makes sense.
        
           | starfallg wrote:
           | >In their case they run a site that is probably under
           | constant attack by the "hired goons", so they're going to
           | need to have more moving parts than others.
           | 
           | That's taken care of by the DDoS-Guard system they placed
           | fronting their infrastructure. The design of their system has
           | to take this into account, but that is mainly on a IP and DNS
           | level. The design of their stack behind the loadbalancer is
           | mainly driven by their functional and non-functional
           | requirements, rather than by the need to prevent DDoS
           | attacks.
        
             | radicalbyte wrote:
             | The layering - defence in depth - is very much a security
             | consideration. Especially if you're building a pure
             | request/response/sync system you need that. Or you decouple
             | with a queue for mutations and avoid a lot of issues.
        
               | starfallg wrote:
               | That may be in terms of managing general security,
               | especially with regards to the attack surface of the
               | solution, but here we are talking about DDoS, which is
               | mostly a separate topic and handled on the network level
               | (for volumetric attacks) and load-balancer level (for
               | non-volumetric attacks) or a combination of both.
        
           | maxk42 wrote:
           | lol
           | 
           | > In general the less moving parts you have in a system the
           | more reliable, secure, efficient and cheaper the system
           | becomes.
           | 
           | 100% agreed. This is not my first high-traffic site, nor even
           | the highest. (I built the analytics system for a an Alexa
           | top-10 site in 2010, reaching some 30 billion writes / day
           | off of a mere 14 small ec2 instances.) I've never seen a k8s
           | implementation in production that was necessary.
           | 
           | I will note that my Alexa-2k site is also a personal site (no
           | revenue) and under constant attack. In fact we frequently
           | suffer DDOSes that we don't even notice until reviewing the
           | logs later because it doesn't suffer any latency under
           | pressure.
        
             | radicalbyte wrote:
             | Interesting, wouldn't mind having a chat outside of HN if
             | you're interested (see my profile for mail).
             | 
             | I've spent much of my career working on systems with active
             | users from the hundreds to low thousands, but which process
             | a huge number (50k/sec scale) jobs/tasks.
             | 
             | It's a totally different kettle of fish, and if I'm totally
             | honest I'm shocked at how badly "web" scales and how common
             | these naive and super inefficient implementations are
             | (hint: my bare-metal server from 2005 was faster than
             | expensive cloud VMs).
             | 
             | Recently I've worked on two high-usage systems (one of
             | which was "handling" 30k requests/second for the first
             | couple of week).
        
               | golergka wrote:
               | > I've spent much of my career working on systems with
               | active users from the hundreds to low thousands, but
               | which process a huge number (50k/sec scale) jobs/tasks.
               | 
               | MMO games, by any chance?
        
               | [deleted]
        
             | Folcon wrote:
             | Would you mind outlining your approach?
             | 
             | Really interested to see how you think about this sort of
             | thing =)...
        
               | maxk42 wrote:
               | My approach to what?
        
               | Folcon wrote:
               | To architecting a high traffic site =)...
        
               | Cipater wrote:
               | He posted a reply to his own comment.
               | 
               | https://news.ycombinator.com/item?id=28443113
        
               | maxk42 wrote:
               | Actually, my reply was to Folcon. HN simply doesn't allow
               | you to reply to comments beyond a certain depth
               | sometimes.
               | 
               | Perhaps mods have the ability to extend this for active
               | discussions and that's why I can reply now?
        
               | detaro wrote:
               | it's timing based. you can always reply by going to the
               | permalink of the comment you want to reply to.
        
               | maxk42 wrote:
               | Couldn't reply to this comment - but sure enough, the
               | permalink gives me the option. Thank you for the info!
        
               | detaro wrote:
               | Yeah, it's a somewhat well-meaning feature (supposed to
               | slow down flamewars) that is extremely unintuitive
        
               | maxk42 wrote:
               | (1) Simple beats complex.
               | 
               | (2) You can spend weeks building complex infrastructure
               | or caching systems only to find out that some fixed C in
               | your equation was larger than your overhead savings. In
               | other words: Measure everything. In other other words:
               | Premature optimization is the root of all evil.
               | 
               | (3) Fewer moving parts equals less overhead. (Again:
               | Simple beats complex.) It also makes things simpler to
               | reason about. If you can get by without the fancy
               | frameworks, VMs, containers, ORM, message queues, etc.
               | you'll probably have a more performant system. You need
               | to understand what each of those things does and how and
               | why you're using them. Which brings me to:
               | 
               | (4) Learn your tools. You can push an incredible amount
               | of performance out of MySQL, for instance, if you learn
               | to adjust its settings, benchmark different DB engines
               | for your application, test different approaches to
               | building your schemas, test different queries, make use
               | of tools like the EXPLAIN statement, etc. you'll probably
               | never need to do something silly like make half a dozen
               | round-trips to the database in a single page load.
               | 
               | (5) Understand your data. Reason about the data you will
               | need before you build your application. If you're working
               | with an existing application, make sure you are very
               | familiar with your application's database schema. Reason
               | ahead of time about what requirements you have or will
               | have, and which data will be needed simultaneously for
               | different operations. Design your database tables in such
               | a way as to minimize the number of round-trips you will
               | need to make to the database. (My rule of thumb: Try to
               | do everything in a single request per page, if possible.
               | Two is acceptable. Three is the maximum. If I need to
               | make more than three round-trips to the database in a
               | single page request, I'm either doing something too
               | complex or I seriously need to rethink my schema.)
               | 
               | (6) Networking is slow. Minimize network traversal. Avoid
               | relying on third-party APIs where possible when
               | performance counts. Prefer running small databases local
               | to the web server to large databases that require network
               | traversal to reach. This is how I handled 30 billion
               | writes / day: 12 web servers with separate MySQL
               | instances local to each sharded on primary key IDs. The
               | servers continuously exported data to an "aggregation"
               | server, which was subsequently copied to another server
               | for additional processing. Having the web server and
               | database local to the same VM meant they didn't need to
               | wait for any network traversal to record their data. I
               | could've easily needed several times as many servers if I
               | had gone with a traditional cluster due to the additional
               | latency. When you need to process 25,000 events in a
               | second, every millisecond counts.
               | 
               | (7) Static files beat the hell out of databases for read-
               | only performance. (Generally.)
               | 
               | (8) Sometimes you can get things moving even faster by
               | storing it in memory instead of on disk.
               | 
               | (9) Reiterating what's in (3): Most web frameworks are
               | garbage when it comes to performance. If your framework
               | isn't in the top half of the Techempower benchmarks, (or
               | higher for performance-critical applications) it's
               | probably going to be better for performance to write your
               | own code if you understand what you're doing. Link for
               | reference: https://www.techempower.com/benchmarks/ Note
               | that the Techempower benchmarks themselves can be
               | misleading. Many of the best performers are only there
               | because of some built-in caching, obscure language hack,
               | or standards-breaking corner-cutting. But for the
               | frameworks that aren't doing those things, the benchmark
               | is solid. Again, make sure you know your tools and _why_
               | the benchmark rating is what it is. Note also that some
               | entire languages don 't really show up in the top half of
               | techempower benchmarks. Take that into consideration if
               | performance is critical to your application.
               | 
               | (10) Most applications don't need great performance.
               | Remember that a million hits a day is really just 12 hits
               | per second. Of course the reality is that the traffic
               | doesn't come in evenly across every second of the day,
               | but the point remains: Most applications just don't need
               | that much optimization. Just stick with (1) and (2) if
               | you're not serving a hundred million hits per day and
               | you'll be fine.
        
               | Folcon wrote:
               | Thanks, this is a good list in general of things to think
               | about =)...
               | 
               | I've not really ever applied 9 myself, I've run
               | comparative benchmarks a couple of times, but not thought
               | about using that as a basis for whether to roll my own on
               | critical performance parts.
        
               | probotect0r wrote:
               | Can you share how you do logging/monitoring/alerting for
               | your site?
        
               | maxk42 wrote:
               | Bash scripts and cron. Automatic alerts go out to devs
               | via OpsGenie when resource availability drops so we can
               | get out ahead of it. 0 seconds of downtime in the past 12
               | months.
        
               | arthur_sav wrote:
               | > Simple beats complex. > Fewer moving parts equals less
               | overhead.
               | 
               | Took me almost a decade to really comprehend this.
               | 
               | I used to include all sorts of libraries, try out all the
               | fancy patterns/architectures etc...
               | 
               | After countless of hours debugging production issues...
               | the best code i've ever written is the one with the fewer
               | moving parts. Easier to debug and the issues are
               | predictable.
        
               | kiba wrote:
               | "The best part is no part." is an engineering quote I
               | heard.
        
               | arethuza wrote:
               | I'm sure I've heard something like "engineering is
               | solving problems while doing as little new as possible".
        
               | vymague wrote:
               | > But for the frameworks that aren't doing those things,
               | the benchmark is solid.
               | 
               | Any example of such frameworks?
        
               | manigandham wrote:
               | (ASP).NET is solid. Extremely fast, very reliable, and
               | highly productive.
               | 
               | https://dotnet.microsoft.com/apps/aspnet
        
               | arethuza wrote:
               | "Simple beats complex."
               | 
               | In the very first lecture of the Computer Science degree
               | I did in the 1980s the lecturer emphasised KISS, and said
               | that while we almost certainly wouldn't believe it at
               | first eventually we'd realise that this is the most
               | important design principle of all. Probably took me ~15
               | years... ;-)
        
               | politelemon wrote:
               | Sadly I think this is a lesson that we as an industry
               | consistently keep unlearning.
        
         | polote wrote:
         | I don't know about you, but they have 42 average Page-views per
         | visit (HN has 3) so Alexa rank is going to be biased
        
       | sxhunga wrote:
       | Interesting to know more about news.ycombinator.com !
        
       | jiggawatts wrote:
       | What kills me is that this was a rather pedestrian outcome on a
       | much cheaper 2-core virtual machine back in 2007 or so.
       | 
       | I easily got 3K requests / sec out of my _laptop_ at the same
       | time, and it was not a trivial app!
       | 
       | People's expectations have shifted so much it's absurd. If you
       | look at the TechEmpower benchmarks, ordinary VMs can easily push
       | 100K requests per second, no sweat, even with managed languages.
       | 
       | Trivial stuff like static content being treated _as static
       | content_ (files on the disk!) not as a distributed cache in front
       | of a database can do wonders.
       | 
       | Am I just old and jaded?
        
         | [deleted]
        
         | z3t4 wrote:
         | 2k sockets on a test bed vs 2k real user request in production
         | is very different. I doubt you ran a top 1000 Alexa site on
         | your laptop. Today we need to deal with SSL which eats from the
         | performance budget.
        
           | jiggawatts wrote:
           | > SSL which eats from the performance budget.
           | 
           | That was a short-lived thing, and has now become a myth
           | perpetuated by companies like Citrix and F5 that sell "SSL
           | offload" appliances for $$$.
           | 
           | Have you benchmarked the overhead of TLS?
           | 
           | In my experience, a _single CPU core_ can easily put out
           | multiple gigabytes of AES-256 (tens of gigabits). This
           | benchmark shows 3 GB /s (24 Gbps) for recent AMD CPUs, and
           | nearly 40 Gbps per core for an Intel CPU:
           | https://calomel.org/aesni_ssl_performance.html
           | 
           | A multi-core server is very unlikely to have more than a 1-5%
           | overheard due to TLS. Even connection set up is a minor
           | overhead with elliptic curve certificates.
           | 
           | This is thanks to the AES offload instructions, which are
           | present in all server CPUs made any time in the last 5-7
           | years or so. As long as the modern Galois Counter Mode (GCM)
           | is used with AES, performance should be great.
           | 
           | Meanwhile, Citrix ADC v13 with a hardware "SSL offload card"
           | _actually slows down_ connections! I had a very hard time
           | getting more than 600 Mbps through one. It seems to be the
           | way the ASIC offload chip is architected: it seems to use a
           | large number of slow cores, a bit like a GPU. This means that
           | any one TLS stream will have its bandwidth capped!
        
             | tpetry wrote:
             | The problem with these benchmarks is they measure the
             | bandwidth you can push through an established tls
             | connection. Try to build 2000 new tls connections a seconds
             | (yes many are still active and dont need to be restarted)
             | that is what is the really slow part. Not sending the data
             | over already established channels.
        
         | adreamingsoul wrote:
         | No. I think you have a healthy perspective and we should all be
         | questioning if current trends are beneficial/sustainable.
         | 
         | I haven't read the article, but the headline alone to me seems
         | alarming, $1,500 a month is a lot of money for only 2k rps.
        
           | reilly3000 wrote:
           | Maybe its a lot of money just for the web servers, but for
           | the entire infrastructure stack its pretty reasonable IMHO.
        
           | Vosporos wrote:
           | Then you should read the article ;)
        
           | rk06 wrote:
           | There is more to it that http request response. Mangadex also
           | need to store a lot of images and distribute them.
        
             | hnlmorg wrote:
             | CDNs have already solved this problem and are much cheaper
             | than $1500/month.
             | 
             | I've ran far more complex sites with much higher traffic
             | for less.
        
               | rk06 wrote:
               | Mangadex can't use cloudflare because of privacy reasons.
               | They may be facing similar issues with other popular
               | CDNs.
               | 
               | I am sure they must be using some kind of CDN for sure,
               | however, those options are unlikely to be free
        
               | hnlmorg wrote:
               | Privacy reasons? It's all static content that is publicly
               | accessible. I don't understand what the privacy reasons
               | could be under this context.
               | 
               | Are they worried about CDNs logging the images their
               | visitors access? Seems like an absurd edge case to worry
               | about in my opinion.
               | 
               | > _however, those options are unlikely to be free_
               | 
               | I wasn't even talking about free CDNs :)
        
               | vymague wrote:
               | I think privacy as in the mangadex team don't want to get
               | sued. So they avoid popular services who are more
               | willingly share their identity.
        
               | nrabulinski wrote:
               | They're basically hosting illegal content, or at least a
               | good chunk of it is copyright-infringing so they cannot
               | use cloudflare or any of the other off the shelf
               | offerings
        
               | hnlmorg wrote:
               | I see. That does complicate things somewhat.
               | 
               | I wonder if there's merit in them approaching studios
               | with a proper business plan?
        
               | Jcowell wrote:
               | Now with the way the manga, Manghwa, and WEBTOON industry
               | is tied up. But I believe that is their end goal
               | eventually.
        
               | level3 wrote:
               | Hm? There are lots of copyright-infringing sites using
               | Cloudflare, and Cloudflare seems pretty content to
               | generally ignore infringement notices.
        
               | ricktdotorg wrote:
               | Not disputing your statement, just made me laugh a bit
               | because almost every single site I visit recently that
               | offers links to copyrighted content [stored in file
               | lockers] sits behind Cloudflare
        
         | willvarfar wrote:
         | Excellent post, good technical content, amazing feat.
         | 
         | That said, I echo that the amazing feat is that they can fit
         | modern inefficient tool choices with poor mechanical sympathy
         | into that budget. The last decade of web-dev tooling has been
         | pushing the TCO of systems through the roof and this post is
         | all about how to struggle against that whilst using those
         | tools.
         | 
         | If they went old-school they'd get another order of magnitude
         | savings. Many veterans know of systems doing 10x that in 10x
         | less cost. Remember C10K was in 1999.
        
           | vymague wrote:
           | > If they went old-school they'd get another order of
           | magnitude savings. Many veterans know of systems doing 10x
           | that in 10x less cost. Remember C10K was in 1999.
           | 
           | How to learn more about the old-school way without getting a
           | job related to it? Like, topic or book recommendations.
        
             | thoughtFrame wrote:
             | I'm also curious about this, since in many areas of
             | computing (not only webdev) the old-school guys take some
             | stuff as so obvious that they don't even bother writing
             | about it or explaining it beyond "This is obviously
             | possible, dude". They know how to achieve this level of
             | performance, but for everyone else, we have to cobble
             | together fragmented insights. So if anyone out there reads
             | this and thinks like the GP, please do write about it ;)
        
               | hsn915 wrote:
               | What topics specifically are you interested in? And where
               | do all the people like you hang out?
               | 
               | I'm not an old-school guy by any means .. but I might
               | have something to contribute.
        
               | vymague wrote:
               | > What topics specifically are you interested in?
               | 
               | Well, for example, what's the old-school alternative to
               | mangadex's solution?
               | 
               | > And where do all the people like you hang out?
               | 
               | We are here on HN.
        
         | colesantiago wrote:
         | Not just you, but if it works for them, that's completely fine.
         | 
         | But there are many ways to achieve 20K RPS without this type
         | architecture and especially without k8s, for less than $1,500.
        
           | bob1029 wrote:
           | >20k RPS.
           | 
           | If this metric is what you are chasing, there are ways to
           | reliably break 1 million RPS using a single box if you _don
           | 't_ play the shiny BS tech game. The moment you involve
           | multiple computers and containers, you are typically removed
           | from this level of performance. Going from 2,000 to 2,000,000
           | RPS (serialized throughput) requires many ideological
           | sacrifices.
           | 
           | Mechanical sympathy (ring buffers, batching, minimizing
           | latency) can save you unbelievable amounts of margin and time
           | when properly utilized.
        
             | nine_k wrote:
             | I frankly don't see where containers could lower the
             | performance.
             | 
             | Basically a container is a glorified chroot. It has the
             | same networking unless you asked for isolation, then
             | packets have to follow a local (inside the host) route. It
             | has exactly no CPU or kernel interface penalty.
             | 
             | Maybe you wanted to say about container orchestration like
             | k8s, with its custom network fabric, etc.
        
               | krageon wrote:
               | > I frankly don't see where containers could lower the
               | performance.
               | 
               | Have you seen most k8s deployments? It's not the
               | containers, it's the thoughtspace that comes with them.
               | Even just using bare containers invites a level of
               | abstraction and generally comes with a type of developer
               | that just isn't desirable.
        
               | bob1029 wrote:
               | Even loopback is significantly slower than a direct
               | method invocation.
        
         | genewitch wrote:
         | In 2011 a company i contracted for was testing some new dell 1U
         | servers with around 1-2TB of ram. There was a postgres database
         | with 4000qps that could fit into tmpfs, and so i restricted
         | postgres to 640Kb of memory and we got replication working, it
         | took about 6 hours of babysitting.
         | 
         | We threw the switch and watched as postgres, with 640Kb of ram
         | and a tmpfs backed store proceeded to handle all of the query
         | traffic. There were some stored procedures or something that
         | were long-querying or whatever - i'm not a DB person at all, so
         | we switched back to the regular production server about 8
         | minutes later.
         | 
         | Yes, we did it in production.
        
           | winrid wrote:
           | Postgres handles low memory situations well. It'll kill
           | memory intensive queries before it crashes. I wonder if your
           | application was getting a lot of errors back instead of
           | successful queries :)
        
             | genewitch wrote:
             | the application performed fine, even though we made the
             | switch around 15:00 PST. The DBA was concerned because of
             | the few long queries.
             | 
             | Obviously the tmpfs was doing the heavy lifting, there -
             | and if i had to do a postmortem, i'd wager that filling the
             | OS caches was the main reason the long queries took so
             | long. We didn't do any sort of performance tracing.
             | 
             | The main purpose was to show that these $35k servers could
             | essentially replace the older machines if need be, even
             | though the old ones had FusionIO. I just removed the
             | middleman of the PCIe bus between the application and the
             | memory. It was a near constant argument on the floor about
             | whether or not we could feasibly switch to SSDs in some
             | configuration over spinning rust or even FusionIO, i wanted
             | a third option.
             | 
             | Basically, serve out of registered, ECC memory in front,
             | replicate to the fusionIO and let those handle the spindled
             | backups, which iirc was a pain point.
        
           | bagels wrote:
           | For real, 640 kilobits?
        
             | sigstoat wrote:
             | K isn't the abbreviation for kilo, so if you're going to
             | rag on the fellow for the 'b', then you should at least be
             | asking what a Kelvin*bit is.
        
               | iandinwoodie wrote:
               | "The binary meaning of the kilobyte for 1024 bytes
               | typically uses the symbol KB, with an uppercase letter
               | K." [0]
               | 
               | 0. https://en.m.wikipedia.org/wiki/Kilobyte
        
               | jmiserez wrote:
               | 640KiB is very little and I'm wondering if it's a typo,
               | given that the servers had 1-2TiB available. Postgres 9.0
               | released in 2010 already had 32MiB as the default for
               | shared_buffers (with a minimum of 128KiB):
               | https://www.postgresql.org/docs/9.0/runtime-config-
               | resource.... and 8.1 released in 2005 used 8MB
               | (1000*8KiB): https://www.postgresql.org/docs/8.1/runtime-
               | config-resource....
        
               | sigstoat wrote:
               | i interpreted it as "we wanted to turn the shared buffers
               | ~off, but in a hilarious way that would suggest to
               | someone reading the configuration file that something was
               | going on" (bill gates, mumble mumble)
               | 
               | but, wtf do i know, i'm the crazy guy who tries to
               | interpret comments generously.
        
               | genewitch wrote:
               | Yes, it was a direct reference to Bill Gates "640
               | kilobytes is enough for anyone" and i typed the comment
               | right before i fell asleep.
        
               | bagels wrote:
               | The question was more about the kilo part, even though I
               | didn't clarify. Seems orders of magnitude too small?
        
         | [deleted]
        
         | huijzer wrote:
         | To add to that, in 2005, a Cloudflare engineer showed that you
         | can receive 1 million packages per second
         | (https://blog.cloudflare.com/how-to-receive-a-million-
         | packets...). Without processing though.
        
         | kragen wrote:
         | I think even a distributed cache in front of a database
         | shouldn't have any trouble handling 2000 requests per second.
         | 
         | The issue is not really the number of requests per second,
         | probably, but the number of bytes, which they don't talk about
         | at all in the article; reading manga with no ads is a pretty
         | static kind of application, which could be satisfied amply with
         | a web browser or even a much simpler program loading images
         | from a filesystem directory.
         | 
         | Valgrind claims httpdito runs a few thousand instructions per
         | request, but that's not really accurate; what happens is that
         | the kernel is doing all the work. httpdito on Linux can handle
         | about 4000 requests per second per core, nearly a million clock
         | cycles per request, almost all of which is in the kernel. Of
         | course it doesn't ship its logs off to Grafana. In fact, it
         | doesn't have logs at all. But it would work fine for reading
         | manga.
        
           | antupis wrote:
           | Also that 2000 request per second has to happen 24/7 not only
           | quick demo session.
        
             | kragen wrote:
             | httpdito is nothing if not consistent in its performance.
             | It doesn't have enough state to have many ways to perform
             | well at first and then more poorly later, or for that
             | matter vice versa. (Not saying it couldn't happen, but it's
             | not that likely.) Linux is pretty good about consistent
             | performance, too, though it has more state.
        
           | buzer wrote:
           | > The issue is not really the number of requests per second,
           | probably, but the number of bytes, which they don't talk
           | about at all in the article; reading manga with no ads is a
           | pretty static kind of application, which could be satisfied
           | amply with a web browser or even a much simpler program
           | loading images from a filesystem directory.
           | 
           | I assume they are talking about their more dynamic content
           | serving in this post (for things like search, tracking which
           | chapters are read, new chapter listing based on what user
           | follows etc.).
           | 
           | They have a custom CDN that is hosted by volunteers to serve
           | the images for the manga pages. They provide some metrics for
           | that at https://mangadex.network, there are also some older
           | screenshots where they hit 3.2GB/s.
        
             | kragen wrote:
             | Interesting! Thanks! It still doesn't sound like the kind
             | of thing that would require load balancing, but maybe it
             | was easier to write it in Python or PHP or something, and
             | that made it so heavy that it did.
        
           | hsn915 wrote:
           | > distributed cache in front of a database
           | 
           | Already an over kill.
           | 
           | Think smaller. Think simpler.
           | 
           | A single machine serving files directly from the file system
           | (yes, from the SSD attached to the machine) will be able to
           | handle a _LOT_ more.
        
             | kragen wrote:
             | Well, that's what httpdito does: it serves files directly
             | from the filesystem. That's why I mentioned it. But, for
             | some applications, such as the website you're using right
             | now, it's useful to display pages that haven't been
             | previously stored in the filesystem.
        
       | snypher wrote:
       | >more than 10 million unique monthly visitors
       | 
       | >our ~$1500/month budget
       | 
       | I understand not wanting to show ads, but is there no way for the
       | users to contribute to hosting costs?
        
         | [deleted]
        
         | htns wrote:
         | They had a bit under $80k in crypto in their list of BTC and
         | ETH addresses "leaked" along with the source code when the site
         | was hacked earlier this year.
        
         | whateveracct wrote:
         | A $5/mo premium plan would break even so quickly
        
           | tommica wrote:
           | Premium plan on content that can be considered as dubious in
           | copyright context? Seems like a quick way to get shut down.
        
             | hrnn wrote:
             | premium plan on a virtual badge, NFT or whatever crap you
             | want. content would still be freely available for everybody
             | but the infra costs would be a bit less.
        
           | novok wrote:
           | Or even a patreon style 'you get nothing but a supporter
           | badge' with that kind of traffic levels.
        
         | m45t3r wrote:
         | There is the MangaDex@Home, where users can serve part of disk
         | space/bandwidth to help serve (mainly old) manga chapters. It
         | does need to be something that is running 24/7 (e.g.: not a PC
         | that is shutdown frequently), so something like a VPS or a
         | service is recommended.
        
           | AviKav wrote:
           | Virtually every chapter is served via MD@H now. Client
           | doesn't really need much availability, as long as it can do a
           | graceful shutdown. Even in the event of a sudden shutdown,
           | the trust penalties are much lower than H@H and in practice
           | go away after a trickle of traffic to raise your score
        
             | m45t3r wrote:
             | Nice, didn't know about this (there isn't much information
             | about MD@H after the rewrite).
             | 
             | BTW, how can I register my VPS on MD@H? Before we had a
             | dedicated form on the page to register interest, at least
             | after the rewrite I didn't find it. Is it only using
             | something like Discord?
        
               | KatKafka wrote:
               | We had a dedicated page for signing up on the v3 version
               | of the site. Currently, yes, it's via our Discord
               | server's MD@H channels.
        
         | franciscop wrote:
         | From a quick glance it seems to host obviously copyrighted
         | content for free. In some jurisdictions (like Spain) the
         | companies would have a hard time at court against the website
         | creators, since it's a not-for-profit* website sharing culture.
         | 
         | Now show an ad, or premium accounts, and it becomes a for-
         | profit endeavour which is straight jail time. I'm unsure about
         | donations.
         | 
         | (Based on previous rulings I followed ~10 years ago, laws might
         | have changed IANAL yada yada)
         | 
         | *not for profit != non-profit
        
           | reilly3000 wrote:
           | They might have a play at being affiliates for sellers of the
           | original material. I suppose a link is an ad, but its also
           | somehow a less dubious way to monetize in my mind.
        
       | deathanatos wrote:
       | It's a nice article, I guess, but the site is down (the one
       | discussed in the article, not the blog post itself) for me.
        
         | Semaphor wrote:
         | It works fine for me: https://mangadex.org/
        
         | bytearray64 wrote:
         | You probably have Verizon - they've started null routeing
         | traffic to sites "like this".
         | 
         | https://old.reddit.com/r/mangadex/comments/nvj7qf/is_verizon...
        
           | ApolloFortyNine wrote:
           | Ah shucks I thought we had mostly avoided that stuff in the
           | US.
           | 
           | I'm guessing though they're using some old spam ip/block
           | though, there's a lot more obvious piracy sites then a Manga
           | site. For instance, I can access all the major torrent sites.
        
             | bytearray64 wrote:
             | They also block nyaa, an Anime/Manga focused tracker. It's
             | not a very aggressive list though, as you're right that
             | major torrent sites are still accessible.
        
           | deathanatos wrote:
           | Huh, right you are, on both accounts, it seems.
           | 
           | That's disappointing. If only I had some choice to ISPs, then
           | I could express my disappointment by voting with my wallet...
        
             | bebna wrote:
             | Just call their service often enough, and tell them
             | internet isnt working.
        
       | urlgrey wrote:
       | I'm amazed that their architecture doesn't include a CDN. These
       | days I expect nearly all high traffic websites to make use of a
       | CDN for all kinds of content, even content that's not cached.
       | 
       | They cited Cloudflare not being used due to privacy concerns.
       | It'd be interesting to hear more about that, as well as why other
       | CDNs weren't worth evaluating too.
        
         | uyt wrote:
         | What they are doing is unfortunately not legal. There were
         | precedents of Cloudflare ratting out manga site operators
         | before which have led to arrests [1] (the person who ran
         | mangamura got a 3 year sentence and a $650k fine [2]). And at
         | some point they were going after mangadex via the same way too
         | [3].
         | 
         | A lot of their infrastructure design choices should be viewed
         | with OPSEC constraints in mind.
         | 
         | [1] https://torrentfreak.com/japan-pirate-site-traffic-
         | collapsed...
         | 
         | [2] https://torrentfreak.com/mangamura-operator-handed-three-
         | yea...
         | 
         | [3] https://torrentfreak.com/mangadex-targeted-by-dmca-
         | subpoena-...
        
         | Sebguer wrote:
         | I think the usual argument re: Cloudflare on the privacy front
         | is the fact that they pretty aggressively fingerprint users,
         | and will downgrade or block traffic originating from VPNs or
         | some countries. This is a natural side effect of those things
         | often being tied to abusive traffic, and a lot of it is likely
         | configurable (at least on their paid plans) but it often comes
         | up around this.
        
         | ev1 wrote:
         | It's effectively a warez site. There's a reason why they host
         | in the places they do and can't be too picky about providers.
         | 
         | CF will also pass through things like DMCAs easily.
         | 
         | Based on their sidebar, it's probably hosted at Ecatel or
         | whatever they are called now (cybercrime host) via Epik as a
         | reseller, the provider famous for hosting far-right stuff.
        
           | rovr138 wrote:
           | What's the reason behind where they host and having issues
           | with providers? I haven't heard this before
           | 
           | Regarding DMCA's, as an entity doing business where they're
           | legal, what should they do as a middle man?
        
             | ev1 wrote:
             | > Regarding DMCA's, as an entity doing business where
             | they're legal, what should they do as a middle man?
             | 
             | Don't use them and instead have your middleman be in a
             | country that ignores intellectual property rights and
             | copyright?
             | 
             | I'm not saying CF is wrong to pass them through. I'm just
             | saying CF is not the right choice for a warez site for
             | longevity.
        
         | baybal2 wrote:
         | Properly tuned NGINX on a physical server can handle
         | incomparably large load for static content than some of the
         | "cloud" storages around.
         | 
         | The "trick" has really been known for a decade, or more. Have
         | as many things static as possible, and only use backend logic
         | for the barest minimum.
        
           | sofixa wrote:
           | That's the _raison d 'etre_ of nginx, so it is performant for
           | this kind of thing. However, the advantage of a CDN is that
           | they have points of presence around the world, so your user
           | in Singapore doesn't have to do a trip around the world to
           | get to your nginx on a physical box in Lisbon.
        
         | bawolff wrote:
         | What's the benefit of a cdn if nothing is cacheable? Slightly
         | lower latency on the tcp/tls handshake? That seems pretty
         | insignificant.
        
           | sofixa wrote:
           | In their case (manga), seems like the vast majority of the
           | content is cacheable.
        
           | manigandham wrote:
           | Latency makes a bigger impact on UX than throughput for
           | general browsing. A TLS handshake can be multiple roundtrips
           | that greatly benefit from lower latency, especially mobile
           | devices.
           | 
           | Modern CDNs also provide lots of functionality from security
           | (firewall, DDOS) to application delivery (image optimization,
           | partial requests).
        
           | ev1 wrote:
           | The CDN part is kind of pointless because they can't really
           | have nodes in large parts of the western world since.. it's a
           | warez site. The CDN providers will get takedowns, requests to
           | reveal the backing origin, etc. You can't use a commodity CDN
           | provider for this.
        
         | gaudat wrote:
         | They do have a crowdsourced CDN called Mangadex@Home. I
         | participated in it from last year until the site was hacked.
         | The aggregate egress speed was around 10 Gbps.
         | 
         | The NSFW counterpart of MD also has a CDN appropriately named
         | Hentai@Home run by volunteers.
         | 
         | These 2 sites are the only ones rolling their own CDN for free
         | that I know.
        
       | mdoms wrote:
       | I loaded the front page of Mangadex and it made 114 web requests
       | including 10 first-party XHR requests, 30(!!!!) Javascript
       | resource requests and somehow 4 font requests, without me
       | interacting with the page. Clicking one of the titles on the
       | front page resulted in nearly 40 additional requests.
       | 
       | Perhaps if you are limited by requests per second you could
       | consider how many requests a single user is making per
       | interaction, and if this is a reasonable number.
       | 
       | The website is impressively fast though, I'll give you that.
        
         | watermelon0 wrote:
         | Frontend framework they use (NUXT) uses code splitting [1],
         | which means that:
         | 
         | - first request is fast, because you only need to download
         | chunks required for a single page/controller (and you prefetch
         | others in the background)
         | 
         | - changing some parts of codebase requires to re-download only
         | affected chunks, instead of the whole bundle
         | 
         | [1] https://www.telerik.com/blogs/what-you-should-know-code-
         | spli...
        
         | m45t3r wrote:
         | They're probably more limited by bandwidth than requests per
         | second, but anyway you look the number of requests are still
         | impressive considered the budget.
         | 
         | BTW, the site is not just fast: they serve images on high
         | quality (same as the original [1], that can be multiple MBs per
         | page [2]) at an pretty impressive speed too.
         | 
         | [1]: before someone asks why they don't optimize the images,
         | this is by design since they want to serve high quality images.
         | There is an optional toggle to reduce the image size, but this
         | is disabled by default.
         | 
         | [2]: for those not familiar, the average number of pages on a
         | manga is something like ~20, and this can be read in ~5 minutes
         | depending on the density of the text. So you can easily consume
         | 50MB+ per chapter.
        
           | [deleted]
        
       | [deleted]
        
       | jhgg wrote:
       | They mention a $1,500 budget per month but then omit things
       | critical to understanding how they achieve that cost point.
       | 
       | What is actually more interesting is to understand what portion
       | is spent on servers versus bandwidth - and what hardware
       | configuration they use to host the site. For example, Is
       | $1,500/mo just paying for colo costs + bandwidth, with already
       | owned recycled hardware (think last gen hardware that you can get
       | at steep discounts from eBay / used hardware resellers...)
       | 
       | That would have been way more interesting to know given the blog
       | title than the choice of infrastructure software they use.
        
       | chime wrote:
       | Not familiar with the project but it is great to see a
       | counterpart to over-provisioned enterprise infrastructure. $10 in
       | 2021 can do what $100 in 2011 did, what $1000 in 2001 did, and
       | that is not solely due to hardware. Well-designed deployments of
       | K8s, KVM/LXC, Ceph, LBs like this project can handle so much more
       | traffic than poorly configured Wordpress storefronts.
       | 
       | They're using battle-tested tech from Redis and RabbitMQ to
       | Ansible and Grafana. Nothing super fancy, nothing used just for
       | the sake of being modern. Not sure how long it took them to end
       | up with this architecture but it doesn't look like a new dev
       | would have a hard time getting familiar with how everything
       | works.
       | 
       | Would definitely like to hear more about their dev environment,
       | how it is different from prod, and how they handle the
       | differences.
        
         | novok wrote:
         | I think enterprise and more optimize for business flexibility
         | and ability to A/B test very rapidly vs a finely crafted piece
         | of efficiency, for better or worse. The people behind this
         | probably do this for their day job, or are teens that are about
         | to do it for their day job.
        
           | chime wrote:
           | I agree with you. I mostly work in enterprise and understand
           | that it has different needs and ROI requirements. However, my
           | personal mindset is that computers and networks are really
           | really fast now and it's a tragedy that most of these gains
           | are nullified due to unoptimized layers of abstraction or
           | over-architecting. So it's a welcome sight to read about
           | well-designed infrastructure like this.
        
             | novok wrote:
             | It happens because business are optimizing for resources
             | that are ultimately more expensive or slower, which is
             | staffing levels and the ability to respond to the market so
             | the business can grow or survive longer. Inefficient
             | computing architecture as a side effect is a worthwhile
             | tradeoff in light of that to them.
             | 
             | But as a craftsman, it is definitely nice :)
        
         | tristan9 wrote:
         | > Would definitely like to hear more about their dev
         | environment, how it is different from prod, and how they handle
         | the differences.
         | 
         | It's honestly quite boringly similar (hence why it's only
         | vaguely alluded to in the article)
         | 
         | Take out DDoS-Guard/External LBs (no need for publicness of
         | it), pick a cheap-o cloud provider to get niceties like quick
         | rebuilding with Terraform etc, slap a VPC-like thing to make it
         | a similar private network (do use a different subnet so
         | copypasting typos across dev and prod are impossible) and scale
         | down everything (ES node has 8 CPUs and 24GB ram in prod? It
         | will have to do with 2vCPUs and 2GB RAM in dev)
         | 
         | One of the annoying things is you do want to test the
         | replicated/distributed nature of things, so you can't just
         | throw everything on a single-instance-single-host because it's
         | dev, otherwise you miss out on a lot of the configuration being
         | properly tested, which ends up a bit costlier than necessary
        
       | Shadonototro wrote:
       | How can this be legal?
       | 
       | It's basically pirating content
        
       | latch wrote:
       | I've done things at scale (5-10K req/s) on a budget ($1000 USD)
       | and I've done things at much smaller scales that required a much
       | larger budget.
       | 
       | _How_ you hit scale on a budget is one part of the equation. The
       | other part is: what you're doing.
       | 
       | Off the top of my head, the "how" will often involve the
       | following (just to list a few):
       | 
       | 1 - Baremetal
       | 
       | 2 - Cache
       | 
       | 3 - Denormalize
       | 
       | 4 - Append-only
       | 
       | 5 - Shard
       | 
       | 6 - Performance focused clients/api
       | 
       | 7 - Async / background everything
       | 
       | These strategies work _really_ well for catalog-type systems:
       | amazon.com, wiki, shopify, spotify, stackoverflow. The list is
       | virtually endless.
       | 
       | But it doesn't take much more complexity for it to become more
       | difficult/expensive.
       | 
       | Twitter's a good example. Forget twitter-scale, just imagine
       | you've outgrown what 1 single DB server can do, how do you scale?
       | You can't shard on the `author_id` because the hot path isn't
       | "get all my tweets", the hot path is "get all the tweets of the
       | people I follow". If you shard on `author_id`, you now need to
       | visit N shards. To optimize the hot path, you need to duplicate
       | tweets into each "recipient" shard so that you can do: "select
       | tweet from tweets where recipient_id = $1 order by created desc
       | limit 50". But this duplication is never going to be cheap (to
       | compute or store).
       | 
       | (At twitter's scale, even though it's a simple graph, you have
       | the case of people with millions of followers which probably need
       | special handling. I assume this involves a server-side merge of
       | "tweets from normal people" & RAM["tweets from the popular
       | people"].)
        
         | winrid wrote:
         | I've heard in a few talks how at Twitter engineers have
         | accidentally ran into OOM problems by loading up too big of a
         | follower graph in memory in application code. I think it's a
         | nice reminder that at scale even big companies make the easy
         | mistakes and you have to architect for them.
        
         | ignoramous wrote:
         | > _Twitter 's a good example._
         | 
         | Mike Cvet's talk about Twitter's fan-in/fan-out problem and its
         | solution makes for a fascinating watch: https://www.youtube-
         | nocookie.com/embed/WEgCjwyXvwc
        
           | Cipater wrote:
           | I appreciate the no-cookie embed.
           | 
           | Learned something new today.
        
         | trampi wrote:
         | Reads like a small excerpt out of "Designing Data-Intensive
         | Applications" :)
        
           | chairmanwow1 wrote:
           | This is an amazing book that improved my effectiveness as an
           | engineer by an undefinable amount. Instead of just randomly
           | picking components for a cloud application, I learned that I
           | could pick the right tools for the job. This book does a
           | really good job communicating the trade-offs between
           | different designs and tools.
        
             | ignoramous wrote:
             | I have always wondered "what next" after having read data-
             | intensive. Some suggested looking at research publications
             | by Google, Facebook, and Microsoft. What do others
             | interested in the field read?
        
         | kalev wrote:
         | The 1-7 list you mention definitely deserves it's own blogpost
         | and how to implement these. I'm currently not using any of
         | these except 1, and probably don't need the rest for a while
         | but I do want to know what I should do when I need it. For
         | example: what and how should things be cached? When and how to
         | denormalize, why is it needed? Why append-only and how? Never
         | 'sharded' before, no idea how that works. Heard some things of
         | everything async/in the background, but how would that work
         | practically?
        
           | BrentOzar wrote:
           | > what and how should things be cached?
           | 
           | If something is read much more frequently than it changes,
           | store it client-side, or store it temporarily in an in-
           | memory-only, not-persisted-to-disk "persistence" layer like
           | Redis.
           | 
           | For example, if you're running an online store, your product
           | list doesn't change all that often, but it's queried
           | constantly. The single source of truth lives in a relational
           | database, but when your app needs to fetch the list of
           | products, it should first check the caching layer to see if
           | it's available there. If not, fetch it from the database, but
           | then write it into the cache so that it's available more
           | quickly the next time you need it.
           | 
           | > When and how to denormalize, why is it needed?
           | 
           | When you need to join several tables together in order to
           | retrieve a result set, and especially when you need to do
           | grouping to get the result set, and the retrieval & grouping
           | is presenting a performance problem, then pre-bake that data
           | on a regular basis, flattening it out into a table optimized
           | for read performance.
           | 
           | Again with the online store example, let's say you want to
           | show the 10 most popular products, with the average review
           | score for each product. As your store grows and you have
           | millions of reviews, you don't really want to calculate that
           | data every time the web page renders. You would build a
           | simpler table that just has the top 10 products, names, IDs,
           | average rating, etc. Rendering the page becomes much more
           | simple because you can just fetch that list from the table.
           | If the average review counts are slightly out of date by a
           | day or two, it doesn't really matter.
           | 
           | > Why append-only and how?
           | 
           | If you have a lot of users fighting over the same row, trying
           | to update it, you can run into blocking problems. Consider
           | just storing new versions of rows.
           | 
           | But now we're starting to get into the much more challenging
           | things that require big application code changes - that's why
           | the grandparent post listed 'em in this order. If you do the
           | first two things I cover above there, you can go a long,
           | long, long way.
        
           | toast0 wrote:
           | > Never 'sharded' before, no idea how that works.
           | 
           | Sharding sucks, but if your database can't fit on a single
           | machine anymore, you do what you've got to do. The basic idea
           | is instead of everything in one database on one machine (or
           | well redundant group of machines anyway), you have some
           | method to decide for a given key what database machine will
           | have the data. Managing the split of data across different
           | machines is, of course, tricky in practice; especially if you
           | need to change the distribution in the future.
           | 
           | OTOH, Supermicro sells dual processor servers that go up to 8
           | TB of ram now; you can fit a lot of database in 8 TB of ram,
           | and if you don't keep the whole thing in ram, you can index a
           | ton of data with 8 TB of ram, which means sharding can wait.
           | In contrast, eBay had to shard because a Sun e10k, where they
           | ran Oracle, could only go to 64 GB of ram, and they had no
           | choice but to break up into multiple databases.
        
             | bigiain wrote:
             | > you have some method to decide for a given key what
             | database machine will have the data
             | 
             | Super simple example, splitting there phone book into two
             | volumes, A-K and L-Z. (Hmmmm, is a "phonebook" a thing that
             | typical HN readers remember?)
             | 
             | > you can fit a lot of database in 8 TB of ram, and if you
             | don't keep the whole thing in ram, you can index a ton of
             | data with 8 TB of ram, which means sharding can wait.
             | 
             | For almost everyone, sharing can wait until after the
             | business doesn't need it any more. FAANG need to shard.
             | Maybe a few thousand other companies need to shard. I
             | suspect way way more businesses start sharding when
             | realistically spending more on suitable hardware would
             | easily cover the next two orders of magnitude of growth.
             | 
             | One of these boxes maxed out will give you a few TB of ram,
             | 24 cpu cores, and 24x16TB NVMe drives which gives you
             | 380-ish TB of fairly fast database - for around $135k, and
             | you'd want two for redundancy. So maybe 12 months worth of
             | a senior engineer's time.
             | 
             | https://www.broadberry.com/performance-storage-
             | servers/cyber...
        
               | Zababa wrote:
               | > So maybe 12 months worth of a senior engineer's time.
               | 
               | In America. When the salaries are 2/3 times lower, people
               | spend more time to use less hardware.
        
               | toast0 wrote:
               | Sharding does take more time, but it doesn't save that
               | much in hardware costs. Maybe you can save money with two
               | 4TB ram servers vs one 8TB ram server, because the
               | highest density ram tends to cost more per byte, but you
               | also had to buy a whole second system. And that second
               | system has follow on costs, now you're using more power,
               | and twice the switch ports, etc.
               | 
               | There's also a price breakpoint for single socket vs dual
               | socket. Or four vs two, if you really want to spend
               | money. My feeling is currently, single socket Epyc looks
               | nice if you don't use a ton of ram, but dual socket is
               | still decently affordable if you need more cores or more
               | ram and probably for Intel sevees; quad socket adds a lot
               | of expense and probably isn't worth it.
               | 
               | Of course, if time is cheap and hardware isn't, you can
               | spend more time on reducing data size, profiling to find
               | optimizations, etc.
        
               | Zababa wrote:
               | Fair points, I'm just trying to push back a bit against
               | "optimizing anything is useless since the main cost is
               | engineering and not hardware", since this situation
               | depends on the local salaries and in low-inome countries
               | the opposite can be true.
        
           | pedrosorio wrote:
           | As a sibling comment mentioned, read DDIA:
           | https://dataintensive.net/
        
           | latch wrote:
           | It's hard to answer this in general. Most out-of-the-box
           | scaling solutions have to be generic, so they lean on
           | distribution/clustering (e.g., more than one + coordination)
           | so they're expensive.
           | 
           | Consider something like an amazon product page. It's mostly
           | static. You can cache the "product", and calculate most of
           | the "dynamic" parts in the background periodically (e.g.,
           | recommendation, suggestions) and serve it up as static
           | content. For the truly dynamic/personalized parts (e.g.,
           | previous purchased) you can load this separately (either as a
           | separate call from the client or let the server pieces all
           | the parts together for the client). This personalized stuff
           | is user specific, so [very naively]:                  conn =
           | connections[hash(user_id) % number_of_db_servers]
           | conn.row("select last_bought from user_purchases where
           | user_id = $1 and product_id = $2", user_id, product_id)
           | 
           | Note that this is also a denormalization compared to:
           | 
           | select max(o.purchase_date) from order o join order_items oi
           | on o.id = oi.order_id where o.user_id = $1 and oi.product_id
           | = $2
           | 
           | Anyways, I'd start with #7. I'd add RabbitMQ into your stack
           | and start using it as a job queue (e.g. send forget
           | password). Then I'd expand it to track changes in your data:
           | write to "v1.user.create" with the user object in the payload
           | (or just user id, both approaches are popular) when a user is
           | created. It should let you decouple some of the logic you
           | might have that's being executed sequentially on the http
           | request, making it easier to test, change and expand. Though
           | it does add a lot of operational complexity and stuff that
           | can go wrong, so I wouldn't do it unless you need it or want
           | to play with it. If nothing else, you'll get more comfortable
           | with at-least-once, idempotency and poison messages, which
           | are pretty important concepts. (to make the write to the DB
           | transactionally safe with the write to the queue, lookup
           | "transactional outbox pattern").
        
         | 29athrowaway wrote:
         | Try to convert as much content as you can into static content,
         | and serve it via CDN. Then, use your servers only for dynamic
         | stuff.
         | 
         | Also, put the browser to work for you, caching via Cache-
         | Control, ETag, etc. Only then, optimize your server...
        
         | Ginden wrote:
         | I would like to notice that many of these techniques can incur
         | significant cost of developer or sysadmin time.
        
       | jprupp wrote:
       | This is complexity for complexity's sake. Pay no attention to the
       | disclaimer at the start of the article. They threw every
       | buzzword-heavy bit of tech they could find at it, creating a
       | Frankenstein monster.
        
         | sofixa wrote:
         | Completely disagree. How would you do it in a simpler way,
         | while keeping the features like redundancy ( including
         | storage), logs, metrics, etc?
        
           | pahae wrote:
           | Looking at their diagrams it seems that the k8s cluster
           | exists solely to handle their monitoring and logging needs
           | which would be extreme overkill, especially since 18k
           | metrics/samples and 7k logs per second are nothing. Plus you
           | now suddenly need a S3-compatible storage backend for all
           | your logs and metrics. Good thing Ceph comes 'free' with
           | Proxmox, I guess.
           | 
           | Deploying an instance of Prometheus with *every host is also
           | unusual, to say the least and I don't quite understand their
           | comment to that. If you don't like a pull-based architecture
           | (which is a valid point) why use one at all!? There are many
           | more push-based setups out there that are simpler to set up
           | and less complex.
        
       | hsn915 wrote:
       | I don't understand. Why is 2k requests/sec supposed to be
       | massive?
       | 
       | Try this yourself: write a simple web server in Go, host it on a
       | cheap VPS provider, let's say at the option that costs $20/mo.
       | Your website will be able to handle more than 1k/s requests with
       | hardly any resource usage.
       | 
       | ok, let's assume you're doing some complicated things.
       | 
       | So what? You can scale vertically, upgrade to the $120/mo server.
       | Your website now should be able to comfortably handle 5k req/s
       | 
       | Looking at the website itself, mangadex.org, it doesn't even host
       | the manga itself. The whole website is just an index that links
       | to manga on external websites. All you are doing is storing
       | metadata and displaying it as a webpage. The easiest problem on
       | the web.
       | 
       | So, I really don't understand the premise behind the whole post.
       | 
       | The problem statement is:
       | 
       | > In practice, we currently see peaks of above 2000 requests
       | every single second during prime time.
       | 
       | This is great in terms of success as a website, but it's
       | underwhelming in terms of describing a technical problem.
        
         | true_religion wrote:
         | They also host the manga. It's not just an link farm. Because
         | they host... that's why they use ceph.
         | 
         | Their goal is for scanlators to have a place to post their new
         | translated manga, rather than always linking it off from some
         | Wordpress instance.
        
         | AviKav wrote:
         | The only releases that link to external websites are the ones
         | from sites such as MangaPlus and BiliBili (And delayed releases
         | if you count those)
        
         | tristan9 wrote:
         | > This is great in terms of success as a website, but it's
         | underwhelming in terms of describing a technical problem.
         | 
         | A bit of an intro punchline, even though I agree it admittedly
         | doesn't say much on itself :)
         | 
         | Fwiw most of the work is that there's little "static" traffic
         | going on -- images and cacheable responses are not very CPU
         | intensive to serve -- but what isn't static (which is a good
         | chunk of it) is more problematic, but more to come on these
        
         | Tenoke wrote:
         | >Looking at the website itself, mangadex.org, it doesn't even
         | host the manga itself.
         | 
         | They do seem to. Clicking on a random manga on there the images
         | are hosted on their server[0]. Also I guess some of those are
         | much bigger images which is less trivial to serve at that rate
         | than a 10kb static page.
         | 
         | 0.
         | blob:https://mangadex.org/e78bd61a-e761-4a73-a27c-5f58394e7ea4
        
           | akx wrote:
           | Blob links are scoped to your browser tab, they're not real
           | internet URLs.
        
         | slightwinder wrote:
         | You ignore the weight of requests and general situation of this
         | project. This is not your average mommy-blog whose does not
         | care much how many downtimes it has. This is a website with
         | illegal content, under constant attack, with a some pretty
         | dynamic content on top and likely the main goal to satisfy
         | their community. So most of their budget will go to security
         | and redundancy, to protect themselves and allowing a high
         | uptime.
         | 
         | Where you can use 1 server, they will need to have something
         | around 20 servers. Where you can use a cheap VPS provider, they
         | must use an expensive shady provider who will take the heat of
         | legal attacks. And so on and on... because of their situation
         | they have a bunch more requirements which eat their budget than
         | your average website, leading to a rather heavy, complex and
         | thus expensive architecture.
         | 
         | Surely there is still room for optimization, but it seems this
         | is a rather new redesign from scratch(?), so not details need
         | time.
        
         | ctvo wrote:
         | > Try this yourself: write a simple web server in Go, host it
         | on a cheap VPS provider, let's say at the option that costs
         | $20/mo. Your website will be able to handle more than 1k/s
         | requests with hardly any resource usage.
         | 
         | These people have never heard of Go, obviously. The likely
         | scenario is not that you haven't fully understood their
         | constraints or requirements, it's that you're just smarter than
         | they are.
         | 
         | > So what? You can scale vertically, upgrade to the $120/mo
         | server. Your website now should be able to comfortably handle
         | 5k req/s
         | 
         | > Looking at the website itself, mangadex.org, it doesn't even
         | host the manga itself. The whole website is just an index that
         | links to manga on external websites. All you are doing is
         | storing metadata and displaying it as a webpage. The easiest
         | problem on the web.
         | 
         | Take that order of magnitude cheaper, single VPS server
         | solution you're proposing and build something with it. Sounds
         | like you'd make a lot of money. There has to be a business idea
         | around "storing metadata and displaying it as a webpage"
         | somewhere? Easiest problem on the web.
         | 
         | The peanut gallery at HN is out of control. People who don't do
         | / build explaining to the people who do how easy, simple,
         | better their solutions would be.
        
           | krageon wrote:
           | > People who don't do / build
           | 
           | I can and do frequently advise on certain topics in comments
           | specifically because I do build and can in fact speak of such
           | topics authoritatively. Isn't that what this website is for?
           | 
           | That said, the post you are replying to is perhaps overly
           | dismissive of the criteria that this website operates under.
           | Other comment chains have some really good advice though.
        
           | manigandham wrote:
           | There are plenty of people who build here on HN (more than
           | most other sites) and the requirements are pretty clearly
           | described in the article.
           | 
           | While it's not as simple as a Go program on a VPS, there is
           | certainly a lot of unnecessary overhead here. I think you
           | underestimate just how much poor and wasteful engineering
           | there is out there.
        
             | ctvo wrote:
             | > While it's not as simple as a Go program on a VPS, there
             | is certainly a lot of unnecessary overhead here. I think
             | you underestimate just how much poor and wasteful
             | engineering there is out there.
             | 
             | I don't under estimate poor and wasteful engineering at
             | all, but that's not what I saw in the article.
             | 
             | Serving traffic is a single element of their design. They
             | also designed for security, redundancy, and observability.
             | All with their own solutions because using a service or a
             | cloud provider would be too costly. With that in mind, it's
             | not a charitable view to think they didn't explore low
             | hanging fruits like "make the server in Go". If you think
             | you can do better, detail in depth how and solve all of
             | their requirements vs. the single piece you're familiar
             | with.
             | 
             | And if you can do the above holistically, for an order of
             | magnitude below their costs, it sounds like I need to get
             | in touch to throw money at you.
        
               | hsn915 wrote:
               | If you have a lot of extra money to throw I'd be happy to
               | oblige.
        
               | manigandham wrote:
               | My background is in adtech, which is a unique mix of
               | massive scale, dynamic logic, strict latency
               | requirements, and geographical distribution. I've built
               | complete ad platforms by myself for 3 different companies
               | now so I can confidently say that this is not a difficult
               | scenario. It's a ready-heavy content site with very
               | little interactivity or complexity to each page and can
               | be made much simpler, faster and cheaper.
               | 
               | > " _detail in depth how_ "
               | 
               | This thing seems to be little more than a very complex
               | API and SPA sitting on top of Elasticsearch. These
               | frontend/backend sites are almost always a poor choice
               | compared to a simple server-side framework that just
               | generates pages. ES itself is probably unnecessary
               | depending on the requirements of their search (it doesn't
               | seem to be actual full text indexing of the content but
               | just the metadata). The security and observability also
               | tends to be a problem of their own making and a symptom
               | of too much complexity.
        
               | ctvo wrote:
               | > _My background is in adtech, which is a unique mix of
               | massive scale, dynamic logic, strict latency
               | requirements, and geographical distribution. I 've built
               | complete ad platforms by myself for 3 different companies
               | now so I can confidently say that this is not a difficult
               | scenario. It's a ready-heavy content site with very
               | little interactivity or complexity to each page and can
               | be made much simpler, faster and cheaper._
               | 
               | I don't dispute this or your credentials. You've built
               | critical systems in a space where it was a _core_ of the
               | business. If given time, and resources, I have no doubt
               | you could build a custom solution to their problem that
               | was more efficient.
               | 
               | Unstated in this is the type of business MangaDex is,
               | which I have the following assumptions about. I don't
               | think it's unfair to assume that we're mostly on the same
               | page here:
               | 
               | - Small to mid size, at most
               | 
               | - Small engineering team. Need to develop, deploy,
               | support, and maintain solutions.
               | 
               | - Lacks deep systems expertise, or is unable to attract
               | talent that has that expertise ($)
               | 
               | These characteristics are very common in our space. To
               | solve their technical problems, most of the time, they
               | reach for an open source solution (after examining the
               | alternatives like a service).
               | 
               | Now the question is given those constraints, and their
               | other business requirements, how do they best optimize
               | for dimensions they care about? Everything is a trade-
               | off. Everyone who builds knows this. It's unkind to
               | pretend this is a purely technical exercise. And after
               | reading their article, it's obvious they know _some_ of
               | trade-offs they 're making, so it's unkind to suggest a
               | naive solution that does nothing but make you feel
               | smarter. I'm not saying you did the above, but some of
               | these comments are outrageous.
        
           | hsn915 wrote:
           | > The likely scenario is not that you haven't fully
           | understood their constraints or requirements, it's that
           | you're just smarter than they are.
           | 
           | I never claimed to be smarter. I just understand some things
           | that I noticed a lot of people in the industry don't
           | understand.
           | 
           | My understanding is not even that great.
           | 
           | But still, this is just one example that I keep running into
           | over and over and over:
           | 
           | People opting for a complicated infrastructure setup because
           | that's what they think you should do.
           | 
           | No one showed them how to make a stable reliable website that
           | just runs on a single machine and handle thousands of
           | concurrent connections.
           | 
           | It's not hard. It's just that they've never seen it and
           | assume it's basically impossible.
           | 
           | There are areas about computing that I feel the same way
           | about. For example, before Casey Muratori demoed his refterm
           | implementation, I had no idea that it was possible to render
           | a terminal at thousands of frames per second. I just assumed
           | such a feat was technically impossible. Partly because no one
           | has done it. But then he did it, and I was blown away.
           | 
           | > Take that order of magnitude cheaper, single VPS server
           | solution you're proposing and build something with it. Sounds
           | like you'd make a lot of money.
           | 
           | Building something and making money out of it are not the
           | same thing. But thanks for the advice. I'm in the process of
           | trying. I know for sure I can build the thing, but I don't
           | know if it will make any money. We will see.
           | 
           | > People who don't do / build explaining to the people who do
           | how easy, simple, better their solutions would be.
           | 
           | I do and have done.
           | 
           | This kind of advice is exactly the kind of thing I know how
           | to do because I have done it in the past using my trivial
           | setup of a single process running on a cheap VPS. And I have
           | also seen other teams struggle to get some feature nearly
           | half-working on a complicated infrastructure setup with AWS
           | and all the other buzzwords: Kibana, Elastic Search, Dynamo
           | DB, Ansible, Terraform, Kubernetes ... what else? I can't
           | even keep track of all this stuff that everyone keeps talking
           | about even though hardly anyone needs at all.
           | 
           | I've seen 4 or 5 companies try to build their service using
           | this kind of setup, with the proposed advantange of
           | "horizontal" and "auto" scaling. And you know what? They ALL
           | struggled with poor performance, _ALL_ _THE_ _TIME_. It 's
           | really sad.
        
         | colesantiago wrote:
         | I agree, just had to read the article again, and took it as a
         | fancy way of wasting money really.
        
       | [deleted]
        
       | robertwt7 wrote:
       | I had nothing but respect for the whole team. Dedicating their
       | time to build everything from scratch, not to mention that they
       | maintain everything for free.. It's a cool project, not sure if
       | there's a way for anyone to contribute.
       | 
       | I"ll join the discord afterwork to see if they need any extra
       | hand.
       | 
       | Gee, how do these people find other people online to work on all
       | of the cool projects. I would love to join rather than playing
       | games after WFH on the same pc over and over again lol
        
         | codewithcheese wrote:
         | Find cool project. Contribute. :)
        
           | robertwt7 wrote:
           | I do on some open source projects on github. Sorry what I
           | meant is not just some open source projects but working
           | products like this driven by volunteers / teams like theirs.
        
         | IncRnd wrote:
         | Okay, but isn't most of their content stolen? Why would you
         | want to contribute to that?
        
           | hwers wrote:
           | Just curious if anyone reading this knows the answer: Would
           | it be illegal to contribute man-hours on e.g. implementing
           | features or fixing bugs on a project like this, or does that
           | only apply to whoever actually hosts the content?
        
             | Hamuko wrote:
             | MPA tried to get the source code for Nyaa.si removed from
             | GitHub because the "Repository hosts and offers for
             | download the Project, which, when downloaded, provides the
             | downloader everything necessary to launch and host a
             | "clone" infringing website identical to Nyaa.si (and, thus,
             | engage in massive infringement of copyrighted motion
             | pictures and television shows)".
             | 
             | It was a completely retarded play on MPA's part and they
             | only managed to get the repo down for days until GitHub
             | restored it even without hearing from the repo owners. So
             | really they only brought about some minor nuisance
             | alongside a bunch of headlines to advertise Nyaa.si for the
             | rest of the world.
             | 
             | https://torrentfreak.com/mpa-takes-down-nyaa-github-
             | reposito...
             | 
             | https://torrentfreak.com/github-restores-nyaa-repository-
             | as-...
        
             | kmeisthax wrote:
             | A good lawyer would probably say something like "it
             | depends".
             | 
             | It's entirely possible for a copyright owner to construe
             | some kind of secondary liability based on your conduct,
             | even if the underlying software is legal. This is how they
             | ultimately got Grokster, for example - even if the software
             | was legal, advertising it's use for copyright infringement
             | makes you liable for the infringement. I could also see
             | someone alledging contributory liability for, say,
             | implementing features of the software that have no non-
             | infringing uses. Even if that turned out to ultimately not
             | be illegal, that would be at the end of a long, expensive,
             | and fruitless legal defense that would drain your finances.
             | 
             | In other words, "chilling effects dominate".
        
           | KingOfCoders wrote:
           | Yes of course it is stolen. And people claiming otherwise are
           | the same people who come here and ask "What can I do, some
           | Chinese company ripped of my website?!?!?!"
        
             | cyborgx7 wrote:
             | >And people claiming otherwise are the same people who come
             | here and ask "What can I do, some Chinese company ripped of
             | my website?!?!?!"
             | 
             | Something you made up in your head with literally not a
             | single shred of evidence.
        
           | dinobones wrote:
           | Maybe because they enjoy the interesting domain and
           | challenges of the area, look at a project like Dolphin for
           | example.
           | 
           | Also, some people hold the view that things like information,
           | media, code can not be "stolen" in the traditional sense, so
           | that further reduces any qualms about associating themselves
           | with it.
        
           | cyborgx7 wrote:
           | No, intellectual "property" can not be stolen. You are
           | thinking of copyright infringement.
        
             | [deleted]
        
           | neonbones wrote:
           | The world of scanlations is always on edge. Usually, when
           | publishers announce official translations of manga titles,
           | fans drop translations of this title. It's not rare that
           | publishers hire fans who were translating this title before
           | for free as an official team.
           | 
           | To be more precise, the real reason why such sites are alive
           | is that they delete titles that got licenses in Europe and
           | the USA. Still, publishers can measure the popularity of
           | titles and buy legal rights to publish it, because it's
           | popular enough. It's harder to find manga "raws" than
           | translated versions.
           | 
           | And by that, they're not 100% "illegal" for the western
           | world, and asian companies are not so interested in fighting
           | with scanlations because they need to combat piracy in their
           | part of the world.
        
             | lifthrasiir wrote:
             | Heck no. As per the Berne Convention they are 100% illegal
             | even in the western world and can only survive due to the
             | neglect or lack of legal resources---I have seen multiple
             | cases where artists were well aware of scanlations but
             | couldn't fight against them because of that. A legal way to
             | do scanlation would be always welcomed (and there have been
             | varying degree of successes in other areas), but it is just
             | wrong to claim that they are somehow legitimate at all.
        
           | Hamuko wrote:
           | Depends on what you consider "stolen". In most cases, the
           | manga that is available is translated and edited by fans to
           | make it accessible to English-speakers when the IP owners do
           | not see a reason to do it themselves. The amount of manga
           | that actually get official English releases is very tiny and
           | western licensing companies do not have many incentives to
           | start picking up obscure manga that no one without the
           | ability to read Japanese have heard of. They're much better
           | off going after manga that have already been made popular by
           | fan-translated manga, or have some other property that has
           | caught traction (for example manga with an anime adaptation
           | that has official or unofficial subtitles).
        
             | IncRnd wrote:
             | It's what most of the world considers stolen.
             | 
             |  _Scanlations are often viewed by fans as the only way to
             | read comics that have not been licensed for release in
             | their area. However, according to international copyright
             | law, such as the Berne Convention, scanlations are
             | illegal._ [1]
             | 
             | This is a snippet about the Berne Convention:
             | 
             |  _The Berne Convention for the Protection of Literary and
             | Artistic Works, usually known as the Berne Convention, is
             | an international agreement governing copyright, which was
             | first accepted in Berne, Switzerland, in 1886. The Berne
             | Convention has 179 contracting parties, most of which are
             | parties to the Paris Act of 1971.
             | 
             | The Berne Convention formally mandated several aspects of
             | modern copyright law; it introduced the concept that a
             | copyright exists the moment a work is "fixed", rather than
             | requiring registration. It also enforces a requirement that
             | countries recognize copyrights held by the citizens of all
             | other parties to the convention._ [2]                 [1]
             | https://en.wikipedia.org/wiki/Scanlation#Legal_action
             | [2] https://en.wikipedia.org/wiki/Berne_Convention
        
       | angarg12 wrote:
       | > The only missing bit would be the ability to replicate
       | production traffic, as some bugs only happen under very high
       | traffic by a large number of concurrent users. This is however at
       | best difficult or nearly impossible to do.
       | 
       | Not sure I'm missing something here. Surely you could sample some
       | prod traffic and then replay it with one of the many load test
       | tools out there. You might lose in the geographical distribution,
       | but load testing a web server with 2k TPS sounds a bit trivial.
        
       | holoduke wrote:
       | I am running an app with 10.000 incoming rq/s on AVG. It's
       | running on 8, 8 core Hetzner VMs. Most request are static data
       | calls like images, JSON and text. About 5% is MySQL and other IO
       | operations. I pay about 300 euros a month for this setup. Quite
       | happy with it.
        
       | TekMol wrote:
       | My cheap $20/month VPS serves tens of thousands a user per day
       | without breaking much of a sweat. Using a good old LAMP stack
       | (Linux, Apache, MariaDB, PHP).
       | 
       | I don't know how many requests per second it can handle.
       | 
       | Trying a guess via curl:
       | 
       | time curl --insecure --header 'Host: www.mysite.com'
       | https://127.0.0.1 > test
       | 
       | This gives me 0.03s
       | 
       | So it could handle about 30 requests per second? Or 30x the
       | number of CPUs? What do you guys think?
        
         | dharmab wrote:
         | Does it serve 20-40 hi resolution images and uploads per user?
        
           | TekMol wrote:
           | I wanted to start a discussion about how to estimate the
           | number of requests a given server can handle per second. So
           | when I read "x requests/s" I can put that into perspective.
           | 
           | But it seems you think I wanted to start a dick measuring
           | contest?
           | 
           | If your question is genuine: I would serve images via a CDN.
           | The above timing is for assembling a page by doing an auth
           | check, a bunch of database queries and templating the result.
        
           | lostmsu wrote:
           | I am not sure how to interpret this para:
           | 
           | > In practice, we currently see peaks of above 2000 requests
           | every single second during prime time. That is multiple
           | billions of requests per month, or more than 10 million
           | unique monthly visitors. And all of this before actually
           | serving images.
           | 
           | If I am reading that correctly, 2000r/s does not include
           | images, and makes it unclear if $1500/month does.
        
             | cinntaile wrote:
             | I'm pretty sure that includes images, that's why people
             | visit the site. Prime time happens when a very popular
             | manga gets released at around the same time every week.
        
           | Hamuko wrote:
           | Hosting static files isn't really that hard. I used to host a
           | website that at its best served around 1000 GB of video
           | content in 24 hours. Of course, it wasn't the fastest without
           | a CDN but it was just 25 EUR/month.
        
         | blntechie wrote:
         | I guess you basically run a load test of randomized or usage
         | weighted list of API endpoints for increasing number of
         | synthetic users and see when things start breaking. Many free
         | tools help run these tests from even your laptop.
        
         | [deleted]
        
         | rhines wrote:
         | You need to do load testing to determine this - a request's
         | time includes many delays that are not related to the work the
         | server does, and thus it's not as simple as 1/0.03 - it's
         | possible that 0.0001 second of that time is actually server
         | time, or 0.025 - plus you also have to consider if there are
         | multiple cores working, or non-linear algorithms running, or
         | who knows what else.
         | 
         | Best way to figure it out is to use an application like Apache
         | Bench from a powerful computer with a good internet connection,
         | throw a lot of concurrent connections at the site, and see what
         | happens.
        
           | TekMol wrote:
           | I think it makes sense to test from the server itself because
           | otherwise I would test network infrastructure. While that is
           | interesting too, I am trying to figure out what the server
           | (VM) can handle first.
           | 
           | I just tried Apache Bench:
           | 
           | ab -n 1000 -c 100 'https://www.mysite.com'
           | Concurrency Level:      100         Time taken for tests:
           | 1.447 seconds         Complete requests:      1000
           | Failed requests:        0         Requests per second:
           | 691.19 [#/sec] (mean)         Time per request:       144.679
           | [ms] (mean)         Time per request:       1.447 [ms] (mean,
           | across all concurrent requests)
           | 
           | Wow, that is fast. Around 700 requests per second!
           | 
           | Upping it 10x times to 10k requests ...
           | Requests per second:    844.99 [#/sec] (mean)
           | 
           | Even faster!
        
         | lamnk wrote:
         | A day is 16 * 60 * 60 = 57,600 seconds (night time
         | substracted). So tens thousands users per day is like 1-2
         | req/s, maybe 50 at peak time.
         | 
         | What is more important is what kind of requests your server has
         | to serve. Nginx can easily serve 50-80k req/s of static
         | content; 100ks range if tuned properly.
        
       ___________________________________________________________________
       (page generated 2021-09-07 23:01 UTC)