[HN Gopher] Notes from the Architect (2016)
       ___________________________________________________________________
        
       Notes from the Architect (2016)
        
       Author : genericlemon24
       Score  : 82 points
       Date   : 2021-08-02 11:06 UTC (2 days ago)
        
 (HTM) web link (varnish-cache.org)
 (TXT) w3m dump (varnish-cache.org)
        
       | skywhopper wrote:
       | I find it mildly amusing that the author spends the first half of
       | the article mocking squid for handling disk cache manually when
       | varnish just hands the work to the OS instead, and then the
       | second half of the article explaining how and why varnish does
       | all its own memory management for processing requests instead of
       | just letting the OS handle it.
        
         | jerf wrote:
         | No, the first half explains why varnish lets the OS manage
         | virtual memory, then explains why varnish doesn't rely on the
         | user space memory allocator that comes with the language. Two
         | completely different layers and different things.
        
           | aporetics wrote:
           | This
        
       | throw0101a wrote:
       | (2006)
       | 
       | Previously:
       | 
       | * https://news.ycombinator.com/item?id=27086239
        
         | phkamp wrote:
         | This post does seem to pop up here every other year, doesn't it
         | ?
        
       | nextaccountic wrote:
       | > Well, today computers really only have one kind of storage, and
       | it is usually some sort of disk, the operating system and the
       | virtual memory management hardware has converted the RAM to a
       | cache for the disk storage.
       | 
       | > So what happens with squids elaborate memory management is that
       | it gets into fights with the kernels elaborate memory management,
       | and like any civil war, that never gets anything done.
       | 
       | ctrl+f mmap on this article doesn't return anything so.. is it
       | talking it's better to mmap files and having files store in-
       | memory data structures like cap'n'proto does?
       | 
       | (if it's entirely in-memory with no file in disk, then it's
       | surely comparing apples to oranges? if a program needs the
       | persistent memory, it needs the disk)
       | 
       | Also: is mmaping a file and using it directly from memory what
       | databases should do today to avoid fighting the OS disk cache?
        
         | atonse wrote:
         | I thought the fact that mongodb data files were essentially
         | mmapped was used as a critique against its durability as a
         | primary database. Is that not true?
         | 
         | Or was that not a relevant critique?
        
           | nextaccountic wrote:
           | Welp, I don't know!
           | 
           | From the link of the other comment here, Sqlite disables mmap
           | by default because Linux provides a poor API regarding I/O
           | errors (they are sent as a signal instead)
        
           | jd_mongodb wrote:
           | MongoDB released the Wired Tiger storage engine in 2015 which
           | replaced the MMAP storage engine. So whatever your read about
           | MMAP is obsolete.
           | 
           | https://www.mongodb.com/presentations/a-technical-
           | introducti...
        
             | atonse wrote:
             | Yep I know that. I'm saying mmap used to be a critique.
             | 
             | I left the place I introduced mongo to, a couple months
             | before they bought WiredTiger
        
           | jfindley wrote:
           | A database like mongodb is a very different animal from a
           | http cache. The mongodb authors likely read this article
           | (it's from 2006) when they were designing the first version
           | of their database, and due to their almost complete
           | unfamiliarity with what they were doing didn't understand
           | that these two things are not alike.
           | 
           | Databases have different classes of objects (think index vs
           | data, though there are more than just these two), different
           | types of access (e.g. full table scan vs single record
           | select) and a whole lot more complexity than I can cover
           | here. You don't really ever want to page out your indicies,
           | but paging out data is probably fine. You likely want to
           | avoid replacing your entire cache for full table scan of a
           | big table. You probably want to have some concept of ensuring
           | a single user doesn't starve out all your other 1000+ users.
           | Etc etc.
           | 
           | In other words, you can't get away with just opting out of
           | memory management for a database - you need to treat
           | different things differently. I understand that a few years
           | in, the mongo devs eventually caught up to 1990s era mysql
           | and realised this, leading to a new DB engine that was a bit
           | less bad.
        
           | aidenn0 wrote:
           | It's a valid critique, but A cache has no durability
           | requirements.
        
         | jasonwatkinspdx wrote:
         | Mmaping directly readable data is certainly a way of avoiding
         | the problems PHK is talking about, and can be reasonably
         | efficient. Sendfile can be even better if you can structure
         | things to have a metadata index that's in memory or mmaped, and
         | never need to read the actual sent data.
         | 
         | But mmap is no panacea. It's big flaw is that the kernel
         | scheduler often doesn't know as much as your app about what
         | you're doing. So when you touch an out of core page and it page
         | faults, your thread just gets locked up. Worse, the kernel's
         | read ahead logic may make the wrong guess about how many pages
         | you're going to read, or when you want data written back out.
         | You can try to mitigate this using madvise and msync, but in
         | practice it's kinda brittle.
         | 
         | Some systems use blocking io with a dedicated thread pool to
         | wait on the block. This way the main threads can move on to
         | other useful work rather than sitting stalled on a page fault.
         | Golang's runtime does something like this for you
         | automatically.
         | 
         | Databases for the most part do their own scheduling, and may
         | even use async io. This is because they know far better what
         | they're actually doing than the kernel can guess. Another
         | reason is the database's buffer cache is usually tightly
         | coupled with the consistency control algorithm, so again, you
         | don't want the kernel guessing, you want total control.
         | Databases often use direct io and take on the burden of
         | scheduling everything themselves precisely for this reason.
         | 
         | LMDB uses mmap, but should not really be considered a database.
         | It's just an embedded CoW btree, though a fine one.
         | 
         | Andy Pavlo's openly published course materials are a great
         | resource for learning about database internels. He used to be
         | emphatically against mmap for databases, but in the last couple
         | years has moderated that position somewhat.
         | 
         | io_uring is looking like it will be the best overall solution
         | moving forward, at least for linux only.
        
           | nextaccountic wrote:
           | Thanks for the explanation.
           | 
           | > io_uring is looking like it will be the best overall
           | solution moving forward, at least for linux only.
           | 
           | But io_uring means duplicating memory, right? There will be
           | both my in-memory representation and the kernel cache for the
           | file and they will be fighting each other (or at least,
           | consuming more memory than necessary.
        
         | genericlemon24 wrote:
         | Yes, I understood that to mean mmap().
         | 
         | > is mmaping a file and using it directly from memory what
         | databases should do today
         | 
         | Some do; for example, SQLite: https://www.sqlite.org/mmap.html
         | (the page has pros and cons, and explains how SQLite does
         | memory mapped I/O).
        
       | haasted wrote:
       | Would be nice if the headline included the year this was written.
       | (2005? 2006?)
        
       | Ecco wrote:
       | Does anyone else think the MP3 player joke is kind of weird?
        
         | jfindley wrote:
         | Yes - this was written in 2006 and that joke really hasn't aged
         | well at all.
         | 
         | Dang maybe worth sticking a (2006) on the end of this?
        
         | capableweb wrote:
         | Eh. The article itself is good enough for me to ignore one
         | silly penile joke that almost doesn't even make sense.
        
       | tyingq wrote:
       | He also has a post on why Varnish doesn't support SSL/TLS:
       | 
       | https://varnish-cache.org/docs/4.0/phk/ssl.html
       | 
       | Which is somewhat interesting now, as Varnish has lost a lot
       | ground to other caching proxies that did choose to implement
       | SSL/TLS.
        
         | myWindoonn wrote:
         | How are you measuring this lost ground? Varnish is still
         | extremely popular, and designs which separate TLS from caching
         | are also still popular.
        
           | tyingq wrote:
           | Mostly anecdotal, since there's no real definitive way to
           | tell.
           | 
           | There's two spaces I see.
           | 
           | - Caching within the webserver, as the caching there has
           | improved over time. The caches in Nginx, Apache, LiteSpeed,
           | etc, perform much better than they did in the past.
           | 
           | - Caching at a load balancer tier. Nginx again (though in a
           | LB context), Haproxy, etc.
           | 
           | Both spaces seem to have less talk about Varnish than they
           | used to, and more about other platforms.
           | 
           | You can see a plateau, then decline (though slight) that
           | starts for Varnish around mid-2018, here:
           | 
           | https://trends.builtwith.com/Web-Server/Varnish
           | 
           | Compare to:
           | 
           | LiteSpeed: https://trends.builtwith.com/Web-Server/LiteSpeed
           | 
           | Nginx: https://trends.builtwith.com/Web-Server/nginx
           | 
           | (Though these kinds of surveys are tricky, as they depend on
           | outward facing headers that don't always exist, or don't
           | always tell you enough...like Nginx doesn't always imply the
           | cache is in use)
           | 
           | I also wonder what percentage hit Varnish might take if
           | Fastly moves away. I'm sure they regret not varnish
           | specifically, but exposing the varnish vcl directly to end
           | users.
           | 
           | Edit: Yes, I agree that sites like "builtwith" are flawed,
           | and mentioned some of the reasons above. And, I didn't mean
           | for this comment to sound like a criticism...just an
           | observation. I noticed builtwith's chart has a similar
           | plateau + slight decline for Apache, starting also near
           | mid-2018.
        
             | acdha wrote:
             | > Though these kinds of surveys are tricky, as they depend
             | on outward facing headers that don't always exist
             | 
             | It's not just "don't always exist": those headers are
             | actively recommended against by various security guidelines
             | so many large sites heavily use things you can only infer
             | from other characteristics.
             | 
             | This is also the kind of environment where I see some
             | movement against Varnish: internal TLS requirements
             | increase the cost of managing two services instead of one,
             | and if you're increasingly using something like an external
             | CDN the level of benefit from Varnish's cache declines
             | somewhat even though the powerful request routing and
             | manipulation features are still appealing.
             | 
             | I've been generally wondering what it would take to be able
             | to flip the model to something like Cloudflare's Argo
             | Tunnel feature where you could secure internal
             | communications by having your various web services make an
             | _outbound_ connection to the Varnish box which all of the
             | requests will be tunneled over so you only need to manage
             | one certificate there rather than one for every
             | service/container in a complex application.
        
             | phkamp wrote:
             | Varnish has a pretty big market-share as "the intelligent
             | HTTP-router" which can be used to sort traffic to piles of
             | legacy webservers etc, and also be the central clearing-
             | house for detecting and fixing trouble.
             | 
             | Surveys such as builtwith, despite their hard work, can
             | often not "see through the sandwich" and spot if there is a
             | Varnish in it.
             | 
             | Also, I dont know about you, but from a "I want the world
             | to keep working" reliability point of view, I do not like
             | it when any single piece of software, FOSS or non-FOSS,
             | becomes too dominant.
             | 
             | See for instance how dysfunctional the GCC-monopoly was
             | until LLVM gave them competition.
             | 
             | So taking builtwith at their numbers, I'm actually fine
             | with Varnish having "stagnated" at a market share of one
             | fifth of the worlds top 10K websites: That keeps me humble
             | about my code quality, but does not keep me awake at night.
        
         | zapt02 wrote:
         | Varnish does support SSL officially in their enterprise
         | offering: https://docs.varnish-software.com/varnish-cache-
         | plus/feature...
        
         | phkamp wrote:
         | On the other hand: Count the CVE's.
         | 
         | An increasingly popular high-rel setup has two different
         | SSL/TLS handlers in front of Varnish, each using a different
         | SSL/TLS implementation.
         | 
         | That way a "ohh shit CVE" against either of those two
         | implementations allow you to turn those of, and keep your site
         | running.
         | 
         | If We bolted any particular TLS/SSL implementation into
         | Varnish, you'd be down when that one got hit.
        
           | tyingq wrote:
           | Yes, I'm not arguing that he's (edit: you're) wrong. Just
           | that the decision seems to be causing people to choose other
           | solutions, I suspect because managing one thing at least
           | seems easier than two.
        
             | phkamp wrote:
             | "he" in this case being me :-)
             | 
             | I think the threshold question will always be "Does
             | software X make my life better?" and if it does not, you
             | should ditch it if you can.
             | 
             | There is always a huge bias in reporting: People are eager
             | to tell you why they started using your software, but they
             | always forget the "exit-interview" when they drop it again.
             | 
             | The reason I hear most often for people dropping Varnish is
             | that they have cleaned up the mess of legacy web-services,
             | or at least transitioned it all to the New Fantastic
             | Platform.
             | 
             | Other people drop Varnish for other versions of "this is
             | now surplus to requirements" and I am totally fine with
             | that: I dont want people to run Varnish if it doesn't make
             | their life better.
        
       | andrewmcwatters wrote:
       | So I guess a bigger question is, how do you reconcile old
       | thinking with new?
       | 
       | You can't really program everything to use big blocks of virtual
       | memory that are file backed. And you can't program everything to
       | be cache oblivious.
       | 
       | So is the only real solution to implement naive solutions until
       | they're slow and test them until they're not?
       | 
       | That would be sort of a sad way to write software, but perhaps it
       | is the only true way, since it's the most widely applied and
       | pragmatic practice.
       | 
       | Further, if all of this really is true, which I'm sure it is,
       | then APIs and languages have not caught up, not developers.
       | 
       | Because we're all still writing software with what's provided to
       | us, and last time I checked, we didn't have access to black box
       | caching implementation details, or direct access to L1, L2, L3,
       | etc.
        
         | phkamp wrote:
         | Ohh, one of the hard questions :-)
         | 
         | Personally, I think the cache oblivious thing was oversold, all
         | the algorithms I have seen are horribly complex and thus only
         | an improvement if your N is so big as to overshadow the
         | constant terms in O(...) we usually ignore. Patenting and
         | expecting to become stinking filthy rich on the licensing fees,
         | was another good way to prevent them from being used.
         | 
         | I think the best we can do is probably still making sure we
         | know what we are trying to do, and what we are not going to do,
         | and then architect accordingly.
         | 
         | That is of course really just a fancy way of saying "it
         | depends", but I'm OK with that: Magic silver bullets are not a
         | thing.
        
       | sudhirj wrote:
       | > Varnish allocate some virtual memory, it tells the operating
       | system to back this memory with space from a disk file.
       | 
       | What's the special instruction to do this? Is this a low level C
       | / syscall thing or do languages like Go have a disk-backed map
       | implementation already?
        
         | ruste wrote:
         | This sounds like a backwards description of mmap. This is
         | probably what they're using on the backend. I'm not sure if
         | Windows has this as a feature, but any unixy system will.
        
           | Arnavion wrote:
           | mmap on Windows is CreateFileMapping + MapViewOfFile
        
         | KarlKode wrote:
         | I guess they are refering to mmap (https://man7.org/linux/man-
         | pages/man2/mmap.2.html). In go you can use the mmap syscall
         | directly (low level, not supported on all targets/OSs). I
         | believe there are a few libraries that offer wrappers around
         | the syscall/emulate the syscall if not available).
        
       | capableweb wrote:
       | See also: a reply from antirez (of Redis fame) -
       | http://oldblog.antirez.com/post/what-is-wrong-with-2006-prog...
        
         | 0x000000001 wrote:
         | Don't overlook PHK's replies in the comments there
        
           | throw0101a wrote:
           | He's on HN as well:
           | 
           | * https://news.ycombinator.com/user?id=phkamp
        
           | jasonwatkinspdx wrote:
           | This. Note that PHK's comments have proven entirely correct,
           | and redis had to implement at least a limited form of
           | multithreading.
           | 
           | I would be very careful about Antirez's older advice on this
           | topic area. At one point he was designing his own "VM"
           | algorithm that would work at 256 byte granularity, an idea
           | that was never viable.
        
             | andrewmcwatters wrote:
             | I've noticed this sort of thing from Antirez before and it
             | makes his comments seem dubious unless they come from his
             | specific specialization.
             | 
             | He made some claims about Lua here some time ago, and if
             | you read his code, it was clear he didn't know how to use
             | its C API and was criticizing it out of something he could
             | have just read the manual over.
             | 
             | Software is hard, though, and there's always someone who is
             | more right, so, oh well.
             | 
             | Maybe a good strategy is to openly ask others if there are
             | better solutions the reader knows of and to share them.
        
               | jasonwatkinspdx wrote:
               | Yeah, I don't want to be overly negative, but Not
               | Invented Here syndrome is a recurring theme with his
               | work. Unfortunately that's quite prevalent among software
               | developers as a whole. It's somewhat understandable: it's
               | easier to just run with your imagination and start
               | hacking something together than dig through research
               | papers, textbooks, and blog posts.
        
       ___________________________________________________________________
       (page generated 2021-08-04 23:01 UTC)