[HN Gopher] Notes from the Architect (2016)
___________________________________________________________________
Notes from the Architect (2016)
Author : genericlemon24
Score : 82 points
Date : 2021-08-02 11:06 UTC (2 days ago)
(HTM) web link (varnish-cache.org)
(TXT) w3m dump (varnish-cache.org)
| skywhopper wrote:
| I find it mildly amusing that the author spends the first half of
| the article mocking squid for handling disk cache manually when
| varnish just hands the work to the OS instead, and then the
| second half of the article explaining how and why varnish does
| all its own memory management for processing requests instead of
| just letting the OS handle it.
| jerf wrote:
| No, the first half explains why varnish lets the OS manage
| virtual memory, then explains why varnish doesn't rely on the
| user space memory allocator that comes with the language. Two
| completely different layers and different things.
| aporetics wrote:
| This
| throw0101a wrote:
| (2006)
|
| Previously:
|
| * https://news.ycombinator.com/item?id=27086239
| phkamp wrote:
| This post does seem to pop up here every other year, doesn't it
| ?
| nextaccountic wrote:
| > Well, today computers really only have one kind of storage, and
| it is usually some sort of disk, the operating system and the
| virtual memory management hardware has converted the RAM to a
| cache for the disk storage.
|
| > So what happens with squids elaborate memory management is that
| it gets into fights with the kernels elaborate memory management,
| and like any civil war, that never gets anything done.
|
| ctrl+f mmap on this article doesn't return anything so.. is it
| talking it's better to mmap files and having files store in-
| memory data structures like cap'n'proto does?
|
| (if it's entirely in-memory with no file in disk, then it's
| surely comparing apples to oranges? if a program needs the
| persistent memory, it needs the disk)
|
| Also: is mmaping a file and using it directly from memory what
| databases should do today to avoid fighting the OS disk cache?
| atonse wrote:
| I thought the fact that mongodb data files were essentially
| mmapped was used as a critique against its durability as a
| primary database. Is that not true?
|
| Or was that not a relevant critique?
| nextaccountic wrote:
| Welp, I don't know!
|
| From the link of the other comment here, Sqlite disables mmap
| by default because Linux provides a poor API regarding I/O
| errors (they are sent as a signal instead)
| jd_mongodb wrote:
| MongoDB released the Wired Tiger storage engine in 2015 which
| replaced the MMAP storage engine. So whatever your read about
| MMAP is obsolete.
|
| https://www.mongodb.com/presentations/a-technical-
| introducti...
| atonse wrote:
| Yep I know that. I'm saying mmap used to be a critique.
|
| I left the place I introduced mongo to, a couple months
| before they bought WiredTiger
| jfindley wrote:
| A database like mongodb is a very different animal from a
| http cache. The mongodb authors likely read this article
| (it's from 2006) when they were designing the first version
| of their database, and due to their almost complete
| unfamiliarity with what they were doing didn't understand
| that these two things are not alike.
|
| Databases have different classes of objects (think index vs
| data, though there are more than just these two), different
| types of access (e.g. full table scan vs single record
| select) and a whole lot more complexity than I can cover
| here. You don't really ever want to page out your indicies,
| but paging out data is probably fine. You likely want to
| avoid replacing your entire cache for full table scan of a
| big table. You probably want to have some concept of ensuring
| a single user doesn't starve out all your other 1000+ users.
| Etc etc.
|
| In other words, you can't get away with just opting out of
| memory management for a database - you need to treat
| different things differently. I understand that a few years
| in, the mongo devs eventually caught up to 1990s era mysql
| and realised this, leading to a new DB engine that was a bit
| less bad.
| aidenn0 wrote:
| It's a valid critique, but A cache has no durability
| requirements.
| jasonwatkinspdx wrote:
| Mmaping directly readable data is certainly a way of avoiding
| the problems PHK is talking about, and can be reasonably
| efficient. Sendfile can be even better if you can structure
| things to have a metadata index that's in memory or mmaped, and
| never need to read the actual sent data.
|
| But mmap is no panacea. It's big flaw is that the kernel
| scheduler often doesn't know as much as your app about what
| you're doing. So when you touch an out of core page and it page
| faults, your thread just gets locked up. Worse, the kernel's
| read ahead logic may make the wrong guess about how many pages
| you're going to read, or when you want data written back out.
| You can try to mitigate this using madvise and msync, but in
| practice it's kinda brittle.
|
| Some systems use blocking io with a dedicated thread pool to
| wait on the block. This way the main threads can move on to
| other useful work rather than sitting stalled on a page fault.
| Golang's runtime does something like this for you
| automatically.
|
| Databases for the most part do their own scheduling, and may
| even use async io. This is because they know far better what
| they're actually doing than the kernel can guess. Another
| reason is the database's buffer cache is usually tightly
| coupled with the consistency control algorithm, so again, you
| don't want the kernel guessing, you want total control.
| Databases often use direct io and take on the burden of
| scheduling everything themselves precisely for this reason.
|
| LMDB uses mmap, but should not really be considered a database.
| It's just an embedded CoW btree, though a fine one.
|
| Andy Pavlo's openly published course materials are a great
| resource for learning about database internels. He used to be
| emphatically against mmap for databases, but in the last couple
| years has moderated that position somewhat.
|
| io_uring is looking like it will be the best overall solution
| moving forward, at least for linux only.
| nextaccountic wrote:
| Thanks for the explanation.
|
| > io_uring is looking like it will be the best overall
| solution moving forward, at least for linux only.
|
| But io_uring means duplicating memory, right? There will be
| both my in-memory representation and the kernel cache for the
| file and they will be fighting each other (or at least,
| consuming more memory than necessary.
| genericlemon24 wrote:
| Yes, I understood that to mean mmap().
|
| > is mmaping a file and using it directly from memory what
| databases should do today
|
| Some do; for example, SQLite: https://www.sqlite.org/mmap.html
| (the page has pros and cons, and explains how SQLite does
| memory mapped I/O).
| haasted wrote:
| Would be nice if the headline included the year this was written.
| (2005? 2006?)
| Ecco wrote:
| Does anyone else think the MP3 player joke is kind of weird?
| jfindley wrote:
| Yes - this was written in 2006 and that joke really hasn't aged
| well at all.
|
| Dang maybe worth sticking a (2006) on the end of this?
| capableweb wrote:
| Eh. The article itself is good enough for me to ignore one
| silly penile joke that almost doesn't even make sense.
| tyingq wrote:
| He also has a post on why Varnish doesn't support SSL/TLS:
|
| https://varnish-cache.org/docs/4.0/phk/ssl.html
|
| Which is somewhat interesting now, as Varnish has lost a lot
| ground to other caching proxies that did choose to implement
| SSL/TLS.
| myWindoonn wrote:
| How are you measuring this lost ground? Varnish is still
| extremely popular, and designs which separate TLS from caching
| are also still popular.
| tyingq wrote:
| Mostly anecdotal, since there's no real definitive way to
| tell.
|
| There's two spaces I see.
|
| - Caching within the webserver, as the caching there has
| improved over time. The caches in Nginx, Apache, LiteSpeed,
| etc, perform much better than they did in the past.
|
| - Caching at a load balancer tier. Nginx again (though in a
| LB context), Haproxy, etc.
|
| Both spaces seem to have less talk about Varnish than they
| used to, and more about other platforms.
|
| You can see a plateau, then decline (though slight) that
| starts for Varnish around mid-2018, here:
|
| https://trends.builtwith.com/Web-Server/Varnish
|
| Compare to:
|
| LiteSpeed: https://trends.builtwith.com/Web-Server/LiteSpeed
|
| Nginx: https://trends.builtwith.com/Web-Server/nginx
|
| (Though these kinds of surveys are tricky, as they depend on
| outward facing headers that don't always exist, or don't
| always tell you enough...like Nginx doesn't always imply the
| cache is in use)
|
| I also wonder what percentage hit Varnish might take if
| Fastly moves away. I'm sure they regret not varnish
| specifically, but exposing the varnish vcl directly to end
| users.
|
| Edit: Yes, I agree that sites like "builtwith" are flawed,
| and mentioned some of the reasons above. And, I didn't mean
| for this comment to sound like a criticism...just an
| observation. I noticed builtwith's chart has a similar
| plateau + slight decline for Apache, starting also near
| mid-2018.
| acdha wrote:
| > Though these kinds of surveys are tricky, as they depend
| on outward facing headers that don't always exist
|
| It's not just "don't always exist": those headers are
| actively recommended against by various security guidelines
| so many large sites heavily use things you can only infer
| from other characteristics.
|
| This is also the kind of environment where I see some
| movement against Varnish: internal TLS requirements
| increase the cost of managing two services instead of one,
| and if you're increasingly using something like an external
| CDN the level of benefit from Varnish's cache declines
| somewhat even though the powerful request routing and
| manipulation features are still appealing.
|
| I've been generally wondering what it would take to be able
| to flip the model to something like Cloudflare's Argo
| Tunnel feature where you could secure internal
| communications by having your various web services make an
| _outbound_ connection to the Varnish box which all of the
| requests will be tunneled over so you only need to manage
| one certificate there rather than one for every
| service/container in a complex application.
| phkamp wrote:
| Varnish has a pretty big market-share as "the intelligent
| HTTP-router" which can be used to sort traffic to piles of
| legacy webservers etc, and also be the central clearing-
| house for detecting and fixing trouble.
|
| Surveys such as builtwith, despite their hard work, can
| often not "see through the sandwich" and spot if there is a
| Varnish in it.
|
| Also, I dont know about you, but from a "I want the world
| to keep working" reliability point of view, I do not like
| it when any single piece of software, FOSS or non-FOSS,
| becomes too dominant.
|
| See for instance how dysfunctional the GCC-monopoly was
| until LLVM gave them competition.
|
| So taking builtwith at their numbers, I'm actually fine
| with Varnish having "stagnated" at a market share of one
| fifth of the worlds top 10K websites: That keeps me humble
| about my code quality, but does not keep me awake at night.
| zapt02 wrote:
| Varnish does support SSL officially in their enterprise
| offering: https://docs.varnish-software.com/varnish-cache-
| plus/feature...
| phkamp wrote:
| On the other hand: Count the CVE's.
|
| An increasingly popular high-rel setup has two different
| SSL/TLS handlers in front of Varnish, each using a different
| SSL/TLS implementation.
|
| That way a "ohh shit CVE" against either of those two
| implementations allow you to turn those of, and keep your site
| running.
|
| If We bolted any particular TLS/SSL implementation into
| Varnish, you'd be down when that one got hit.
| tyingq wrote:
| Yes, I'm not arguing that he's (edit: you're) wrong. Just
| that the decision seems to be causing people to choose other
| solutions, I suspect because managing one thing at least
| seems easier than two.
| phkamp wrote:
| "he" in this case being me :-)
|
| I think the threshold question will always be "Does
| software X make my life better?" and if it does not, you
| should ditch it if you can.
|
| There is always a huge bias in reporting: People are eager
| to tell you why they started using your software, but they
| always forget the "exit-interview" when they drop it again.
|
| The reason I hear most often for people dropping Varnish is
| that they have cleaned up the mess of legacy web-services,
| or at least transitioned it all to the New Fantastic
| Platform.
|
| Other people drop Varnish for other versions of "this is
| now surplus to requirements" and I am totally fine with
| that: I dont want people to run Varnish if it doesn't make
| their life better.
| andrewmcwatters wrote:
| So I guess a bigger question is, how do you reconcile old
| thinking with new?
|
| You can't really program everything to use big blocks of virtual
| memory that are file backed. And you can't program everything to
| be cache oblivious.
|
| So is the only real solution to implement naive solutions until
| they're slow and test them until they're not?
|
| That would be sort of a sad way to write software, but perhaps it
| is the only true way, since it's the most widely applied and
| pragmatic practice.
|
| Further, if all of this really is true, which I'm sure it is,
| then APIs and languages have not caught up, not developers.
|
| Because we're all still writing software with what's provided to
| us, and last time I checked, we didn't have access to black box
| caching implementation details, or direct access to L1, L2, L3,
| etc.
| phkamp wrote:
| Ohh, one of the hard questions :-)
|
| Personally, I think the cache oblivious thing was oversold, all
| the algorithms I have seen are horribly complex and thus only
| an improvement if your N is so big as to overshadow the
| constant terms in O(...) we usually ignore. Patenting and
| expecting to become stinking filthy rich on the licensing fees,
| was another good way to prevent them from being used.
|
| I think the best we can do is probably still making sure we
| know what we are trying to do, and what we are not going to do,
| and then architect accordingly.
|
| That is of course really just a fancy way of saying "it
| depends", but I'm OK with that: Magic silver bullets are not a
| thing.
| sudhirj wrote:
| > Varnish allocate some virtual memory, it tells the operating
| system to back this memory with space from a disk file.
|
| What's the special instruction to do this? Is this a low level C
| / syscall thing or do languages like Go have a disk-backed map
| implementation already?
| ruste wrote:
| This sounds like a backwards description of mmap. This is
| probably what they're using on the backend. I'm not sure if
| Windows has this as a feature, but any unixy system will.
| Arnavion wrote:
| mmap on Windows is CreateFileMapping + MapViewOfFile
| KarlKode wrote:
| I guess they are refering to mmap (https://man7.org/linux/man-
| pages/man2/mmap.2.html). In go you can use the mmap syscall
| directly (low level, not supported on all targets/OSs). I
| believe there are a few libraries that offer wrappers around
| the syscall/emulate the syscall if not available).
| capableweb wrote:
| See also: a reply from antirez (of Redis fame) -
| http://oldblog.antirez.com/post/what-is-wrong-with-2006-prog...
| 0x000000001 wrote:
| Don't overlook PHK's replies in the comments there
| throw0101a wrote:
| He's on HN as well:
|
| * https://news.ycombinator.com/user?id=phkamp
| jasonwatkinspdx wrote:
| This. Note that PHK's comments have proven entirely correct,
| and redis had to implement at least a limited form of
| multithreading.
|
| I would be very careful about Antirez's older advice on this
| topic area. At one point he was designing his own "VM"
| algorithm that would work at 256 byte granularity, an idea
| that was never viable.
| andrewmcwatters wrote:
| I've noticed this sort of thing from Antirez before and it
| makes his comments seem dubious unless they come from his
| specific specialization.
|
| He made some claims about Lua here some time ago, and if
| you read his code, it was clear he didn't know how to use
| its C API and was criticizing it out of something he could
| have just read the manual over.
|
| Software is hard, though, and there's always someone who is
| more right, so, oh well.
|
| Maybe a good strategy is to openly ask others if there are
| better solutions the reader knows of and to share them.
| jasonwatkinspdx wrote:
| Yeah, I don't want to be overly negative, but Not
| Invented Here syndrome is a recurring theme with his
| work. Unfortunately that's quite prevalent among software
| developers as a whole. It's somewhat understandable: it's
| easier to just run with your imagination and start
| hacking something together than dig through research
| papers, textbooks, and blog posts.
___________________________________________________________________
(page generated 2021-08-04 23:01 UTC)