[HN Gopher] Chromium cleans up its act and daily DNS root server...
       ___________________________________________________________________
        
       Chromium cleans up its act and daily DNS root server queries drop
       by 60B
        
       Author : bdcravens
       Score  : 172 points
       Date   : 2021-02-05 14:35 UTC (8 hours ago)
        
 (HTM) web link (www.theregister.com)
 (TXT) w3m dump (www.theregister.com)
        
       | CivBase wrote:
       | I'm surprised 60 billion DNS queries is only 41% of the daily
       | norm. With a billion people between the US and Europe alone - the
       | vast majority of which use the world wide web directly multiple
       | times a day (nevermind other online services) - and the amount of
       | domains a typical website loads resources from, I figured daily
       | DNS queries would have broken into the trillions by now.
       | 
       | I suppose that's probably thanks to caching, dedicated apps for
       | many websites, and most users sticking to a relatively small
       | selection of websites.
        
         | ElFitz wrote:
         | Also, people usually hit their ISPs', Google's or Cloudflare's
         | DNS servers, not the root ones
         | 
         | It's, I believe, those servers that hit the root DNS servers
         | when they don't have the data.
        
         | jasonhansel wrote:
         | These are requests to the root servers, not to the ordinary DNS
         | servers most people use.
        
           | sllabres wrote:
           | Correct, and IIRC the TTL is 5 days.
        
         | csunbird wrote:
         | There is a lot of caching.
         | 
         | https://xkcd.com/908/
        
         | jlgaddis wrote:
         | Well, probably (at least) half of those billion people are
         | using one of the large public DNS servers (Google, Cloudflare,
         | et al.).
         | 
         | Plus, the answers returned by the root servers more often than
         | not include resource records with very high TTL values (e.g.,
         | NS RRs with a TTL of two days). These then get cached for that
         | long by the recursive resolvers that are used (directly) by end
         | users.
         | 
         | The root servers aren't responding to requests for the A RR for
         | google.com from Joe Schmoe or the MX RR for gmail.com from
         | Outlook running on his desktop -- both of which (without
         | checking) likely have TTLs measured in a two- or (at the most)
         | three-digit number of seconds.
        
       | fitblipper wrote:
       | At first blush the fact that all these searches were sent over
       | DNS (which is plain text by default) this was a gross privacy
       | violation. Congress has passed bills allowing ISPs to use your
       | DNS and browsing history and sell it to 3rd parties [1]. With all
       | these searchs being included in DNS queries chrome was basically
       | allowing everyone who wanted that data that you were worried
       | about that mole or that you have that weird fetish.
       | 
       | [1] https://www.nbcnews.com/news/us-news/senate-votes-let-
       | isps-s...
       | 
       | (edit: explaining better what I felt was a privacy violation)
        
         | tptacek wrote:
         | Your link does not say what you say it does, though I do
         | believe ISPs monetize DNS queries, which is part of the reason
         | for DoH.
        
       | clan wrote:
       | The is a news story which "reports" on a blog entry. At least
       | they linked the source:
       | 
       | https://blog.apnic.net/2021/02/04/how-chromium-reduces-root-...
        
         | lstamour wrote:
         | Or https://blog.verisign.com/domain-names/chromiums-
         | reduction-o...
        
       | cashewchoo wrote:
       | I recently completed a "software pilgrimage", as I decided to
       | call it in explaining to friends why I did it, of writing my own
       | DNS recursive resolver. I decided to implement it purely from RFC
       | 1034/1035, with no library code aside from TCP and UDP sockets
       | and a JSON config parser, those kinds of things. No library help
       | for anything relating to the RFC. It's part of a homelab-focused
       | DNS solution that I plan to release under the AGPLv3 once I
       | dogfood it for a while. Basically "DNS for impatient
       | homelabbers/SMB's who don't want to have to learn how to write
       | zonefiles and don't want some crazy enterprise omnibus thing
       | either".
       | 
       | Anyway, in the process of doing this and putting it in my DHCP
       | config and seeing the traffic from two smart TVs of different
       | brands, my RIPE atlas probe, various phones and laptops, smart
       | thermostat, etc etc, I've noticed that there is a TON of
       | "garbage" dns requests. Like, the chart [the article] shows -
       | where about 70% of queries result in a name error - that totally
       | meshes with what I see on a much smaller scale. Right now,
       | prometheus tells me that since last restart (which was about a
       | week ago at this point?), I've answered 80437 queries with
       | NO_ERROR, 52014 with NAME_ERROR, and 242 with SERVER_FAILURE
       | (funnily enough, when spot checking these, 8.8.8.8 also
       | SERVER_FAILUREs these same requests - usually devices with
       | presumably-buggy dns libraries not correctly specifying lengths
       | of variable-length fields).
       | 
       | It really surprised my initial suspicions that so many DNS
       | requests would be coming up with what I was originally
       | considering to be an error condition. But I guess sometimes the
       | absence of a DNS record is just as meaningful as the presence.
       | 
       | Incidentally, I also noticed these chromium dns requests, and
       | they had me worried for a bit because I wondered if they were
       | malware trying to exploit some kind of vuln in dns servers. Took
       | a bit of googling to figure it out. I do think they make up a
       | decent % of the name errors I see, though I hadn't gotten around
       | to having prometheus split them out to measure.
        
         | fouronnes3 wrote:
         | That's a very sweet idea. What are some other "software
         | pilgrimage" projects for ambitious hackers with a finite amount
         | of free time?
         | 
         | Buying something in bitcoin using only the protocol spec and
         | man pages? Tweeting from a Linux from scratch install? Writing
         | a correct tar command on the first try? Writing a quine on your
         | favorite language without google?
        
           | uncledave wrote:
           | My personal pilgrimage seems to always end up designing
           | designing a stack virtual machine and writing a compiler for
           | it for a toy programming language. I tend to pick this up
           | every time I learn a new language to get a feel for it.
           | 
           | I usually get it to run a simple loop to print hello world
           | and do the inevitable Fibonacci generator and get bored. No
           | fancy things like functions!
           | 
           | One of these things was crudely repurposed into a domain
           | specific language for a job as well which was handy.
        
           | 3np wrote:
           | There's so much.
           | 
           | If I can dream, an LDAP implementation with a similar target
           | as GP wound be fantastic.
           | 
           | That's not a short one though (:
        
           | bitxbitxbitcoin wrote:
           | Follow up using bitcoin using only the protocol spec and man
           | pages with setting up a multisig wallet on testnet!
        
           | wdfx wrote:
           | I'm writing a software audio synthesiser.
           | 
           | Not quite the same as I'm not following or reimplementing any
           | established standard, but it's interesting and fun to
           | Greenfield a c++ project using minimal dependencies to do
           | things like:
           | 
           | - invent and parse a DSL - modularise components and
           | connections between them - possibly cyclic directed graph
           | traversal - efficient near-real-time audio processing
           | (working in chunks within a time budget) - multithreading /
           | parallel module processing - spit the outputs into an audio
           | device and/or wav files - eventually put a GUI on top
        
           | trillic wrote:
           | I built (against my will) a regex engine in college and it's
           | the toughest, most CS'y "software pilgrimage" I could think
           | of.
           | 
           | In the same vein, I think building or designing some of the
           | components in an OS or Systems book[1] could have a similar
           | positive result.
           | 
           | [1] https://mitpress.mit.edu/books/elements-computing-
           | systems-se...
        
           | cheschire wrote:
           | LFS was my first tech pilgrimage as a teen. I would love to
           | see more projects with that level of optional hand holding!
        
           | cashewchoo wrote:
           | Heh. I was inspired by a movement I saw a while ago to read
           | old CS academic research, since e.g. quicksort is still very
           | relevant nowadays but is also still approachable for someone
           | with a B.Sc. in CS. (Whereas more recent papers all require
           | fairly deep knowledge of the research niche).
           | 
           | My take on it was that lots of this old software was
           | implemented by candlelight with a magnetized needle, a 512KiB
           | HDD platter and a steady hand; so I should be able to
           | reimplement it in a modern language with modern tooling in a
           | lot less time, while learning a lot about the system. Like,
           | dig command output means a lot more to me now than it did at
           | the start, and I now appreciate what articles like the OP
           | mean now.
        
           | nishanth_v wrote:
           | One of my friends is building a collection of such exercises
           | aimed at experienced & intermediate programmers, where you
           | build Redis / Docker / Git / React from scratch with
           | guidance.
           | 
           | https://codecrafters.io/challenges
        
             | maxwelljoslyn wrote:
             | See also: Robert Heaton's series of project prompts
             | "Programming Projects for Advanced Beginners."
             | 
             | https://robertheaton.com/2018/12/08/programming-projects-
             | for...
             | 
             | The project goals are not as lofty as those proposed by
             | nishanth_v's friend, but Mr. Heaton goes the extra mile to
             | turn each project idea into a step by step mini-curriculum
             | with lots of extension points.
             | 
             | Then, he goes _another_ extra mile by allowing readers to
             | email in their buggy projects, and running a companion
             | series where he teaches people how to debug  / fix /
             | improve their code by refactoring reader's attempts at the
             | "...Advanced Beginners" projects.
             | 
             | Recommended.
        
           | the_imp wrote:
           | I wrote a fully spec-compliant YAML library.
        
             | zingplex wrote:
             | I didn't think that was possible
        
       | basilgohar wrote:
       | Can someone explain why this couldn't have been implemented
       | purely client side and validating if what was entered was first
       | even a valid hostname? Or were the DNS queries only run after
       | they passed an initial check?
       | 
       | I think this is one of the key problems that emerged by merging
       | the search functionality into the location bar. I still now
       | always enable the separate search bar for Firefox and avoid
       | running searches in the so-called Awesome bar. I still consider a
       | search and host lookup very different operations in my mind.
        
         | Deathmax wrote:
         | I believe it was mainly to detect when a user is trying to
         | navigate to an intranet site with a single-word name (for
         | example http://fileserver). To do that, Chrome needs to make a
         | DNS request for the word and check if it's valid.
         | 
         | However, you can have misbehaving ISPs that replaces all
         | NXDOMAIN responses with something (producing false positives
         | for what constitutes "valid" hostnames), or a guest portal that
         | is hijacking DNS responses, so Chrome will make requests for
         | random DNS hostnames that are unlikely to exist to detect if
         | DNS hijacking is occurring and disable the intranet detection
         | if it is.
        
           | somehnguy wrote:
           | If that was the goal, it does a pretty poor job. The amount
           | of times I've accidentally Googled things like
           | 'internalserver', 'fileserver:80', or 'dockerhost:8000' over
           | the years is way too high.
           | 
           | And when you do eventually reach any of them via manually
           | prepending 'http://', they make the 'http://' not visible in
           | the bar. What confusing signaling!
           | 
           | Hint for Chrome: if I'm appending a port and the DNS name
           | exists I probably don't want to Google it.
        
             | eric-lee wrote:
             | I've searched for internal servers so many times I've just
             | gotten into the habit of suffixing with '/' so
             | 'internalserver/' goes to http://internalserver/
        
             | rfoo wrote:
             | > If that was the goal, it does a pretty poor job.
             | 
             | Their goal is to treat the query as search, _and then_ try
             | to detect if a host named  "internalserver" exists. If so,
             | Chrome displays a banner saying "do you want to visit
             | internalserver/". So they don't have to delay the search
             | until the probe finishes, thus it never automatically make
             | it work without another click.
             | 
             | Personally I don't like it either, so I usually type
             | "internalserver/" (with a tailing slash) to skip the
             | search.
        
             | NicoJuicy wrote:
             | Counterpoint, i do want to Google ml.net
        
               | Moru wrote:
               | Then you type !g ml.net
        
               | jbverschoor wrote:
               | or ?ml.net
        
               | JJMcJ wrote:
               | Or switch to Firefox, which haS an address bar and a
               | separate search-only box.
        
               | trulyme wrote:
               | This. I understand why some people don't know (or care
               | about) the difference between the search box and the
               | address bar, but why engineers accept the merged input is
               | beyond me.
        
               | TylerE wrote:
               | Because it saves, on the aggregate, keystrokes.
               | 
               | For something you do dozens of times a day even a tiny
               | gain adds up over time.
        
               | ptx wrote:
               | How does it save keystrokes? I press ctrl+L to enter a
               | URL or ctrl+K to enter a search, the same number of
               | keystrokes regardless of which box I want to focus.
        
               | somehnguy wrote:
               | I was going to argue that it doesn't save keystrokes when
               | I frequently have to go back and fix the bad guess. But
               | then I realized that you're right - the time saved over
               | the years when I actually _do_ want to Google a specific
               | term likely *far* exceeds the time I 've spent fixing the
               | bad guesses.
               | 
               | I guess I'll just accept the flaws that come with it.
        
               | atq2119 wrote:
               | How does it save key strokes? To use the location/awesome
               | bar you press Ctrl+L and start typing, to search you
               | press Ctrl+K and start typing...
        
               | granzymes wrote:
               | The vast majority of users won't use this keyboard
               | shortcut.
        
             | Already__Taken wrote:
             | tip: always end those names with the forward slash and it
             | will try the name not Google the word. saves typing the
             | protocol.
             | 
             | I always assumed they didnt make this easier so my mum
             | doesn't go to almostBank:8000
             | 
             | if you go to host ports a lot i.e localhost you can spend a
             | quick search to the sqlite DB for your preferences to make
             | l<tab>:<port> expand out.
             | 
             | u have to edit the SQL. they "patch" the UI so it doesn't
             | accept just arbitrary string patterns to tab complete but
             | only URLs.
        
               | OkGoDoIt wrote:
               | Can you provide more details about this? That sounds
               | intriguing, but I think I'm missing something to fully
               | understand what you're talking about
        
               | eric-lee wrote:
               | I took a look at the sqlite db but I didn't see anything
               | that stood out but I might be looking at the wrong file.
               | Is it databases.db? You can get that tab-complete
               | behavior with custom search engines though
               | 
               | Keyword: l, URL: localhost:%s
               | 
               | Then it works how you're describing with l<tab> and input
               | the port. I personally define all the different versions
               | of my servers that way so v1<enter> goes to something
               | like internalserver:9001, v2<enter> goes to
               | internalserver:9002, etc.
        
           | lkbm wrote:
           | The APNIC post[0] backs this up, saying it's the "Intranet
           | Redirect Detector". [1] has more details on how that system
           | works (which is basically what you said, but slightly more
           | in-depth).
           | 
           | [0] https://blog.apnic.net/2021/02/04/how-chromium-reduces-
           | root-...
           | 
           | [1] https://www.archyde.com/chromium-function-against-dns-
           | hijack...
        
         | shakna wrote:
         | Worth noting that this doesn't actually affect the behaviour of
         | a single user at this point in time.
         | 
         | > Thanks, Matt, that's great to see. Note that that's the
         | effect of the change in comment 37, which only affects Android.
         | The desktop change is blocked on a planned experiment in M88. A
         | wide rollout probably won't happen until late February or early
         | March, at which point we expect an additional reduction in root
         | traffic. [0]
         | 
         | They were doing the DNS interception check on Android, but a
         | different path of code was actually doing the checks (in a
         | different way). Android was basically sending out useless
         | requests.
         | 
         | [0]
         | https://bugs.chromium.org/p/chromium/issues/detail?id=109098...
        
           | treesknees wrote:
           | Which amazes me. Chrome does this check every time network
           | settings change, including a new IP. So your Android phone
           | connecting to various networks just spammed this stuff for no
           | reason. The fact that this fixed dropped root server requests
           | by nearly 50% is astounding.
        
         | rasengan wrote:
         | Handshake [1] handles validation on the client side before
         | making a query.
         | 
         | [1] https://handshake.org
        
           | judge2020 wrote:
           | Can you point to the specific piece of client side code that
           | does this? It doesn't sound like it inherently fixes the
           | problem of either going to "myserver" (as in,
           | http://myserver/) and "myserver" (searching the web).
        
         | dublinben wrote:
         | Normal users don't enter hostnames directly any longer. Even
         | for the most common sites like YouTube.com, Facebook.com and
         | Amazon.com they will largely search the name and then click on
         | the top search result.
         | 
         | This behaviour actually suits Google well, because their
         | incredibly lucrative "promoted results" are shown before the
         | real destination site. You can't charge an advertising tax on
         | users who browse directly to their destination.
        
           | oconnor663 wrote:
           | Normal users don't type "fa" in the browser bar and get
           | autocompleted to "facebook.com"?
        
       | dpcx wrote:
       | So is this functionality just "gone"? I remember one of the
       | original reasons for this functionality was to know if you were
       | on an ISP that performed any kind of DNS hijacking (I don't fully
       | understand the why of it, but that's separate), and I'm curious
       | if that's still available and if so, where are those queries
       | actually going now?
        
         | pas wrote:
         | Probably the long term solution is DNSSEC + DoH/DoT + trusted
         | resolvers.
        
       | tyingq wrote:
       | Timely, I suppose, since there's talk of partitioning DNS cache
       | in the browser...which will drive it back up.
        
       ___________________________________________________________________
       (page generated 2021-02-05 23:02 UTC)