[HN Gopher] How MDN's Autocomplete Search Works
       ___________________________________________________________________
        
       How MDN's Autocomplete Search Works
        
       Author : oedmarap
       Score  : 140 points
       Date   : 2021-08-03 17:04 UTC (5 hours ago)
        
 (HTM) web link (hacks.mozilla.org)
 (TXT) w3m dump (hacks.mozilla.org)
        
       | muxator wrote:
       | Now I am curious if, in the real MDN production site, serach-
       | index.json loading is triggered by the execution of
       | /static/js/autocomplete.js, when their download should really be
       | started in parallel by the shim.
       | 
       | Many websites leave a lot of performance on the table because of
       | such behaviors.
       | 
       | My hypothesis is that, since this is easier for the developer,
       | and works good enough, not many people really care. But these
       | things add up, and the web becomes slower and slower.
        
         | peterbe wrote:
         | MDN has 2 search things: 1. client-side only which downloads a
         | complete list of all titles. 2. full-text search on everything
         | with Elasticsearch.
        
       | sa3dany wrote:
       | I remember doing something similar a few years ago, I needed
       | autocomplete for a shipping ports field, the data was too big
       | though so I ended up using a csv file in an aws lamda function
       | that filters based on the selected country and returns a much
       | smaller subset. It lazy loaded after the user selected the
       | country. To keep response times low I had to do a binary search
       | on the raw csv bytes. It felt like I was reinventing databases
       | but I liked the idea of it being self contained in a function.
        
       | ShrigmaMale wrote:
       | I like mosra's search, implemented in m.css for magnum. He wrote
       | a blog post on it here:
       | https://blog.magnum.graphics/meta/improved-doxygen-documenta...
       | and you can try it on the magnum docs site:
       | https://doc.magnum.graphics/magnum/#search
       | 
       | Fast and can be served from a static site.
        
       | vdm wrote:
       | https://react-spectrum.adobe.com/blog/building-a-combobox.ht...
        
       | thinkloop wrote:
       | I thought this was going to be about advanced usage of
       | <datalist>: https://developer.mozilla.org/en-
       | US/docs/Web/HTML/Element/da...
        
         | peterbe wrote:
         | <datalist> is awesome! But I find it works better for short
         | options. See https://www.peterbe.com/plog/datalist-looks-great-
         | on-mobile-...
        
         | theandrewbailey wrote:
         | Typos and fuzzy searches cause <datalist> to break.
        
       | winrid wrote:
       | We took a similar approach for our documentation search. [0]
       | 
       | You can see the "inverted index" is rendered inline in the page,
       | since everything is generated at build time.
       | 
       | When you type something that matches a key in the index, we fetch
       | that index key and add it to the results. [1] [2]
       | 
       | Obviously we could do a lot better in terms of relevancy, but
       | it's simple and fast.
       | 
       | [0] https://docs.fastcomments.com/
       | 
       | [1] https://docs.fastcomments.com/index-ublJLBnXgz88.json
       | 
       | [2] https://github.com/FastComments/fastcomments-
       | docs/blob/main/...
        
         | peterbe wrote:
         | Relevancy is the huge game-changer. MDN uses pageviews
         | analytics to determine was a "popular" age is.
        
           | winrid wrote:
           | Indeed, that's a great idea.
        
       | ushakov wrote:
       | i'm wondering how much kb it loads before ready to search?
       | 
       | update: 144KB for JSON file
       | 
       | a little bit worrying, given their scale and potential bandwidth
       | requirements
        
         | ourcat wrote:
         | For content like this, it's much easier to download the entire
         | search-index.json and run the auto-complete against that.
         | 
         | Rather that than hitting a search endpoint (after typing a
         | certain amount of characters).
        
           | simonw wrote:
           | Sadly in 2021 adding 140KB to a page isn't a big deal (given
           | how heavy the rest of the page probably is) - but it really
           | should be.
           | 
           | A large chunk of the world's population still pays a locally-
           | expensive rate for mobile bandwidth, and we're increasingly
           | leaving them behind - or worse, pushing them into zero-rating
           | internet plans which mean they can only use Facebook and
           | WhatsApp while avoiding the rest of the web:
           | https://en.wikipedia.org/wiki/Zero-rating
        
             | city41 wrote:
             | It's only added if the user shows an intent to search. And
             | if you want to search, 144kb is a decent price to pay for
             | instant search once it's downloaded
        
               | simonw wrote:
               | Oh I'd missed that - yeah loading it on-demand the first
               | time they attempt to search is a much better strategy.
        
         | hbcondo714 wrote:
         | Yeah I would think this file size will increase well over time.
         | Maybe a part 2 of the article can go over how updates to the
         | file are made when new content is published and possible
         | scaling solutions.
        
       | flerovium wrote:
       | I can't wait until FlexSearch reaches 1.0.0. Reading the source
       | code is like reading great literature.
        
         | peterbe wrote:
         | (author here) We're still on FlexSearch 0.6 and the new 0.7 is
         | a big refactor. I hope we can upgrade some time.
        
       | earthboundkid wrote:
       | I miss the old search that let me narrow things down by category.
        
         | peterbe wrote:
         | What do you miss about it? Can you not find what you're looking
         | for?
        
       | bityard wrote:
       | They hi-jacked the browser's `/` key to focus the field, which is
       | something I hate. As a user, I want `/` to bring up Firefox's
       | quick search bar, especially when reading documentation.
       | 
       | They should have just had the search field focused automatically
       | but that would have done away with their "clever" hack to lazy-
       | load the DB containing every page name.
       | 
       | Also, I'm confused, I thought https://mdn.dev/ was the new thing
       | because Mozilla was stepping back from MDN. Is it a fork? They
       | both carry Mozilla logos, so what's going on there?
        
         | thrdbndndn wrote:
         | I knew the existence of "/", but never figure out why I should
         | use this instead of Ctrl+F. What's the difference (other than
         | have fewer features)?
        
           | jannes wrote:
           | The only difference I know of is that "/" focuses links. So
           | when you press return, it loads the link instead of jumping
           | to the next result.
           | 
           | It's quite nice for keyboard-only web navigation.
        
             | polar wrote:
             | > "/" focuses links.
             | 
             | It's ' to trigger quick find in links only mode.
        
               | kxrm wrote:
               | I had the same confusion with his comment but I think
               | what he meant was that when you highlight a result in a
               | link, pressing enter causes you to follow that link
               | (which is true). You are correct that ' focuses on only
               | searching within links though.
               | 
               | Enter never goes to the next result though, so I am not
               | sure if that is just something different between his
               | setup and mine. I have to use F3 to go to the next
               | result.
        
             | [deleted]
        
           | kxrm wrote:
           | This seems to be a good introduction to Quick Find.
           | 
           | https://www.tenforums.com/tutorials/120679-enable-disable-
           | qu...
        
             | thrdbndndn wrote:
             | Ok, so the difference is:
             | 
             | 1. It disappears after a few seconds.
             | 
             | 2. It has no "next/previous/highlight all" etc. buttons (it
             | still have these features, just no clickable buttons)
             | 
             | It still makes no sense to me.
             | 
             | I guess maybe a small portion of people would find the
             | auto-disappearing thing useful, even though in normal
             | Ctrl+F all you need to do is pressing Esc.
             | 
             | But the second "feature" totally baffles me. It's not like
             | Ctrl+F is some expensive GUI to launch, why would I want to
             | _not_ have these buttons? Even if you don 't need them at
             | all (I don't), you can simply not click them, there is no
             | downside by having them.
        
               | lol768 wrote:
               | Does the usual Ctrl+F GUI support filtering down to links
               | only?
        
               | mejutoco wrote:
               | You can use the single quote character to search only
               | links
        
         | est31 wrote:
         | Yeah discourse does the same. Sometimes i want to search
         | _within_ a post for some keyword. But ctrl+f redirects you to
         | the global search... that global search only helps if you want
         | to find interesting posts, but it does not support searching
         | inside one, nor does it allow limited search within a thread.
         | So I started using  / in discourse discussions. Then that one
         | was being overridden as well. I've heard the recommendation
         | that you turn js off, which gives you a saner experience.
        
           | mikepurvis wrote:
           | I hate this behaviour in discourse as well, but it hadn't
           | occurred to me to try using it sans JS altogether, since it
           | seemed to be pretty dependent on it. Will give that a shot
           | for sure.
        
           | polar wrote:
           | > But ctrl+f redirects you to the global search
           | 
           | Press ctrl-f twice.
        
             | est31 wrote:
             | Oh thanks for that trick. It violates the principle of
             | least surprise so much but it does what I want. Thanks
             | again!
        
         | jraph wrote:
         | GitHub and GitLab do this too. Is there a way to prevent web
         | pages from hijacking this key? I almost never want to use their
         | search engine and when I do, I'm fine with clicking on the
         | input box.
        
         | Santosh83 wrote:
         | > Also, I'm confused, I thought https://mdn.dev/ was the new
         | thing because Mozilla was stepping back from MDN. Is it a fork?
         | They both carry Mozilla logos, so what's going on there?
         | 
         | It seems to me that mdn.dev is intended to be the future home
         | of MDN web docs since it is collaborative now, and no longer
         | exclusively managed by Mozilla. But they haven't actually made
         | the transition yet, as any link on mdn.dev points back to the
         | old (current) site at developer.mozilla.org
        
         | jxcl wrote:
         | Firefox lets you disable keyboard overrides on a per-site
         | basis, if that's something you're interested in
         | 
         | Page Info -> Permissions -> Override Keyboard Shortcuts
        
         | peterbe wrote:
         | > They hi-jacked the browser's `/` key to focus the field,
         | which is something I hate.
         | 
         | You're not the first one to point it out. Please join
         | github.com/mdn/yari to raise your voice. It's an Open Source
         | project after all.
         | 
         | > They should have just had the search field focused
         | automatically
         | 
         | Why? There's a lot of JS to load to make that work. If you
         | never need to do a search (e.g. from a Google search) it would
         | be a potential waste.
         | 
         | > Also, I'm confused, I thought https://mdn.dev/ was the new
         | thing because Mozilla was stepping back from MDN. Is it a fork?
         | 
         | That domain is just an alias we don't currently use. It's still
         | the old MDN from Mozilla. No fork.
        
           | daleharvey wrote:
           | > Why? There's a lot of JS to load to make that work. If you
           | never need to do a search (e.g. from a Google search) it
           | would be a potential waste.
           | 
           | Confused by what this comment is meant to say exactly, but
           | just in case its not known already, seems this situation is
           | what the autofocus attribute is for @
           | https://developer.mozilla.org/en-
           | US/docs/Web/HTML/Global_att..., no JS needed
        
         | ushakov wrote:
         | imho, they should've opted for CMD/CTRL + K, which Algolia's
         | Doc search uses
         | 
         | > They should have just had the search field auto-focused
         | automatically but that would have done away with their "clever"
         | hack to lazy-load the DB containing every page name.
         | 
         | this would steal away the focus and is not good for
         | accessibility (unless you're building a search engine)
        
       | namanyayg wrote:
       | In the code snippet they show the `startAutocomplete()` function
       | checks for the "started" variable being true; but never actually
       | sets it to true.
        
         | peterbe wrote:
         | It's pseudo code. The real code is TypeScript React and looks
         | very different and it wouldn't serve the article to take
         | snippets from that code to explain how it works.
        
       | encryptluks2 wrote:
       | I think adding search to the HTML standard makes more sense
       | overall. The thing I hate about search like this is that they
       | don't work with JS turned off (e.g. terminal browser). Why not
       | just add a JSON search component to HTML itself?
        
       | mg wrote:
       | My favorite autocomplete library is an ancient version of
       | bootstrap-typeahead.js by Twitter. A single file with less than
       | 400 lines of Javascript. They don't make these anymore :)
       | 
       | I use it everywhere where I need autcompletion. For example on
       | the Music-Map:
       | 
       | https://www.music-map.com
       | 
       | I made a fork of the code which is available here:
       | 
       | https://www.gibney.org/0g-typeahead
        
         | ourcat wrote:
         | I did my first autocomplete search UI with that library.
         | 
         | These days, due to the rest of the project, I've been using
         | Angular and Material's Autocomplete component, which I've found
         | very easy to customise for in-memory indexes or hits to a
         | remote ElasticSearch 'suggester' proxy endpoint.
        
         | peterbe wrote:
         | Getting accessibility right is hard. We very much care about
         | that. One of the strong reasons for why we're using Downshift.
        
       ___________________________________________________________________
       (page generated 2021-08-03 23:00 UTC)