[HN Gopher] Why does searching Google for random hex lead to car...
       ___________________________________________________________________
        
       Why does searching Google for random hex lead to car dealers?
       [video]
        
       Author : bonyt
       Score  : 101 points
       Date   : 2024-05-31 14:44 UTC (8 hours ago)
        
 (HTM) web link (tmp.tonybox.net)
 (TXT) w3m dump (tmp.tonybox.net)
        
       | Terr_ wrote:
       | Weird... Maybe Google thinks that the closest inexact match is a
       | VIN number?
        
       | lambdaxyzw wrote:
       | Weird premise. I search for random hex literally all the time
       | (checking hashes and guessing algorithms as a part of my reverse
       | engineering work) and I don't remember car dealers coming up
       | especially often. I suspect it's just the author who - because of
       | their location or the previous search history - gets more
       | targetted car dealership ads.
        
         | joe_the_user wrote:
         | But the results here aren't ads - at least they appear to be
         | regular search results.
        
       | dtagames wrote:
       | Most likely some part of the string matches the VIN number.
       | Dealers are legally required to post the VIN of an actual vehicle
       | in any advertisements that have a price, as a way of preventing
       | bait-and-switch.
        
         | dawnerd wrote:
         | And yet they still bait and switch. Most recently-ish with
         | added markups not in their online price.
        
           | cratermoon wrote:
           | Or just claiming the vehicle is currently unavailable or not
           | yet for sale because it's in the shop/in use as a loaner/the
           | manager has a hold on it or some BS, but here's a very
           | similar vehicle that we'd love to unload on you!
           | 
           | It's very technically legal because they _do_ have the
           | vehicle in their inventory, and you _can_ test drive and buy
           | it, but just not right then.
        
         | OptionOfT wrote:
         | Funny, in Europe that's absolutely not the case.
         | 
         | I watched some government sale and they posted a PDF vehicles
         | for sale that were forfeited.
         | 
         | The VINs where there but parts of it where blacked out.
         | 
         | It was a PDF. I copy-pasted the text behind the black box and
         | got the full VIN.
        
           | dylan604 wrote:
           | > It was a PDF. I copy-pasted the text behind the black box
           | and got the full VIN.
           | 
           | You're such a hacker. As the world turns now, I'd expect some
           | legislation that says if you copy the text from a badly
           | created PDF, then you are the one to blame and not the one
           | that made the bad document. You're clearly circumventing the
           | intent. You you...criminal.
        
           | londons_explore wrote:
           | In Europe VIN's of cars are treated a little like SSN's are
           | treated in the US. Some governments assume that just because
           | you know the VIN of a vehicle, you must be it's owner,
           | despite many vehicles having the VIN written on every bit of
           | glass and visible without even unlocking the car...
        
       | CommieBobDole wrote:
       | Looking at this very briefly, the results seem to always be
       | inventory pages for the dealerships, which use long strings of
       | hex or just random numbers as identifiers for the vehicles they
       | have for sale.
       | 
       | For example, a search for "ca7112b7167c15e621412c0fbc0a6c97"
       | brings up the URL "https://www.premierclearancecenterofstbernard.
       | com/inventory/...", which has a gallery of vehicles at the bottom
       | whose image names are of the format
       | "9b362510c100095f02cf3cad9e365ea6.jpg".
       | 
       | I assume something inside the Google black box is saying "well,
       | there's no exact match but this site has a bunch of strings with
       | most of the same characters, so here you go".
       | 
       | Edit: And to add to this, I'd surmise that the reason you see a
       | lot of car dealerships in these results is that they sell a lot
       | of one-offs - instead of having a list of SKUs in inventory, they
       | sell a unique vehicle just once, so the inventory systems need to
       | account for that by using long strings as item IDs and the like.
       | Also there's probably a limited number of inventory systems out
       | there, so a bunch of random dealerships are probably all using
       | the same one.
        
         | libria wrote:
         | > no exact match but this site has a bunch of strings with most
         | of the same characters
         | 
         | I suspect it's something similar, but more like partial string
         | match which may score as "close enough to display". I get
         | consistent results with the same hex string - dealerships - but
         | if I quote it (exact match), I get no matches.
        
           | rerdavies wrote:
           | I suspect there's a single word embedding for WTF_IS_THAT.
        
         | cedws wrote:
         | Back when Google search was good this query would have returned
         | no results. As it should do. Now it desperately tries to dig up
         | anything it can find just so the number of results is not zero.
         | Somebody at Google wanted to the increase search 'hit rate' KPI
         | and this is the result.
        
           | refulgentis wrote:
           | That doesn't sound right to me: Google used to suppress
           | results with string matches?
           | 
           | Why?
           | 
           | If so, would that be a good thing?
           | 
           | Why shouldn't I be able to find the vehicle via its ID?
        
             | risenshinetech wrote:
             | These aren't string matches. Check again.
        
               | refulgentis wrote:
               | Ah, doh, thank you.
        
           | duxup wrote:
           | Is there anyway we can somehow find out that is true?
           | 
           | I could have sworn google always was happy to return some odd
           | url matches, typically when the given results weren't great.
        
             | cedws wrote:
             | Unless you have a time machine there's only anecdotal
             | evidence, but there's plenty of it on HN. Seen many
             | comments here reporting the same thing.
        
             | beardyw wrote:
             | There was at one time a kind of game where you tried to
             | find a search term that would return only say 3 results. It
             | was hard, but some did get found.
             | 
             | Having said that I have recently had some kind of "nothing
             | found" result on several occasions. So it still happens.
             | 
             | --edit--
             | 
             | In fact I just tried "ca7112b7167c15e621412c0fbc0a6c9"
             | (omitting the last digit to avoid HN) and got:
             | 
             | Your search - "ca7112b7167c15e621412c0fbc0a6c9" - did not
             | match any documents.
             | 
             | Suggestions:
             | 
             | Make sure that all words are spelled correctly. Try
             | different keywords. Try more general keywords.
        
               | doublerabbit wrote:
               | Google-Whack as I knew it.
               | 
               | Where you tried to find only a search with one result.
               | 
               | https://en.m.wikipedia.org/wiki/Googlewhack
        
             | dylan604 wrote:
             | I've seen it come back with something along the lines of
             | "it looks like there's not a lot matches" with some useless
             | cartoon graphic.
             | 
             | I see this a lot when searching for phone numbers. I've
             | also seen the opposite like the forced "find something no
             | matter how terrible of a match to avoid no results" as
             | being described. You search for a number and no exact
             | matches, but it returns things with different area codes
             | same prefix different numbers. Or same area code, different
             | prefix, same numbers. Or some such randomness that I can't
             | even venture a guess as to why it thought the not one
             | number matches would be interesting to me. Unless you're
             | brave, I'd suggest not searching for random phone numbers
             | with Safe Search off as you'll find some very interesting
             | pages displayed that have absolutely nothing to do with the
             | number being searched.
        
             | rurp wrote:
             | I remember when Googlewhacks[0] used to be a thing. Zero
             | result search queries weren't interesting enough because
             | they were too easy to find.
             | 
             | [0]https://en.m.wikipedia.org/wiki/Googlewhack
        
           | noqc wrote:
           | Garbage in garbage out is fine here, no? I hate google quite
           | as much as the next person here, but this seems like a non-
           | issue. If I type in a random string, it should be assumed
           | that I'm searching for _something_.
        
             | ryanianian wrote:
             | Sometimes you really do want exactly that "random" string.
             | This is common with error messages, model numbers, build
             | hashes, etc. If I'm searching for B9GDSIGH as the model
             | number for my refrigerator, I really don't want to see
             | B9GDSIGY.
        
               | kimixa wrote:
               | But if it links to the B9GDSIG series refrigerator, which
               | has the 240v H and 120v Y subtypes, then it would be
               | correct in suggesting that?
               | 
               | Same with error messages - they often have timestamps, or
               | local object IDs/memory addresses, which you also want to
               | be fuzzy-matched.
               | 
               | I think the issue is the de-emphasis of "power" modifiers
               | for google - it's less obvious how to say "This part of
               | the string needs exact match, this can be fuzzy"
        
               | dylan604 wrote:
               | In that case, click the "must contain" link and it
               | resubmits with the query wrapped in quotes. Or, just
               | quote the query yourself on the first go if you know it
               | _must_ match
        
               | ryanianian wrote:
               | Google no longer (hasn't in a while) respected quotes.
               | It's very hard to get Google to actually say there aren't
               | any results even when in fact there are no matching
               | results.
        
               | dylan604 wrote:
               | They respect it when they submit it then, as every time
               | I've used that function to see them update the query with
               | quotes it comes back with different results. I've never
               | cared to look at the search query in the URL, so maybe
               | they also add and additional parameter that tells the
               | back end specifically to obey the quotes on this
               | resubmitting???? So at some point, the quotes aren't
               | ignored
        
               | fragmede wrote:
               | that's not my experience.
               | 
               | https://www.google.com/search?q=%22kgirbudidndijrjjr%22
               | gives me "Your search - "kgirbudidndijrjjr" - did not
               | match any documents.", at least it will until they index
               | this comment and find kgirbudidndijrjjr
        
               | thfuran wrote:
               | Quotes are more like guidelines these days.
        
               | aldousd666 wrote:
               | on the advanced search, there's still the option to
               | specify that it 'must contain' something, but I'm not
               | sure if it's just a suggestion like quotes or not.
        
               | dylan604 wrote:
               | I "love" how we've reached a point where we so distrust
               | this company specifically but dark pattern UIs in general
               | where we almost anticipate placebo like buttons.
        
             | bravetraveler wrote:
             | One man's trash is another man's treasure. Search is
             | ambiguous enough by nature IMO. No liberty zone!
             | 
             | Agree with the peer - specificity matters. Model numbers
             | are a good example. I feel like I've developed a weak form
             | of dyslexia because I can't trust Google like I once did.
             | 
             | Things I want fuzzy searches for... will be presented
             | fuzzy. Not as an opaque string of usually-quoted
             | characters, but wrapped in keywords
             | 
             | A reply makes a good point - double quotes don't seem as
             | effective any more.
        
           | SquareWheel wrote:
           | If you put quotes around the string (the "exact match"
           | operator), the only results are this very thread. So it seems
           | to be working as intended.
           | 
           | Basically, you did a fuzzy search and got a fuzzy result.
           | Usually that's what people want. Quotes will let you fine-
           | tune results. Or if you want all results to be strict by
           | default, use verbatim mode. I tested that with the above
           | string and again, only this thread showed up.
        
             | underwater wrote:
             | But it's clearly not what people want. Ask any person if a
             | search for a hex encoded ID should be a fuzzy match for a
             | different ID and the answer will be no.
             | 
             | As technical people, it's easy to infer what's happening
             | under the hood and make excuses for the weirdness. But food
             | product design is about having strong opinions about what
             | should happen, and ignoring our bias is around the
             | limitations of the tech or the status quo.
             | 
             | In an age where I can have an entire conversation with a
             | computer or generate a video from text the world's greatest
             | search engine still doesn't understand that you can't fuzzy
             | match an ID? It increasingly feels like Google search is
             | stuck in the past.
        
               | fragmede wrote:
               | Who is this "any person" who's searching for random hex,
               | and how much do you think they care of Google shows them
               | a car instead of whatever thing they're not even actually
               | looking for?
               | 
               | the idea that this mythical "any person" even cares about
               | the difference between a useless car result and a page
               | that says no results and then they just move on with
               | their lives is projecting a lot of your own biases onto a
               | hypothetical.
        
             | xp84 wrote:
             | I get that that's the default now, but can't help but hate
             | it. When you search for like `dog house` to have a bunch of
             | results for just house (marked "Missing: ~dog~" ) it's so
             | dumb. Why would I have typed dog unless that was important
             | to me??
        
           | shadowgovt wrote:
           | I can't tell you the number of times I've searched for random
           | serial numbers and gotten the exact product I seek. I'm glad
           | Google indexes this random crap.
        
         | confused_boner wrote:
         | Bing search results for that are interesting
        
         | shadowgovt wrote:
         | Additionally, the user is doing the search in a non-Incognito
         | session, so the system will bias based on assumption of user
         | preferences. "Hm, I see this random hex identifier in three
         | pages... Oh, but this user likes cars. Let's give 'em the car
         | result first."
        
       | joe_the_user wrote:
       | I think this is notable just because it's a result of Google now
       | having every single search result set be trying to sell you
       | something. That's different from simply having targeted ads and
       | rather disturbing.
        
         | cratermoon wrote:
         | Google is now a glorified Yellow Pages, assuming that every
         | search is a search for a business.
        
       | libria wrote:
       | Repro'd in an incognito window so it's not a history thing. 1st 3
       | of OPs strings if anyone else is experimenting (remove spaces):
       | 3344cfb4 78ead204a49b88 1da6079adf8a         e2c75c64
       | eef8087f6f36df 57         eb944335 73626fe9b73550 b02a651620d8
       | 
       | --
       | 
       | Shoot, depending on crawling, this may end up causing this page
       | to match. I'm injecting spaces above to deter this, but maybe
       | it'll also prove out the partial string match theory...
        
         | 1970-01-01 wrote:
         | I'm only getting back 2 results: Citi.com and FDIC.gov
         | 
         | Clicking on the 3 dots gives me this info:
         | Your search & this result          This result seems relevant
         | even though this search term may not appear:
         | 3344cfb478ead204a49b881da6079adf8a
        
       | qnleigh wrote:
       | Good guesses in the comments so far: VIN number partial matches
       | and targeted search. Anyone going to test what's correct?
       | 
       | Ideas: 1. Vin numbers are 17 characters and don't contain I, O or
       | Q, to prevent confusion with other letters. If you throw in lots
       | of these always spaced by less than 17 characters, do you get
       | fewer hits?
       | 
       | 2. Does a VPN and/or private browsing affect the results?
       | 
       | A third possibility is that Google has cheaper ad category for
       | search queries that they can't categorize. This doesn't explain
       | the diversity of dealerships though.
        
         | joe_the_user wrote:
         | Sure, it's matching VINs. But in the vast expanse of the net,
         | surely there are many strings of random hex out there. Why this
         | source of random digits.
        
           | bn-l wrote:
           | Probably the most authoritive sites with weird strings
        
             | cratermoon wrote:
             | "most authoritive", for financially preferred values of
             | authority.
        
       | ww520 wrote:
       | The word embeddings computed from the hex values and the car
       | dealership's inventory ID's probably have close similarity in
       | Google's vector db.
        
         | rerdavies wrote:
         | I like that theory, but with one slight modification.
         | 
         | There's a single word embedding for DARNED_IF_I_KNOW, and,
         | statistically, automobile listings outnumber other pages with
         | the DARNED_IF_I_KNOW token.
        
       | omoikane wrote:
       | I see that digits is between 10 and 19:
       | DIGITS=$((10 + $RANDOM % 10))
       | 
       | If it was always an even number, I would have expected some
       | checksum files to be matched (16 for md5sum, 20 for sha1sum,
       | etc).
        
       | jimbobthrowawy wrote:
       | I tend to get variations on cryptocurrency block explorer
       | websites mostly.
       | 
       | It's annoying when I want to search for a btih or something
       | exact.
        
       | cratermoon wrote:
       | I'm going to guess that google makes more money from car dealer
       | ads than it does for programmers searching for hex codes. Also
       | probably just because Google's search is more and more giving
       | irrelevant results.
        
       | hi-v-rocknroll wrote:
       | Perhaps Google trolls anyone in security or torrenting, and would
       | instead prefer to show CPM/CPC ads to charge instead of nothing
       | because money. /s
        
       | worik wrote:
       | Search bubble?
        
       ___________________________________________________________________
       (page generated 2024-05-31 23:00 UTC)