[HN Gopher] Why does searching Google for random hex lead to car...
___________________________________________________________________
Why does searching Google for random hex lead to car dealers?
[video]
Author : bonyt
Score : 101 points
Date : 2024-05-31 14:44 UTC (8 hours ago)
(HTM) web link (tmp.tonybox.net)
(TXT) w3m dump (tmp.tonybox.net)
| Terr_ wrote:
| Weird... Maybe Google thinks that the closest inexact match is a
| VIN number?
| lambdaxyzw wrote:
| Weird premise. I search for random hex literally all the time
| (checking hashes and guessing algorithms as a part of my reverse
| engineering work) and I don't remember car dealers coming up
| especially often. I suspect it's just the author who - because of
| their location or the previous search history - gets more
| targetted car dealership ads.
| joe_the_user wrote:
| But the results here aren't ads - at least they appear to be
| regular search results.
| dtagames wrote:
| Most likely some part of the string matches the VIN number.
| Dealers are legally required to post the VIN of an actual vehicle
| in any advertisements that have a price, as a way of preventing
| bait-and-switch.
| dawnerd wrote:
| And yet they still bait and switch. Most recently-ish with
| added markups not in their online price.
| cratermoon wrote:
| Or just claiming the vehicle is currently unavailable or not
| yet for sale because it's in the shop/in use as a loaner/the
| manager has a hold on it or some BS, but here's a very
| similar vehicle that we'd love to unload on you!
|
| It's very technically legal because they _do_ have the
| vehicle in their inventory, and you _can_ test drive and buy
| it, but just not right then.
| OptionOfT wrote:
| Funny, in Europe that's absolutely not the case.
|
| I watched some government sale and they posted a PDF vehicles
| for sale that were forfeited.
|
| The VINs where there but parts of it where blacked out.
|
| It was a PDF. I copy-pasted the text behind the black box and
| got the full VIN.
| dylan604 wrote:
| > It was a PDF. I copy-pasted the text behind the black box
| and got the full VIN.
|
| You're such a hacker. As the world turns now, I'd expect some
| legislation that says if you copy the text from a badly
| created PDF, then you are the one to blame and not the one
| that made the bad document. You're clearly circumventing the
| intent. You you...criminal.
| londons_explore wrote:
| In Europe VIN's of cars are treated a little like SSN's are
| treated in the US. Some governments assume that just because
| you know the VIN of a vehicle, you must be it's owner,
| despite many vehicles having the VIN written on every bit of
| glass and visible without even unlocking the car...
| CommieBobDole wrote:
| Looking at this very briefly, the results seem to always be
| inventory pages for the dealerships, which use long strings of
| hex or just random numbers as identifiers for the vehicles they
| have for sale.
|
| For example, a search for "ca7112b7167c15e621412c0fbc0a6c97"
| brings up the URL "https://www.premierclearancecenterofstbernard.
| com/inventory/...", which has a gallery of vehicles at the bottom
| whose image names are of the format
| "9b362510c100095f02cf3cad9e365ea6.jpg".
|
| I assume something inside the Google black box is saying "well,
| there's no exact match but this site has a bunch of strings with
| most of the same characters, so here you go".
|
| Edit: And to add to this, I'd surmise that the reason you see a
| lot of car dealerships in these results is that they sell a lot
| of one-offs - instead of having a list of SKUs in inventory, they
| sell a unique vehicle just once, so the inventory systems need to
| account for that by using long strings as item IDs and the like.
| Also there's probably a limited number of inventory systems out
| there, so a bunch of random dealerships are probably all using
| the same one.
| libria wrote:
| > no exact match but this site has a bunch of strings with most
| of the same characters
|
| I suspect it's something similar, but more like partial string
| match which may score as "close enough to display". I get
| consistent results with the same hex string - dealerships - but
| if I quote it (exact match), I get no matches.
| rerdavies wrote:
| I suspect there's a single word embedding for WTF_IS_THAT.
| cedws wrote:
| Back when Google search was good this query would have returned
| no results. As it should do. Now it desperately tries to dig up
| anything it can find just so the number of results is not zero.
| Somebody at Google wanted to the increase search 'hit rate' KPI
| and this is the result.
| refulgentis wrote:
| That doesn't sound right to me: Google used to suppress
| results with string matches?
|
| Why?
|
| If so, would that be a good thing?
|
| Why shouldn't I be able to find the vehicle via its ID?
| risenshinetech wrote:
| These aren't string matches. Check again.
| refulgentis wrote:
| Ah, doh, thank you.
| duxup wrote:
| Is there anyway we can somehow find out that is true?
|
| I could have sworn google always was happy to return some odd
| url matches, typically when the given results weren't great.
| cedws wrote:
| Unless you have a time machine there's only anecdotal
| evidence, but there's plenty of it on HN. Seen many
| comments here reporting the same thing.
| beardyw wrote:
| There was at one time a kind of game where you tried to
| find a search term that would return only say 3 results. It
| was hard, but some did get found.
|
| Having said that I have recently had some kind of "nothing
| found" result on several occasions. So it still happens.
|
| --edit--
|
| In fact I just tried "ca7112b7167c15e621412c0fbc0a6c9"
| (omitting the last digit to avoid HN) and got:
|
| Your search - "ca7112b7167c15e621412c0fbc0a6c9" - did not
| match any documents.
|
| Suggestions:
|
| Make sure that all words are spelled correctly. Try
| different keywords. Try more general keywords.
| doublerabbit wrote:
| Google-Whack as I knew it.
|
| Where you tried to find only a search with one result.
|
| https://en.m.wikipedia.org/wiki/Googlewhack
| dylan604 wrote:
| I've seen it come back with something along the lines of
| "it looks like there's not a lot matches" with some useless
| cartoon graphic.
|
| I see this a lot when searching for phone numbers. I've
| also seen the opposite like the forced "find something no
| matter how terrible of a match to avoid no results" as
| being described. You search for a number and no exact
| matches, but it returns things with different area codes
| same prefix different numbers. Or same area code, different
| prefix, same numbers. Or some such randomness that I can't
| even venture a guess as to why it thought the not one
| number matches would be interesting to me. Unless you're
| brave, I'd suggest not searching for random phone numbers
| with Safe Search off as you'll find some very interesting
| pages displayed that have absolutely nothing to do with the
| number being searched.
| rurp wrote:
| I remember when Googlewhacks[0] used to be a thing. Zero
| result search queries weren't interesting enough because
| they were too easy to find.
|
| [0]https://en.m.wikipedia.org/wiki/Googlewhack
| noqc wrote:
| Garbage in garbage out is fine here, no? I hate google quite
| as much as the next person here, but this seems like a non-
| issue. If I type in a random string, it should be assumed
| that I'm searching for _something_.
| ryanianian wrote:
| Sometimes you really do want exactly that "random" string.
| This is common with error messages, model numbers, build
| hashes, etc. If I'm searching for B9GDSIGH as the model
| number for my refrigerator, I really don't want to see
| B9GDSIGY.
| kimixa wrote:
| But if it links to the B9GDSIG series refrigerator, which
| has the 240v H and 120v Y subtypes, then it would be
| correct in suggesting that?
|
| Same with error messages - they often have timestamps, or
| local object IDs/memory addresses, which you also want to
| be fuzzy-matched.
|
| I think the issue is the de-emphasis of "power" modifiers
| for google - it's less obvious how to say "This part of
| the string needs exact match, this can be fuzzy"
| dylan604 wrote:
| In that case, click the "must contain" link and it
| resubmits with the query wrapped in quotes. Or, just
| quote the query yourself on the first go if you know it
| _must_ match
| ryanianian wrote:
| Google no longer (hasn't in a while) respected quotes.
| It's very hard to get Google to actually say there aren't
| any results even when in fact there are no matching
| results.
| dylan604 wrote:
| They respect it when they submit it then, as every time
| I've used that function to see them update the query with
| quotes it comes back with different results. I've never
| cared to look at the search query in the URL, so maybe
| they also add and additional parameter that tells the
| back end specifically to obey the quotes on this
| resubmitting???? So at some point, the quotes aren't
| ignored
| fragmede wrote:
| that's not my experience.
|
| https://www.google.com/search?q=%22kgirbudidndijrjjr%22
| gives me "Your search - "kgirbudidndijrjjr" - did not
| match any documents.", at least it will until they index
| this comment and find kgirbudidndijrjjr
| thfuran wrote:
| Quotes are more like guidelines these days.
| aldousd666 wrote:
| on the advanced search, there's still the option to
| specify that it 'must contain' something, but I'm not
| sure if it's just a suggestion like quotes or not.
| dylan604 wrote:
| I "love" how we've reached a point where we so distrust
| this company specifically but dark pattern UIs in general
| where we almost anticipate placebo like buttons.
| bravetraveler wrote:
| One man's trash is another man's treasure. Search is
| ambiguous enough by nature IMO. No liberty zone!
|
| Agree with the peer - specificity matters. Model numbers
| are a good example. I feel like I've developed a weak form
| of dyslexia because I can't trust Google like I once did.
|
| Things I want fuzzy searches for... will be presented
| fuzzy. Not as an opaque string of usually-quoted
| characters, but wrapped in keywords
|
| A reply makes a good point - double quotes don't seem as
| effective any more.
| SquareWheel wrote:
| If you put quotes around the string (the "exact match"
| operator), the only results are this very thread. So it seems
| to be working as intended.
|
| Basically, you did a fuzzy search and got a fuzzy result.
| Usually that's what people want. Quotes will let you fine-
| tune results. Or if you want all results to be strict by
| default, use verbatim mode. I tested that with the above
| string and again, only this thread showed up.
| underwater wrote:
| But it's clearly not what people want. Ask any person if a
| search for a hex encoded ID should be a fuzzy match for a
| different ID and the answer will be no.
|
| As technical people, it's easy to infer what's happening
| under the hood and make excuses for the weirdness. But food
| product design is about having strong opinions about what
| should happen, and ignoring our bias is around the
| limitations of the tech or the status quo.
|
| In an age where I can have an entire conversation with a
| computer or generate a video from text the world's greatest
| search engine still doesn't understand that you can't fuzzy
| match an ID? It increasingly feels like Google search is
| stuck in the past.
| fragmede wrote:
| Who is this "any person" who's searching for random hex,
| and how much do you think they care of Google shows them
| a car instead of whatever thing they're not even actually
| looking for?
|
| the idea that this mythical "any person" even cares about
| the difference between a useless car result and a page
| that says no results and then they just move on with
| their lives is projecting a lot of your own biases onto a
| hypothetical.
| xp84 wrote:
| I get that that's the default now, but can't help but hate
| it. When you search for like `dog house` to have a bunch of
| results for just house (marked "Missing: ~dog~" ) it's so
| dumb. Why would I have typed dog unless that was important
| to me??
| shadowgovt wrote:
| I can't tell you the number of times I've searched for random
| serial numbers and gotten the exact product I seek. I'm glad
| Google indexes this random crap.
| confused_boner wrote:
| Bing search results for that are interesting
| shadowgovt wrote:
| Additionally, the user is doing the search in a non-Incognito
| session, so the system will bias based on assumption of user
| preferences. "Hm, I see this random hex identifier in three
| pages... Oh, but this user likes cars. Let's give 'em the car
| result first."
| joe_the_user wrote:
| I think this is notable just because it's a result of Google now
| having every single search result set be trying to sell you
| something. That's different from simply having targeted ads and
| rather disturbing.
| cratermoon wrote:
| Google is now a glorified Yellow Pages, assuming that every
| search is a search for a business.
| libria wrote:
| Repro'd in an incognito window so it's not a history thing. 1st 3
| of OPs strings if anyone else is experimenting (remove spaces):
| 3344cfb4 78ead204a49b88 1da6079adf8a e2c75c64
| eef8087f6f36df 57 eb944335 73626fe9b73550 b02a651620d8
|
| --
|
| Shoot, depending on crawling, this may end up causing this page
| to match. I'm injecting spaces above to deter this, but maybe
| it'll also prove out the partial string match theory...
| 1970-01-01 wrote:
| I'm only getting back 2 results: Citi.com and FDIC.gov
|
| Clicking on the 3 dots gives me this info:
| Your search & this result This result seems relevant
| even though this search term may not appear:
| 3344cfb478ead204a49b881da6079adf8a
| qnleigh wrote:
| Good guesses in the comments so far: VIN number partial matches
| and targeted search. Anyone going to test what's correct?
|
| Ideas: 1. Vin numbers are 17 characters and don't contain I, O or
| Q, to prevent confusion with other letters. If you throw in lots
| of these always spaced by less than 17 characters, do you get
| fewer hits?
|
| 2. Does a VPN and/or private browsing affect the results?
|
| A third possibility is that Google has cheaper ad category for
| search queries that they can't categorize. This doesn't explain
| the diversity of dealerships though.
| joe_the_user wrote:
| Sure, it's matching VINs. But in the vast expanse of the net,
| surely there are many strings of random hex out there. Why this
| source of random digits.
| bn-l wrote:
| Probably the most authoritive sites with weird strings
| cratermoon wrote:
| "most authoritive", for financially preferred values of
| authority.
| ww520 wrote:
| The word embeddings computed from the hex values and the car
| dealership's inventory ID's probably have close similarity in
| Google's vector db.
| rerdavies wrote:
| I like that theory, but with one slight modification.
|
| There's a single word embedding for DARNED_IF_I_KNOW, and,
| statistically, automobile listings outnumber other pages with
| the DARNED_IF_I_KNOW token.
| omoikane wrote:
| I see that digits is between 10 and 19:
| DIGITS=$((10 + $RANDOM % 10))
|
| If it was always an even number, I would have expected some
| checksum files to be matched (16 for md5sum, 20 for sha1sum,
| etc).
| jimbobthrowawy wrote:
| I tend to get variations on cryptocurrency block explorer
| websites mostly.
|
| It's annoying when I want to search for a btih or something
| exact.
| cratermoon wrote:
| I'm going to guess that google makes more money from car dealer
| ads than it does for programmers searching for hex codes. Also
| probably just because Google's search is more and more giving
| irrelevant results.
| hi-v-rocknroll wrote:
| Perhaps Google trolls anyone in security or torrenting, and would
| instead prefer to show CPM/CPC ads to charge instead of nothing
| because money. /s
| worik wrote:
| Search bubble?
___________________________________________________________________
(page generated 2024-05-31 23:00 UTC)