[HN Gopher] Reverse Engineering Vercel's BotID
___________________________________________________________________
Reverse Engineering Vercel's BotID
Author : hazebooth
Score : 84 points
Date : 2025-06-30 12:19 UTC (10 hours ago)
(HTM) web link (www.nullpt.rs)
(TXT) w3m dump (www.nullpt.rs)
| codedokode wrote:
| Note that the bot detection script uses WebGL to obtain GPU name.
| I assume this (fingerprinting) is the most popular use of WebGL.
| Sad that independent browsers like Firefox do not supply fake
| values.
| nullpt_rs wrote:
| Sadly, spoofing GPU vendor & renderer can be an even larger
| flag since they can hash the resulting image of the canvas to
| compare it with a database of collected fingerprints[0]
|
| [0]: https://research.google/pubs/picasso-lightweight-device-
| clas...
| reaperducer wrote:
| Until a major player gets on board. Then it works.
|
| Apple does this by sending an imposter user agent from Safari
| on iPads.
|
| If only that was expanded to iPhones, too. And then send
| rotating, or randomized user agents.
| nerdsniper wrote:
| Apple does it because they don't have a vested financial
| interest in internet-wide tracking.
|
| Google does.
|
| And while Mozilla does too because the vast majority of
| their funding comes from Google, it's more pertinent that
| they don't have the market share to pull this off. Firefox
| would just stop working on major websites if they did this.
| andrewmcwatters wrote:
| It's funny that trying to click on the Google Scholar link
| there falsely identifies me as a bot.
| grishka wrote:
| IMO the use of <canvas> needs to be behind a permission prompt,
| the same as e.g. geolocation or WebRTC. Few websites _actually
| need_ canvas /WebGL for legitimate purposes.
| ATechGuy wrote:
| > At the moment, it seems Basic mode is so basic that it allows
| everything to pass as human. That'll likely change as they gather
| more telemetry to better identify what a bot signal looks like.
|
| So they are basically collecting telemetry in the name of "free
| basic anti-bot" solution.
| cchance wrote:
| free basic anti-bot solution that literally NEVER BLOCKS A BOT,
| like what the actual fuck
| b0a04gl wrote:
| why is bot detection even happening at render time instead of
| request time. why can't tell you're a bot from your headers, UA,
| IP, TLS fingerprint. imo making it a surveillance. 'you're a bot,
| ok not just go away, let's fingerprint your GPU and assign you a
| behavioral risk score anyway'
| n2d4 wrote:
| It's really hard to detect it at request time. It's practically
| trivial for an attacker to fake headers to resemble a real
| browser.
| indrora wrote:
| Anubis does it pretty decently.
| baby_souffle wrote:
| You absolutely have options at request time. Arguably, some
| of the things you can only do at request time are part of a
| full and complete mitigation strategy.
|
| You can fingerprint the originating TCP stack with some
| degree of confidence. If the request looks like it came from
| a Linux server but the user agent says Windows, that's a
| signal.
|
| Likewise, the IP address making the request has geographic
| information associated with it. If my IP address says I'm in
| Romania but my browser is asking for the English language
| version of the page... That's a signal.
|
| Similar to basic IP/Geo, you can do DNS and STUN based
| profiling, too. This helps you catch people that are behind
| proxies or VPNs.
|
| To blur the line, you can use JavaScript to measure request
| timing. Proxies that are going to tamper with the request to
| hide its origins or change its fingerprint will add a
| measurable latency.
| n2d4 wrote:
| None of these are conclusive by any means. The IP address
| check you mentioned would mark anyone using a VPN, or
| English speakers living abroad. Modern bot detection
| combines lots of heuristics like these together, and being
| able to run JavaScript in the browser (at render-time) adds
| a lot more data that can be used to make a better
| prediction.
| cAtte_ wrote:
| > If my IP address says I'm in Romania but my browser is
| asking for the English language version of the page...
| That's a signal.
|
| jesus christ don't give them ideas. it's annoying enough to
| have my country's language forced on me (i prefer english)
| when there's a perfectly good http header for that. now
| _blocking_ me based on this?!
___________________________________________________________________
(page generated 2025-06-30 23:01 UTC)