[HN Gopher] OpenStreetMap overwhelmed by bots scraping data
___________________________________________________________________
OpenStreetMap overwhelmed by bots scraping data
Author : molly_radstowe
Score : 24 points
Date : 2026-01-28 01:23 UTC (5 hours ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| molly_radstowe wrote:
| #OpenStreetMap hammered by scrapers hiding behind residential
| proxy/embedded-SDK networks.
| direwolf20 wrote:
| More like hammered by Google and Apple so you'll use their apps
| instead.
| Bender wrote:
| Looks like it is hosted in Equinix in NL? Or just part of it
| maybe? Is it behind a load balancer, maybe something like
| HAProxy? If so were stick tables set up to limit rates by
| cookie and require people be logged in on unique accounts and
| limit anonymous access after so many requests? I know limiting
| anonymous access is not great but that is something that could
| be enabled when under a high load so that instead of the site
| going offline for everyone it would just be limited for the
| anonymous users. _Degradation vs critical outage_
|
| On a separate note have tcpdump captures been done on these
| excessive connections? Minus the IP, what do their SYN packets
| look like? Minus the IP what do the corresponding log entries
| look like in the web server? Are they using HTTP/1.1 or
| HTTP/2.0? Are they missing any expected headers for a real
| person such as cors, no-cors, navigate, accept_language?
| tcpdump -p --dont-verify-checksums -i any -NNnnvvv -B32768 -c32
| -s0 port 443 and 'tcp[13] == 2'
|
| Is there someone at OpenStreetMap that can answer these
| questions?
| KomoD wrote:
| I think it could be worth trying to block them with TLS
| fingerprinting, or since they think it's residential proxies
| they are being hammered by, https://spur.us could be worth a
| try.
| phillipseamore wrote:
| The number of idiotic vibe coded repos I've seen on GH lately
| that are doing things like crawling OSM for POI data is
| mindboggling!
| CqtGLRGcukpy wrote:
| https://xcancel.com/openstreetmap/status/2016320492420878531
|
| https://nitter.poast.org/openstreetmap/status/20163204924208...
| CqtGLRGcukpy wrote:
| They also posted about this on Mastodon / Fedi:
| https://en.osm.town/@osm_tech/115968544599864782
| dzhiurgis wrote:
| I'll ask dumb question - if they are "open source" then why they
| are bothered by it? Is it scraping itself? Are their data not
| freely available for download?
| wodenokoto wrote:
| Someone has to pay for bandwidth. And that someone would like
| the bandwidth to go to human users.
| zeeZ wrote:
| Their data is freely available to download. There are weekly
| dumps of the entire planet and several sources for partial
| data. There's no need for most legitimate use cases to scrape
| their API.
| solaris2007 wrote:
| Make the data available through bit-torrent and IPFS. Redirect
| IPs that make excessive requests to response only kilobytes in
| size "use the torrents and IPFS".
|
| As an SRE, the only legitimate concern here could be the
| bandwidth costs. But QoS tuning should solve that too.
|
| Supposedly technical people crying out for a journalist to help
| them is super lame. Everything about this looks super lame.
| zeeZ wrote:
| That data is already available. Including torrents.
|
| https://planet.openstreetmap.org/
___________________________________________________________________
(page generated 2026-01-28 07:01 UTC)