[HN Gopher] OpenStreetMap overwhelmed by bots scraping data
       ___________________________________________________________________
        
       OpenStreetMap overwhelmed by bots scraping data
        
       Author : molly_radstowe
       Score  : 24 points
       Date   : 2026-01-28 01:23 UTC (5 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | molly_radstowe wrote:
       | #OpenStreetMap hammered by scrapers hiding behind residential
       | proxy/embedded-SDK networks.
        
         | direwolf20 wrote:
         | More like hammered by Google and Apple so you'll use their apps
         | instead.
        
         | Bender wrote:
         | Looks like it is hosted in Equinix in NL? Or just part of it
         | maybe? Is it behind a load balancer, maybe something like
         | HAProxy? If so were stick tables set up to limit rates by
         | cookie and require people be logged in on unique accounts and
         | limit anonymous access after so many requests? I know limiting
         | anonymous access is not great but that is something that could
         | be enabled when under a high load so that instead of the site
         | going offline for everyone it would just be limited for the
         | anonymous users. _Degradation vs critical outage_
         | 
         | On a separate note have tcpdump captures been done on these
         | excessive connections? Minus the IP, what do their SYN packets
         | look like? Minus the IP what do the corresponding log entries
         | look like in the web server? Are they using HTTP/1.1 or
         | HTTP/2.0? Are they missing any expected headers for a real
         | person such as cors, no-cors, navigate, accept_language?
         | tcpdump -p --dont-verify-checksums -i any -NNnnvvv -B32768 -c32
         | -s0 port 443 and 'tcp[13] == 2'
         | 
         | Is there someone at OpenStreetMap that can answer these
         | questions?
        
           | KomoD wrote:
           | I think it could be worth trying to block them with TLS
           | fingerprinting, or since they think it's residential proxies
           | they are being hammered by, https://spur.us could be worth a
           | try.
        
       | phillipseamore wrote:
       | The number of idiotic vibe coded repos I've seen on GH lately
       | that are doing things like crawling OSM for POI data is
       | mindboggling!
        
       | CqtGLRGcukpy wrote:
       | https://xcancel.com/openstreetmap/status/2016320492420878531
       | 
       | https://nitter.poast.org/openstreetmap/status/20163204924208...
        
       | CqtGLRGcukpy wrote:
       | They also posted about this on Mastodon / Fedi:
       | https://en.osm.town/@osm_tech/115968544599864782
        
       | dzhiurgis wrote:
       | I'll ask dumb question - if they are "open source" then why they
       | are bothered by it? Is it scraping itself? Are their data not
       | freely available for download?
        
         | wodenokoto wrote:
         | Someone has to pay for bandwidth. And that someone would like
         | the bandwidth to go to human users.
        
         | zeeZ wrote:
         | Their data is freely available to download. There are weekly
         | dumps of the entire planet and several sources for partial
         | data. There's no need for most legitimate use cases to scrape
         | their API.
        
       | solaris2007 wrote:
       | Make the data available through bit-torrent and IPFS. Redirect
       | IPs that make excessive requests to response only kilobytes in
       | size "use the torrents and IPFS".
       | 
       | As an SRE, the only legitimate concern here could be the
       | bandwidth costs. But QoS tuning should solve that too.
       | 
       | Supposedly technical people crying out for a journalist to help
       | them is super lame. Everything about this looks super lame.
        
         | zeeZ wrote:
         | That data is already available. Including torrents.
         | 
         | https://planet.openstreetmap.org/
        
       ___________________________________________________________________
       (page generated 2026-01-28 07:01 UTC)