hngopher.com

       [HN Gopher] Ask HN: How did the internet discover my subdomain?
       ___________________________________________________________________
        
       Ask HN: How did the internet discover my subdomain?
        
       I have a domain that is not live. As expected, loading the domain
       returns: Error 1016.  However...I have a subdomain with a not
       obvious name, like: userfileupload.sampledomain.com  This subdomain
       IS LIVE but has NOT been publicized/posted anywhere. It's a custom
       URL for authenticated users to upload media with presigned url to
       my Cloudflare r2 bucket.  I am using CloudFlare for my DNS.  How
       did the internet find my subdomain? Some sample user agents are:
       "Expanse, a Palo Alto Networks company, searches across the global
       IPv4 space multiple times per day to identify customers&#39;
       presences on the Internet. If you would like to be excluded from
       our scans, please send IP addresses/domains to:
       scaninfo@paloaltonetworks.com", "Mozilla/5.0 (Macintosh; U; Intel
       Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko)
       Version/5.1 Safari/534.20.8", "Mozilla/5.0 (Linux; Android 9; Redmi
       Note 5 Pro) AppleWebKit/537.36 (KHTML, like Gecko)
       Chrome/76.0.3809.89 Mobile Safari/537.36",  The bots are GET
       requests which are failing, as designed, but I'm wondering how the
       bots even knew the subdomain existed?!
        
       Author : govideo
       Score  : 220 points
       Date   : 2025-03-06 22:34 UTC (1 days ago)
        
       | codingdave wrote:
       | If it is on DNS, it is discoverable. Even if it were not, the
       | message you pasted says outright that they scan the entire IP
       | space, so they could be hitting your server's IP without having a
       | clue there is a subdomain serving your stuff from it.
        
         | govideo wrote:
         | Ahh yeah, my internet network knowledge was never super strong,
         | and now is rusty to boot. Thanks for your note.
        
         | paulnpace wrote:
         | Shouldn't the web server only respond to a configred domain,
         | else 404?
        
           | precommunicator wrote:
           | Depends if it's configured like that, by default usually no
        
         | EQYV wrote:
         | Question: How does a subdomain get discovered by a member of
         | the public if there are no references to it anywhere online?
         | 
         | The only thing I can think of that would let you do that would
         | be a DNS zone transfer request, but those are almost always
         | disallowed from most origin IPs.
         | 
         | https://en.m.wikipedia.org/wiki/DNS_zone_transfer
        
           | arccy wrote:
           | you also have zone walking with DNS NSEC
           | 
           | https://www.domaintools.com/resources/blog/zone-walking-
           | zone...
        
           | dnsfax wrote:
           | Certificate transparency logs.
        
           | yatralalala wrote:
           | See my comment above
           | https://news.ycombinator.com/item?id=43289743 there are many
           | techniques!
        
         | dnsfax wrote:
         | If you know what to query, sure. You can't just say "give me
         | all subdomains"; it doesn't work that way. The subdomain was
         | discovered via certificate transparency logs.
        
         | alexjplant wrote:
         | > If it is on DNS, it is discoverable.
         | 
         | In the context of what OP is asking this is not true. DNS zones
         | aren't enumerable - the only way to reliably get the complete
         | contents of the zone is to have the SOA server approve a zone
         | transfer and send the zone file to you. You can ask if a record
         | in that zone exists but as a random user you can't say "hand
         | over all records in this zone". I'd imagine that tools like
         | Cloudflare that need this kind of functionality perform a
         | dictionary search since they get 90% of records when importing
         | a domain but always seem to miss inconspicuously-named ones.
         | 
         | > Even if it were not, the message you pasted says outright
         | that they scan the entire IP space, so they could be hitting
         | your server's IP without having a clue there is a subdomain
         | serving your stuff from it.
         | 
         | This is likely what's happening. If the bot isn't using SNI or
         | sending a host header then they probably found the server by
         | IP. The fact that there's a heretofore unknown DNS record
         | pointing to it is of no consequence. *EDIT: Or the Cert
         | Transparency log as others have mentioned, though this isn't
         | DNS per se. I learn something new every day :o)
        
           | fulafel wrote:
           | In practice it's not so far fetched: A zone transfer is just
           | another dns query at the protocol level, i suppose you can
           | conceptually view it as sending a file if you consider the
           | dns response a file. Something like "host -t axfr my.domain
           | ns1.my.domain" will show the zone depending on how a domain's
           | name server is configured (eg in bind, allow-transfer
           | directive can be used to make it public, require ip acl to
           | match the query source, etc).
        
             | elric wrote:
             | No sensible DNS provider has zone transfers enabled by
             | default. OP mentioned using CloudFlare, and they certainly
             | don't.
        
             | alexjplant wrote:
             | > in bind, allow-transfer directive
             | 
             | Configuring BIND as an authoritative server for a corporate
             | domain when I was a wee lad is how I learned DNS. It was
             | and still is bad practice to allow zone transfers without
             | auth. If memory serves I locked it down between servers via
             | key pairs.
        
           | walrus01 wrote:
           | > In the context of what OP is asking this is not true. DNS
           | zones aren't enumerable - the only way to reliably get the
           | complete contents of the zone is to have the SOA server
           | approve a zone transfer and send the zone file to you.
           | 
           | This is _generally_ true but also if you watch authoritative-
           | only dns server logs for text strings matching ACL
           | rejections, there 's plenty of things out there which are
           | fully automated crawlers _attempting_ to do entire zone
           | transfers.
           | 
           | There are a non zero number of improperly configured
           | authoritative dns servers out there on the internet which
           | will happily give away a zone transfer to anyone who asks for
           | it, at least, apparently enough to be useful that somebody
           | wrote crawlers for it. I would guess it's only a few percent
           | of servers that host zonefiles but given the total size of
           | the public Internet, that's still a lot.
        
             | majke wrote:
             | In the context of DNSSEC dns zones are very much
             | enumerable. Cloudflare does amazing tricks to avoid this
             | https://blog.cloudflare.com/black-lies/
        
               | eqvinox wrote:
               | Cloudflare themselves gives more information here:
               | 
               | > NSEC3 was a "close but no cigar" solution to the
               | problem. While it's true that it made zone walking
               | harder, it did not make it impossible. Zone walking with
               | NSEC3 is still possible with a dictionary attack.
               | 
               | So, hardening it against enumerability is a question of
               | inserting non-dictionary names.
        
           | yatralalala wrote:
           | Zone transfers are super interesting topic. Thanks for
           | mentioning that.
           | 
           | It's basically the way how to get all DNS records a DNS
           | server has. Interestingly in some countries this is illegal
           | and in some this is considered best practice.
           | 
           | Generally, enabled zone transfers is considered as
           | misconfiguration and should be disabled.
           | 
           | We did research on that few months back and found out that 8%
           | of all global name servers have it enabled.[0]
           | 
           | [0] - https://reconwave.com/blog/post/alarming-prevalence-of-
           | zone-...
        
             | stwrzn wrote:
             | That's concerning. I thought everyone knows that zone
             | transfers should be generally disallowed, especially when
             | coming from random hosts.
        
       | Kikawala wrote:
       | Is it available under HTTPS? Then it's probably in a Certificate
       | Transparency log.
        
         | govideo wrote:
         | Yes, https via cloudflare's automatic https. Thanks for the
         | info.
        
           | thisisgvrt wrote:
           | Automated agents can tail the certificate log to discover new
           | domains as the certs are issued. But if you want to explore
           | subdomains manually, https://crt.sh/ is a nice tool.
        
           | snailmailman wrote:
           | Yeah this is a surprisingly little known fact- all certs
           | being logged means all subdomain names get logged.
           | 
           | Wildcard certs can hide the subdomains, but then your cert
           | works on _all_ subdomains. This could be an issue if the
           | certs get compromised.
           | 
           | Usually there isn't sensitive information in subdomain names,
           | but i suspect it often accidentally leaks information about
           | infrastructure setups. "vaultwarden.example.com" existing
           | tells you someone is probably running a vaultwarden instance,
           | even if it's not publicly accessible.
           | 
           | The same kind of info can leak via dns records too, I think?
        
             | tialaramex wrote:
             | > The same kind of info can leak via dns records too, I
             | think?
             | 
             | That's correct "passive DNS" is sold by many large public
             | DNS providers. They tell you (for a fee) what questions
             | were asked and answered which meet your chosen criteria. So
             | e.g. maybe you're interested, what questions and answers
             | matched A? something.internal.bigcorp.example in February
             | 2025.
             | 
             | They won't tell you who asked (IP address, etc.) but
             | they're great for discovering that even though it says 404
             | for you, bigcorp.famous-brand-hr.example is checked
             | regularly by _somebody_ , probably BigCorp employees who
             | aren't on their VPN - suggesting very strongly that
             | although BigCorp told Famous Brand HR not to list them as a
             | client that is in fact the HR system used by BigCorp.
        
             | Arrowmaster wrote:
             | I had coworkers at a previous employer go change settings
             | in CloudFlare trying to troubleshoot instead of reaching
             | out to me. They changed the option that caused CF proxy to
             | issue a cert for every subdomain instead of using the
             | wildcard. They didn't understand why I was pissed that they
             | had now written every subdomain we had in use to the public
             | record in addition to doing it without an approved change
             | request.
        
           | yatralalala wrote:
           | If you're using infra in a way [cloudflare -> your VM] I'd
           | recommend setting firewall on the VM in a way that it can be
           | accessed only from Cloudflare.
           | 
           | This way, you will force everyone to go through Cloudflare
           | and utilize all those fancy bot blocking features they have.
        
         | system2 wrote:
         | Do you know how to access these logs?
        
           | Kikawala wrote:
           | Answered below, but https://crt.sh/ is what I use.
        
           | daneel_w wrote:
           | https://crt.sh/ is one example, if you sign using e.g. Let's
           | Encrypt.
        
           | Eikon wrote:
           | https://www.merklemap.com/
        
       | daggersandscars wrote:
       | DNS query type AXFR allows for subdomain querying. There are
       | security restrictions around who can do it on what DNS servers.
       | Given the number of places online one can run a subdomain query,
       | I suspect it's mostly a matter of paying the right fees to the
       | right DNS provider.
        
       | artursapek wrote:
       | presumably it has a DNS record
        
       | vince14 wrote:
       | I'm having the same issue.
       | 
       | https://securitytrails.com/ also had my "secret" staging
       | subdomain.
       | 
       | I made a catch-all certificate, so the subdomain didn't show up
       | in CT logs.
       | 
       | It's still a secret to me how my subdomain ended up in their
       | database.
        
         | selcuka wrote:
         | They could be purchasing DNS query logs from ISPs.
        
         | arccy wrote:
         | maybe your server responded to a plain ip addressed request
         | with the real name...
        
           | averageRoyalty wrote:
           | Host header is a request header, not a response one, isn't
           | it?
        
           | fc417fc802 wrote:
           | He said he used a wildcard cert though. So what part of the
           | response would contain the subdomain in that case?
        
         | johnklos wrote:
         | Serious question: Do you really think that Cloudflare is trying
         | to keep these kinds of thing private? If so, I'd suggest that's
         | not a reasonable expectation.
        
           | fc417fc802 wrote:
           | Related question (not rhetorical). If you do DNS for
           | subdomains yourself (and just use Cloudflare to point
           | dns.example.com at your box) will the subdomain queries leak
           | and show up in aggregate datasets? What I'm asking is if
           | query recursion is always handled locally or if any of the
           | reasonably common software stacks resolve it remotely.
        
             | immibis wrote:
             | As well as assuming Cloudflare sells DNS lists, it's
             | probably safe to assume the operators of public resolvers
             | like 8.8.8.8, 9.9.9.9 and 1.1.1.1 (that is Google, Quad9
             | and Cloudflare again) are looking at their logs and either
             | selling them or using them internally.
        
       | parliament32 wrote:
       | Certificate Transparency logs, or they don't actually know the
       | domain name: just port-scanning[1] then making requests to open
       | web ports.
       | 
       | [1] Turns out you can port-scan the entire internet in under 5
       | minutes: https://github.com/robertdavidgraham/masscan
        
         | andix wrote:
         | Port scanning usually can't discover subdomains. Most servers
         | don't expose the of the domains they server content for. In
         | case of HTTP they usually only serve the subdomain content if
         | the Host: request-header includes it.
        
           | hombre_fatal wrote:
           | Most servers just listen on :80 and respond to all requests.
           | Almost nobody checks the host header intentionally, it's just
           | a happy mistake if they use a reverse proxy.
           | 
           | You can often decloak servers behind Cloudflare because of
           | this.
           | 
           | But OP's post already answered their question: someone
           | scanned ipv4 space. And what they mean is that a server they
           | point to via DNS is receiving requests, but DNS is a red
           | herring.
        
             | andix wrote:
             | This really depends on the setup. Most web servers host
             | multiple virtual hosts. IP addresses are expensive.
             | 
             | If you're deploying a service behind a reverse proxy, it
             | either must be only accessible from the reverse proxy via
             | an internal network, or check the IP address of the reverse
             | proxy. It absolutely must not trust X-Forwarded-For:
             | headers from random IPs.
        
               | hombre_fatal wrote:
               | I just don't see how any of this matters. OP's server is
               | reachable via ipv4 and someone sent an http request to
               | it. Their post even says that this is the case.
        
               | andix wrote:
               | I'm guessing they meant it discovered a virtual host
               | behind a subdomain.
        
           | benfortuna wrote:
           | I could be wrong, but the Palo Alto scanner says it's using
           | global ipv4 space, so not using DNS at all. So actually the
           | subdomain has not been discovered at all.
        
             | reactordev wrote:
             | This is exactly what's happening based on the log snippet
             | posted. Has nothing to do with subdomains, has everything
             | to do with it being on the internet.
        
           | parliament32 wrote:
           | How deep in the domain hierarchy you are doesn't matter from
           | a network layer: a bare tld (yes this exists), a normal
           | domain, a subdomain, a sub-subdomain, etc can all be assigned
           | different IPs and go different places. You can issue a GET
           | against / for any IP you want (like we see in the logs OP
           | posted). The only time this would actually matter is if a
           | host at an address is serving content for multiple hostnames
           | and depends on the Host header to figure out which one to
           | serve -- but even those will almost always have a default.
        
             | andix wrote:
             | You can discover IP adresses, sure. Just enumerate them.
             | But this doesn't give you the domain, as long as there is
             | no reverse dns record.
             | 
             | I'm quite sure OP meant a virtual host only reachable with
             | the correct Host: header.
        
           | cryptonector wrote:
           | And in the case of HTTPS they need to insist on SNI (and
           | TLSv3 requires it).
        
         | giancarlostoro wrote:
         | Last few times I tried to do this my ISP cut off my internet
         | every time. Assholes. It comes back, but they're still assholes
         | for it.
        
         | icehawk wrote:
         | This.
         | 
         | I have a DNS client that feeds into my passive DNS database by
         | reading CT logs and then trying to resolve them.
        
           | toomuchtodo wrote:
           | What do you use it for?
        
       | fsckboy wrote:
       | LPT, this is an object lesson in the weakness of security through
       | obscurity
        
         | bangaladore wrote:
         | I mean you could argue that this is more of a multi-factor
         | authentication lesson.
         | 
         | Just knowing 1 "secret"-- a subdomain in this case --shouldn't
         | get you somewhere you shouldn't.
         | 
         | In general you should always assume that any password has been
         | (or could be) compromised. So in this case, more factors should
         | be involved such as IP restricting for access, an additional
         | login page, certificate validation, something...
        
         | andix wrote:
         | Security by obscurity can be a great additional measure for an
         | already secure system. It can reduce attack surface, make it
         | less likely to get attacked in the first place. In some cases
         | (like this one) it can also be much easier to break than
         | expected.
        
       | OuterVale wrote:
       | https://www.merklemap.com pops to mind.
        
         | govideo wrote:
         | Interesting! Just checked them out.
         | 
         | "MerkleMap gathers its information by continuously monitoring
         | and live tailing Certificate Transparency (CT) logs, which are
         | operated by organizations like Google, Cloudflare, and Let's
         | Encrypt. "
        
           | Eikon wrote:
           | I made this, thank you!
        
       | 8bitchemistry wrote:
       | Did you ever email the URL to somebody? We had the same issue
       | years ago where google seemed to be crawling/indexing new
       | subdomains it finds in emails.
        
         | govideo wrote:
         | Nope, never emailed or posted to anyone. Just me (it's my solo
         | project at the moment).
        
       | andix wrote:
       | I'm surprised nobody mentioned subfinder yet:
       | https://github.com/projectdiscovery/subfinder
       | 
       | Subfinder uses different public and private sources to discover
       | subdomains. Certificate Transparency logs are a great source, but
       | it also has some other options.
        
       | spl757 wrote:
       | Does the IP address for that subdomain have a DNS PTR record set?
       | If it does, someone can discover the subdomain by querying the
       | PTR record for the IP.
        
         | govideo wrote:
         | If it does, I did not set it up; it would have been
         | automatically done by CloudFlare when I told it to use my
         | custom subdomain for the upload urls.
        
       | andix wrote:
       | If a HTTPS service should be hard to discover, an easy way is to
       | hide it behind a subdirectory. Something like
       | https://subdomain.domain.example/hard_to_find_secret_string.
       | 
       | Another option are wildcard certificates.
       | 
       | This obviously can't be the only protection. But if an attacker
       | doesn't know about a service, or misses it during discovery, they
       | can't attack it.
        
       | LinuxBender wrote:
       | As others have said, likely cert transparency logs. Use a
       | wildcard cert to avoid this. They are free using LetsEncrypt and
       | possibly a couple other ACME providers. I have loads of wildcard
       | certs. Bots will try guessing names but like you I do not use
       | easily guessable names and the bots never find them. _I log all
       | DNS answers._ I assume cloudflare supports strict-SNI but no idea
       | if they have their own automation around wildcard certs.
       | Sometimes I renew wildcard certs I am not even using just to give
       | the bots something to do.
        
         | govideo wrote:
         | I have been just relying on CloudFlare's automatic https. But I
         | will look into my own certs, though will likely just use
         | CloudFlare's. I don't mind the internet knowing the subdomain I
         | posted about; was curious how the bots found it!
        
       | pabs3 wrote:
       | ArchiveTeam has some docs about this:
       | 
       | https://wiki.archiveteam.org/index.php/Finding_subdomains
        
         | govideo wrote:
         | I'm so often amazed (but no longer surprised) at the depth of
         | niche (relatively) info and tools out there.
        
       | ciaovietnam wrote:
       | There is a chance that your subdomain is the first/default
       | virtual host in your web server setup (or the subdomain's access
       | log is the default log file) so any requests to the server's IP
       | address get logged to this virtual host. That means they didn't
       | access your subdomain, they accessed via your server IP address
       | but got logged in your subdomain's access log.
        
         | BrandoElFollito wrote:
         | And this is the correct answer, thank you.
         | 
         | Transparency logs are fine except if you have a wildcard cert
         | (or no https, obviously).
         | 
         | IP scans are just this: scans for live ports. If you do not
         | provide a host header in your call you get whatever the default
         | response was set up. This can be a default site, a 404 or
         | anything else.
        
       | alberth wrote:
       | This site will find any subdomain, for any domain, so long as it
       | previously had a certificate (ssl/tls)
       | 
       | https://crt.sh/
        
         | averageRoyalty wrote:
         | This is incorrect (or at least only technically correct). This
         | is only true for subdomains with public, trusted CA signed
         | certificates since certificate transparency has existed and
         | only for subdomains with a specific, non wildcard certificate.
        
         | govideo wrote:
         | Thanks for mentioning. I checked it out, and am learning lots
         | of new stuff (ie, realize how much I do not know).
        
         | nvarsj wrote:
         | Doesn't find any of my semi secret subdomains.
        
         | socrateslee wrote:
         | https://crt.sh can find your subdomain only when it doesn't
         | have a wildcard certificate(*.somedomain.com)
        
       | thedougd wrote:
       | Some CAs (Amazon) allow not publishing to the Certificate
       | Transparency Log. But if you do this, browsers will block the
       | connection by default. Chromium browsers have a policy option to
       | skip this check for selected URLs. See:
       | CertificateTransparencyEnforcementDisabledForURLs.
       | 
       | Some may find this more desirable than wildcard certificates and
       | their drawbacks.
        
         | klntsky wrote:
         | > Some may find this more desirable
         | 
         | Why?
        
           | thedougd wrote:
           | A CISA article on wildcard security risks. Some of this is in
           | part from common misimplementations (e.g.reusing private keys
           | across servers), but not all of it.
           | 
           | https://www.cisa.gov/news-events/alerts/2021/10/08/nsa-
           | relea... Direct: https://media.defense.gov/2021/Oct/07/200286
           | 9955/-1/-1/0/CSI...
        
         | navigate8310 wrote:
         | To avoid subdomain discovery, I usually acquire certificate
         | domain level and add a wildcard SAN.
        
         | snailmailman wrote:
         | Firefox is currently rolling out the same thing. They will
         | treat any non-publicly-logged certificate as insecure.
         | 
         | I'm surprised amazon offers the option to not log certificates.
         | The whole idea is that every issued cert should get logged.
         | That way, fraudulently-issued certs are either well documented
         | in public logs- or at least not trusted by the browser.
        
           | fc417fc802 wrote:
           | It doesn't seem like the choice has any impact on that. It
           | just protects user privacy if that's what they want to
           | prioritize.
           | 
           | Depending on the issuer logging all certs would never work.
           | You can't rely on the untrusted entity to out themselves for
           | you.
           | 
           | The security comes from the browser querying the log and
           | warning you if the entry is missing. In that sense declining
           | to log a cert is similar to self signing one. The browser
           | will warn and users will need to accept. As long as the vast
           | majority of sites don't do that then we maintain a sort of
           | herd immunity because the warnings are unexpected by the end
           | user.
        
             | thedougd wrote:
             | I should have included in my post, this technique only
             | makes sense in the context of private or internal
             | endpoints.
        
       | rempargo wrote:
       | I assume you host this with a https certificate, so you can look
       | your subdomains at:
       | 
       | https://crt.sh/?q=sampledomain.com
        
       | melson wrote:
       | Someone might used open-source tool like sublist3r
        
         | rawbytes wrote:
         | oh yes that for sure
        
         | ackbar03 wrote:
         | yea was gonna mention this as well lol
        
       | paxys wrote:
       | Not sure why everyone is going on about certificate transparency
       | logs when the answer is right there in the user agent. The
       | company is scanning the ipv4 space and came upon your IP and
       | port.
        
         | pkulak wrote:
         | Okay. But how did they get the proper host header?
        
           | peeters wrote:
           | There are a couple easy possibilities depending on server
           | config.
           | 
           | 1. Not using SNI, and all https requests just respond with
           | the same cert. (Example, go to https://209.216.230.207/ and
           | you'll get a certificate error. Go to the cert details and
           | you'll see the common name is news.ycombinator.com).
           | 
           | 2. http upgrades to https with a redirect to the hostname,
           | not IP address. (Example, go to http://209.216.230.207/ and
           | you get a 301 redirect to https://news.ycombinator.com)
        
           | jimnotgym wrote:
           | I don't think op said that they had the correct host header?
        
           | INTPenis wrote:
           | Could be a number of ways for example a default TLS cert, or
           | a default vhost redirect.
           | 
           | I actually had a job once a few years ago where I was asked
           | to hide a web service from crawlers and so I did some of
           | these things to ensure no info leaked about the real vhost.
        
           | paxys wrote:
           | Who says they did?
        
         | peeters wrote:
         | It's rather hilarious that nobody mentioned this in 7 hours.
         | What am I missing?
         | 
         | ~5 billion scans in a few hours is nothing for a company with
         | decent resources. OP: in case you didn't follow, they're
         | literally trying every possible IPv4 address and seeing if
         | something exists on standard ports at that address.
         | 
         | I believe it would be harder to find out your domain that way
         | if you were using SNI and only forwarded/served requests that
         | used the correct host. But if you aren't using SNI, your server
         | is probably just responding to any TLS connect request with
         | your subdomain's cert, which will reveal your hostname.
        
           | Dylan16807 wrote:
           | > What am I missing?
           | 
           | That it was in fact mentioned many hours earlier, in more
           | than one top level comment.
        
             | peeters wrote:
             | I was referring more to the fact that the user agent
             | explicitly contained the answer, rather than suggestions
             | that it was IP scanning. But you're right I do see one
             | comment that mentions that. And many more likely assumed
             | the OP already figured that part out.
        
               | Dylan16807 wrote:
               | The user agent contains a partial answer. IP scanning
               | doesn't give you the actual subdomain, so the question is
               | slightly wrong or there are missing pieces.
        
               | diggan wrote:
               | Judging by the logs (user agents really) right now in the
               | submission, it's hard to tell if the requests were
               | actually for the domain (since the request headers aren't
               | included) or just for the IP.
        
               | Dylan16807 wrote:
               | Yes, that's the question being wrong option I listed.
        
           | globular-toast wrote:
           | > What am I missing?
           | 
           | It's very common for people to read only up to the point they
           | feel they can comment, then skip immediately to the comment.
           | So, basically, noone read it.
        
             | flemhans wrote:
             | Funny, that'd be so unthinkable for me to do! But you're
             | probably right.
        
           | fragmede wrote:
           | Just the default hostname. It won't reveal all of them or any
           | of the IP addresses of that box. secret-freedom-fighter.ice-
           | cream-shop.example.com could have the same IP as example.com
           | and you'd only know example.com
        
             | A1kmm wrote:
             | If you've got one cert with a subject alt name for each
             | host, they'd see them all. If you use SNI and they have
             | different certificates, the domains might still be in
             | Certificate Transparency logs. If a wildcard cert is used,
             | that could help to conceal the exact subdomain.
        
         | ozim wrote:
         | That perfectly fits midwit meme. Lots of people are smart
         | enough to know transparency logs - but not smart enough to read
         | OP post and understand the details.
        
           | seba_dos1 wrote:
           | The details aren't there, so it's "assume" rather than
           | "understand".
           | 
           | The only proper response to OP's question is to ask for
           | clarification: is the subdomain pointing to a separate IP?
           | Are the logs vhost-specific or not?
           | 
           | If you don't get the answers, all you can do is to assume,
           | and both assumptions may end up being right or wrong (with
           | varying probability, perhaps).
        
         | 4ndrewl wrote:
         | Also it's Palo Alto. They're not some kiddie scripters.
         | https://en.m.wikipedia.org/wiki/Palo_Alto_Networks
        
           | chinathrow wrote:
           | Hm?
           | 
           | They sell you security but provide you with CVEs en masse.
           | 
           | https://www.cybersecuritydive.com/news/palo-alto-networks--
           | h...
        
           | ThatMedicIsASpy wrote:
           | Am I google when I come with the useragent 'google here, no
           | evil'?
        
           | bildung wrote:
           | Looking at _how_ they earned their 100s of CVEs, script
           | kiddie almost looks like a compliment
        
         | p0w3n3d wrote:
         | Finding IP does not mean finding the domain. When doing HTTP
         | request to IP you specify the domain you want to connect to.
         | For example you can configure your /etc/hosts to have
         | xxxnakedhamsters.google.com pointing to 8.8.8.8 and make the
         | http request, which will cause Google getting the domain
         | request (i.e. header Host: xxxnakedhamsters.google.com) and it
         | will refuse it or try to redirect to http. Of course it's only
         | related to HTTP because HTTPS will require certificate. That's
         | why they're speaking about certificates.
        
           | ghusto wrote:
           | First thing I'd do for an IP that answers is a reverse
           | lookup, so I expect that's at least in the list of things
           | they'd try.
        
           | lewiscollard wrote:
           | Depending on the web server's configuration, you very much
           | _can_ find the domain which is configured on an IP address,
           | by attempting to connect to that IP address via HTTPS and
           | seeing what certificate gets served. Here's an example:
           | 
           | https://138.68.161.203/
           | 
           | > Web sites prove their identity via certificates. Firefox
           | does not trust this site because it uses a certificate that
           | is not valid for 138.68.161.203. The certificate is only
           | valid for the following names: exhaust.lewiscollard.com,
           | www.exhaust.lewiscollard.com
        
             | jchw wrote:
             | I don't think that does you any good for Cloudflare,
             | though. They will definitely be using SNI.
        
               | kelnos wrote:
               | That doesn't really matter, though. While OP is using
               | Cloudflare, the actual server behind it is still a
               | publicly-accessible IP address that an IPv4 space scanner
               | can easily stumble upon.
        
               | jchw wrote:
               | I misunderstood, I thought the subdomain _was_ an R2
               | bucket. If it 's just normal Cloudflare proxying to some
               | backend this is probably the most likely answer.
               | 
               | That said, while I think it's not the case here, using
               | Cloudflare doesn't mean the underlying host is
               | accessible, as even on the free tier you can use
               | Cloudflare Tunnels, which I often do.
        
           | melevittfl wrote:
           | But there's no evidence in the OP's post that they have, in
           | fact, discovered the domain. The only thing posted is that
           | there is a GET request to a listening web server.
           | 
           | The OP and all the people talking about certificates are
           | making the same assumption. Namely that the scanning company
           | discovered the DNS name for the server and tried to connect.
           | When, if fact, they simply iterate through IP address blocks
           | and make get requests to any listening web servers they find.
        
             | p0w3n3d wrote:
             | OP states that the domain was discovered
        
               | crazygringo wrote:
               | No they didn't. They said "How did the internet find my
               | subdomain?" They're _assuming_ the internet found their
               | subdomain. They don 't provide any evidence that
               | happened, just that they found their IP address.
        
           | paxys wrote:
           | > When doing HTTP request to IP you specify the domain you
           | want to connect to
           | 
           | No, you make HTTP requests to an IP, not a domain. You
           | convert the domain name to an IP in an earlier step (via a
           | DNS query). You can connect to servers using their raw IPs
           | and open ports all day if you like, which is what's happening
           | here. Yes servers will (likely) reject the requests by
           | looking at the host header, but they will still _receive_ the
           | request.
        
       | DeborahMatthews wrote:
       | Your subdomain may have been discovered through certificate
       | transparency logs, search engine crawling, passive DNS,
       | https://arzhost.com/blogs/openssl-unable-to-write-random-sta...
       | leaked links, or third-party analytics tools.
        
       | arkfil wrote:
       | paloAlto (network devices like firewalls etc) is able to scan the
       | sites that users want to visit behind their devices. these are
       | very popular devices in many companies. users can also have
       | agents installed on their computers that also have access to the
       | sites they visit.
        
         | opello wrote:
         | This is what I was thinking it must be, along the lines of
         | Cisco NAC. Could monitor via browser plugin for full URLs or
         | DNS server for domains.
         | 
         | I imagine the certificate transparency log is the avenue, but
         | local monitoring and reporting up as a new URL or domain to
         | scan for malware seems similarly plausible.
        
       | govideo wrote:
       | Thanks for everyone's perspectives. Very educational and
       | admittedly lots outside the boundaries of my current knowledge. I
       | have thus far relied on CloudFlare's automatic https and simple
       | instant subdomain setup for their worker microservice I'm using.
       | 
       | There are evidently technical/footprint implications of that
       | convenience. Fortunately, I'm not really concerned with the
       | subdomain being publicly known; was more curious how it become
       | publicly known.
        
         | groestl wrote:
         | I had to scroll pretty far down to see the first comment
         | refering to the second most likely leak (after certificate
         | transparency lists): Some ISP sold their DNS query log, and
         | your's was in it.
         | 
         | People buying such records do so for various reasons, for
         | example to seed some crawler they've built.
        
       | bashwizard wrote:
       | Like people have said already; Certificate Transparency logs.
       | 
       | There are countless of tools to use for subdomain enumeration. I
       | personally use subfinder or amass when doing recon on bug bounty
       | targets.
        
       | 3oil3 wrote:
       | What happens if you google your subdomain? Maybe the bots have
       | some sort of dictionary files and they just run them, and when
       | there is a match, then they append it with some .html extension,
       | or maybe they prepend it to the match as a subdomain of it?
        
       | f4c39012 wrote:
       | CSP headers can leak urls, but I assume that isn't the cause here
       | if the subdomain is an entirely separate project
        
       | ThePowerOfFuet wrote:
       | Others are saying CT logs but my own subdomains are on wildcard
       | certificates, in which case I suspect they are discovered by DPI
       | analysis of DNS traffic and resold, such as by Team Cymru.
        
       | BLKNSLVR wrote:
       | There are a number of companies, not just Palo Alto Networks,
       | that perform various different scales of scans of the entire IPv4
       | space, some of them perform these scans multiple times per day.
       | 
       | I setup a set of scripts to log all "uninvited activity" to a
       | couple of my systems, from which I discovered a whole bunch of
       | these scanner "security" companies. Personally, I treat them all
       | as malicious.
       | 
       | There are also services that track Newly Registered Domains
       | (NRDs).
       | 
       | Tangentially:
       | 
       | NRD lists are useful for DNS block lists since a large number of
       | NRDs are used for short term scam sites.
       | 
       | My little, very amateur, project to block them can be found here:
       | https://github.com/UninvitedActivity/UninvitedActivity
       | 
       | Edited to add: Direct link to the list of scanner IP addresses
       | (although hasn't been updated in 8 months - crikey, I've been
       | busy longer than I thought):
       | https://github.com/UninvitedActivity/UninvitedActivity/blob/...
        
         | mr_mitm wrote:
         | Getting the domain name from the IP address is not trivial,
         | though. In fact, it should be impossible, if the name really
         | hasn't been published (barring guessing attempts), so OP's
         | question stands.
        
           | venj wrote:
           | I had this issue with internal domains indexed by Google. The
           | domains where not published anywhere by my company. They were
           | dcanned by leakix.net which apparently scans the whole web
           | for vulnerabilities and publishes web pages containing the
           | domain names associated with each IP address. I guess they
           | read them from the certificates
        
             | jhart99 wrote:
             | There is another source, SNI certs showing up on a server
             | or load balancer during the TLS handshake. When the client
             | tries to connect to a server using SNI without indicating
             | the server, some will reply with a default or give a list
             | of valid server names.
        
           | melevittfl wrote:
           | The OP is misunderstanding what's happened, based on what's
           | been posted. The OP has a server with an IP address. They're
           | seeing GET requests in the server's logs and is assuming
           | people have found the server's DNS name.
           | 
           | In fact, the scanners are simply searching the IP address
           | space and simply sending GET requests to any IP address they
           | find. No DNS discovery needed.
        
             | alfiedotwtf wrote:
             | Are you sure that's the case? IP addresses != domain, so
             | I'm getting bots are including the Host header in their
             | requests containing the obfuscated domain.
             | 
             | My guess is OP is using a public DNS server that sells
             | aggregated user requests. All it takes is one request from
             | their machine to a public machine on the internet, and it's
             | now public knowledge.
        
             | lxgr wrote:
             | That entirely depends on whether the GET requests were
             | providing the (supposed to be hidden) hostname in the
             | `Host` header (and potentially SNI TLS extension).
        
           | okasaki wrote:
           | $ host 209.216.230.207         207.230.216.209.in-addr.arpa
           | domain name pointer news.ycombinator.com.
        
             | mr_mitm wrote:
             | Not sure what you are trying to tell me. This isn't
             | guaranteed to work. If you define a reverse lookup record
             | for your domain, then that counts as published in my book.
        
               | drpossum wrote:
               | This is correct.
        
             | dspillett wrote:
             | That is when there is an explicit PTR record, for instance
             | one of my assigned addresses can be named that way due to:
             | 74.231.187.81.in-addr.arpa. 3600 IN PTR
             | ns2.nogoodnamesareleft.com.
             | 
             | in the zone file for that IPv4, but unless they've
             | explicitly configured, or are using a hosting service that
             | does it without asking, this it won't be what is happening.
             | 
             | It isn't practical to do a reverse lookup from "normal"
             | name-to-address records like
             | ns2.nogoodnamesareleft.com. IN A 81.187.231.74
             | 
             | (it is _possible_ to build a partial reverse mapping by
             | collecting a huge number of DNS query results, but not
             | really practical unless you are someone like Google or
             | Cloudflare running a popular resolution service)
        
             | DonHopkins wrote:
             | I love how the ARPANET still lives on through reverse DNS
             | PTRs.
             | 
             | https://www.youtube.com/watch?v=V78GUSOS-EM
        
         | yabones wrote:
         | I do something similar. Any hits on the default nginx vhost get
         | logged, logs get parsed out and "repeat offenders" get put on
         | the shitlist. I use ipset/iptables but this can also be done
         | with fail2ban quite simply.
         | 
         | https://nbailey.ca/post/block-scanners/
        
           | immibis wrote:
           | This is security theater.
        
             | Sohcahtoa82 wrote:
             | Only kinda.
             | 
             | Doing something like this can prevent you from showing up
             | on Shodan.io which is used by many users/bots to find
             | servers without running massive scans themselves.
        
         | drpossum wrote:
         | How does an ip scan help with general DNS resolution at all?
        
       | lockhead wrote:
       | Most likely passive DNS data, if you use your subdomain you do
       | DNS queries for it. If you use a DNS server to resolve your
       | domains that shares this data, it can be picked up by others.
        
       | nusl wrote:
       | It's pretty common to bruteforce subdomains of a domain you might
       | be interested in, specially by attackers.
        
       | xg15 wrote:
       | TIL (from this thread) : You can abuse TLS handshakes to
       | effectively reverse-DNS an IP address without ever talking to a
       | DNS server! Is this built into dig yet? :)
       | 
       | (Alright, _some_ IP addresses, not all of them)
       | 
       | I also wonder if this is a potential footgun for eSNI
       | deployments: If you add eSNI support to a server, you must
       | remember to also make regular SNI mandatory - otherwise, an
       | eavesdropper can just ask your server nicely for the domain that
       | the eSNI encryption was trying to hide from it.
        
         | yatralalala wrote:
         | Lifehack - it's especially awesome in cases where server
         | operator is using self-signed certs / private cert authorities.
         | Because you will not find these in public cert logs.
        
       | _trampeltier wrote:
       | Did you send a link over Email, Whatsapp or something like?
        
       | ralferoo wrote:
       | If you're using HTTPS, then you're probably using letsencrypt and
       | so your subdomain will appear on the CT logs that are publicly
       | accessible.
       | 
       | One thing you could do is use a wildcard certificate, and then
       | use a non-obvious subdomain from that. I actually have something
       | similar - in my set up, all my web-traffic goes to haproxy
       | frontends which forward traffic to the appropriate backend, and I
       | was sick of setting up multiple new certificates for each new
       | subdomain, so I just replaced them all with a single wildcard
       | cert instead. This means that I'm not advertising each new
       | subdomain on the CT list, and even though they all look nominally
       | the same when visiting - same holding page on index and same /api
       | handling, just one of the subdomains decodes an additional URL
       | path that provides access to status monitoring.
       | 
       | Separately, that Palo Alto Networks company is a real pain. They
       | connect to absolutely everything in their attempts to spam the
       | internet. Frankly, I'm sick of even my mail servers being
       | bombarded with HTTP requests on port 25 and the resultant log
       | spam.
        
       | clvx wrote:
       | Put it behind ipv6 and it won't likely happen again. The address
       | space is massive
        
       | supermatt wrote:
       | 1) Are you sure that they are using the subdomain? They could be
       | connecting via IP or an alternate host address.
       | 
       | 2) Are you using TLS? Unless you are using a wildcard cert, then
       | the FQDN will have been published as part of the certificate
       | transparency logs.
        
       | mightybyte wrote:
       | If you've made any kind of DNS entries involving this subdomain,
       | then congratulations, you've notified the world of its existence.
       | There are tools out there that leverage this information and let
       | you get all the subdomains for a domain. Here's the first one I
       | found in a quick search:
       | 
       | https://pentest-tools.com/information-gathering/find-subdoma...
        
       | yatralalala wrote:
       | Hi, our company does this basically "as-a-service".
       | 
       | The options how to find it are basically limitless. Best source
       | is probably Certificate Transparency project as others suggested.
       | But it does not end there, some other things that we do are
       | things like internet crawl, domain bruteforcing on wildcard dns,
       | dangling vhosts identification, default certs on servers (connect
       | to IP on 443 and get default cert) and many others.
       | 
       | Security by obscurity does not work. You can not rely on "people
       | won't find it". Once it's online, everyone can find it. No matter
       | how you hide it.
        
         | TZubiri wrote:
         | "Security by obscurity does not work"
         | 
         | This is one of those false voyeur OS internet tennets designed
         | to get people to publish their stuff.
         | 
         | Obscurity is a fine strategy, if you don't post your source
         | that's good. If you post your source, that's a risk.
         | 
         | The fact that you can't rely on that security measure is just a
         | basic security tennet that applies to everything: don't rely on
         | a single security measure, use redundant barriers.
         | 
         | Truth is we don't know how the subdomain got leaked. Subdomains
         | can be passwords and a well crafted subdomain should not leak,
         | if it leaks there is a reason.
        
           | zevlag wrote:
           | > Subdomains can be passwords and a well crafted subdomain
           | should not leak,
           | 
           | I disagree. A subdomain is not secret in any way. There are
           | many ways in which it is transmitted unencrypted. A couple:
           | 
           | - DNS resolution, multiple resolvers and authoritative
           | servers - TLS SNI - HTTP Host Header
           | 
           | There are many middle boxes that could perform safety checks
           | on behalf of the client, and drop it into a list to be
           | rescanned.
           | 
           | - Virus Scanners - Firewalls - Proxies
        
             | dharmab wrote:
             | I once worked for a company which was using a subdomain of
             | an internal development domain to do some completely
             | internal security research on our own products. The entire
             | domain got flagged in Safe Browsing despite never being
             | exposed to the outside world. We think Chrome's telemetry
             | flagged it, and since it was technically routable as a
             | public IP (all public traffic on that IP was blackholed),
             | Chrome thought it was a public website.
        
               | mkl95 wrote:
               | I saw a similar thing happen with a QA team's domains.
               | Google flagged them as malicious and the company never
               | managed to get them unflagged.
        
               | dharmab wrote:
               | Our lawyers knew their lawyers so there was a friendly
               | chat and we got added to an internal whitelist within
               | Google.
        
             | TZubiri wrote:
             | >It's not encrypted in transit
             | 
             | Agree.
             | 
             | But who said that all passwords or shiboleths should all be
             | encrypted in transit?
             | 
             | It can serve as a canary for someone snooping your traffic.
             | Even if you encrypt it, you don't want people snooping.
             | 
             | To date of my subdomains that I never publish, I haven't
             | had anyone attempting to connect with them.
             | 
             | It's one of those redundant measures.
             | 
             | And it's also one of those risks that you take, you can
             | maximize security by staying at home all day, but going out
             | to take the trash is a calculated risk that you must take
             | or risk overfocusing on security.
             | 
             | It's similar to port knocking. If you are encrypting it,
             | it's counterproductive, it's a low effort finishing touch,
             | like a nice knot.
        
           | lolinder wrote:
           | Truth is we don't know _that_ the subdomain got leaked. The
           | example user agent they give says that the methodology they
           | 're using is to scan the IPv4 space, which is a great example
           | of why security through obscurity doesn't work here: The IPv4
           | space is tiny and trivial to scan. If your server has an IPv4
           | address it's not obscure, you should assume it's publicly
           | reachable and plan accordingly.
           | 
           | > Subdomains can be passwords and a well crafted subdomain
           | should not leak, if it leaks there is a reason.
           | 
           | The problem with this theory is that DNS was never designed
           | to be secret and private and even after DNS over HTTPS it's
           | _still_ not designed to be private for the servers. This
           | means that getting to  "well crafted" is an incredibly
           | difficult task with hundreds of possible failure modes which
           | need constant maintenance and attention--not only is it
           | complicated to get right the first time, you have to
           | reconfigure away the failure modes on every device or even on
           | every use of the "password".
           | 
           | Here are just a few failure modes I can think of off the top
           | of my head. Yes, these have mitigations, but it's a game of
           | whack-a-mole and you really don't want to try it:
           | 
           | * Certificate transparency logs, as mentioned.
           | 
           | * A user of your "password" forgets that they didn't
           | configure DNS over HTTPS on a new device and leaves a trail
           | of logs through a dozen recursive DNS servers and ISPs.
           | 
           | * A user has DNS over HTTPS but doesn't point it at a server
           | within your control. One foreign server having the password
           | is better than dozens and their ISPs, but you don't have any
           | control over that default DNS server nor how many different
           | servers your clients will attempt to use.
           | 
           | * Browser history.
           | 
           | Just don't. Work with the grain, assume the subdomain is
           | public and secure your site accordingly.
        
             | immibis wrote:
             | > The IPv4 space is tiny and trivial to scan
             | 
             | Something many people don't expect is that the IPv6 space
             | is also tiny and trivial to scan, if you follow certain
             | patterns.
             | 
             | For example, many server hosts give you a /48 or /64
             | subnet, and your server is at your prefix::1 by default. If
             | they have a /24 and they give you a /48, someone only has
             | to scan 2^24 addresses at that host to find all the ones
             | using prefix::1.
        
               | Sayrus wrote:
               | Assuming everyone is using /48 and binding to prefix::1,
               | that's a 2^16 difference with scanning the IPv4 address
               | space. Assuming a specific host with only one IPv6 /24
               | block and delegating /64, this is a 2^12 difference.
               | Scanning for /64 on the entire IPv6 space is definitely
               | not as tiny.
               | 
               | AWS only allows routing /80 to EC2 instances making a
               | huge difference.
               | 
               | It doesn't mean that we should rely on obscurity, but the
               | entire space is not tiny as IPv4 was.
        
               | TZubiri wrote:
               | Interesting, so you may see the Ipv6 space as a tree, and
               | go just for the first addresses of the block.
               | 
               | But if you just choose a random address you would enjoy a
               | bit more immunity from brute force scanners here.
        
               | AStonesThrow wrote:
               | IPv6 address space may be trivial from this perspective,
               | but imagine trying to establish two-way contact with a
               | user on a smartphone on a mobile network. Or a user whose
               | Interface ID (64 bits) is regenerated randomly every few
               | hours.
               | 
               | Just try leaving a User Talk page message on Wikipedia,
               | and good luck if the editor even notices, or anyone finds
               | that talk page again, before the MediaWiki privacy
               | measures are implemented.
        
           | lyu07282 wrote:
           | > Obscurity is a fine strategy
           | 
           | > Subdomains can be passwords and a well crafted subdomain
           | should not leak
           | 
           | Your comment is really odd to read I'm not sure I understand
           | you, but I'm sure you don't mean it like that. Just to re-
           | iterate the important points:
           | 
           | 1. Do not rely on subdomains for security, subdomains can
           | easily leak in innumerable ways including in ways outside of
           | your control.
           | 
           | 2. Security by obscurity must never be relied on for security
           | but can be part of a larger defense in depth strategy.
           | 
           | ---
           | 
           | https://cwe.mitre.org/data/definitions/656.html
           | 
           | > This reliance on "security through obscurity" can produce
           | resultant weaknesses if an attacker is able to reverse
           | engineer the inner workings of the mechanism. Note that
           | obscurity can be one small part of defense in depth, since it
           | can create more work for an attacker; however, it is a
           | significant risk if used as the primary means of protection.
        
             | TZubiri wrote:
             | It's a pretty weak cve category.
             | 
             | "The product uses a protection mechanism whose strength
             | depends heavily on its obscurity, such that knowledge of
             | its algorithms or key data is sufficient to defeat the
             | mechanism."
             | 
             | If you can defeat the mechanism, that's not very impactful
             | if it's one stage of a multi-round mechanism. Especially if
             | vulnerating or crossing that perimeter alerts the admin!
             | 
             | Lots of uncreative blue teamers here
        
           | yapyap wrote:
           | > This is one of those false voyeur OS internet tennets
           | designed to get people to publish their stuff.
           | 
           | No it isn't, it's a push to get people to login protect
           | whatever they want to keep to themselves.
           | 
           | It's silly to say informing people that security through
           | obscurity is a weak concept is trying to convince them to
           | publish their stuff.
        
             | HeatrayEnjoyer wrote:
             | If security through obscurity didn't provide any benefit
             | then governments wouldn't have built entire frameworks for
             | protecting classified information.
        
               | ehutch79 wrote:
               | So the only thing protecting classified docs is the
               | public not knowing where they are? That's what security
               | through obscurity is.
        
           | yatralalala wrote:
           | So many thoughts on that, but from my perspective - obscurity
           | is ok, but you can not depend on it at all.
           | 
           | Great example is port knocking - it hides your open port from
           | random nmap, but would you leave it as the only mechanism
           | preventing people getting to your server? No. So does it make
           | sense to have it? Well maybe, it's a layer.
           | 
           | Kerckhoffs' principle comes to my mind as well here.
           | 
           | So while I agree with you on that's obscurity is fine
           | strategy, you can never depend on it ever.
        
             | marcosdumay wrote:
             | As long as you don't go into "nah, I have another
             | protection barrier, I don't need the best possible security
             | for my main barrier" mode...
             | 
             | Or in other words, if you place absolutely zero trust in
             | it, consider it as good as broken by every single script
             | kid, and publicly known, then yeah, it's fine.
             | 
             | But then, why are you investing time into it? Almost
             | everybody that makes low-security barriers is relying on
             | it.
        
           | sim7c00 wrote:
           | making things obscure and hard to find is indeed a sound
           | choice, as long as its not the single measure taken. i think
           | people tout this sentence because its popular to say it,
           | without thinking further.
           | 
           | you dont put an unauthenticated thing in a difficult to find
           | subdomain and call it secure. but your nicely secured page is
           | more secure if its also very tedious to find. its a less low
           | hanging fruit.
           | 
           | as you state also there is always a leak needed. but dns
           | system is quite leaky. and often sources wont fix or wont
           | admit its even broken by their design.
           | 
           | strong passwords are also insecure if they leak, so you
           | obscure them from prying eyes, securing it by obscurity.
        
             | TZubiri wrote:
             | A lot of the pushback I'm seeing is that people are
             | assuming that you always want to make things more secure.
             | That security is a number that needs to go up, like income
             | or profit, as opposed to numbers that need to go down, like
             | cost and taxes.
             | 
             | The possibility that I'm adding this feature to something
             | that would otherwise have been published on a public domain
             | does not cross people's mind, so it is not thought of an
             | additional security measure, but a removal of a security
             | feature.
             | 
             | Similarly it is assumed that there's an unauthenticated or
             | authentication mechanism behind the subdomain. There may be
             | a simple idempotent server running, such that there is no
             | concern for abuse, but it may be desirable to reduce the
             | code executed by random spearfishing scanners that only
             | have an IP.
             | 
             | This brings me again to the competitive economic take on
             | the subject, that people believe that this wisdom nugget
             | they hold "that security by obscurity" is a valuable
             | tennet, and they bet on it and desperately try to find
             | someone to use it on. You can tell when a meme is
             | overvalued because they try to use it on you even if it
             | doesn't fit, it means they are dying to actually apply it.
             | 
             | My bet is that "Security through obscurity" is undervalued,
             | not as a rule or law, or a definite thing, but as a basic
             | correlation: keep a low profile, and you'll be safer. If
             | you want to get more sales, you will need to be a bit more
             | open and transparent and that will expose you to more risk,
             | same if you want transparency for ethical or regulation
             | reasons. You will be less obscure and you will need to
             | compensate with additional security mechanisms.
             | 
             | But it seems evident to me that if you don't publish your
             | shit, you are going to have much less risk, and need to
             | implement less security mechanisms for the same risks as
             | compared to voicing your infrastructure and your business,
             | duh.
        
           | 1970-01-01 wrote:
           | It's become an anti-cliche. Security via obscure technique is
           | a valid security layer in the exact same way a physical lock
           | tumbler will not unlock when any random key is inserted and
           | twisted. It's not great but it's not terrible and it does a
           | fine job until someone picks or breaks it open.
        
             | gitgud wrote:
             | I don't think that analogy works well, a subdomain that is
             | not published is more like hiding the key to the front door
             | in the garden somewhere... does a fine job of keeping the
             | house secure until someone finds it...
        
               | TZubiri wrote:
               | Terrible analogy.
               | 
               | Why not use letters and packages which is the literal
               | metaphor these services were built on?
               | 
               | It's like relying on public header information to
               | determine whether an incoming letter or package is
               | legitimate.
               | 
               | If it says: To "Name LastName" or "Company", then it's
               | probably legitimate. Of course it's no guarantee, but it
               | filters the bulk of Nigerian Prince spam.
               | 
               | It gets you past the junk box, but you don't have to
               | trust it with your life.
               | 
               | Nuance.
        
             | lxgr wrote:
             | Keeping a key secret is not security by obscurity, but
             | keeping the existence of a lock secret is.
        
           | legitster wrote:
           | > "Security by obscurity does not work"
           | 
           | Depends on the context and exposure. Sometimes a key under a
           | rock is perfectly fine.
           | 
           | I used to work for a security company that REALLY oversold
           | security risks to sell products.
           | 
           | The idea that someone was going to wardrive through your
           | suburban neighborhood with a networked cluster of GPUs to
           | crack your AES keys and run a MITM attack for web traffic is
           | honestly pretty far fetched unless they are a nation-state
           | actor.
        
             | natebc wrote:
             | Realistically we get into $3 wrench territory pretty
             | quickly too.
        
               | throwway120385 wrote:
               | They could also just cut and tip both ends of the
               | Ethernet cable I have running between my house and my
               | outbuilding too. I probably wouldn't notice if I'm
               | asleep.
        
               | TZubiri wrote:
               | Metaforgotten, but this is a very standard attack
               | surface, you don't need to imagine such a close tap, just
               | imagine that at any point in the multi node internet an
               | attacker has a node and snoops the traffic in its role as
               | a relaying router.
        
               | ninju wrote:
               | With inflation looks like its now a $5 wrench :-)
               | 
               | https://xkcd.com/538/
        
               | natebc wrote:
               | AmazonBasics is good enough in this case! ;)
        
           | bob1029 wrote:
           | Obscurity can be fantastic.
           | 
           | One of my favorite patterns for sending large files around is
           | to drop them in a public blob storage bucket with a type 4
           | guid as the name. No consumer needs to authenticate or sign
           | in. They just need to know the resource name. After a period
           | of time the files can be automatically expired to minimize
           | the impact of URL sharing/stealing.
        
           | unethical_ban wrote:
           | No, it's a very sensible slogan to keep people from doing a
           | common, bad thing.
           | 
           | Obscurity helps cut down on noise and low effort attacks and
           | scans. It only helps as a security mechanism in that the
           | remaining access/error logs are both fewer and more
           | interesting.
        
             | TZubiri wrote:
             | I definitely see it's value as a very naive recommendation
             | to avoid someone literally relying on an algorithmic or low
             | entropy secret. Literally something you may learn on your
             | first class on security.
             | 
             | However on more advanced levels, a more common error is to
             | ignore the risks of open source and being public. If you
             | don't publish your source code, you are massively safer,
             | period.
             | 
             | I guess your view on the subject depends on whether you
             | think you are ahead of the curve by taking the naive
             | interpretation. It's like investing in the stock market
             | based on your knowledge of supply and demand.
        
           | batch12 wrote:
           | Obscurity as a single control does not work. That's what the
           | phrase hints at. In combination with other controls, it could
           | be part of an effective defense. Context matters though.
        
           | 0hijinks wrote:
           | Depending on one's threat model, any technique can be a
           | secure strategy.
           | 
           | Is my threat model a network of dumb nodes doing automatic
           | port scanning? Tucking a system on an obscure IPv6 address
           | and never sharing the address may work OK. Running some
           | bespoke, unauthenticated SSH-over-Carrier-Pigeon (SoCP)
           | tunnel may be fine. The adversaries in the model are pretty
           | dumb, so intrusion detection is also easy.
           | 
           | But if the threat model includes any well-motivated,
           | intelligent adversary (disgruntled peer, NSA, evil ex-
           | boyfriend), it will probably just annoy them. And as a bonus,
           | for my trouble, it will be harder to maintain going forward.
        
             | TZubiri wrote:
             | It's a bit more complex than that as well. You might have
             | attackers of both types and different datapoints that have
             | different security requirements. And these are not
             | necessarily scalars, you may need integrity for one,
             | privacy for the other.
             | 
             | Even when considering hi sophistication attackers, and
             | perhaps especially with regards to them, you may want to
             | leave some breadcrumbs for them to access your info.
             | 
             | If the deep state wants my company's info, they can safely
             | get it by subpoenaing my provider's info, I don't need to
             | worry about them as an attacker for privacy, as they have
             | the access to the information if needed.
             | 
             | If your approach to security is to add cryptography
             | everywhere and make everything as secure as possible and
             | imagine that you are up against a nation-state adversary
             | (or conversely, that you add security until you satisfy a
             | requirement conmesurate with your adversary), then you are
             | literally reducing one of the most important design
             | requirements of your system to a single scalar that you
             | attempt to maximize while not compromising other tradeoffs.
             | 
             | A straightforward lack of nuance. It's like having a tax
             | strategy consisting of number go down, or pricing strategy
             | of price go up, or cost strategy of cost go down, or risk
             | strategy of no risk for me, etc...
        
           | lxgr wrote:
           | The only thing you're definitely complicating with security
           | by obscurity is getting a clear picture of your own security
           | posture.
        
           | wolrah wrote:
           | > "Security by obscurity does not work"
           | 
           | The saying is "security by obscurity is not security" which
           | is absolutely true.
           | 
           | If your security relies on the attacker not finding it or not
           | knowing how it works, it's not actually secure.
           | 
           | Obscurity has its own value of course, I strongly recommend
           | running any service that's likely to be scanned for regularly
           | on non-standard ports wherever practical simply to reduce the
           | number of connection logs you need to sort through. Obscurity
           | works for what it actually offers. That has nothing to do
           | with security though, and unfortunately it's hard in cases
           | where a human is likely to want to type in your service
           | address because most user-facing services have little to no
           | support for SRV records.
           | 
           | Two of the few services that do have widespread SRV support
           | are SIP VoIP and Minecraft, and coincidentally the former is
           | my day job while I've also run a personal Minecraft server
           | for over a decade. I can say that the couple of systems I
           | still have running public-facing SIP on port 5060 get scanned
           | tens of thousands of times per hour while the ones running on
           | non-standard ports get maybe one or two activations of
           | fail2ban a month. Likewise my Minecraft server has never seen
           | a single probe from anyone other than an actual player.
        
             | TZubiri wrote:
             | >"If your security relies on "
             | 
             | Again, if your security relies on any one thing, it's a
             | problem. A secure system needs redundant mechanisms.
             | 
             | Can you think of a single mechanism that if implemented
             | would make a system secure? I think not.
        
           | Diggsey wrote:
           | This is the worst take...
           | 
           | People consistently misuse the Swiss cheese security metaphor
           | to justify putting multiple ineffective security barriers in
           | place.
           | 
           | The holes in the cheese are supposed to represent _unknown_
           | or very difficult to exploit flaws in your security layers,
           | and that 's why you ideally want multiple layers.
           | 
           | You can't just stack up multiple known to be broken layers
           | and call something secure. The extra layers are inconvenient
           | to users and readily bypassed by attackers by simply tackling
           | them one at a time.
           | 
           | Security by obscurity is one such layer.
        
             | TZubiri wrote:
             | So according to you, a picket fence or a wire fence is just
             | a useless thing that makes things less usable by users?
             | 
             | Security does not consist only of 100% or 99.99% effective
             | mechanisms, there needs to be a flow of information and an
             | inherent risk, if you are only designing absolute barriers,
             | then you are rarely considering the actual surface of
             | relevant user interactions. A life form consisting only of
             | skin might be very secure, but it's practically useless.
        
         | nkmnz wrote:
         | > other things that we do are things like (...) domain
         | bruteforcing on wildcard dns
         | 
         | Are you proud of the work you do?
        
           | remlov wrote:
           | If you look at the company they founded it's a service to
           | protect yourself. Not to willy-nilly go out into the open web
           | to find hidden subdomains.
        
           | tmerc wrote:
           | Why would enumerating a wildcard dns through brute force be
           | something that evokes pride or shame?
        
             | yatralalala wrote:
             | I sadly did not see the comment above, but I'd like to just
             | add, that this bruteforce and sniffing methods are target
             | only against our paying customers.
             | 
             | We built global reverse-DNS dataset solely from cert
             | transparency logs. Our active scanning/bruteforcing runs
             | only for assets owned by our customers.
        
               | 6stringmerc wrote:
               | ...as long as your tools are only in your hands to be
               | used, correct? Once a tool is created and used on a
               | machine with access to the greater internet, doesn't your
               | logic hold that its security is compromised inherently?
               | Not saying you have been infiltrated, or a rogue employee
               | has cleverly exported a copy or the methodology to
               | duplicate it off-site, but I'm not saying that hasn't
               | happened either.
        
               | cryptonector wrote:
               | It's not that hard to write this code. It's not a nuclear
               | weapon.
        
               | lkt wrote:
               | You can find a dozen projects on Github that do this,
               | it's not sensitive information that needs protecting
        
           | ivell wrote:
           | Irrespective of whether they are proud of what they are
           | doing, I found the post helpful and educational. Let's not
           | prevent people from sharing their knowledge as it might help
           | us to protect ourselves. A consequence of such line of
           | questioning would be that in future they would be hesitant to
           | share their knowledge to avoid being judged.
        
           | lxgr wrote:
           | Given that bad actors can also do this, I'd say that publicly
           | advertising the fact and thereby drawing attention to
           | misconceptions about security is a net good thing.
        
         | amelius wrote:
         | Well, I sure hope the remainder of my URLs are safe.
        
           | amelius wrote:
           | Like, in: example.com/secret-id-48723487345
           | 
           | I hope the last bit is not leaked somehow (?)
           | 
           | Btw, we need a "falsehoods programmers believe about URLs"
           | ...
           | 
           | Although there is: https://www.netmeister.org/blog/urls.html
        
             | idoubtit wrote:
             | > Although there is:
             | https://www.netmeister.org/blog/urls.html
             | 
             | I think the section named "Pathname" is wrong. It describes
             | the path of an URL as if every server was Apache serving
             | static files with its default configuration. It should
             | describe how the path is converted into a HTTP request.
             | 
             | For instance, the article states that "all of these go to
             | the same place : https://example.org https://example.org/
             | https://example.org//
             | https://example.org//////////////////". That's wrong. A web
             | client send a distinct HTTP request for each case, e.g
             | starting with `GET // HTTP/1.1`. So the server will receive
             | distinct paths. The assertion of "going to the same place"
             | makes no sense in the general case.
        
         | 1970-01-01 wrote:
         | Subdomainfinder.com ??
         | 
         | Dozens of others will also find it.
         | 
         | Really, it's this simple today.
        
           | binarymax wrote:
           | I think your comment resulted in a hug of death for that
           | service ;)
        
           | yatralalala wrote:
           | Sorry for a bit of self promo, but just to explain we run
           | https://reconwave.com/, basically EASM product but more
           | focused on network/DNS/setup level.
           | 
           | Finding all things about domains is one of the things that we
           | do. And yes, it's very easy.
           | 
           | There are many services like subdomainfinder - i.e.
           | dnsdumpster and merklemap. We built our own as well on
           | https://search.reconwave.com/. But it's a side project and it
           | does not pay our bills.
        
         | cryptonector wrote:
         | > Security by obscurity does not work. You can not rely on
         | "people won't find it". Once it's online, everyone can find it.
         | No matter how you hide it.
         | 
         | Especially do not name your domainnames in a way that leaks
         | MNPI! Like, imagine if publicly traded companies A and B were
         | discussing a merger or acquisition, do not name your domainname
         | A-and-B.com, m'kay?
        
         | cryptonector wrote:
         | DANE would help here: register a harmless sounding domainname
         | whose name leaks nothing, use DNSSEC and NSEC3, and host your
         | hidden service in a sub-domain whose name is a 63 byte long
         | string of randomly selected ASCII characters. But this isn't
         | really an option.
        
           | Dylan16807 wrote:
           | Why the DNSSEC, which then requires NSEC3? Shouldn't a
           | wildcard certificate do the job in conjunction with normal
           | unsigned DNS?
        
         | no-dr-onboard wrote:
         | Hi, former pentester here. If any one of your trusted clients
         | is using a google/chromium based browser, the telemetry from
         | that browser (webdiscovery) would reveal the existence of the
         | subdomain in question. As others have said, security by
         | obscurity doesn't work.
        
           | geek_at wrote:
           | Current pen tester here and this guy is right. There was a
           | Google blog post years ago where Google planted a site with
           | an unguessable url and indexed it and used edge to surf on
           | the site. Shortly after this site was also listed on Bing.
           | 
           | Google had a "gotcha" moment when Microsoft responded
           | basically with "yeah we didn't steal it from Google, you had
           | telemetry enabled"
           | 
           | Total shitshow
        
         | AtNightWeCode wrote:
         | So to mostly prevent this.
         | 
         | Disable direct IP access. Use wildcard certificates. Don't use
         | guessable subdomains like www or mail.
        
       | AtNightWeCode wrote:
       | Assuming this is not direct traffic to your IP people will say it
       | is because of TLS logs. Maybe it is in your case. But if you spin
       | up a CF worker on a subdomain to it you will also get hit by
       | traffic immediately. And those certificates are wildcards. I
       | think CF leaks subdomains in some cases. Never seen this behavior
       | when using CF just as a DNS server though.
        
       | jcalx wrote:
       | Some bots scan using giant lists of subdomains, e.g.
       | https://github.com/danielmiessler/SecLists/tree/master/Disco....
       | Your subdomain may be on that giant combined_subdomains list, or
       | perhaps some other lists that other tools use.
        
       | TZubiri wrote:
       | Maybe it's a cloudflare controlled scanner?
       | 
       | Maybe you published the subdomain in a cert?
       | 
       | Snooped traffic is unlikely.
       | 
       | This is a good question, if you don't publish a subdomain,
       | scanners should not reach it. If they do, there's a leak in your
       | infra.
        
       | CGamesPlay wrote:
       | Be careful with these. I had a subdomain like this (completely
       | unlisted) with a Google OAuth flow on it, using a development
       | mode Google app. Somehow, the domain was discovered, and Google
       | decided that using their OAuth flow was a phishing scam, and
       | delisted my entire toplevel domain as a result!
        
         | yoavm wrote:
         | What do you mean "careful with these"? With subdomains?
        
           | CGamesPlay wrote:
           | Yes, unlisted subdomains. I updated my post to be clearer.
        
             | joshstrange wrote:
             | I must be missing something. What does "unlisted" mean in
             | this context?
             | 
             | I have plenty of subdomains I don't "advertise" (tell
             | people about online) but "unlisted" is a weird thing to
             | call those. Also I don't see how it would matter at all
             | when it comes to Google auth.
             | 
             | My guess is they blocked it based on the subdomain name
             | itself. I made a "steamgames" subdomain to list stream
             | games I have extra copies of (from bundles) for friends to
             | grab for free. Less than a day after I put it up I started
             | getting chrome scare pages. I switched it to "games" and
             | there have been no issues.
        
       | fsflover wrote:
       | Could it be that Chrome shared the web page with advertisers?
       | 
       | https://www.ghacks.net/2021/03/16/wonder-about-the-data-goog...
        
       | perching_aix wrote:
       | Using the Certificate Transparency logs I'd imagine.
       | 
       | Also note that _your domains are live_ as they 're allocated
       | (they exist). Whether a web server or anything else actually
       | backs them is a different question entirely.
       | 
       | For "secret" subdomains, you'll want a wildcard certificate. That
       | way only that will show on the CT logs. Note that if you serve
       | over IPv4, the underlying host will be eventually discovered
       | anyways by brute-force host enumeration, and the domain can still
       | be discovered using dictionary attacks / enumeration.
       | 
       | Never touched Cloudflare so this is as far as I can help you.
        
       | immibis wrote:
       | Additionally to what other people said, you can assume Cloudflare
       | is selling lists of DNS names to someone.
        
       | bbarnett wrote:
       | If you ever email a link and it hits gmail, Google will index it.
        
       | whalesalad wrote:
       | ICANN zone files -
       | https://www.icann.org/resources/pages/czds-2014-03-03-en
        
       | zeagle wrote:
       | Can I ask an adjacent question? I have a bunh of DNS A name
       | entries for locallyaccessedservice.mydomain.tld point to my
       | 10.0.0.x NAS's nginx reverse proxy so I can use HTTPS and DNS to
       | access them locally and via Tailscale. My cert is for
       | *.domain.tld. It's nothing critical and only accessible within my
       | LAN, but is there any reason I shouldn't be doing this from a
       | security point of view? I guess someone could phish that to
       | another globally accessible server if DNS changed and I wouldn't
       | notice but I don't see how that would be an issue. There are a
       | couple nginx services exposed to public but not those specific
       | domains so I guess that is an attack vector since.
        
         | yatralalala wrote:
         | As always, depends on your threat model. Generally having
         | private IPs in public DNS is not great, because potential
         | attacker gets "a general idea" how your private net looks like.
         | 
         | But I'd say there's no issue if everything else is secured
         | properly.
        
           | zeagle wrote:
           | Great thank you. I've mulled around running separate reverse
           | proxies for public and internal services instead.
        
       | Gabrys1 wrote:
       | > Expanse, a Palo Alto Networks company, searches across the
       | global IPv4 space multiple
       | 
       | So my guess is reverse DNS
        
       | itscrush wrote:
       | > I am using CloudFlare for my DNS.
       | 
       | Based on this it sounds like you exposed your resource and
       | advertised it for others. Reverse dns, get IP, scan IP.
       | 
       | Probably simpler, you exposed resource on IPV4 publicly, if it
       | exists, it'll be scanned. There's probably 100s of companies
       | scanning entire 0.0.0.0/0 space at all times.
        
       | eat wrote:
       | DNS enumeration (brute force) with a good wordlist, zone
       | transfer, or leaking the name through a certificate served when
       | accessing your host via IP address are all possibilities.
       | 
       | The name "userfileupload" is far from not-obvious, so that would
       | be my guess.
        
       | aspbee555 wrote:
       | cloudflare uses certificates with numerous other site names
       | included on the certificate as alt names so your site name could
       | have been discovered by any other site that happens to use that
       | same cert
        
       | 1vuio0pswjnm7 wrote:
       | Why not experiment with multiple variations. For example, as part
       | of the experiment, run own DNS, use non-standard DNS encryption
       | like CurveDNS, or even no DNS at all, use non-standard port for
       | HTTPS, self-signed CA, TLS with no SNI extension, or even
       | TCPCurve instead of CAs and TLS. If non-discoverability is the
       | goal, there are inifinite ways to deviate from web developer
       | norms.
       | 
       | If "the internet fails to find the subdomain" when using non-
       | standard practices and conventions then perhaps "following the
       | internet's recommendations", e.g., use Cloudflare, etc., might be
       | partially at cause for discoverability.
       | 
       | Would be surprised if Expanse scans more than a relatively small
       | selection of common ports.
        
       | codazoda wrote:
       | This discussion makes me wonder, how hard is it to find a Google
       | Document that was shared with "Anyone with the link"?
        
       | oliwarner wrote:
       | Certificate Transparency would also be my guess. These are logs
       | published by big TLS certificate issuers to cross-check and make
       | sure they're not issuing certificates for domains they have no
       | standing on.
       | 
       | The way around this is to issue a wildcard for your root domain
       | and use that. Your main domain is discoverable but your subs
       | aren't.
       | 
       | There are other routes: leaky extensions, leaky DNS servers, bad
       | internet security system utilities that phone home about traffic.
       | Who knows?
       | 
       | Unless your IP address redirects to your subdomain --not unheard
       | of-- it's not somebody IP/port scanning. Webservers don't
       | typically leak anything about the domains they serve for.
        
       ___________________________________________________________________
       (page generated 2025-03-07 23:01 UTC)