[HN Gopher] Ask HN: How did the internet discover my subdomain?
___________________________________________________________________
Ask HN: How did the internet discover my subdomain?
I have a domain that is not live. As expected, loading the domain
returns: Error 1016. However...I have a subdomain with a not
obvious name, like: userfileupload.sampledomain.com This subdomain
IS LIVE but has NOT been publicized/posted anywhere. It's a custom
URL for authenticated users to upload media with presigned url to
my Cloudflare r2 bucket. I am using CloudFlare for my DNS. How
did the internet find my subdomain? Some sample user agents are:
"Expanse, a Palo Alto Networks company, searches across the global
IPv4 space multiple times per day to identify customers'
presences on the Internet. If you would like to be excluded from
our scans, please send IP addresses/domains to:
scaninfo@paloaltonetworks.com", "Mozilla/5.0 (Macintosh; U; Intel
Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko)
Version/5.1 Safari/534.20.8", "Mozilla/5.0 (Linux; Android 9; Redmi
Note 5 Pro) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/76.0.3809.89 Mobile Safari/537.36", The bots are GET
requests which are failing, as designed, but I'm wondering how the
bots even knew the subdomain existed?!
Author : govideo
Score : 220 points
Date : 2025-03-06 22:34 UTC (1 days ago)
| codingdave wrote:
| If it is on DNS, it is discoverable. Even if it were not, the
| message you pasted says outright that they scan the entire IP
| space, so they could be hitting your server's IP without having a
| clue there is a subdomain serving your stuff from it.
| govideo wrote:
| Ahh yeah, my internet network knowledge was never super strong,
| and now is rusty to boot. Thanks for your note.
| paulnpace wrote:
| Shouldn't the web server only respond to a configred domain,
| else 404?
| precommunicator wrote:
| Depends if it's configured like that, by default usually no
| EQYV wrote:
| Question: How does a subdomain get discovered by a member of
| the public if there are no references to it anywhere online?
|
| The only thing I can think of that would let you do that would
| be a DNS zone transfer request, but those are almost always
| disallowed from most origin IPs.
|
| https://en.m.wikipedia.org/wiki/DNS_zone_transfer
| arccy wrote:
| you also have zone walking with DNS NSEC
|
| https://www.domaintools.com/resources/blog/zone-walking-
| zone...
| dnsfax wrote:
| Certificate transparency logs.
| yatralalala wrote:
| See my comment above
| https://news.ycombinator.com/item?id=43289743 there are many
| techniques!
| dnsfax wrote:
| If you know what to query, sure. You can't just say "give me
| all subdomains"; it doesn't work that way. The subdomain was
| discovered via certificate transparency logs.
| alexjplant wrote:
| > If it is on DNS, it is discoverable.
|
| In the context of what OP is asking this is not true. DNS zones
| aren't enumerable - the only way to reliably get the complete
| contents of the zone is to have the SOA server approve a zone
| transfer and send the zone file to you. You can ask if a record
| in that zone exists but as a random user you can't say "hand
| over all records in this zone". I'd imagine that tools like
| Cloudflare that need this kind of functionality perform a
| dictionary search since they get 90% of records when importing
| a domain but always seem to miss inconspicuously-named ones.
|
| > Even if it were not, the message you pasted says outright
| that they scan the entire IP space, so they could be hitting
| your server's IP without having a clue there is a subdomain
| serving your stuff from it.
|
| This is likely what's happening. If the bot isn't using SNI or
| sending a host header then they probably found the server by
| IP. The fact that there's a heretofore unknown DNS record
| pointing to it is of no consequence. *EDIT: Or the Cert
| Transparency log as others have mentioned, though this isn't
| DNS per se. I learn something new every day :o)
| fulafel wrote:
| In practice it's not so far fetched: A zone transfer is just
| another dns query at the protocol level, i suppose you can
| conceptually view it as sending a file if you consider the
| dns response a file. Something like "host -t axfr my.domain
| ns1.my.domain" will show the zone depending on how a domain's
| name server is configured (eg in bind, allow-transfer
| directive can be used to make it public, require ip acl to
| match the query source, etc).
| elric wrote:
| No sensible DNS provider has zone transfers enabled by
| default. OP mentioned using CloudFlare, and they certainly
| don't.
| alexjplant wrote:
| > in bind, allow-transfer directive
|
| Configuring BIND as an authoritative server for a corporate
| domain when I was a wee lad is how I learned DNS. It was
| and still is bad practice to allow zone transfers without
| auth. If memory serves I locked it down between servers via
| key pairs.
| walrus01 wrote:
| > In the context of what OP is asking this is not true. DNS
| zones aren't enumerable - the only way to reliably get the
| complete contents of the zone is to have the SOA server
| approve a zone transfer and send the zone file to you.
|
| This is _generally_ true but also if you watch authoritative-
| only dns server logs for text strings matching ACL
| rejections, there 's plenty of things out there which are
| fully automated crawlers _attempting_ to do entire zone
| transfers.
|
| There are a non zero number of improperly configured
| authoritative dns servers out there on the internet which
| will happily give away a zone transfer to anyone who asks for
| it, at least, apparently enough to be useful that somebody
| wrote crawlers for it. I would guess it's only a few percent
| of servers that host zonefiles but given the total size of
| the public Internet, that's still a lot.
| majke wrote:
| In the context of DNSSEC dns zones are very much
| enumerable. Cloudflare does amazing tricks to avoid this
| https://blog.cloudflare.com/black-lies/
| eqvinox wrote:
| Cloudflare themselves gives more information here:
|
| > NSEC3 was a "close but no cigar" solution to the
| problem. While it's true that it made zone walking
| harder, it did not make it impossible. Zone walking with
| NSEC3 is still possible with a dictionary attack.
|
| So, hardening it against enumerability is a question of
| inserting non-dictionary names.
| yatralalala wrote:
| Zone transfers are super interesting topic. Thanks for
| mentioning that.
|
| It's basically the way how to get all DNS records a DNS
| server has. Interestingly in some countries this is illegal
| and in some this is considered best practice.
|
| Generally, enabled zone transfers is considered as
| misconfiguration and should be disabled.
|
| We did research on that few months back and found out that 8%
| of all global name servers have it enabled.[0]
|
| [0] - https://reconwave.com/blog/post/alarming-prevalence-of-
| zone-...
| stwrzn wrote:
| That's concerning. I thought everyone knows that zone
| transfers should be generally disallowed, especially when
| coming from random hosts.
| Kikawala wrote:
| Is it available under HTTPS? Then it's probably in a Certificate
| Transparency log.
| govideo wrote:
| Yes, https via cloudflare's automatic https. Thanks for the
| info.
| thisisgvrt wrote:
| Automated agents can tail the certificate log to discover new
| domains as the certs are issued. But if you want to explore
| subdomains manually, https://crt.sh/ is a nice tool.
| snailmailman wrote:
| Yeah this is a surprisingly little known fact- all certs
| being logged means all subdomain names get logged.
|
| Wildcard certs can hide the subdomains, but then your cert
| works on _all_ subdomains. This could be an issue if the
| certs get compromised.
|
| Usually there isn't sensitive information in subdomain names,
| but i suspect it often accidentally leaks information about
| infrastructure setups. "vaultwarden.example.com" existing
| tells you someone is probably running a vaultwarden instance,
| even if it's not publicly accessible.
|
| The same kind of info can leak via dns records too, I think?
| tialaramex wrote:
| > The same kind of info can leak via dns records too, I
| think?
|
| That's correct "passive DNS" is sold by many large public
| DNS providers. They tell you (for a fee) what questions
| were asked and answered which meet your chosen criteria. So
| e.g. maybe you're interested, what questions and answers
| matched A? something.internal.bigcorp.example in February
| 2025.
|
| They won't tell you who asked (IP address, etc.) but
| they're great for discovering that even though it says 404
| for you, bigcorp.famous-brand-hr.example is checked
| regularly by _somebody_ , probably BigCorp employees who
| aren't on their VPN - suggesting very strongly that
| although BigCorp told Famous Brand HR not to list them as a
| client that is in fact the HR system used by BigCorp.
| Arrowmaster wrote:
| I had coworkers at a previous employer go change settings
| in CloudFlare trying to troubleshoot instead of reaching
| out to me. They changed the option that caused CF proxy to
| issue a cert for every subdomain instead of using the
| wildcard. They didn't understand why I was pissed that they
| had now written every subdomain we had in use to the public
| record in addition to doing it without an approved change
| request.
| yatralalala wrote:
| If you're using infra in a way [cloudflare -> your VM] I'd
| recommend setting firewall on the VM in a way that it can be
| accessed only from Cloudflare.
|
| This way, you will force everyone to go through Cloudflare
| and utilize all those fancy bot blocking features they have.
| system2 wrote:
| Do you know how to access these logs?
| Kikawala wrote:
| Answered below, but https://crt.sh/ is what I use.
| daneel_w wrote:
| https://crt.sh/ is one example, if you sign using e.g. Let's
| Encrypt.
| Eikon wrote:
| https://www.merklemap.com/
| daggersandscars wrote:
| DNS query type AXFR allows for subdomain querying. There are
| security restrictions around who can do it on what DNS servers.
| Given the number of places online one can run a subdomain query,
| I suspect it's mostly a matter of paying the right fees to the
| right DNS provider.
| artursapek wrote:
| presumably it has a DNS record
| vince14 wrote:
| I'm having the same issue.
|
| https://securitytrails.com/ also had my "secret" staging
| subdomain.
|
| I made a catch-all certificate, so the subdomain didn't show up
| in CT logs.
|
| It's still a secret to me how my subdomain ended up in their
| database.
| selcuka wrote:
| They could be purchasing DNS query logs from ISPs.
| arccy wrote:
| maybe your server responded to a plain ip addressed request
| with the real name...
| averageRoyalty wrote:
| Host header is a request header, not a response one, isn't
| it?
| fc417fc802 wrote:
| He said he used a wildcard cert though. So what part of the
| response would contain the subdomain in that case?
| johnklos wrote:
| Serious question: Do you really think that Cloudflare is trying
| to keep these kinds of thing private? If so, I'd suggest that's
| not a reasonable expectation.
| fc417fc802 wrote:
| Related question (not rhetorical). If you do DNS for
| subdomains yourself (and just use Cloudflare to point
| dns.example.com at your box) will the subdomain queries leak
| and show up in aggregate datasets? What I'm asking is if
| query recursion is always handled locally or if any of the
| reasonably common software stacks resolve it remotely.
| immibis wrote:
| As well as assuming Cloudflare sells DNS lists, it's
| probably safe to assume the operators of public resolvers
| like 8.8.8.8, 9.9.9.9 and 1.1.1.1 (that is Google, Quad9
| and Cloudflare again) are looking at their logs and either
| selling them or using them internally.
| parliament32 wrote:
| Certificate Transparency logs, or they don't actually know the
| domain name: just port-scanning[1] then making requests to open
| web ports.
|
| [1] Turns out you can port-scan the entire internet in under 5
| minutes: https://github.com/robertdavidgraham/masscan
| andix wrote:
| Port scanning usually can't discover subdomains. Most servers
| don't expose the of the domains they server content for. In
| case of HTTP they usually only serve the subdomain content if
| the Host: request-header includes it.
| hombre_fatal wrote:
| Most servers just listen on :80 and respond to all requests.
| Almost nobody checks the host header intentionally, it's just
| a happy mistake if they use a reverse proxy.
|
| You can often decloak servers behind Cloudflare because of
| this.
|
| But OP's post already answered their question: someone
| scanned ipv4 space. And what they mean is that a server they
| point to via DNS is receiving requests, but DNS is a red
| herring.
| andix wrote:
| This really depends on the setup. Most web servers host
| multiple virtual hosts. IP addresses are expensive.
|
| If you're deploying a service behind a reverse proxy, it
| either must be only accessible from the reverse proxy via
| an internal network, or check the IP address of the reverse
| proxy. It absolutely must not trust X-Forwarded-For:
| headers from random IPs.
| hombre_fatal wrote:
| I just don't see how any of this matters. OP's server is
| reachable via ipv4 and someone sent an http request to
| it. Their post even says that this is the case.
| andix wrote:
| I'm guessing they meant it discovered a virtual host
| behind a subdomain.
| benfortuna wrote:
| I could be wrong, but the Palo Alto scanner says it's using
| global ipv4 space, so not using DNS at all. So actually the
| subdomain has not been discovered at all.
| reactordev wrote:
| This is exactly what's happening based on the log snippet
| posted. Has nothing to do with subdomains, has everything
| to do with it being on the internet.
| parliament32 wrote:
| How deep in the domain hierarchy you are doesn't matter from
| a network layer: a bare tld (yes this exists), a normal
| domain, a subdomain, a sub-subdomain, etc can all be assigned
| different IPs and go different places. You can issue a GET
| against / for any IP you want (like we see in the logs OP
| posted). The only time this would actually matter is if a
| host at an address is serving content for multiple hostnames
| and depends on the Host header to figure out which one to
| serve -- but even those will almost always have a default.
| andix wrote:
| You can discover IP adresses, sure. Just enumerate them.
| But this doesn't give you the domain, as long as there is
| no reverse dns record.
|
| I'm quite sure OP meant a virtual host only reachable with
| the correct Host: header.
| cryptonector wrote:
| And in the case of HTTPS they need to insist on SNI (and
| TLSv3 requires it).
| giancarlostoro wrote:
| Last few times I tried to do this my ISP cut off my internet
| every time. Assholes. It comes back, but they're still assholes
| for it.
| icehawk wrote:
| This.
|
| I have a DNS client that feeds into my passive DNS database by
| reading CT logs and then trying to resolve them.
| toomuchtodo wrote:
| What do you use it for?
| fsckboy wrote:
| LPT, this is an object lesson in the weakness of security through
| obscurity
| bangaladore wrote:
| I mean you could argue that this is more of a multi-factor
| authentication lesson.
|
| Just knowing 1 "secret"-- a subdomain in this case --shouldn't
| get you somewhere you shouldn't.
|
| In general you should always assume that any password has been
| (or could be) compromised. So in this case, more factors should
| be involved such as IP restricting for access, an additional
| login page, certificate validation, something...
| andix wrote:
| Security by obscurity can be a great additional measure for an
| already secure system. It can reduce attack surface, make it
| less likely to get attacked in the first place. In some cases
| (like this one) it can also be much easier to break than
| expected.
| OuterVale wrote:
| https://www.merklemap.com pops to mind.
| govideo wrote:
| Interesting! Just checked them out.
|
| "MerkleMap gathers its information by continuously monitoring
| and live tailing Certificate Transparency (CT) logs, which are
| operated by organizations like Google, Cloudflare, and Let's
| Encrypt. "
| Eikon wrote:
| I made this, thank you!
| 8bitchemistry wrote:
| Did you ever email the URL to somebody? We had the same issue
| years ago where google seemed to be crawling/indexing new
| subdomains it finds in emails.
| govideo wrote:
| Nope, never emailed or posted to anyone. Just me (it's my solo
| project at the moment).
| andix wrote:
| I'm surprised nobody mentioned subfinder yet:
| https://github.com/projectdiscovery/subfinder
|
| Subfinder uses different public and private sources to discover
| subdomains. Certificate Transparency logs are a great source, but
| it also has some other options.
| spl757 wrote:
| Does the IP address for that subdomain have a DNS PTR record set?
| If it does, someone can discover the subdomain by querying the
| PTR record for the IP.
| govideo wrote:
| If it does, I did not set it up; it would have been
| automatically done by CloudFlare when I told it to use my
| custom subdomain for the upload urls.
| andix wrote:
| If a HTTPS service should be hard to discover, an easy way is to
| hide it behind a subdirectory. Something like
| https://subdomain.domain.example/hard_to_find_secret_string.
|
| Another option are wildcard certificates.
|
| This obviously can't be the only protection. But if an attacker
| doesn't know about a service, or misses it during discovery, they
| can't attack it.
| LinuxBender wrote:
| As others have said, likely cert transparency logs. Use a
| wildcard cert to avoid this. They are free using LetsEncrypt and
| possibly a couple other ACME providers. I have loads of wildcard
| certs. Bots will try guessing names but like you I do not use
| easily guessable names and the bots never find them. _I log all
| DNS answers._ I assume cloudflare supports strict-SNI but no idea
| if they have their own automation around wildcard certs.
| Sometimes I renew wildcard certs I am not even using just to give
| the bots something to do.
| govideo wrote:
| I have been just relying on CloudFlare's automatic https. But I
| will look into my own certs, though will likely just use
| CloudFlare's. I don't mind the internet knowing the subdomain I
| posted about; was curious how the bots found it!
| pabs3 wrote:
| ArchiveTeam has some docs about this:
|
| https://wiki.archiveteam.org/index.php/Finding_subdomains
| govideo wrote:
| I'm so often amazed (but no longer surprised) at the depth of
| niche (relatively) info and tools out there.
| ciaovietnam wrote:
| There is a chance that your subdomain is the first/default
| virtual host in your web server setup (or the subdomain's access
| log is the default log file) so any requests to the server's IP
| address get logged to this virtual host. That means they didn't
| access your subdomain, they accessed via your server IP address
| but got logged in your subdomain's access log.
| BrandoElFollito wrote:
| And this is the correct answer, thank you.
|
| Transparency logs are fine except if you have a wildcard cert
| (or no https, obviously).
|
| IP scans are just this: scans for live ports. If you do not
| provide a host header in your call you get whatever the default
| response was set up. This can be a default site, a 404 or
| anything else.
| alberth wrote:
| This site will find any subdomain, for any domain, so long as it
| previously had a certificate (ssl/tls)
|
| https://crt.sh/
| averageRoyalty wrote:
| This is incorrect (or at least only technically correct). This
| is only true for subdomains with public, trusted CA signed
| certificates since certificate transparency has existed and
| only for subdomains with a specific, non wildcard certificate.
| govideo wrote:
| Thanks for mentioning. I checked it out, and am learning lots
| of new stuff (ie, realize how much I do not know).
| nvarsj wrote:
| Doesn't find any of my semi secret subdomains.
| socrateslee wrote:
| https://crt.sh can find your subdomain only when it doesn't
| have a wildcard certificate(*.somedomain.com)
| thedougd wrote:
| Some CAs (Amazon) allow not publishing to the Certificate
| Transparency Log. But if you do this, browsers will block the
| connection by default. Chromium browsers have a policy option to
| skip this check for selected URLs. See:
| CertificateTransparencyEnforcementDisabledForURLs.
|
| Some may find this more desirable than wildcard certificates and
| their drawbacks.
| klntsky wrote:
| > Some may find this more desirable
|
| Why?
| thedougd wrote:
| A CISA article on wildcard security risks. Some of this is in
| part from common misimplementations (e.g.reusing private keys
| across servers), but not all of it.
|
| https://www.cisa.gov/news-events/alerts/2021/10/08/nsa-
| relea... Direct: https://media.defense.gov/2021/Oct/07/200286
| 9955/-1/-1/0/CSI...
| navigate8310 wrote:
| To avoid subdomain discovery, I usually acquire certificate
| domain level and add a wildcard SAN.
| snailmailman wrote:
| Firefox is currently rolling out the same thing. They will
| treat any non-publicly-logged certificate as insecure.
|
| I'm surprised amazon offers the option to not log certificates.
| The whole idea is that every issued cert should get logged.
| That way, fraudulently-issued certs are either well documented
| in public logs- or at least not trusted by the browser.
| fc417fc802 wrote:
| It doesn't seem like the choice has any impact on that. It
| just protects user privacy if that's what they want to
| prioritize.
|
| Depending on the issuer logging all certs would never work.
| You can't rely on the untrusted entity to out themselves for
| you.
|
| The security comes from the browser querying the log and
| warning you if the entry is missing. In that sense declining
| to log a cert is similar to self signing one. The browser
| will warn and users will need to accept. As long as the vast
| majority of sites don't do that then we maintain a sort of
| herd immunity because the warnings are unexpected by the end
| user.
| thedougd wrote:
| I should have included in my post, this technique only
| makes sense in the context of private or internal
| endpoints.
| rempargo wrote:
| I assume you host this with a https certificate, so you can look
| your subdomains at:
|
| https://crt.sh/?q=sampledomain.com
| melson wrote:
| Someone might used open-source tool like sublist3r
| rawbytes wrote:
| oh yes that for sure
| ackbar03 wrote:
| yea was gonna mention this as well lol
| paxys wrote:
| Not sure why everyone is going on about certificate transparency
| logs when the answer is right there in the user agent. The
| company is scanning the ipv4 space and came upon your IP and
| port.
| pkulak wrote:
| Okay. But how did they get the proper host header?
| peeters wrote:
| There are a couple easy possibilities depending on server
| config.
|
| 1. Not using SNI, and all https requests just respond with
| the same cert. (Example, go to https://209.216.230.207/ and
| you'll get a certificate error. Go to the cert details and
| you'll see the common name is news.ycombinator.com).
|
| 2. http upgrades to https with a redirect to the hostname,
| not IP address. (Example, go to http://209.216.230.207/ and
| you get a 301 redirect to https://news.ycombinator.com)
| jimnotgym wrote:
| I don't think op said that they had the correct host header?
| INTPenis wrote:
| Could be a number of ways for example a default TLS cert, or
| a default vhost redirect.
|
| I actually had a job once a few years ago where I was asked
| to hide a web service from crawlers and so I did some of
| these things to ensure no info leaked about the real vhost.
| paxys wrote:
| Who says they did?
| peeters wrote:
| It's rather hilarious that nobody mentioned this in 7 hours.
| What am I missing?
|
| ~5 billion scans in a few hours is nothing for a company with
| decent resources. OP: in case you didn't follow, they're
| literally trying every possible IPv4 address and seeing if
| something exists on standard ports at that address.
|
| I believe it would be harder to find out your domain that way
| if you were using SNI and only forwarded/served requests that
| used the correct host. But if you aren't using SNI, your server
| is probably just responding to any TLS connect request with
| your subdomain's cert, which will reveal your hostname.
| Dylan16807 wrote:
| > What am I missing?
|
| That it was in fact mentioned many hours earlier, in more
| than one top level comment.
| peeters wrote:
| I was referring more to the fact that the user agent
| explicitly contained the answer, rather than suggestions
| that it was IP scanning. But you're right I do see one
| comment that mentions that. And many more likely assumed
| the OP already figured that part out.
| Dylan16807 wrote:
| The user agent contains a partial answer. IP scanning
| doesn't give you the actual subdomain, so the question is
| slightly wrong or there are missing pieces.
| diggan wrote:
| Judging by the logs (user agents really) right now in the
| submission, it's hard to tell if the requests were
| actually for the domain (since the request headers aren't
| included) or just for the IP.
| Dylan16807 wrote:
| Yes, that's the question being wrong option I listed.
| globular-toast wrote:
| > What am I missing?
|
| It's very common for people to read only up to the point they
| feel they can comment, then skip immediately to the comment.
| So, basically, noone read it.
| flemhans wrote:
| Funny, that'd be so unthinkable for me to do! But you're
| probably right.
| fragmede wrote:
| Just the default hostname. It won't reveal all of them or any
| of the IP addresses of that box. secret-freedom-fighter.ice-
| cream-shop.example.com could have the same IP as example.com
| and you'd only know example.com
| A1kmm wrote:
| If you've got one cert with a subject alt name for each
| host, they'd see them all. If you use SNI and they have
| different certificates, the domains might still be in
| Certificate Transparency logs. If a wildcard cert is used,
| that could help to conceal the exact subdomain.
| ozim wrote:
| That perfectly fits midwit meme. Lots of people are smart
| enough to know transparency logs - but not smart enough to read
| OP post and understand the details.
| seba_dos1 wrote:
| The details aren't there, so it's "assume" rather than
| "understand".
|
| The only proper response to OP's question is to ask for
| clarification: is the subdomain pointing to a separate IP?
| Are the logs vhost-specific or not?
|
| If you don't get the answers, all you can do is to assume,
| and both assumptions may end up being right or wrong (with
| varying probability, perhaps).
| 4ndrewl wrote:
| Also it's Palo Alto. They're not some kiddie scripters.
| https://en.m.wikipedia.org/wiki/Palo_Alto_Networks
| chinathrow wrote:
| Hm?
|
| They sell you security but provide you with CVEs en masse.
|
| https://www.cybersecuritydive.com/news/palo-alto-networks--
| h...
| ThatMedicIsASpy wrote:
| Am I google when I come with the useragent 'google here, no
| evil'?
| bildung wrote:
| Looking at _how_ they earned their 100s of CVEs, script
| kiddie almost looks like a compliment
| p0w3n3d wrote:
| Finding IP does not mean finding the domain. When doing HTTP
| request to IP you specify the domain you want to connect to.
| For example you can configure your /etc/hosts to have
| xxxnakedhamsters.google.com pointing to 8.8.8.8 and make the
| http request, which will cause Google getting the domain
| request (i.e. header Host: xxxnakedhamsters.google.com) and it
| will refuse it or try to redirect to http. Of course it's only
| related to HTTP because HTTPS will require certificate. That's
| why they're speaking about certificates.
| ghusto wrote:
| First thing I'd do for an IP that answers is a reverse
| lookup, so I expect that's at least in the list of things
| they'd try.
| lewiscollard wrote:
| Depending on the web server's configuration, you very much
| _can_ find the domain which is configured on an IP address,
| by attempting to connect to that IP address via HTTPS and
| seeing what certificate gets served. Here's an example:
|
| https://138.68.161.203/
|
| > Web sites prove their identity via certificates. Firefox
| does not trust this site because it uses a certificate that
| is not valid for 138.68.161.203. The certificate is only
| valid for the following names: exhaust.lewiscollard.com,
| www.exhaust.lewiscollard.com
| jchw wrote:
| I don't think that does you any good for Cloudflare,
| though. They will definitely be using SNI.
| kelnos wrote:
| That doesn't really matter, though. While OP is using
| Cloudflare, the actual server behind it is still a
| publicly-accessible IP address that an IPv4 space scanner
| can easily stumble upon.
| jchw wrote:
| I misunderstood, I thought the subdomain _was_ an R2
| bucket. If it 's just normal Cloudflare proxying to some
| backend this is probably the most likely answer.
|
| That said, while I think it's not the case here, using
| Cloudflare doesn't mean the underlying host is
| accessible, as even on the free tier you can use
| Cloudflare Tunnels, which I often do.
| melevittfl wrote:
| But there's no evidence in the OP's post that they have, in
| fact, discovered the domain. The only thing posted is that
| there is a GET request to a listening web server.
|
| The OP and all the people talking about certificates are
| making the same assumption. Namely that the scanning company
| discovered the DNS name for the server and tried to connect.
| When, if fact, they simply iterate through IP address blocks
| and make get requests to any listening web servers they find.
| p0w3n3d wrote:
| OP states that the domain was discovered
| crazygringo wrote:
| No they didn't. They said "How did the internet find my
| subdomain?" They're _assuming_ the internet found their
| subdomain. They don 't provide any evidence that
| happened, just that they found their IP address.
| paxys wrote:
| > When doing HTTP request to IP you specify the domain you
| want to connect to
|
| No, you make HTTP requests to an IP, not a domain. You
| convert the domain name to an IP in an earlier step (via a
| DNS query). You can connect to servers using their raw IPs
| and open ports all day if you like, which is what's happening
| here. Yes servers will (likely) reject the requests by
| looking at the host header, but they will still _receive_ the
| request.
| DeborahMatthews wrote:
| Your subdomain may have been discovered through certificate
| transparency logs, search engine crawling, passive DNS,
| https://arzhost.com/blogs/openssl-unable-to-write-random-sta...
| leaked links, or third-party analytics tools.
| arkfil wrote:
| paloAlto (network devices like firewalls etc) is able to scan the
| sites that users want to visit behind their devices. these are
| very popular devices in many companies. users can also have
| agents installed on their computers that also have access to the
| sites they visit.
| opello wrote:
| This is what I was thinking it must be, along the lines of
| Cisco NAC. Could monitor via browser plugin for full URLs or
| DNS server for domains.
|
| I imagine the certificate transparency log is the avenue, but
| local monitoring and reporting up as a new URL or domain to
| scan for malware seems similarly plausible.
| govideo wrote:
| Thanks for everyone's perspectives. Very educational and
| admittedly lots outside the boundaries of my current knowledge. I
| have thus far relied on CloudFlare's automatic https and simple
| instant subdomain setup for their worker microservice I'm using.
|
| There are evidently technical/footprint implications of that
| convenience. Fortunately, I'm not really concerned with the
| subdomain being publicly known; was more curious how it become
| publicly known.
| groestl wrote:
| I had to scroll pretty far down to see the first comment
| refering to the second most likely leak (after certificate
| transparency lists): Some ISP sold their DNS query log, and
| your's was in it.
|
| People buying such records do so for various reasons, for
| example to seed some crawler they've built.
| bashwizard wrote:
| Like people have said already; Certificate Transparency logs.
|
| There are countless of tools to use for subdomain enumeration. I
| personally use subfinder or amass when doing recon on bug bounty
| targets.
| 3oil3 wrote:
| What happens if you google your subdomain? Maybe the bots have
| some sort of dictionary files and they just run them, and when
| there is a match, then they append it with some .html extension,
| or maybe they prepend it to the match as a subdomain of it?
| f4c39012 wrote:
| CSP headers can leak urls, but I assume that isn't the cause here
| if the subdomain is an entirely separate project
| ThePowerOfFuet wrote:
| Others are saying CT logs but my own subdomains are on wildcard
| certificates, in which case I suspect they are discovered by DPI
| analysis of DNS traffic and resold, such as by Team Cymru.
| BLKNSLVR wrote:
| There are a number of companies, not just Palo Alto Networks,
| that perform various different scales of scans of the entire IPv4
| space, some of them perform these scans multiple times per day.
|
| I setup a set of scripts to log all "uninvited activity" to a
| couple of my systems, from which I discovered a whole bunch of
| these scanner "security" companies. Personally, I treat them all
| as malicious.
|
| There are also services that track Newly Registered Domains
| (NRDs).
|
| Tangentially:
|
| NRD lists are useful for DNS block lists since a large number of
| NRDs are used for short term scam sites.
|
| My little, very amateur, project to block them can be found here:
| https://github.com/UninvitedActivity/UninvitedActivity
|
| Edited to add: Direct link to the list of scanner IP addresses
| (although hasn't been updated in 8 months - crikey, I've been
| busy longer than I thought):
| https://github.com/UninvitedActivity/UninvitedActivity/blob/...
| mr_mitm wrote:
| Getting the domain name from the IP address is not trivial,
| though. In fact, it should be impossible, if the name really
| hasn't been published (barring guessing attempts), so OP's
| question stands.
| venj wrote:
| I had this issue with internal domains indexed by Google. The
| domains where not published anywhere by my company. They were
| dcanned by leakix.net which apparently scans the whole web
| for vulnerabilities and publishes web pages containing the
| domain names associated with each IP address. I guess they
| read them from the certificates
| jhart99 wrote:
| There is another source, SNI certs showing up on a server
| or load balancer during the TLS handshake. When the client
| tries to connect to a server using SNI without indicating
| the server, some will reply with a default or give a list
| of valid server names.
| melevittfl wrote:
| The OP is misunderstanding what's happened, based on what's
| been posted. The OP has a server with an IP address. They're
| seeing GET requests in the server's logs and is assuming
| people have found the server's DNS name.
|
| In fact, the scanners are simply searching the IP address
| space and simply sending GET requests to any IP address they
| find. No DNS discovery needed.
| alfiedotwtf wrote:
| Are you sure that's the case? IP addresses != domain, so
| I'm getting bots are including the Host header in their
| requests containing the obfuscated domain.
|
| My guess is OP is using a public DNS server that sells
| aggregated user requests. All it takes is one request from
| their machine to a public machine on the internet, and it's
| now public knowledge.
| lxgr wrote:
| That entirely depends on whether the GET requests were
| providing the (supposed to be hidden) hostname in the
| `Host` header (and potentially SNI TLS extension).
| okasaki wrote:
| $ host 209.216.230.207 207.230.216.209.in-addr.arpa
| domain name pointer news.ycombinator.com.
| mr_mitm wrote:
| Not sure what you are trying to tell me. This isn't
| guaranteed to work. If you define a reverse lookup record
| for your domain, then that counts as published in my book.
| drpossum wrote:
| This is correct.
| dspillett wrote:
| That is when there is an explicit PTR record, for instance
| one of my assigned addresses can be named that way due to:
| 74.231.187.81.in-addr.arpa. 3600 IN PTR
| ns2.nogoodnamesareleft.com.
|
| in the zone file for that IPv4, but unless they've
| explicitly configured, or are using a hosting service that
| does it without asking, this it won't be what is happening.
|
| It isn't practical to do a reverse lookup from "normal"
| name-to-address records like
| ns2.nogoodnamesareleft.com. IN A 81.187.231.74
|
| (it is _possible_ to build a partial reverse mapping by
| collecting a huge number of DNS query results, but not
| really practical unless you are someone like Google or
| Cloudflare running a popular resolution service)
| DonHopkins wrote:
| I love how the ARPANET still lives on through reverse DNS
| PTRs.
|
| https://www.youtube.com/watch?v=V78GUSOS-EM
| yabones wrote:
| I do something similar. Any hits on the default nginx vhost get
| logged, logs get parsed out and "repeat offenders" get put on
| the shitlist. I use ipset/iptables but this can also be done
| with fail2ban quite simply.
|
| https://nbailey.ca/post/block-scanners/
| immibis wrote:
| This is security theater.
| Sohcahtoa82 wrote:
| Only kinda.
|
| Doing something like this can prevent you from showing up
| on Shodan.io which is used by many users/bots to find
| servers without running massive scans themselves.
| drpossum wrote:
| How does an ip scan help with general DNS resolution at all?
| lockhead wrote:
| Most likely passive DNS data, if you use your subdomain you do
| DNS queries for it. If you use a DNS server to resolve your
| domains that shares this data, it can be picked up by others.
| nusl wrote:
| It's pretty common to bruteforce subdomains of a domain you might
| be interested in, specially by attackers.
| xg15 wrote:
| TIL (from this thread) : You can abuse TLS handshakes to
| effectively reverse-DNS an IP address without ever talking to a
| DNS server! Is this built into dig yet? :)
|
| (Alright, _some_ IP addresses, not all of them)
|
| I also wonder if this is a potential footgun for eSNI
| deployments: If you add eSNI support to a server, you must
| remember to also make regular SNI mandatory - otherwise, an
| eavesdropper can just ask your server nicely for the domain that
| the eSNI encryption was trying to hide from it.
| yatralalala wrote:
| Lifehack - it's especially awesome in cases where server
| operator is using self-signed certs / private cert authorities.
| Because you will not find these in public cert logs.
| _trampeltier wrote:
| Did you send a link over Email, Whatsapp or something like?
| ralferoo wrote:
| If you're using HTTPS, then you're probably using letsencrypt and
| so your subdomain will appear on the CT logs that are publicly
| accessible.
|
| One thing you could do is use a wildcard certificate, and then
| use a non-obvious subdomain from that. I actually have something
| similar - in my set up, all my web-traffic goes to haproxy
| frontends which forward traffic to the appropriate backend, and I
| was sick of setting up multiple new certificates for each new
| subdomain, so I just replaced them all with a single wildcard
| cert instead. This means that I'm not advertising each new
| subdomain on the CT list, and even though they all look nominally
| the same when visiting - same holding page on index and same /api
| handling, just one of the subdomains decodes an additional URL
| path that provides access to status monitoring.
|
| Separately, that Palo Alto Networks company is a real pain. They
| connect to absolutely everything in their attempts to spam the
| internet. Frankly, I'm sick of even my mail servers being
| bombarded with HTTP requests on port 25 and the resultant log
| spam.
| clvx wrote:
| Put it behind ipv6 and it won't likely happen again. The address
| space is massive
| supermatt wrote:
| 1) Are you sure that they are using the subdomain? They could be
| connecting via IP or an alternate host address.
|
| 2) Are you using TLS? Unless you are using a wildcard cert, then
| the FQDN will have been published as part of the certificate
| transparency logs.
| mightybyte wrote:
| If you've made any kind of DNS entries involving this subdomain,
| then congratulations, you've notified the world of its existence.
| There are tools out there that leverage this information and let
| you get all the subdomains for a domain. Here's the first one I
| found in a quick search:
|
| https://pentest-tools.com/information-gathering/find-subdoma...
| yatralalala wrote:
| Hi, our company does this basically "as-a-service".
|
| The options how to find it are basically limitless. Best source
| is probably Certificate Transparency project as others suggested.
| But it does not end there, some other things that we do are
| things like internet crawl, domain bruteforcing on wildcard dns,
| dangling vhosts identification, default certs on servers (connect
| to IP on 443 and get default cert) and many others.
|
| Security by obscurity does not work. You can not rely on "people
| won't find it". Once it's online, everyone can find it. No matter
| how you hide it.
| TZubiri wrote:
| "Security by obscurity does not work"
|
| This is one of those false voyeur OS internet tennets designed
| to get people to publish their stuff.
|
| Obscurity is a fine strategy, if you don't post your source
| that's good. If you post your source, that's a risk.
|
| The fact that you can't rely on that security measure is just a
| basic security tennet that applies to everything: don't rely on
| a single security measure, use redundant barriers.
|
| Truth is we don't know how the subdomain got leaked. Subdomains
| can be passwords and a well crafted subdomain should not leak,
| if it leaks there is a reason.
| zevlag wrote:
| > Subdomains can be passwords and a well crafted subdomain
| should not leak,
|
| I disagree. A subdomain is not secret in any way. There are
| many ways in which it is transmitted unencrypted. A couple:
|
| - DNS resolution, multiple resolvers and authoritative
| servers - TLS SNI - HTTP Host Header
|
| There are many middle boxes that could perform safety checks
| on behalf of the client, and drop it into a list to be
| rescanned.
|
| - Virus Scanners - Firewalls - Proxies
| dharmab wrote:
| I once worked for a company which was using a subdomain of
| an internal development domain to do some completely
| internal security research on our own products. The entire
| domain got flagged in Safe Browsing despite never being
| exposed to the outside world. We think Chrome's telemetry
| flagged it, and since it was technically routable as a
| public IP (all public traffic on that IP was blackholed),
| Chrome thought it was a public website.
| mkl95 wrote:
| I saw a similar thing happen with a QA team's domains.
| Google flagged them as malicious and the company never
| managed to get them unflagged.
| dharmab wrote:
| Our lawyers knew their lawyers so there was a friendly
| chat and we got added to an internal whitelist within
| Google.
| TZubiri wrote:
| >It's not encrypted in transit
|
| Agree.
|
| But who said that all passwords or shiboleths should all be
| encrypted in transit?
|
| It can serve as a canary for someone snooping your traffic.
| Even if you encrypt it, you don't want people snooping.
|
| To date of my subdomains that I never publish, I haven't
| had anyone attempting to connect with them.
|
| It's one of those redundant measures.
|
| And it's also one of those risks that you take, you can
| maximize security by staying at home all day, but going out
| to take the trash is a calculated risk that you must take
| or risk overfocusing on security.
|
| It's similar to port knocking. If you are encrypting it,
| it's counterproductive, it's a low effort finishing touch,
| like a nice knot.
| lolinder wrote:
| Truth is we don't know _that_ the subdomain got leaked. The
| example user agent they give says that the methodology they
| 're using is to scan the IPv4 space, which is a great example
| of why security through obscurity doesn't work here: The IPv4
| space is tiny and trivial to scan. If your server has an IPv4
| address it's not obscure, you should assume it's publicly
| reachable and plan accordingly.
|
| > Subdomains can be passwords and a well crafted subdomain
| should not leak, if it leaks there is a reason.
|
| The problem with this theory is that DNS was never designed
| to be secret and private and even after DNS over HTTPS it's
| _still_ not designed to be private for the servers. This
| means that getting to "well crafted" is an incredibly
| difficult task with hundreds of possible failure modes which
| need constant maintenance and attention--not only is it
| complicated to get right the first time, you have to
| reconfigure away the failure modes on every device or even on
| every use of the "password".
|
| Here are just a few failure modes I can think of off the top
| of my head. Yes, these have mitigations, but it's a game of
| whack-a-mole and you really don't want to try it:
|
| * Certificate transparency logs, as mentioned.
|
| * A user of your "password" forgets that they didn't
| configure DNS over HTTPS on a new device and leaves a trail
| of logs through a dozen recursive DNS servers and ISPs.
|
| * A user has DNS over HTTPS but doesn't point it at a server
| within your control. One foreign server having the password
| is better than dozens and their ISPs, but you don't have any
| control over that default DNS server nor how many different
| servers your clients will attempt to use.
|
| * Browser history.
|
| Just don't. Work with the grain, assume the subdomain is
| public and secure your site accordingly.
| immibis wrote:
| > The IPv4 space is tiny and trivial to scan
|
| Something many people don't expect is that the IPv6 space
| is also tiny and trivial to scan, if you follow certain
| patterns.
|
| For example, many server hosts give you a /48 or /64
| subnet, and your server is at your prefix::1 by default. If
| they have a /24 and they give you a /48, someone only has
| to scan 2^24 addresses at that host to find all the ones
| using prefix::1.
| Sayrus wrote:
| Assuming everyone is using /48 and binding to prefix::1,
| that's a 2^16 difference with scanning the IPv4 address
| space. Assuming a specific host with only one IPv6 /24
| block and delegating /64, this is a 2^12 difference.
| Scanning for /64 on the entire IPv6 space is definitely
| not as tiny.
|
| AWS only allows routing /80 to EC2 instances making a
| huge difference.
|
| It doesn't mean that we should rely on obscurity, but the
| entire space is not tiny as IPv4 was.
| TZubiri wrote:
| Interesting, so you may see the Ipv6 space as a tree, and
| go just for the first addresses of the block.
|
| But if you just choose a random address you would enjoy a
| bit more immunity from brute force scanners here.
| AStonesThrow wrote:
| IPv6 address space may be trivial from this perspective,
| but imagine trying to establish two-way contact with a
| user on a smartphone on a mobile network. Or a user whose
| Interface ID (64 bits) is regenerated randomly every few
| hours.
|
| Just try leaving a User Talk page message on Wikipedia,
| and good luck if the editor even notices, or anyone finds
| that talk page again, before the MediaWiki privacy
| measures are implemented.
| lyu07282 wrote:
| > Obscurity is a fine strategy
|
| > Subdomains can be passwords and a well crafted subdomain
| should not leak
|
| Your comment is really odd to read I'm not sure I understand
| you, but I'm sure you don't mean it like that. Just to re-
| iterate the important points:
|
| 1. Do not rely on subdomains for security, subdomains can
| easily leak in innumerable ways including in ways outside of
| your control.
|
| 2. Security by obscurity must never be relied on for security
| but can be part of a larger defense in depth strategy.
|
| ---
|
| https://cwe.mitre.org/data/definitions/656.html
|
| > This reliance on "security through obscurity" can produce
| resultant weaknesses if an attacker is able to reverse
| engineer the inner workings of the mechanism. Note that
| obscurity can be one small part of defense in depth, since it
| can create more work for an attacker; however, it is a
| significant risk if used as the primary means of protection.
| TZubiri wrote:
| It's a pretty weak cve category.
|
| "The product uses a protection mechanism whose strength
| depends heavily on its obscurity, such that knowledge of
| its algorithms or key data is sufficient to defeat the
| mechanism."
|
| If you can defeat the mechanism, that's not very impactful
| if it's one stage of a multi-round mechanism. Especially if
| vulnerating or crossing that perimeter alerts the admin!
|
| Lots of uncreative blue teamers here
| yapyap wrote:
| > This is one of those false voyeur OS internet tennets
| designed to get people to publish their stuff.
|
| No it isn't, it's a push to get people to login protect
| whatever they want to keep to themselves.
|
| It's silly to say informing people that security through
| obscurity is a weak concept is trying to convince them to
| publish their stuff.
| HeatrayEnjoyer wrote:
| If security through obscurity didn't provide any benefit
| then governments wouldn't have built entire frameworks for
| protecting classified information.
| ehutch79 wrote:
| So the only thing protecting classified docs is the
| public not knowing where they are? That's what security
| through obscurity is.
| yatralalala wrote:
| So many thoughts on that, but from my perspective - obscurity
| is ok, but you can not depend on it at all.
|
| Great example is port knocking - it hides your open port from
| random nmap, but would you leave it as the only mechanism
| preventing people getting to your server? No. So does it make
| sense to have it? Well maybe, it's a layer.
|
| Kerckhoffs' principle comes to my mind as well here.
|
| So while I agree with you on that's obscurity is fine
| strategy, you can never depend on it ever.
| marcosdumay wrote:
| As long as you don't go into "nah, I have another
| protection barrier, I don't need the best possible security
| for my main barrier" mode...
|
| Or in other words, if you place absolutely zero trust in
| it, consider it as good as broken by every single script
| kid, and publicly known, then yeah, it's fine.
|
| But then, why are you investing time into it? Almost
| everybody that makes low-security barriers is relying on
| it.
| sim7c00 wrote:
| making things obscure and hard to find is indeed a sound
| choice, as long as its not the single measure taken. i think
| people tout this sentence because its popular to say it,
| without thinking further.
|
| you dont put an unauthenticated thing in a difficult to find
| subdomain and call it secure. but your nicely secured page is
| more secure if its also very tedious to find. its a less low
| hanging fruit.
|
| as you state also there is always a leak needed. but dns
| system is quite leaky. and often sources wont fix or wont
| admit its even broken by their design.
|
| strong passwords are also insecure if they leak, so you
| obscure them from prying eyes, securing it by obscurity.
| TZubiri wrote:
| A lot of the pushback I'm seeing is that people are
| assuming that you always want to make things more secure.
| That security is a number that needs to go up, like income
| or profit, as opposed to numbers that need to go down, like
| cost and taxes.
|
| The possibility that I'm adding this feature to something
| that would otherwise have been published on a public domain
| does not cross people's mind, so it is not thought of an
| additional security measure, but a removal of a security
| feature.
|
| Similarly it is assumed that there's an unauthenticated or
| authentication mechanism behind the subdomain. There may be
| a simple idempotent server running, such that there is no
| concern for abuse, but it may be desirable to reduce the
| code executed by random spearfishing scanners that only
| have an IP.
|
| This brings me again to the competitive economic take on
| the subject, that people believe that this wisdom nugget
| they hold "that security by obscurity" is a valuable
| tennet, and they bet on it and desperately try to find
| someone to use it on. You can tell when a meme is
| overvalued because they try to use it on you even if it
| doesn't fit, it means they are dying to actually apply it.
|
| My bet is that "Security through obscurity" is undervalued,
| not as a rule or law, or a definite thing, but as a basic
| correlation: keep a low profile, and you'll be safer. If
| you want to get more sales, you will need to be a bit more
| open and transparent and that will expose you to more risk,
| same if you want transparency for ethical or regulation
| reasons. You will be less obscure and you will need to
| compensate with additional security mechanisms.
|
| But it seems evident to me that if you don't publish your
| shit, you are going to have much less risk, and need to
| implement less security mechanisms for the same risks as
| compared to voicing your infrastructure and your business,
| duh.
| 1970-01-01 wrote:
| It's become an anti-cliche. Security via obscure technique is
| a valid security layer in the exact same way a physical lock
| tumbler will not unlock when any random key is inserted and
| twisted. It's not great but it's not terrible and it does a
| fine job until someone picks or breaks it open.
| gitgud wrote:
| I don't think that analogy works well, a subdomain that is
| not published is more like hiding the key to the front door
| in the garden somewhere... does a fine job of keeping the
| house secure until someone finds it...
| TZubiri wrote:
| Terrible analogy.
|
| Why not use letters and packages which is the literal
| metaphor these services were built on?
|
| It's like relying on public header information to
| determine whether an incoming letter or package is
| legitimate.
|
| If it says: To "Name LastName" or "Company", then it's
| probably legitimate. Of course it's no guarantee, but it
| filters the bulk of Nigerian Prince spam.
|
| It gets you past the junk box, but you don't have to
| trust it with your life.
|
| Nuance.
| lxgr wrote:
| Keeping a key secret is not security by obscurity, but
| keeping the existence of a lock secret is.
| legitster wrote:
| > "Security by obscurity does not work"
|
| Depends on the context and exposure. Sometimes a key under a
| rock is perfectly fine.
|
| I used to work for a security company that REALLY oversold
| security risks to sell products.
|
| The idea that someone was going to wardrive through your
| suburban neighborhood with a networked cluster of GPUs to
| crack your AES keys and run a MITM attack for web traffic is
| honestly pretty far fetched unless they are a nation-state
| actor.
| natebc wrote:
| Realistically we get into $3 wrench territory pretty
| quickly too.
| throwway120385 wrote:
| They could also just cut and tip both ends of the
| Ethernet cable I have running between my house and my
| outbuilding too. I probably wouldn't notice if I'm
| asleep.
| TZubiri wrote:
| Metaforgotten, but this is a very standard attack
| surface, you don't need to imagine such a close tap, just
| imagine that at any point in the multi node internet an
| attacker has a node and snoops the traffic in its role as
| a relaying router.
| ninju wrote:
| With inflation looks like its now a $5 wrench :-)
|
| https://xkcd.com/538/
| natebc wrote:
| AmazonBasics is good enough in this case! ;)
| bob1029 wrote:
| Obscurity can be fantastic.
|
| One of my favorite patterns for sending large files around is
| to drop them in a public blob storage bucket with a type 4
| guid as the name. No consumer needs to authenticate or sign
| in. They just need to know the resource name. After a period
| of time the files can be automatically expired to minimize
| the impact of URL sharing/stealing.
| unethical_ban wrote:
| No, it's a very sensible slogan to keep people from doing a
| common, bad thing.
|
| Obscurity helps cut down on noise and low effort attacks and
| scans. It only helps as a security mechanism in that the
| remaining access/error logs are both fewer and more
| interesting.
| TZubiri wrote:
| I definitely see it's value as a very naive recommendation
| to avoid someone literally relying on an algorithmic or low
| entropy secret. Literally something you may learn on your
| first class on security.
|
| However on more advanced levels, a more common error is to
| ignore the risks of open source and being public. If you
| don't publish your source code, you are massively safer,
| period.
|
| I guess your view on the subject depends on whether you
| think you are ahead of the curve by taking the naive
| interpretation. It's like investing in the stock market
| based on your knowledge of supply and demand.
| batch12 wrote:
| Obscurity as a single control does not work. That's what the
| phrase hints at. In combination with other controls, it could
| be part of an effective defense. Context matters though.
| 0hijinks wrote:
| Depending on one's threat model, any technique can be a
| secure strategy.
|
| Is my threat model a network of dumb nodes doing automatic
| port scanning? Tucking a system on an obscure IPv6 address
| and never sharing the address may work OK. Running some
| bespoke, unauthenticated SSH-over-Carrier-Pigeon (SoCP)
| tunnel may be fine. The adversaries in the model are pretty
| dumb, so intrusion detection is also easy.
|
| But if the threat model includes any well-motivated,
| intelligent adversary (disgruntled peer, NSA, evil ex-
| boyfriend), it will probably just annoy them. And as a bonus,
| for my trouble, it will be harder to maintain going forward.
| TZubiri wrote:
| It's a bit more complex than that as well. You might have
| attackers of both types and different datapoints that have
| different security requirements. And these are not
| necessarily scalars, you may need integrity for one,
| privacy for the other.
|
| Even when considering hi sophistication attackers, and
| perhaps especially with regards to them, you may want to
| leave some breadcrumbs for them to access your info.
|
| If the deep state wants my company's info, they can safely
| get it by subpoenaing my provider's info, I don't need to
| worry about them as an attacker for privacy, as they have
| the access to the information if needed.
|
| If your approach to security is to add cryptography
| everywhere and make everything as secure as possible and
| imagine that you are up against a nation-state adversary
| (or conversely, that you add security until you satisfy a
| requirement conmesurate with your adversary), then you are
| literally reducing one of the most important design
| requirements of your system to a single scalar that you
| attempt to maximize while not compromising other tradeoffs.
|
| A straightforward lack of nuance. It's like having a tax
| strategy consisting of number go down, or pricing strategy
| of price go up, or cost strategy of cost go down, or risk
| strategy of no risk for me, etc...
| lxgr wrote:
| The only thing you're definitely complicating with security
| by obscurity is getting a clear picture of your own security
| posture.
| wolrah wrote:
| > "Security by obscurity does not work"
|
| The saying is "security by obscurity is not security" which
| is absolutely true.
|
| If your security relies on the attacker not finding it or not
| knowing how it works, it's not actually secure.
|
| Obscurity has its own value of course, I strongly recommend
| running any service that's likely to be scanned for regularly
| on non-standard ports wherever practical simply to reduce the
| number of connection logs you need to sort through. Obscurity
| works for what it actually offers. That has nothing to do
| with security though, and unfortunately it's hard in cases
| where a human is likely to want to type in your service
| address because most user-facing services have little to no
| support for SRV records.
|
| Two of the few services that do have widespread SRV support
| are SIP VoIP and Minecraft, and coincidentally the former is
| my day job while I've also run a personal Minecraft server
| for over a decade. I can say that the couple of systems I
| still have running public-facing SIP on port 5060 get scanned
| tens of thousands of times per hour while the ones running on
| non-standard ports get maybe one or two activations of
| fail2ban a month. Likewise my Minecraft server has never seen
| a single probe from anyone other than an actual player.
| TZubiri wrote:
| >"If your security relies on "
|
| Again, if your security relies on any one thing, it's a
| problem. A secure system needs redundant mechanisms.
|
| Can you think of a single mechanism that if implemented
| would make a system secure? I think not.
| Diggsey wrote:
| This is the worst take...
|
| People consistently misuse the Swiss cheese security metaphor
| to justify putting multiple ineffective security barriers in
| place.
|
| The holes in the cheese are supposed to represent _unknown_
| or very difficult to exploit flaws in your security layers,
| and that 's why you ideally want multiple layers.
|
| You can't just stack up multiple known to be broken layers
| and call something secure. The extra layers are inconvenient
| to users and readily bypassed by attackers by simply tackling
| them one at a time.
|
| Security by obscurity is one such layer.
| TZubiri wrote:
| So according to you, a picket fence or a wire fence is just
| a useless thing that makes things less usable by users?
|
| Security does not consist only of 100% or 99.99% effective
| mechanisms, there needs to be a flow of information and an
| inherent risk, if you are only designing absolute barriers,
| then you are rarely considering the actual surface of
| relevant user interactions. A life form consisting only of
| skin might be very secure, but it's practically useless.
| nkmnz wrote:
| > other things that we do are things like (...) domain
| bruteforcing on wildcard dns
|
| Are you proud of the work you do?
| remlov wrote:
| If you look at the company they founded it's a service to
| protect yourself. Not to willy-nilly go out into the open web
| to find hidden subdomains.
| tmerc wrote:
| Why would enumerating a wildcard dns through brute force be
| something that evokes pride or shame?
| yatralalala wrote:
| I sadly did not see the comment above, but I'd like to just
| add, that this bruteforce and sniffing methods are target
| only against our paying customers.
|
| We built global reverse-DNS dataset solely from cert
| transparency logs. Our active scanning/bruteforcing runs
| only for assets owned by our customers.
| 6stringmerc wrote:
| ...as long as your tools are only in your hands to be
| used, correct? Once a tool is created and used on a
| machine with access to the greater internet, doesn't your
| logic hold that its security is compromised inherently?
| Not saying you have been infiltrated, or a rogue employee
| has cleverly exported a copy or the methodology to
| duplicate it off-site, but I'm not saying that hasn't
| happened either.
| cryptonector wrote:
| It's not that hard to write this code. It's not a nuclear
| weapon.
| lkt wrote:
| You can find a dozen projects on Github that do this,
| it's not sensitive information that needs protecting
| ivell wrote:
| Irrespective of whether they are proud of what they are
| doing, I found the post helpful and educational. Let's not
| prevent people from sharing their knowledge as it might help
| us to protect ourselves. A consequence of such line of
| questioning would be that in future they would be hesitant to
| share their knowledge to avoid being judged.
| lxgr wrote:
| Given that bad actors can also do this, I'd say that publicly
| advertising the fact and thereby drawing attention to
| misconceptions about security is a net good thing.
| amelius wrote:
| Well, I sure hope the remainder of my URLs are safe.
| amelius wrote:
| Like, in: example.com/secret-id-48723487345
|
| I hope the last bit is not leaked somehow (?)
|
| Btw, we need a "falsehoods programmers believe about URLs"
| ...
|
| Although there is: https://www.netmeister.org/blog/urls.html
| idoubtit wrote:
| > Although there is:
| https://www.netmeister.org/blog/urls.html
|
| I think the section named "Pathname" is wrong. It describes
| the path of an URL as if every server was Apache serving
| static files with its default configuration. It should
| describe how the path is converted into a HTTP request.
|
| For instance, the article states that "all of these go to
| the same place : https://example.org https://example.org/
| https://example.org//
| https://example.org//////////////////". That's wrong. A web
| client send a distinct HTTP request for each case, e.g
| starting with `GET // HTTP/1.1`. So the server will receive
| distinct paths. The assertion of "going to the same place"
| makes no sense in the general case.
| 1970-01-01 wrote:
| Subdomainfinder.com ??
|
| Dozens of others will also find it.
|
| Really, it's this simple today.
| binarymax wrote:
| I think your comment resulted in a hug of death for that
| service ;)
| yatralalala wrote:
| Sorry for a bit of self promo, but just to explain we run
| https://reconwave.com/, basically EASM product but more
| focused on network/DNS/setup level.
|
| Finding all things about domains is one of the things that we
| do. And yes, it's very easy.
|
| There are many services like subdomainfinder - i.e.
| dnsdumpster and merklemap. We built our own as well on
| https://search.reconwave.com/. But it's a side project and it
| does not pay our bills.
| cryptonector wrote:
| > Security by obscurity does not work. You can not rely on
| "people won't find it". Once it's online, everyone can find it.
| No matter how you hide it.
|
| Especially do not name your domainnames in a way that leaks
| MNPI! Like, imagine if publicly traded companies A and B were
| discussing a merger or acquisition, do not name your domainname
| A-and-B.com, m'kay?
| cryptonector wrote:
| DANE would help here: register a harmless sounding domainname
| whose name leaks nothing, use DNSSEC and NSEC3, and host your
| hidden service in a sub-domain whose name is a 63 byte long
| string of randomly selected ASCII characters. But this isn't
| really an option.
| Dylan16807 wrote:
| Why the DNSSEC, which then requires NSEC3? Shouldn't a
| wildcard certificate do the job in conjunction with normal
| unsigned DNS?
| no-dr-onboard wrote:
| Hi, former pentester here. If any one of your trusted clients
| is using a google/chromium based browser, the telemetry from
| that browser (webdiscovery) would reveal the existence of the
| subdomain in question. As others have said, security by
| obscurity doesn't work.
| geek_at wrote:
| Current pen tester here and this guy is right. There was a
| Google blog post years ago where Google planted a site with
| an unguessable url and indexed it and used edge to surf on
| the site. Shortly after this site was also listed on Bing.
|
| Google had a "gotcha" moment when Microsoft responded
| basically with "yeah we didn't steal it from Google, you had
| telemetry enabled"
|
| Total shitshow
| AtNightWeCode wrote:
| So to mostly prevent this.
|
| Disable direct IP access. Use wildcard certificates. Don't use
| guessable subdomains like www or mail.
| AtNightWeCode wrote:
| Assuming this is not direct traffic to your IP people will say it
| is because of TLS logs. Maybe it is in your case. But if you spin
| up a CF worker on a subdomain to it you will also get hit by
| traffic immediately. And those certificates are wildcards. I
| think CF leaks subdomains in some cases. Never seen this behavior
| when using CF just as a DNS server though.
| jcalx wrote:
| Some bots scan using giant lists of subdomains, e.g.
| https://github.com/danielmiessler/SecLists/tree/master/Disco....
| Your subdomain may be on that giant combined_subdomains list, or
| perhaps some other lists that other tools use.
| TZubiri wrote:
| Maybe it's a cloudflare controlled scanner?
|
| Maybe you published the subdomain in a cert?
|
| Snooped traffic is unlikely.
|
| This is a good question, if you don't publish a subdomain,
| scanners should not reach it. If they do, there's a leak in your
| infra.
| CGamesPlay wrote:
| Be careful with these. I had a subdomain like this (completely
| unlisted) with a Google OAuth flow on it, using a development
| mode Google app. Somehow, the domain was discovered, and Google
| decided that using their OAuth flow was a phishing scam, and
| delisted my entire toplevel domain as a result!
| yoavm wrote:
| What do you mean "careful with these"? With subdomains?
| CGamesPlay wrote:
| Yes, unlisted subdomains. I updated my post to be clearer.
| joshstrange wrote:
| I must be missing something. What does "unlisted" mean in
| this context?
|
| I have plenty of subdomains I don't "advertise" (tell
| people about online) but "unlisted" is a weird thing to
| call those. Also I don't see how it would matter at all
| when it comes to Google auth.
|
| My guess is they blocked it based on the subdomain name
| itself. I made a "steamgames" subdomain to list stream
| games I have extra copies of (from bundles) for friends to
| grab for free. Less than a day after I put it up I started
| getting chrome scare pages. I switched it to "games" and
| there have been no issues.
| fsflover wrote:
| Could it be that Chrome shared the web page with advertisers?
|
| https://www.ghacks.net/2021/03/16/wonder-about-the-data-goog...
| perching_aix wrote:
| Using the Certificate Transparency logs I'd imagine.
|
| Also note that _your domains are live_ as they 're allocated
| (they exist). Whether a web server or anything else actually
| backs them is a different question entirely.
|
| For "secret" subdomains, you'll want a wildcard certificate. That
| way only that will show on the CT logs. Note that if you serve
| over IPv4, the underlying host will be eventually discovered
| anyways by brute-force host enumeration, and the domain can still
| be discovered using dictionary attacks / enumeration.
|
| Never touched Cloudflare so this is as far as I can help you.
| immibis wrote:
| Additionally to what other people said, you can assume Cloudflare
| is selling lists of DNS names to someone.
| bbarnett wrote:
| If you ever email a link and it hits gmail, Google will index it.
| whalesalad wrote:
| ICANN zone files -
| https://www.icann.org/resources/pages/czds-2014-03-03-en
| zeagle wrote:
| Can I ask an adjacent question? I have a bunh of DNS A name
| entries for locallyaccessedservice.mydomain.tld point to my
| 10.0.0.x NAS's nginx reverse proxy so I can use HTTPS and DNS to
| access them locally and via Tailscale. My cert is for
| *.domain.tld. It's nothing critical and only accessible within my
| LAN, but is there any reason I shouldn't be doing this from a
| security point of view? I guess someone could phish that to
| another globally accessible server if DNS changed and I wouldn't
| notice but I don't see how that would be an issue. There are a
| couple nginx services exposed to public but not those specific
| domains so I guess that is an attack vector since.
| yatralalala wrote:
| As always, depends on your threat model. Generally having
| private IPs in public DNS is not great, because potential
| attacker gets "a general idea" how your private net looks like.
|
| But I'd say there's no issue if everything else is secured
| properly.
| zeagle wrote:
| Great thank you. I've mulled around running separate reverse
| proxies for public and internal services instead.
| Gabrys1 wrote:
| > Expanse, a Palo Alto Networks company, searches across the
| global IPv4 space multiple
|
| So my guess is reverse DNS
| itscrush wrote:
| > I am using CloudFlare for my DNS.
|
| Based on this it sounds like you exposed your resource and
| advertised it for others. Reverse dns, get IP, scan IP.
|
| Probably simpler, you exposed resource on IPV4 publicly, if it
| exists, it'll be scanned. There's probably 100s of companies
| scanning entire 0.0.0.0/0 space at all times.
| eat wrote:
| DNS enumeration (brute force) with a good wordlist, zone
| transfer, or leaking the name through a certificate served when
| accessing your host via IP address are all possibilities.
|
| The name "userfileupload" is far from not-obvious, so that would
| be my guess.
| aspbee555 wrote:
| cloudflare uses certificates with numerous other site names
| included on the certificate as alt names so your site name could
| have been discovered by any other site that happens to use that
| same cert
| 1vuio0pswjnm7 wrote:
| Why not experiment with multiple variations. For example, as part
| of the experiment, run own DNS, use non-standard DNS encryption
| like CurveDNS, or even no DNS at all, use non-standard port for
| HTTPS, self-signed CA, TLS with no SNI extension, or even
| TCPCurve instead of CAs and TLS. If non-discoverability is the
| goal, there are inifinite ways to deviate from web developer
| norms.
|
| If "the internet fails to find the subdomain" when using non-
| standard practices and conventions then perhaps "following the
| internet's recommendations", e.g., use Cloudflare, etc., might be
| partially at cause for discoverability.
|
| Would be surprised if Expanse scans more than a relatively small
| selection of common ports.
| codazoda wrote:
| This discussion makes me wonder, how hard is it to find a Google
| Document that was shared with "Anyone with the link"?
| oliwarner wrote:
| Certificate Transparency would also be my guess. These are logs
| published by big TLS certificate issuers to cross-check and make
| sure they're not issuing certificates for domains they have no
| standing on.
|
| The way around this is to issue a wildcard for your root domain
| and use that. Your main domain is discoverable but your subs
| aren't.
|
| There are other routes: leaky extensions, leaky DNS servers, bad
| internet security system utilities that phone home about traffic.
| Who knows?
|
| Unless your IP address redirects to your subdomain --not unheard
| of-- it's not somebody IP/port scanning. Webservers don't
| typically leak anything about the domains they serve for.
___________________________________________________________________
(page generated 2025-03-07 23:01 UTC)