hngopher.com

       [HN Gopher] You cannot simply publicly access private secure lin...
       ___________________________________________________________________
        
       You cannot simply publicly access private secure links, can you?
        
       Author : vin10
       Score  : 379 points
       Date   : 2024-03-07 16:29 UTC (1 days ago)
        
 (HTM) web link (vin01.github.io)
 (TXT) w3m dump (vin01.github.io)
        
       | internetter wrote:
       | The fundamental issue is that links without any form of access
       | control are presumed private, simply because there is no public
       | index of the available identifiers.
       | 
       | Just last month, a story with a premise of discovering AWS
       | account ids via buckets[0] did quite well on HN. The consensus
       | established in the comments is that if you are relying on your
       | account identifier being private as some form of security by
       | obscurity, you are doing it wrong. The same concept applies here.
       | This isn't a novel security issue, this is just another method of
       | dorking.
       | 
       | [0]: https://news.ycombinator.com/item?id=39512896
        
         | ta1243 wrote:
         | The problem is links leak.
         | 
         | In theory a 256 hex-character link (so 1024 bits) is near
         | infinitely more secure than a 32 character username and 32
         | character password, as to guess it
         | 
         | https://site.com/[256chars]
         | 
         | As there's 2^1024 combinations. You'd never brute force it
         | 
         | vs
         | 
         | https://site,com/[32chars] with a password of [32chars]
         | 
         | As there's 2^256 combinations. Again you can't brute force it,
         | but it's more likely than the 2^1024 combinations.
         | 
         | Imagine it's
         | 
         | https://site,com/[32chars][32chars] instead.
         | 
         | But while guessing the former is harder than the latter, URLs
         | leak a lot, far more than passwords.
        
           | internetter wrote:
           | Dorking is the technique of using public search engine
           | indexes to uncover information that is presumed to be
           | private. It has been used to uncover webcams, credit card
           | numbers, confidential documents, and even spies.
           | 
           | The problem is the website administers who are encoding
           | authentication tokens into URL state, _not_ the naive
           | crawlers that find them.
        
             | shkkmo wrote:
             | It can be OK to put authentication tokens in urls, but
             | those tokens need to (at a bare minimum) have short
             | expirations.
        
               | knome wrote:
               | >It can be OK to put authentication tokens in urls
               | 
               | When would this ever be necessary? URL session tokens
               | have been a bad idea ever since they first appeared.
               | 
               | The only things even near to auth tokens I can reasonably
               | see stuffed into a URL are password reset and email
               | confirmation tokens sent to email for one time short
               | expiration use.
               | 
               | Outside of that, I don't see any reason for it.
        
               | dylanowen wrote:
               | They're useful for images when you can't use cookies and
               | want the client to easily be able to embed them.
        
               | albert_e wrote:
               | "presigned" URLs[1] are a pretty standard and recommended
               | way of providing users access to upload/download content
               | to Amazon S3 buckets without needing other forms of
               | authentication like IAM credential pair, or STS token,
               | etc
               | 
               | Web Applications do utilize this pattern very frequently
               | 
               | But as noted i previous comment these do have short
               | expiry times (configurable) so that there is no permanent
               | or long-term risk on the lines of the OP article
               | 
               | [1]: https://docs.aws.amazon.com/AmazonS3/latest/userguid
               | e/using-...
        
               | vin10 wrote:
               | You are right about short expiry times but another catch
               | here is that if pre-signed URLs are being leaked in an
               | automated fashion, these services also keep the
               | downloaded content from these URLs around. I found
               | various such examples where links no longer work, but
               | PDFs downloaded from pre-signed URLs were still stored by
               | scanning services.
               | 
               | From https://urlscan.io/blog/2022/07/11/urlscan-pro-
               | product-updat...
               | 
               | > In the process of scanning websites, urlscan.io will
               | sometimes encounter file downloads triggered by the
               | website. If we are able to successfully download the
               | file, we will store it, hash it and make it available for
               | downloading by our customers.
        
               | couchand wrote:
               | Indeed, the only valid operation with the magic URL is
               | exchanging the URL-based token with something else (your
               | PDF, a session token, etc.) and then expiring the URL, so
               | by the time the scanner gets around to it the original
               | URL is invalid.
        
               | ta1243 wrote:
               | That seems ripe for race condition class problems.
        
               | couchand wrote:
               | If anybody but the intended recipient gets the magic URL
               | first there's something more critical wrong with some
               | assumption in your authentication scheme.
        
               | knome wrote:
               | Interesting. I haven't built on s3, and if I did my first
               | instinct would probably have been to gate things through
               | a website.
               | 
               | Thanks for sharing your knowledge in that area.
        
             | layer8 wrote:
             | I wonder if there would be a way to tag such URLs in a
             | machine-recognizable, but not text-searchable way. (E.g.
             | take every fifth byte in the URL from after the authority
             | part, and have those bytes be a particular form of hash of
             | the remaining bytes.) Meaning that crawlers and tools in
             | TFA would have a standardized way to recognize when a URL
             | is meant to be private, and thus could filter them out from
             | public searches. Of course, being recognizable in that way
             | may add new risks.
        
               | loa_in_ wrote:
               | We already have robots.txt in theory.
        
               | layer8 wrote:
               | I didn't think robots.txt would be applicable to URLs
               | being copied around, but actually it might be, good
               | point. Though again, collecting that robots.txt
               | information could make it easier to search for such URLs.
        
               | internetter wrote:
               | We already have a solution to this. It's called not
               | including authentication information within URLs
               | 
               | Even if search engines knew to include it, would every
               | insecure place a user put a link know it? Bad actors with
               | their own indexes certainly wouldn't care
        
               | layer8 wrote:
               | How do you implement password-reset links otherwise? I
               | mean, those should be short-lived, but still.
        
               | andersa wrote:
               | You could send the user a code that they must copy paste
               | onto the page rather than sending them a link.
        
               | vmfunction wrote:
               | Hopefully using POST not GET. The GET links get logged in
               | the HTTP server most of time. Just another great way to
               | store your 'security credential' in plain text. Logs gets
               | zipped and archive. Good luck with any security measure.
        
               | andersa wrote:
               | I mean of course the idea was to put it in a form that is
               | sent using POST, but even then, it's a single-use reset
               | code so once it shows in the log it's worthless.
        
               | fullspectrumdev wrote:
               | This makes a large assumption about application logic
               | that is often incorrect.
               | 
               | t. security auditor/researcher.
        
               | rapind wrote:
               | It certainly does. Security usually comes at the cost of
               | convenience and can incur confusion.
               | 
               | In this example, where best practice may be to use one
               | time tokens, you will end up with users who click on the
               | secure link again (from their email) in the future to
               | access the secure site and they'll be frustrated when
               | they have to go through the secure link generation dance
               | again.
               | 
               | Of course you can mitigate this with sessions / cookies,
               | but that is also a security compromise and not device
               | portable.
               | 
               | It's easy to say that these are minor uxp concerns, but
               | enforcing a high level of security may have a significant
               | user cost depending on your demographic. I have a
               | demographic that skews older and non technical and they
               | are pretty loud when they complain about this stuff...
               | meanwhile they are also more likely to reuse passwords
               | and forward emails with secure links in them!
        
               | conductr wrote:
               | Some people will always find something to complain about.
               | I feel like it's completely reasonable to give a "sorry
               | this link was only valid for 5 minutes and is now
               | expired, request a new code here" message. State it in
               | the email that originally contained the link and state it
               | again on the page when they click it afterwards. This is
               | incredibly common practice and very unlikely to be the
               | first time someone has seen this workflow. If they want
               | to complain further, direct them to a password manager
               | and remind them there's probably one built into their
               | browser already
        
               | rapind wrote:
               | > State it in the email that originally contained the
               | link and state it again on the page when they click it
               | afterwards.
               | 
               | No one reads this stuff. I'm not saying this to be
               | argumentative. I have a large user base and I know from
               | experience.
        
               | conductr wrote:
               | Oh I definitely agree. But the point is that you've
               | informed them of the process before expiring their link.
               | These types of complainers are just looking for an easy
               | button and don't care about your security policies so I
               | say to do this just so you can point at it and say it's
               | your process if someone really gets their panties in a
               | wad over it.
               | 
               | It's also why you say it on the site when the link is
               | found to be expired. You basically remind them of the
               | email even though they didn't read it. Just consistent
               | messaging is all. It might reduce the number of folks
               | that decide to yell at you over it but will never fully
               | eliminate them.
               | 
               | IMO the appropriate easy button is them using a password
               | manager which is what I'd recommend. Also, just ignore
               | these complaints if they don't take your explanation and
               | really push hard it's a customer not worth pleasing at
               | some point.
        
               | hnlmorg wrote:
               | As you said, short lived codes. And the codes don't
               | contain any PII. So even if the link does get indexed,
               | it's meaningless and useless.
        
               | 1231232131231 wrote:
               | A short-lived link that's locked down to their user
               | agent/IP would work as well.
        
               | dmurray wrote:
               | Also, it would allow bad actors to just opt out of
               | malware scans - the main vector whereby these insecure
               | URLs were leaked.
        
               | fullspectrumdev wrote:
               | So there was an interesting vector a while back where
               | some email firewalls would reliably click on any link
               | sent to them that was abused by spammers.
               | 
               | Spammers would sign up for services that required a click
               | on a link using blabla@domainusingsuchservice
               | 
               | The services bots to check phishing would reliably click
               | on the link, rendering the account creation valid.
               | 
               | One particularly exploitable vendor for getting such
               | links clicked was one that shares the name with a
               | predatory fish that also has a song about it :)
        
               | rkagerer wrote:
               | SharkGate?
               | 
               | Why coy about naming them?
        
               | reaperman wrote:
               | Barracuda. And for plausible deniability so they don't
               | have as much of a chance of catching a libel suit. Not
               | sure how necessary or effective that is, but I do
               | understand the motivation.
        
               | tsimionescu wrote:
               | Actually, there are cases where this is more or less
               | unavoidable.
               | 
               | For example, if you want a web socket server that is
               | accessible from a browser, you need authentication, and
               | can't rely on cookies, the only option is to encode the
               | Auth information in the URL (since browsers don't allow
               | custom headers in the initial HTTP request for
               | negotiating a web socket).
        
               | zer00eyz wrote:
               | Authentication: Identify yourself
               | 
               | Authorization: Can you use this service.
               | 
               | Access Control/Tokenization: How long can this service be
               | used for.
               | 
               | I swipe my badge on the card reader. The lock unlocks.
               | 
               | Should we leave a handy door stopper or 2x4 there, so you
               | can just leave it propped open? Or should we have tokens
               | that expire in a reasonable time frame.. say a block of
               | ice (in our door metaphor) so it disappears at some point
               | in future? Nonce tokens have been a well understood
               | pattern for a long time...
               | 
               | Its not that these things are unavoidable its that
               | security isnt first principal, or easy to embed due to
               | issues of design.
        
               | bigiain wrote:
               | > Or should we have tokens that expire in a reasonable
               | time frame.
               | 
               | And that are single-use.
               | 
               | (Your password reset "magic link" should expire quickly,
               | but needs a long enough window to allow for slow mail
               | transport. But once it's used the first time, it should
               | be revoked so it cannot be used again even inside that
               | timeout window.)
        
               | skissane wrote:
               | > the only option is to encode the Auth information in
               | the URL (since browsers don't allow custom headers in the
               | initial HTTP request for negotiating a web socket).
               | 
               | Put a timestamp in the token and sign it with a private
               | key, so that the token expires after a defined time
               | period.
               | 
               | If the URL is only valid for the next five minutes, the
               | odds that the URL will leak and be exploited in that five
               | minute window is very low
        
               | bigiain wrote:
               | Yeah - that's just red-flagging "interesting" urls to
               | people running greyhat and blackhat crawlers.
        
             | FrustratedMonky wrote:
             | "public search engine indexes"
             | 
             | Then it should be the search engine at fault.
             | 
             | If you leave your house unlocked is one thing.
             | 
             | If there is a company trying everyone's doors, then posting
             | a sign in the yard "this house is unlocked", has to account
             | for something.
        
               | lmm wrote:
               | A plain URL is an open door not a closed one. Most
               | websites are public and expected to be public.
        
               | FrustratedMonky wrote:
               | Isn't that the point of the post?
               | 
               | There are URL's that are out there 'as-if' public, but
               | really should be private.
               | 
               | And some people argue they should be treated as private,
               | even if it is just a plain URL and public.
        
               | lmm wrote:
               | You can't blame the search engine for indexing plain
               | URLs. Listing a closed-but-unlocked door is a bad
               | analogy.
        
               | FrustratedMonky wrote:
               | Well. You also can't charge joe blow with a crime for
               | browsing URL's, that happen to be private but
               | accidentally made public.
               | 
               | Just by looking, you are guilty. That is wrong.
        
               | pixl97 wrote:
               | You've been appropriately downvoted for a terrible take.
               | 
               | Imagine if you left your house unlocked it would be
               | broken into seconds later. Even worse, the people that
               | broke into it live in a different country with no
               | extradition law and you'd never figure out who they are
               | anyway.
               | 
               | In this case your insurance company would tell you lock
               | your damned doors and the police may even charge you
               | under public nuisance laws.
        
               | FrustratedMonky wrote:
               | Yeah, it is a terrible take. It's a bad situation.
               | 
               | Just like charging people for a crime for accessing
               | private material, simply by browsing a public URL.
               | 
               | Maybe Better take:
               | 
               | It is like someone being charged for breaking and
               | entering, simply by looking at a house from the street,
               | when the door was left open. Your guilty by simply
               | looking, and seeing inside. But you were just walking by,
               | you saw inside before realizing it was a crime, now your
               | guilty.
               | 
               | If you are going to charge people for accessing private
               | sites, potentially by accident, by simply being provided
               | a public URL from a search engine. Then shouldn't the
               | search engine have some culpability?
               | 
               | Or. Better. Change the law so the onus is on the site to
               | protect itself.
        
             | thayne wrote:
             | That isn't an inherent problem with having a secret in the
             | url. The problem is the url was leaked somewhere where it
             | could get indexed.
             | 
             | And sometimes it isn't practical to require a POST request
             | or a cookie.
             | 
             | And the risk of a url leaking can be greatly mitigated if
             | the url is only valid for a short period of time.
        
               | ta1243 wrote:
               | > That isn't an inherent problem with having a secret in
               | the url. The problem is the url was leaked somewhere
               | where it could get indexed.
               | 
               | Technically you're right -- after all sending an
               | authentication as a separate header doesn't make any
               | difference.                   GET /endpoint/?Auth=token
               | 
               | or                   GET /endpoint         Auth: token
               | 
               | Sends the same data over the wire.
               | 
               | However software treats URLs differently to headers. They
               | sit in browser histories, server logs, get parsed by MITM
               | firewalls, mined by browser extensions, etc
               | 
               | using https://user:pass@site.com/endpoint or
               | https://auth:token@site.com/endpoint
               | 
               | Would be better than
               | 
               | https://site.com/endpoint/user/pass or
               | https://site.com/endpoint/?auth=token
               | 
               | As the former is less likely to be stored, either on the
               | client or on the server. I don't do front end (or backend
               | authentication -- I just rely on x509 client certs or
               | oidc and the web server passes the validated username)
        
               | paulgb wrote:
               | For better or worse, basic auth in the URL isn't really
               | an option any more, (e.g. see
               | https://stackoverflow.com/a/57193064). I think the issue
               | was that it reveals the secret to anyone who can see the
               | URL bar, but the alternative we got still has that
               | problem _and also_ has the problem that the secret is no
               | longer separable from the resource identifier.
        
               | thayne wrote:
               | The browser could hide the secret after it is entered.
        
           | 4death4 wrote:
           | Passwords are always private. Links are only sometimes
           | private.
        
             | QuinnyPig wrote:
             | Yup. There's a reason putting credentials into url
             | parameters is considered dangerous.
        
             | bachmeier wrote:
             | Well-chosen passwords stored properly are always private.
             | Passwords also tend to have much longer lifetimes than
             | links.
        
           | masom wrote:
           | You won't find a specific link, but at some point if you
           | generate millions of urls the 1024 bits will start to return
           | values pretty quick through bruteforce.
           | 
           | The one link won't be found quickly, but a bunch of links
           | will. You just need to fetch all possibilities and you'll get
           | data.
        
             | blueflow wrote:
             | 1024 bits seems a bit too much for the birthday problem to
             | be a thing.
             | 
             | I looked at [1] to do the calculation but (2^1024)! is a
             | number too large for any of my tools. If someone has a math
             | shortcut to test this idea properly...
             | 
             | [1] https://en.wikipedia.org/wiki/Birthday_problem#Calculat
             | ing_t...
        
               | saagarjha wrote:
               | Stirling's approximation?
        
               | Dylan16807 wrote:
               | This isn't the birthday problem. That would be the chance
               | of two random links overlapping. The birthday problem
               | scales with n^2, while trying to guess links scales with
               | m * n, number of guesses multiplied by number of links.
               | 
               | (Well, before you apply the logistic taper to it. So you
               | wanted an approximation? There you go. Until you get the
               | chance of a hit to be quite high, it's basically equal to
               | guesses * valid links / 2^1024.)
        
               | ta1243 wrote:
               | The chance is less than guessing a random 128 bit
               | username and random 128 bit password. And then guessing a
               | completely different username and password on the very
               | next go.
               | 
               | You'd get far more return on investment breaking bitcoin
               | wallets.
               | 
               | 2^1024 is 10^308
               | 
               | Lets say there are 12 billion links per person, and 8
               | billion people. That's 100 billion billion, or 10^20
               | links.
               | 
               | 10^20 / 10^308 is zero.
               | 
               | Lets say you can test 10 trillion links a second, and
               | started when the big bang happened, you'll have tested
               | 10^30 links so far.
               | 
               | The number of links you'll have found so far is zero.
        
               | Dylan16807 wrote:
               | Yes, but I'm not sure why you replied to me?
        
               | rjmunro wrote:
               | 2^1024 [?] 10^300. There's only [?]10^80 atoms in the
               | whole known universe. And we haven't even done the
               | factorial.
        
             | duskwuff wrote:
             | > You won't find a specific link, but at some point if you
             | generate millions of urls the 1024 bits will start to
             | return values pretty quick through bruteforce.
             | 
             | Not even close. 1024 bits is a really, really big address
             | space.
             | 
             | For the sake of argument and round numbers, let's say that
             | there are 4.2 billion (2^32) valid URLs. That means that
             | one out of every 2^992 randomly generated URLs is valid.
             | Even if you guessed billions of URLs every second, the
             | expected time to come up with a valid one (~2^960 seconds)
             | is still many orders of magnitude greater than the age of
             | the universe (~2^59 seconds).
        
             | charleslmunger wrote:
             | I'm not sure your math checks out. With 1024 bits of
             | entropy and, say, 1 trillion valid links, your chances of
             | any one link being valid are 1/2^984
             | 
             | So test a million links - your probability of finding a
             | real one is (1-1/2^984)^1000000. That's around 1/10^291
             | chance of hitting a valid URL with a million tries. Even if
             | you avoid ever checking the same URL twice it will still
             | take you an impractical amount of time.
        
               | mbrumlow wrote:
               | All this is fine and dandy until your link shows up in a
               | log at /logs.
        
               | ummonk wrote:
               | The same can almost as easily happen with user-submitted
               | passwords.
        
               | 1231232131231 wrote:
               | Passwords usually don't show up in server logs if
               | submitted correctly.
        
             | eknkc wrote:
             | We call 128 bit random data "universally" unique ids. 1024
             | bits won't ever get close to returning any random hits.
        
           | Y_Y wrote:
           | > site-comma-com
           | 
           | Did you do that just to upset me?
        
           | hiddencost wrote:
           | No. In theory they are both totally insecure.
        
           | noahtallen wrote:
           | You can easily rate-limit an authentication attempt, to make
           | brute-forcing account access practically impossible, even for
           | a relatively insecure passwords.
           | 
           | How would you do that for the URLs? 5 requests to
           | site.com/[256chars] which all 404 block your IP because you
           | don't have a real link? I guess the security is relying on
           | the fact that only a very a small percentage of the total
           | possible links would be used? Though the likelihood of
           | randomly guessing a link is the same as the % of addressable
           | links used.
        
             | ummonk wrote:
             | I don't think you realize how exponentially large the
             | possible combinations of 256 characters would be. In fact
             | it doesn't need to be anywhere near 256 characters. 64
             | hexadecimal characters would suffice.
        
           | ablob wrote:
           | Which alphabet did you take as a basis to reach 2^256
           | combinations?
        
             | 1231232131231 wrote:
             | Binary?
        
         | bo1024 wrote:
         | There's probably details I'm missing, but I think the
         | fundamental issue is that "private" messages between people are
         | presumed private, but actually the platforms we use to send
         | messages do read those messages and access links in them. (I
         | mean messages in a very broad sense, including emails, DMs,
         | pasted links in docs, etc.)
        
           | internetter wrote:
           | URL scanners are not scanning links contained within
           | platforms that require access control. They haven't guessed
           | your password, and to my knowledge no communications platform
           | is feeding all links behind authentication into one of these
           | public URL scanning databases. As the article acknowledged in
           | the beginning, these links are either exposed as the result
           | of deliberate user action, or misconfigured extensions (that,
           | I might add, are suffering from this exact same
           | misconception).
           | 
           | If the actual websites are configured to not use the URL as
           | the authentication state, all this would be avoided
        
             | tobyjsullivan wrote:
             | The suggestion (in both the article and the parent) is that
             | the platforms themselves are submitting URLs. For example,
             | if I send a link in Discord[0] DM, it might show the
             | recipient a message like "warning: this link is malicious".
             | How does it know that? It submitted the url to one of these
             | services without your explicit consent.
             | 
             | [0] Discord is a hypothetical example. I don't know if they
             | have this feature. But an increasing number of platforms
             | do.
        
               | internetter wrote:
               | Where in the article does it suggest this? The two bullet
               | points at the very top of TFA is what I cited to
               | discredit this notion, I read it again and still haven't
               | found anything suggesting the communication platforms are
               | submitting this themselves.
        
               | bombcar wrote:
               | Falcon Sandbox is explicitly mentioned - which is a
               | middleware that can be installed on various communication
               | platforms (usually enterprise):
               | https://www.crowdstrike.com/products/threat-
               | intelligence/fal...
               | 
               | Microsoft has "safe links":
               | https://learn.microsoft.com/en-
               | us/microsoft-365/security/off... - Chrome has its own
               | thing, but there are also tons of additional hand-rolled
               | similar features.
               | 
               | My main annoyance is when they kill a one-time use URL.
        
               | anonymousDan wrote:
               | Do you know if safe links is guilty of the issue in the
               | OP?
        
               | bombcar wrote:
               | I suspect not because Microsoft is using their own
               | internal system.
               | 
               | However, it likely exposes the content internally to
               | Microsoft.
               | 
               | They do 100% break Salesforce password reset links, which
               | is a major PITA.
        
               | tobyjsullivan wrote:
               | I thought I read it in the article but I may have
               | unconsciously extrapolated from and/or misread this part:
               | 
               | "I came across this wonderful analysis by Positive
               | Security[0] who focused on urlscan.io and used canary
               | tokens to detect potential automated sources (security
               | tools scanning emails for potentially malicious [links])"
               | 
               | I don't see any mention of messaging platforms generally.
               | It only mentions email and does not suggest who might be
               | operating the tooling (vendor or end users). So I seem to
               | have miscredited that idea.
               | 
               | [0] https://positive.security/blog/urlscan-data-leaks
        
             | nightpool wrote:
             | The article says "Misconfigured scanners". Many, many
             | enterprise communication tools have such a scanner, and if
             | your IT team is using the free plan of whatever url scan
             | tool they signed up for, it's a good bet that these links
             | may end up being public.
        
         | mikepurvis wrote:
         | Bit of a tangent, but I was recently advised by a consultant
         | that pushing private Nix closures to a publicly-accessible S3
         | bucket was fine since each NAR file has a giant hash in the
         | name. I didn't feel comfortable with it so we ended up going a
         | different route, but I've continued to think about that since
         | how different is it _really_ to have the  "secret" be in the
         | URL vs in a token you submit as part of the request for the
         | URL?
         | 
         | And I think for me it comes down to the fact that the tokens
         | can be issued on a per-customer basis, and access logs can be
         | monitored to watch for suspicious behaviour and revoke
         | accordingly.
         | 
         | Also, as others have mentioned, there's just a different
         | mindset around how much it matters that the list of names of
         | files be kept a secret. On the scale of things Amazon might
         | randomly screw up, accidentally listing the filenames sitting
         | in your public bucket sounds pretty low on the priority list
         | since 99% of their users wouldn't care.
        
           | johnmaguire wrote:
           | > how different is it really to have the "secret" be in the
           | URL vs in a token you submit as part of the request for the
           | URL?
           | 
           | I'm not sure I grok this. Do you mean, for example, sending a
           | token in the POST body, or as a cookie / other header?
           | 
           | One disadvantage to having a secret in the URL, versus in a
           | header or body, is that it can appear in web service logs,
           | unless you use a URI fragment. Even then, the URL is visible
           | to the user, and will live in their history and URL bar -
           | from which they may copy and paste it elsewhere.
        
             | mikepurvis wrote:
             | In this case it's package archives, so they're never
             | accessed from a browser, only from the Nix daemon for
             | binary substitution [1]:
             | https://nixos.wiki/wiki/Binary_Cache
        
           | nmadden wrote:
           | I wrote about putting secrets in URLs a few years ago:
           | https://neilmadden.blog/2019/01/16/can-you-ever-safely-
           | inclu...
        
             | Sn0wCoder wrote:
             | Question in the Waterken-Key flow with token in the URL
             | fragment the URL looks like HTTPS
             | www.example.com/APP/#mhbqcmmva5ja3 - but in the diagram its
             | hitting example.com/API/#mhbqcmmva5ja3 Is this a type-o OR
             | are we mapping APP to API with the proxy so the user thinks
             | they are going to the APP with their Key. Or does the
             | browser do us for us automatically when it sees app in the
             | URL and then stores the key in window.location.hash. I am
             | confused and might just find the answer on Google but since
             | you appear to be the author maybe you can answer the
             | question here.
        
               | nmadden wrote:
               | Oops, that's a typo.
        
           | cxr wrote:
           | > I've continued to think about that since how different is
           | it _really_ to have the  "secret" be in the URL vs in a token
           | you submit as part of the request for the URL
           | 
           | Extremely different. The former depends on the existence of a
           | contract about URL privacy (not to mention third parties
           | actually adhering to it) when no such contract exists. Any
           | design for an auth/auth mechanism that depends on private
           | links is inherently broken. The very phrase "private link" is
           | an oxymoron.
           | 
           | > _I am not sure why you think that having an obscure URI
           | format will somehow give you a secure call (whatever that
           | means). Identifiers are public information._
           | 
           | <https://roy.gbiv.com/untangled/2008/rest-apis-must-be-
           | hypert...>
        
         | bachmeier wrote:
         | > The fundamental issue is that links without any form of
         | access control are presumed private, simply because there is no
         | public index of the available identifiers.
         | 
         | Is there a difference between a private link containing a
         | password and a link taking you to a site where you input the
         | password? Bitwarden Send gives a link that you can hand out to
         | others. It has # followed by a long random string. I'd like to
         | know if there are security issues, because I use it regularly.
         | At least with the link, I can kill it, and I can automatically
         | have it die after a few days. Passwords generally don't work
         | that way.
        
           | koolba wrote:
           | If there's a live redirect at least there's the option to
           | revoke the access if the otherwise public link is leaked. I
           | think that's what sites like DocuSign do with their public
           | links. You can always regenerate it and have it resent to the
           | intended recipients email, but it expires after some fixed
           | period of time to prevent it from being public forever.
        
           | 7952 wrote:
           | There is a difference in that people intuitively know that
           | entering passwords gives access. Also, it may be different
           | legally as the user could reasonably be expected to know that
           | they are not supposed to access something.
        
             | bachmeier wrote:
             | > There is a difference in that people intuitively know
             | that entering passwords gives access.
             | 
             | This is a valid argument. However, I'd say that there are
             | two standard practices with links that are a big advantage:
             | giving them a short life, and generating extremely hard to
             | guess URLs. I was a Lastpass customer before their security
             | problems came out. I had many passwords that I made years
             | ago but don't use the service any longer. I moved more into
             | the URL camp at that time. Who knows how many passwords I
             | made 15 or 20 years ago that today are no longer secure.
        
           | PeterisP wrote:
           | Yes, the difference is in what all our tools and
           | infrastructure presume to be more or less sensitive.
           | 
           | Sending a GET request to a site for the password-input screen
           | and POST'ing the password will get very different treatement
           | than sending the same amount of "authorization bits" in the
           | URL; in the first case, your browser won't store the secret
           | in the history, the webserver and reverse proxy won't include
           | it in their logs, various tools won't consider it appropriate
           | to cache, etc, etc.
           | 
           | Our software infrastructure is built on an assumption that
           | URLs aren't really sensitive, not like form content, and so
           | they get far more sloppy treatment in many places.
           | 
           | If the secret URL is short-lived or preferably single-use-
           | only (as e.g. many password reset links) then that's not an
           | issue, but if you want to keep something secret long-term,
           | then using it in an URL means it's very likely to get placed
           | in various places which don't really try to keep things
           | secret.
        
         | fddrdplktrew wrote:
         | legend.
        
         | XorNot wrote:
         | Worked for a company which ran into an S3 bucket naming
         | collision when working with a client - turns out that both
         | sides decided hyphenated-company-name was a good S3 bucket name
         | (my company lost that race obviously).
         | 
         | One of those little informative pieces where everytime I do AWS
         | now all the bucket names are usually named
         | <project>-<deterministic hash from a seed value>.
         | 
         | If it's really meant to be private then you encrypt the
         | project-name too and provide a script to list buckets with
         | "friendly" names.
         | 
         | There's always a weird tradeoff with hosted services where
         | technically the perfect thing (totally random identifiers) is
         | too likely to mostly be an operational burden compared to the
         | imperfect thing (descriptive names).
        
           | cj wrote:
           | What would encrypting the project name accomplish? Typically
           | if you're trying to secure a S3 bucket you'll do that via
           | bucket settings. Many years ago you had to jump through hoops
           | to get things private, but these days there's a big easy
           | button to make a bucket inaccessible publicly.
        
             | XorNot wrote:
             | The point is that in some cases the name of the project
             | might itself be considered sensitive in some way, so
             | preventing people testing bucket names by trying to create
             | them helps prevent it, but doesn't completely lock you out
             | of being able to associate the bucket back to its internal
             | name, and allows the names to be deterministic internally -
             | i.e. someone spinning up a test environment is still
             | getting everything marked appropriately, deterministically,
             | and uniquely.
        
       | scblock wrote:
       | When it comes to the internet if something like this is not
       | protected by anything more than a random string in a URL then
       | they aren't really private. Same story with all the internet
       | connected web cams you can find if you go looking. I thought we
       | knew this already. Why doesn't the "Who is responsible" section
       | even mention this?
        
         | AnotherGoodName wrote:
         | Such links are very useful in an 'it's OK to have security
         | match the use case' type of way. You don't need maximum
         | security for everything. You just want a barrier to widespread
         | sharing in some cases.
         | 
         | As an example i hit 'create link share' on a photo in my photo
         | gallery and send someone the link to that photo. I don't want
         | them to have to enter a password. I want the link to show the
         | photo. It's ok for the link to do this. One of the examples
         | they have here is exactly that and it's fine for that use case.
         | In terms of privacy fears the end user could re-share a
         | screenshot at that point anyway even if there was a login. The
         | security matches the use case. The user now has a link to a
         | photo, they could reshare but i trust they won't intentionally
         | do this.
         | 
         | The big issue here isn't the links imho. It's the security
         | analysis tools scanning all links a user received via email and
         | making them available to other users in that community. That's
         | more re-sharing than i intended when i sent someone a photo.
        
           | nonrandomstring wrote:
           | > Such links are very useful in an 'it's OK to have security
           | match the use case'
           | 
           | I think you give the most sensible summary. It's about
           | "appropriate and proportional" security for the ease of use
           | trade-off.
           | 
           | > the user now has a link to a photo, they could reshare but
           | i trust they won't intentionally do this.
           | 
           | Time limits are something missing from most applications to
           | create ephemeral links. Ideally you'd want to choose from
           | something like 1 hour, 12 hours, 24 hours, 72 hours... Just
           | resend if they miss the message and it expires.
           | 
           | A good trick is to set a cron job on your VPS to clear
           | /www/tmp/ at midnight every other day.
           | 
           | > The big issue here isn't the links imho. It's the security
           | analysis tools scanning all links a user received via email
           | 
           | You have to consider anything sent to a recipient of Gmail,
           | Microsoft, Apple - any of the commercial providers - to be
           | immediately compromised. If sending between private domains
           | on unencrypted email then it's immediately compromised by
           | your friendly local intelligence agency. If using PGP or am
           | E2E chat app, assume it _will_ be compromised at the end
           | point eventually, so use an ephemeral link.
        
           | marcosdumay wrote:
           | The situation is greatly improved if you make the link short-
           | lived and if you put the non-public data in a region of the
           | URL that expects non-public data, like in the password, as in
           | "https://anonymous:32_chars_hash@myphotolibrary.example.com/u
           | ...".
        
       | victorbjorklund wrote:
       | Can someone smarter explain to me what is different between?
       | 
       | 1) domain.com/login user: John password: 5 char random password
       | 
       | 2) domain.com/12 char random url
       | 
       | If we assume both either have the same bruteforce/rate limiting
       | protection (or none at all). Why is 1 more safe than 2?
        
         | amanda99 wrote:
         | Two things:
         | 
         | 1. "Password" is a magic word that makes people less likely to
         | just paste it into anything.
         | 
         | 2. Username + passwords are two separate pieces of information
         | that are not normally copy-pasted at the same time or have a
         | canonical way of being stored next to each other.
        
           | victorbjorklund wrote:
           | 1) Make sense. 2) Not sure about that. If someone shares
           | their password with someone else they probably share both the
           | username/email and the password
        
             | amanda99 wrote:
             | Yes, people share usernames and passwords, but there's no
             | single canonical string, like
             | "username=amanda99&password=hithere". For example most of
             | the time when I share user/pass combos, they are in
             | separate messages on Signal. You type them into two
             | different boxes, so you normally copy the username, then
             | the password in separate actions.
        
               | nightpool wrote:
               | I mean, for HTTP Basic there literally _is_ a single
               | canonical string, and it 's not uncommon to see people
               | send you links like
               | https://user:somepasswordhere@example.com.
               | 
               | I think the arguments other commenters have made about
               | logging, browser history storage, etc are more convincing
        
         | munk-a wrote:
         | Assuming that 5 char password is done in a reasonable way then
         | that data is not part of the publicly visible portion of the
         | request that anyone along the chain of the communication can
         | trivially eavesdrop. In a lot of cases that password even
         | existing (even if there's no significant data there) will
         | transform a request from a cacheable request into an
         | uncacheable request so intermediate servers won't keep a copy
         | of the response in case anyone else wants the document (there
         | are other ways to do this but this will also force it to be the
         | case).
        
         | koliber wrote:
         | From the information theory angle, there is no difference.
         | 
         | In practice, there is.
         | 
         | There is a difference between something-you-have secrets and
         | something-you-know secrets.
         | 
         | A UrL is something you have. It can be taken from you if you
         | leave it somewhere accessible. Passwords are something-you-know
         | and if managed well can not be taken (except for the lead pipe
         | attack).
         | 
         | There is also something-you-are, which includes retina and
         | fingerprint scans.
        
         | kube-system wrote:
         | The difference is that people (and software that people write)
         | often treat URLs differently than a password field. 12
         | characters might take X amount of time to brute force, but if
         | you already have the 12 characters, that time drops to zero.
        
         | rkangel wrote:
         | This article is the exact reason why.
         | 
         | (1) Requires some out-of-band information to authenticate.
         | Information that people are used to keeping safe.
         | 
         | On the other hand the URLs in (2) are handled as URLs. URLs are
         | often logged, recorded, shared, passed around. E.g. your work
         | firewall logging the username and password you used to log into
         | a service would obviously be bad, but logging URLs you've
         | accessed would probably seems fine.
         | 
         | [the latter case is just an example - the E2E guarantees of TLS
         | mean that neither should be accessible]
        
         | wetpaste wrote:
         | In the context of this article, it is that security scanning
         | software that companies/users are using seem to be indexing
         | some of the 12-char links out of emails which ends up in some
         | cases on public scan. Additionally, if domain.com/12-char-
         | password is requested without https, even if there is a
         | redirect, that initial request went over the wire unencrypted
         | and therefore could be MITM, whereas with a login page, there
         | are more ways to guarantee that the password submit would only
         | ever happen over https.
        
         | jarofgreen wrote:
         | As well as what the others have said, various bits of software
         | make the assumption that 1) may be private and to be careful
         | with it and 2) isn't.
         | 
         | eg Your web browser will automatically save any URLs to it's
         | history for any user of the computer to see but will ask first
         | before saving passwords.
         | 
         | eg Any web proxies your traffic goes through or other software
         | that's looking like virus scanners will probably log URLs but
         | probably won't log form contents (yes HTTPS makes this one more
         | complicated but still).
        
         | hawski wrote:
         | You can easily make a regex to filter out URLs. There is no
         | universal regex (other than maybe costly LLM) to match the URL,
         | the username and the password.
        
         | ApolloFortyNine wrote:
         | I researched this a while ago when I was curious if you could
         | put auth tokens as query params.
         | 
         | One of the major issues is that many logging applications will
         | log the full url somewhere, so now your logging 'passwords'.
        
           | laurels-marts wrote:
           | You can definitely pass JWT as a query param (and often are
           | in embedded scenarios) and no its not the same as logging
           | passwords unless you literally place the password in the
           | payload (which would be stupid).
        
       | amanda99 wrote:
       | Off topic: but that links to cloudflare radar which apparently
       | mines data from 1.1.1.1. I was under the impression that 1.1.1.1
       | did not use user data for any purposes?
        
         | kube-system wrote:
         | CF doesn't sell it or use it for marketing, but the entire way
         | they even got the addresses was because APNIC wanted to study
         | the garbage traffic to 1.1.1.1.
        
           | amanda99 wrote:
           | > CF doesn't sell it or use it for marketing
           | 
           | Any source for this? Do you work there? I checked their docs
           | and they say they don't "mine user data", so I wouldn't trust
           | anything they say, at least outside legal documents.
        
             | kube-system wrote:
             | https://1.1.1.1/dns/
             | 
             | > We will never sell your data or use it to target ads.
             | 
             | https://developers.cloudflare.com/1.1.1.1/privacy/public-
             | dns...
             | 
             | > Cloudflare will not sell or share Public Resolver users'
             | personal data with third parties or use personal data from
             | the Public Resolver to target any user with advertisements.
             | 
             | There's a lot of transparency on that page in particular,
             | down to the lists of the fields in the logs.
        
             | autoexec wrote:
             | > so I wouldn't trust anything they say, at least outside
             | legal documents
             | 
             | I wouldn't trust them even if it were in legal docs.
             | Companies have a long history of being perfectly fine with
             | breaking the law when doing so is profitable, especially
             | when they're likely to get little more than a slap on the
             | wrist when caught, and the odds of being caught in the
             | first place are slim.
        
       | boxed wrote:
       | Outlook.com leaks links to bing. At work it's a constant attack
       | surface that I have to block by looking at the user agent string.
       | Thankfully they are honest in the user agent!
        
       | overstay8930 wrote:
       | Breaking news: Security by obscurity isn't actually security
        
         | makapuf wrote:
         | Well, I like my password/ssh private key to be kept in
         | obscurity.
        
           | fiddlerwoaroof wrote:
           | Yeah, I've always hated this saying because all security
           | involves something that is kept secret, or "obscure". Also,
           | obscurity is a valid element of a defense in depth strategy
        
             | koito17 wrote:
             | To play devil's advocate, people discourage "security by
             | obscurity" but not "security with obscurity". That is to
             | say, secrets or "obscurity" as part of a layer in your
             | overall security model isn't what gets contested, it's
             | solely relying on obscure information staying obscure that
             | gets contested.
             | 
             | e.g. configuring an sshd accepting password auth and
             | unlimited retries to listen on a non-22 port is "security
             | by obscurity". configuring an sshd to disallow root logins,
             | disallow password authentication, only accept connections
             | from a subset of "trustworthy" IP addresses, and listen on
             | a non-22 port, is "security with obscurity"
        
             | maxcoder4 wrote:
             | The idea behind "security thorough obscurity" is that even
             | if the adversary knows everything about your setup *except
             | the secret keys*, you should be secure. Security through
             | obscurity is any method of protection other than the secret
             | key, like for example: * serving ssh on a random high port
             | * using a custom secret encryption algorithm * hosting an
             | unauthenticated service on a secret subdomain in hope
             | nobody will find out * or with a long directory name
             | 
             | Some security thorough obscurity is OK (for example high
             | ports or port knocking help buy time when protecting from a
             | zeroday on the service). It's just that relying only on the
             | security thorough obscurity is bad.
             | 
             | In this case, I wouldn't call URLs with embedded key
             | security through obscurity, just a poor key management.
        
               | fiddlerwoaroof wrote:
               | But, this is just relying on the obscurity of the key:
               | all security comes down to some form of secret knowledge.
               | It's just better to use a space that's hard to enumerate
               | than a low-cardinality space: if we had 1024 bits of port
               | numbers, picking a random port would be as hard to crack
               | as a 1024 bit encryption key.
        
           | overstay8930 wrote:
           | If you use an HSM you wouldn't have to worry about that
           | either
        
         | panic wrote:
         | "Security by obscurity" means using custom, unvetted
         | cryptographic algorithms that you believe others won't be able
         | to attack because they're custom (and therefore obscure).
         | Having a key you are supposed to keep hidden isn't security by
         | obscurity.
        
       | QuercusMax wrote:
       | I've always been a bit suspicious of infinite-use "private"
       | links. It's just security thru obscurity. At least when you share
       | a Google doc or something there's an option that explicitly says
       | "anyone with the URL can access this".
       | 
       | Any systems I've built that need this type of thing have used
       | Signed URLs with a short lifetime - usually only a few minutes.
       | And the URLs are generally an implementation detail that's not
       | directly shown to the user (although they can probably see them
       | in the browser debug view).
        
         | empath-nirvana wrote:
         | There's functionally no difference between a private link and a
         | link protected by a username and password or an api key, as
         | long as the key space is large enough.
        
           | ses1984 wrote:
           | You can't revoke an individual user's access to a hard to
           | guess link.
        
             | colecut wrote:
             | You can if it's one link per user
        
               | vel0city wrote:
               | Lots of platforms I've used with these public share links
               | don't really support multiple share links, and if they do
               | the management of it is pretty miserable. Clicking share
               | multiple times just gives the same link.
        
               | ses1984 wrote:
               | True but if you're generating one link per user, at what
               | point do you lift up your head and wonder if it wouldn't
               | be easier to just use authentication?
        
               | jddj wrote:
               | The friction that semi-private links remove is that the
               | recipient doesn't need an account for your service.
               | 
               | Any tradeoffs should be viewed in that context.
        
               | anonymousDan wrote:
               | I like how google docs does it. You can specify the email
               | of a user allowed to access the link (doesn't need to be
               | gmail). When they click it they will be told to check for
               | a validation email containing a link to the actual
               | document.
        
               | ses1984 wrote:
               | Isn't that basically a form of authentication?
               | 
               | I'm not sure if short lived temporary private links fit
               | the model of private links as described above.
               | 
               | If that counts as a private link, what if I'm using a
               | conventional session based app, I go into dev tools and
               | "copy as curl", does that qualify as a private link?
        
               | anonymousDan wrote:
               | Yes it is. My point was more that it's a relatively
               | lightweight way to create a shareable link that does not
               | require the consumers to create a new account on the
               | service hosting the linked resource in order to access
               | it. At the same time, merely having access to the link
               | doesn't really gain you anything, and so it is immune to
               | the kind of issues discussed in the article
        
           | kriops wrote:
           | There is a big difference in how the browser treats the
           | information, depending on how you provide it. Secrets in URLs
           | leak more easily.
        
           | rfoo wrote:
           | Most of developers are aware that username or password are
           | PII and if they log it they are likely to get fired.
           | 
           | Meanwhile our HTTP servers happily log every URI it received
           | in access logs. Oh, and if you ever send a link in non E2EE
           | messenger it's likely their server generated the link preview
           | for you.
        
           | vel0city wrote:
           | There's one big functional difference. People don't normally
           | have their username and password or API key directly in the
           | URL.
           | 
           | Example 1:
           | 
           | Alice wants Bob to see CoolDocument. Alice generates a URL
           | that has the snowflake in the URL and gives it to Bob. Eve
           | manages to see the chat, and can now access the document.
           | 
           | Example 2:
           | 
           | Alice wants Bob to see CoolDocument. Alice clicks "Share with
           | Bob" in the app, grabs the URL to the document with no
           | authentication encoded within and sends it to Bob. Bob clicks
           | the link, is prompted to login, Bob sees the document. Eve
           | manages to see the chat, follows the link, but is unable to
           | login and thus cannot see the document.
           | 
           | Later, Alice wants to revoke Bob's access to the document.
           | Lots of platforms don't offer great tools to revoke
           | individual generated share URLs, so it can be challenging to
           | revoke Bob's access without potentially cutting off other
           | people's access in Example 1, as that link might have been
           | shared with multiple people. In example 2, Alice just removes
           | Bob's access to the doucment and now his login doesn't have
           | permissions to see it. Granted, better link management tools
           | could sovle this, but it often seems like these snowflake
           | systems don't really expose a lot of control over multiple
           | share links.
        
             | Dylan16807 wrote:
             | Example 2 sounds like a pretty big pain if I can't go
             | directly from Bob's chat account to his document account.
             | Which is the case the vast majority of the time.
        
           | LaGrange wrote:
           | I mean, there's a functional difference if your email client
           | will try to protect you by submitting the URL to a public
           | database. Which is incredible and mind-boggling, but also
           | apparently the world we live in.
        
           | nkrisc wrote:
           | There's a big difference. The latter requires information not
           | contained in the URL to access the information.
        
             | ironmagma wrote:
             | That's not a fundamental difference but a difference of
             | convention. A lot of us have been in the convention long
             | enough that it seems like a fundamental.
        
             | deathanatos wrote:
             | > _Here 's the URL to the thing:
             | https://example.com/a/url?secret=hunter2_
             | 
             | This is indexable by search engines.
             | 
             | > _Here 's the URL to the thing: https://example.com/a/url
             | and the password is "hunter2"._
             | 
             | This is indexable by search engines.
             | 
             | Yes, the latter is marginally harder, but you're still
             | leaning on security through obscurity, here.
             | 
             | The number of times I have had "we need to securely
             | transmit this data!" end with exactly or something
             | equivalent to emailing an encrypted ZIP _with the password
             | in the body of the email_ (or sometimes, some other
             | insecure channel...) ...
        
               | nkrisc wrote:
               | Sure if you're comparing worst case of one to best case
               | of the other it's functionally similar, but if the
               | password is strong and handled properly then they are not
               | functionally similar at all.
        
               | pests wrote:
               | Right, but you settled on the answer as well. You must
               | communicate the password via a different medium, which is
               | impossible with links.
        
           | OtherShrezzing wrote:
           | There's at least one critical functional difference: The URL
           | stays in the browser's history after it's been visited.
        
           | cxr wrote:
           | > There's functionally no difference between a private link
           | and a link protected by a username and password or an api key
           | 
           | You mean mathematically there is no difference. Functionally
           | there is a very, very big difference.
        
         | voiper1 wrote:
         | >At least when you share a Google doc or something there's an
         | option that explicitly says "anyone with the URL can access
         | this".
         | 
         | Unfortunately, it's based on the document ID, so you can't re-
         | enable access with a new URL.
        
           | nightpool wrote:
           | Not true, as you may have heard they closed this loophole in
           | 2021 by adding a "resource key" (that can be rotated) to
           | every shared URL: https://9to5google.com/2021/07/28/google-
           | drive-security-upda....
        
       | sbr464 wrote:
       | All media/photos you upload to a private airtable.com app are
       | public links. No authentication required if you know the url.
        
         | internetter wrote:
         | This is actually fairly common for apps using CDNs - not just
         | airtable. I agree it's potentially problematic
        
           | blue_green_maps wrote:
           | Yes, this is the case for images uploaded through GitHub
           | comments, I think.
        
             | eddythompson80 wrote:
             | That's not true. There is a JWT token in the url with about
             | 5 minute expiration window.
        
         | andix wrote:
         | There is a dilemma for web developers with images loaded from
         | CDNs or APIs. Regular <img> tags can't set an Authorization
         | header with a token for the request, like you can do with
         | fetch() for API requests. The only possibility is adding a
         | token to the URL or by using cookie authentication.
         | 
         | Cookie auth only works if the CDN is on the same domain, even a
         | subdomain can be problematic in many cases.
        
       | ttymck wrote:
       | Zoom meeting links often have the password appended as a query
       | parameter. Is this link a "private secure" link? Is the link
       | without the password "private secure"?
        
         | bombcar wrote:
         | If the password is randomized for each meeting, the URL link is
         | not so bad, as the meeting will be dead and gone by the time
         | the URL appears elsewhere.
         | 
         | But in reality, nobody actually cares and just wants a "click
         | to join" that doesn't require fumbling around - but the
         | previous "just use the meeting ID" was too easily guessed.
        
           | runeb wrote:
           | Unless its a recurring meeting
        
       | godelski wrote:
       | There's a clear UX problem here. If you submit a scan it doesn't
       | tell you it is public.
       | 
       | There can be a helpful fix: make clear that the scan is public!
       | When submitting a scan it isn't clear, as the article shows. But
       | you have the opportunity to also tell the user that it is public
       | during the scan, which takes time. You also have the opportunity
       | to tell them AFTER the scan is done. There should be a clear
       | button to delist.
       | 
       | urlscan.io does a bit better but the language is not quite clear
       | that it means the scan is visible to the public. And the colors
       | just blend in. If something isn't catching to your eye, it might
       | as well be treated as invisible. If there is a way to easily
       | misinterpret language, it will always be misinterpreted. if you
       | have to scroll to find something, it'll never be found.
        
         | heipei wrote:
         | Thanks for your feedback. We show the Submit button on our
         | front page as "Public Scan" to indicate that the scan results
         | will be public. Once the scan has finished it will also contain
         | the same colored banner that says "Public Scan". On each scan
         | result page there is a "Report" button which will immediately
         | de-list the scan result without any interaction from our side.
         | If you have any ideas on how to make the experience more
         | explicit I would be happy to hear it!
        
           | godelski wrote:
           | I understand, but that is not clear enough. "Public scan" can
           | easily be misinterpreted. Honestly, when I looked at it, I
           | didn't know what it meant. Just looked like idk maybe a
           | mistranslation or something? Is it a scan for the public? Is
           | the scanning done in public? Are the results public? Who
           | knows. Remember that I'm not tech literate and didn't make
           | the project.
           | 
           | I'd suggest having two buttons, "public scan" "private scan".
           | That would contextualize the public scan to clarify and when
           | you are scanning is publicly __listed__. And different
           | colors. I think red for "public" would actually be the better
           | choice.
           | 
           | Some information could be displayed while scanning. Idk put
           | something like "did you know, using the public scan makes the
           | link visible to others? This helps security researchers. You
           | can delist it by clicking ____" or something like that and do
           | the inverse. It should stand out. There's plenty of time
           | while the scan happens.
           | 
           | > On each scan result page there is a "Report" button which
           | will immediately de-list the scan result without any
           | interaction from our side.
           | 
           | "Report" is not clear. That makes me think I want to report a
           | problem. Also I think there is a problem with the color
           | scheme. The pallet is nice but at least for myself, it all
           | kinda blends in. Nothing pops. Which can be nice at times,
           | but we want to draw the user to certain things, right? I
           | actually didn't see the report button at first. I actually
           | looked around, scrolled, and then even felt embarrassed when
           | I did find it because it is in an "obvious" spot. One that I
           | even looked at! (so extra embarrassing lol)
           | 
           | I think this is exactly one of those problems where when you
           | build a tool everything seems obvious and taken care of. You
           | clearly thought about these issues (far better than most!)
           | but when we put things out into public, we need to see how
           | they get used and where our assumptions miss the mark.
           | 
           | I do want to say thank you for making this. I am criticizing
           | not to put you down or dismiss any of the work you've done.
           | You've made a great tool that helps a lot of people. You
           | should feel proud for that! I am criticizing because I want
           | to help make the tool the best tool it can be. Of course
           | these are my opinions. My suggestion would be to look at
           | other opinions as well and see if there are common themes.
           | Godelski isn't right, they're just one of many voices that
           | you have to parse. Keep up the good work :)
        
             | heipei wrote:
             | Thanks, that is great feedback and we'll try to improve how
             | the scan visibility is shown and what it actually means.
             | The suggestion of adding a text to the loading page is a
             | great idea, and the feedback about the colors on the result
             | page is totally valid.
             | 
             | I'm the last person who wants to see private data
             | accidentally leak into the public domain. However
             | experience has shown that combating the massive amounts of
             | fraud and malicious activity on the web nowadays requires
             | many eyes that are able to access that data and actually do
             | something about it. That is the reason we have these public
             | scans in the first place.
        
               | godelski wrote:
               | And thank you for being receptive and listening! I hope
               | my thoughts and others can help make your tools better.
               | 
               | I really appreciate that people like you are out there
               | trying to defend our data and privacy. I know it is such
               | a difficult problem to solve and you got a lot of work
               | ahead of you. But appreciation is often not said enough
               | and left implied. So I want to make it explicit: Thank
               | you.
               | 
               | (and I'll say this interaction is the best advertisement
               | you could make, at least to me haha)
        
             | vin10 wrote:
             | This is a very well formulated suggestion. Nicely written!
        
       | r2b2 wrote:
       | To create private shareable links, store the private part in the
       | hash of the URL. The hash is not transmitted in DNS queries or
       | HTTP requests.
       | 
       | Ex. When links.com?token=<secret> is visited, that link will be
       | transmitted and potentially saved (search parameters included) by
       | intermediaries like Cloud Flare.
       | 
       | Ex. When links.com#<secret> is visited, the hash portion will not
       | leave the browser.
       | 
       |  _Note: It 's often nice to work with data in the hash portion by
       | encoding it as a URL Safe Base64 string. (aka. JS Object - JSON
       | String - URL Safe Base 64 String)._
        
         | eterm wrote:
         | If it doesn't leave the browser, how would the server know to
         | serve the private content?
        
           | jadengeller wrote:
           | Client web app makes POST request. It leaves browser, but not
           | in URL
        
             | Gigachad wrote:
             | That only works if you can run JavaScript. If you want to
             | download a file with curl for example it fails.
        
         | klabb3 wrote:
         | Thanks, finally some thoughts about how to solve the issue. In
         | particular, email based login/account reset is the main
         | important use case I can think of.
         | 
         | Do bots that follow links in emails (for whatever reason)
         | execute JS? Is there a risk they activate the thing with a JS
         | induced POST?
        
           | 369548684892826 wrote:
           | Yes, I've seen this bot JS problem, it does happen.
        
           | r2b2 wrote:
           | To somewhat mitigate the _link-loading bot_ issue, the link
           | can land on a  "confirm sign in" page with a button the user
           | must click to trigger the POST request that completes
           | authentication.
           | 
           | Another way to mitigate this issue is to store a secret in
           | the browser that initiated the link-request (Ex. local
           | storage). However, this can easily break in situations like
           | private mode, where a new tab/window is opened without access
           | to the same session storage.
           | 
           | An alternative to the in-browser-secret, is doing a browser
           | fingerprint match. If the browser that opens the link doesn't
           | match the fingerprint of the browser that requested the link,
           | then fail authentication. This also has pitfalls.
           | 
           | Unfortunately, if your threat model requires blocking _bots
           | that click too_ , your likely stuck adding some semblance of
           | a second factor (pin/password, bio metric, hardware key,
           | etc.).
           | 
           | In any case, when using link-only authentication, best to at
           | least put sensitive user operations (payments, PII, etc.)
           | behind a second factor at the time of operation.
        
             | klabb3 wrote:
             | > a button the user must click
             | 
             | Makes sense. No action until the user clicks something on
             | the page. One extra step but better than having "helpful
             | bots" wreak havoc.
             | 
             | > to store a secret in the browser [...] is doing a browser
             | fingerprint match
             | 
             | I get the idea but I really dislike this. Assuming the user
             | will use the same device or browser is an anti-pattern that
             | causes problems with people especially while crossing the
             | mobile-desktop boundary. Generally any web functionality
             | shouldn't be browser dependent. Especially hidden state
             | like that..
        
               | r2b2 wrote:
               | I agree, better to use an additional factor than
               | fingerprinting.
        
             | eadmund wrote:
             | > Another way to mitigate this issue is to store a secret
             | in the browser that initiated the link-request (Ex. local
             | storage).
             | 
             | Or just a cookie ...
             | 
             | But this approach breaks anyway in cases such as a user on
             | a desktop who checks his email on his phone for the
             | confirmation.
        
         | andix wrote:
         | Is there a feature of DNS I'm unaware of, that queries more
         | than just the domain part? https://example.com?token=<secret>
         | should only lead to a DNS query with "example.com".
        
           | erikerikson wrote:
           | The problem isn't DNS in GP. DNS will happily supply the IP
           | address for a CDN. The HTTP[S] request will thereafter be
           | sent by the caller to the CDN (in the case of CloudFlare,
           | Akamai, etc.) where it will be handled and potentially logged
           | before the result is retrieved from the cache or the
           | configured origin (i.e. backing server).
        
             | andix wrote:
             | This sounds like a big security flaw in the system that
             | uses access links. Secrets should not be logged (in most
             | cases).
             | 
             | When opening a Dropbox/GoogleDocs/OneDrive link, I expect
             | the application not to route them through potentially
             | unsafe CDNs.
        
           | r2b2 wrote:
           | Correct, DNS only queries the hostname portion of the URL.
           | 
           |  _Maybe my attempt to be thorough - by making note of DNS
           | along side HTTP since it 's part of the browser - network -
           | server request diagram - was too thorough._
        
         | jmholla wrote:
         | > Ex. When links.com?token=<secret> is visited, that link will
         | be transmitted and potentially saved (search parameters
         | included) by intermediaries like Cloud Flare.
         | 
         | Note: When over HTTPS, the parameter string (and path) is
         | encrypted so the intermediaries in question need to be able to
         | decrypt your traffic to read that secret.
         | 
         | Everything else is right. Just wanted to provide some nuance.
        
           | r2b2 wrote:
           | Good to point out. This distinction is especially important
           | to keep in mind when thinking about when and/or who
           | terminates TLS/SSL for your service, and any relevant threat
           | models the service might have for the portion of the HTTP
           | request _after_ terminattion.
        
           | mschuster91 wrote:
           | Cloudflare, Akamai, AWS Cloudfront are all legitimate
           | intermediaries.
        
             | anonymouscaller wrote:
             | Yes, see "Cloudbleed"
        
         | loginatnine wrote:
         | It's called a fragment FYI!
        
           | shiomiru wrote:
           | However, window.location calls it "hash". (Also, the query
           | string is "search". I wonder why Netscape named them this
           | way...)
        
             | loginatnine wrote:
             | Interesting, thanks for the additional info.
        
           | SilasX wrote:
           | Yeah I was confused by it being referred to as the hash.
           | 
           | https://en.wikipedia.org/wiki/URI_fragment?useskin=vector
        
         | nightpool wrote:
         | The secret is still stored in the browser's history DB in this
         | case, which may be unencrypted (I believe it is for Chrome on
         | Windows last I checked). The cookie DB on the other hand I
         | think is always encrypted using the OS's TPM so it's harder for
         | malicious programs to crack
        
           | r2b2 wrote:
           | Yes, adding max-use counts and expiration dates to links can
           | mitigate against some browser-history snooping. However, if
           | your browser history is compromised you probably have an even
           | bigger problem...
        
         | phyzome wrote:
         | Huge qualifier: Even otherwise benign Javascript running on
         | that page can pass the fragment anywhere on the internet.
         | Putting stuff in the fragment helps, but it's not perfect. And
         | I don't just mean this in an ideal sense -- I've actually seen
         | private tokens leak from the fragment this way multiple times.
        
           | eadmund wrote:
           | Which is yet another reason to disable Javascript by default:
           | it can see everything on the page, and do anything with it,
           | to include sending everything to some random server
           | somewhere.
           | 
           | I am not completely opposed to scripting web pages (it's a
           | useful capability), but the vast majority of web pages are
           | just styled text and images: Javascript adds nothing but
           | vulnerability.
           | 
           | It would be awesome if something like HTMX were baked into
           | browsers, and if enabling Javascript were something a user
           | would have to do manually when visiting a page -- just like
           | Flash and Java applets back in the day.
        
       | rpigab wrote:
       | Links that are not part of a fast redirect loop will be copied
       | and pasted to be shared because that's what URLs are for, they're
       | universal, they facilitate access to a resource available on a
       | protocol.
       | 
       | Access control on anything that is not short-lived must be done
       | outside of the url.
       | 
       | When you share links on any channel that is not e2ee, the first
       | agent to access that url is not the person you're sending it to,
       | it is the channel's service, it can be legitimate like Bitwarden
       | looking for favicons to enhance UX, or malicious like FB
       | Messenger crawler that wants to know more about what you are
       | sharing in private messages.
       | 
       | Tools like these scanners won't get better UX, because if you
       | explicitly tell users that the scans are public, some of them
       | will think twice about using the service, and this is bad for
       | business, wether they're using it for free or paying a pro
       | license.
        
       | qudat wrote:
       | Over at pico.sh we are experimenting with an entirely new type of
       | private link by leveraging ssh local forward tunnels:
       | https://pgs.sh/
       | 
       | We are just getting started but so far we are loving the
       | ergonomics.
        
       | zzz999 wrote:
       | You can if you use E2EE and not CAs
        
       | Terr_ wrote:
       | A workaround for this "email-based authentication" problem
       | (without going to a full "make an account with a password" step)
       | is to use temporary one-time codes, so that it doesn't matter if
       | the URL gets accidentally shared.
       | 
       | 1. User visits "private" link (Or even a public link where they
       | re-enter their e-mail.)
       | 
       | 2. Site e-mails user _again_ with time-limited single-use code.
       | 
       | 3. User enters temporary code to confirm ownership of e-mail.
       | 
       | 4. Flow proceeds (e.g. with HTTP cookies/session data) with
       | reasonable certainty that the e-mail account owner is involved.
        
       | andix wrote:
       | A while ago I started to only send password protected links via
       | email. Just with the plaintext password inside the email. This
       | might seem absurd and unsafe on the first glance, but those kind
       | of attacks it can safely prevent. Adding an expiration time is
       | also a good idea, even if it is as long as a few months.
        
       | figers wrote:
       | We have done one time use query string codes at the end of a URL
       | sent to a user email address or as a text message to allow for
       | this...
        
       | kgeist wrote:
       | Tried it with the local alternative to Google Disk. Oh my...
       | Immediately found lots of private data, including photos of
       | credit cars (with security codes), scans of IDs, passports... How
       | do you report a site?
        
       | rvba wrote:
       | Reminds me how some would search for bitcoin wallets via google
       | and kazaa.
       | 
       | On a side note, can someome remind me what was the name of the
       | file, I think I have some tiny fraction of a bicoin on an old
       | computer
        
       | snthd wrote:
       | "private secure links" are indistinguishable from any other link.
       | 
       | With HTTP auth links you know the password is a password, so
       | these tools would know which part to hide from public display:
       | 
       | > https://username:password@example.com/page
        
         | jeroenhd wrote:
         | I think it's quite funny that the URL spec has a section
         | dedicated to authentication, only for web devs to invent ways
         | to pass authentication data in any way but using the built-in
         | security mechanism.
         | 
         | I know there are valid reasons (the "are you sure you want to
         | log in as usernam on example.com?" prompt for example) but this
         | is just one of the many ways web dev has built hacks upon hacks
         | where implementing standards would've sufficed. See also: S3 vs
         | WebDAV.
        
       | getcrunk wrote:
       | What's wrong with using signed urls and encrypting the object
       | with a unique per user key. It's adds some cpu time but if it's
       | encrypted it's encrypted.
       | 
       | * this obviously assumes the objects have a 1-1 mapping with
       | users
        
       | dav43 wrote:
       | A classic one that has a business built on this is pidgeonhole -
       | literally private links for events with people hosting internal
       | company events and users posing private sometimes confidential
       | information. And even banks sign on to these platforms!
        
       | BobbyTables2 wrote:
       | What happened to REST design principles?
       | 
       | A GET isn't supposed to modify server state. That is reserved for
       | POST, PUT, PATCH...
        
       | 65 wrote:
       | Well this is interesting. Even quickly searching
       | "docs.google.com" on urlscan.io gets me some spreadsheets with
       | lists of people's names, emails, telephone numbers, and other
       | personal information.
        
       | egberts1 wrote:
       | Sure, you can!
       | 
       | This is the part where IP filtering by country and subnet can
       | keep your ports hidden.
       | 
       | Also stateful firewall can be crafted to only let certain IP thru
       | after sending a specially-crafted TOTP into a ICMP packet just to
       | get into opening the firewall for your IP.
        
       | JensRantil wrote:
       | I'm surprised no one has mentioned creating a standard that
       | allows a these sites to check whether it's a private link or not.
       | 
       | For example, either a special HTTP header returned when making a
       | HEAD request for the URL, or downloading a file similar to
       | robots.txt that defines globs which are public/private.
       | 
       | At least this would (mostly) avoid these links becoming publicly
       | available on the internetz.
        
       ___________________________________________________________________
       (page generated 2024-03-08 23:01 UTC)