[HN Gopher] Behind GitHub's new authentication token formats
___________________________________________________________________
Behind GitHub's new authentication token formats
Author : todsacerdoti
Score : 146 points
Date : 2021-04-05 16:32 UTC (6 hours ago)
(HTM) web link (github.blog)
(TXT) w3m dump (github.blog)
| RyJones wrote:
| ok, great, when will I be able to scope tokens to acting on only
| one repo, or only one org, or in any meaningful way?
| nathanaldensr wrote:
| I completely agree. This change is interesting and all, but
| GitHub (Enterprise, in my case) tokens aren't granular enough.
| There's a lot more benefit to be had in fixing that issue.
| tptacek wrote:
| I'm sure this is a valid complaint about Github, but it has
| nothing to do with the article, which is a bit annoying since
| there's some cleverness in that article (checksumming tokens,
| for instance) that we could be talking about stealing, rather
| than turning this thread into a generic referendum on whether
| Github is good.
| chatmasta wrote:
| The article is about GitHub's authentication tokens. It seems
| relevant to bring up a complaint about their scopes.
| tptacek wrote:
| And yet it's not, because this is an article about token
| formats, and not about entire authorization schemes.
| threeseed wrote:
| a) I don't see what is particularly clever about their token
| algorithm.
|
| b) Talking about fundamental flaws in their token
| implementation seems relevant in a discussion about their
| token implementation.
|
| c) Public discussions about flaws are often the best way to
| educate others and make the company aware of them.
| paxys wrote:
| You can already do that today. Create a Github App and add it
| only to a single repo.
| chatmasta wrote:
| I definitely agree this is needed, but I have to imagine it's
| quite a complex change on GitHub's side. It probably entails
| changing a lot of their authentication architecture, in terms
| of what's stateless and stateful and what requires a trip to
| the DB to check (i.e., it's hard to encode a list of which
| resources you should have access to in a single token). I'm
| sure they're aware this is a problem though. Maybe this recent
| change sets the groundwork for fixing it.
| RyJones wrote:
| That the only method I have to scope tokens is to break the
| TOS for GitHub - by creating single-use accounts - is super
| lame.
| threeseed wrote:
| Don't forget that creating new accounts costs money.
|
| If you want to take advantage of Environments for example
| then you need to pay for the Enterprise license which means
| every account is another $21/month. That adds up for
| individual and startup use.
|
| I am actually confused why we can't just have tokens
| assigned to organisations and not users.
| JamesSwift wrote:
| What TOS does it violate? They explicitly recommend
| creating machine users in certain areas of their help docs.
| jpalomaki wrote:
| This one? "you may not have more than one free Account"
|
| https://docs.github.com/en/github/site-policy/github-
| terms-o...
|
| But there's also later more detailed guidance:
|
| "One person or legal entity may maintain no more than one
| free Account (if you choose to control a machine account
| as well, that's fine, but it can only be used for running
| a machine)."
| kroltan wrote:
| Presumably then it would be non-violating to have many
| paid accounts? Sure is pricey for smaller operations but
| depending on the value it brings to have this more
| granular scoping it surely might be worth it?
|
| Doesn't remove their need to implement a more reasonable
| way of scoping tokens, though.
| tptacek wrote:
| Whatever the letter of this restriction is, that's not
| its spirit. Our practice at my last company was per-
| client segregated accounts, and I have a mailbox full of
| discussions with Github support staff telling us that was
| OK.
| marcinzm wrote:
| This seems like a classic case of rules with high
| potential for selective enforcement which generally leads
| to unfair enforcement. It's fine as long as you don't
| somehow get on Github's bad side and if you do it's an
| instant reason to close your accounts.
| ArchOversight wrote:
| This means that if you create a work account that is
| separate from your personal account (because you don't
| use personal credentials on work machines and vice-versa)
| then you are technically in violation of the TOS...
|
| Yet this is something I, and many others, do because we
| don't want to mix business with pleasure. In fact I
| absolutely refuse to do so because of security reasons.
| Arnavion wrote:
| Many Microsoft employees also have separate personal and
| work accounts. I would be surprised if that was a
| violation.
| masklinn wrote:
| It is explicitly a technical violation of the TOS.
|
| In practice this is mostly so github has a reason to
| misuses like bots which might not be caught by the anti-
| spam measure.
|
| They do mind a bit but not too much, and you can get help
| from support having literally stated that you have
| multiple accounts. For instance if you're testing an
| extension or integration with github, and there are
| specific interactions between different users... you
| kinda need different users to test it. And mocking github
| may not be sufficient.
| Arnavion wrote:
| Yes. I'm talking specifically about the "separate
| personal and work accounts" case. It may very well be a
| TOS violation as the TOS is written. I'm saying I'd be
| surprised if they treated it as one.
| chatmasta wrote:
| I'm actually dealing with something like this right now
| and am curious what solutions people use for e2e testing
| of OAuth flows. I'm leaning toward creating a test
| account at each Identity Provider, but then I have to
| deal with things like 2FA. I guess it's not so bad if I
| just use a TOTP generator on the client, but if they want
| to send an email to verify my account, that's just
| annoying.
| derefr wrote:
| Aren't the work accounts, paid accounts? The TOS only
| restricts having multiple free accounts.
| ArchOversight wrote:
| No, the work accounts are not paid accounts. My work
| requires me to interact with Open Source projects and the
| like. We have our own hosted Gitlab instance for our
| internal projects.
|
| The work account is strictly to communicate with/provide
| patches back to upstream projects.
|
| I have multiple personal accounts on Github. One for each
| employer I have worked at in the past couple of years,
| and my personal account that is tied to my own identity
| and is used for my personal time projects/open source
| work that is not tied to $work.
| TheSoftwareGuy wrote:
| Well the solution is simple, just make a new LLC for each
| GitHub account you need!
|
| I'm definitely kidding, but unless there is more in there
| TOS (which I don't intend to read) I don't see why this
| wouldn't be a workable loophole
| threeseed wrote:
| GitHub for me is running out of excuses for why the
| fundamentals of their platform is so poor. And why compared
| to Gitlab they deliver improvements at such an anaemic pace.
| They act like a company with 15 employees let alone 1,500+.
|
| Everything from Security, Actions, Containers, Packages,
| Terraform, JIRA Integration etc is either completely broken
| or has major outstanding issues that haven't been fixed for
| years.
| chatmasta wrote:
| Personally, we use GitLab for all the internal repos at our
| company. We originally migrated because GitLab CI was free
| and GitHub didn't even have a CI solution at the time. We
| still use GitHub for our public repos since that's "where
| the community is." GitHub actions is great, although IMO a
| bit too prescriptive (I'd rather write a script that can
| run anywhere rather than spend time building up the mental
| model of the abstractions that are unique to GHA). But
| nothing beats GitLab CI + container registry; we put a lot
| of work into our CI pipeline and now we've got incremental
| builds with a Docker image for every service tagged per
| commit. And since GitLab Container Registry supports
| manifest v2, we can take advantage of BuildKit layer
| caching (I think GitHub registry supports this now too, but
| haven't played with it).
|
| That said, GitLab has its fair share of problems too.
| GitHub UI is way better, community/discussion features are
| better, and forking/public collaboration workflow is
| better.
|
| I'm glad there are two big players in the space, though;
| GitLab really lit a fire under GitHub to finally get them
| to start pushing new features.
| systemvoltage wrote:
| Side rant: Github devs, if you're listening, please give us REST
| API. Not everyone knows GraphQL or has the motivation to do so.
| The industry standard is REST for public facing APIs, including
| companies such as Stripe (widely considered to be the gold
| standard for public API design and documentation). You can use
| GraphQL internally.
| minitech wrote:
| https://docs.github.com/en/rest
| systemvoltage wrote:
| I remember having to use GraphQL to delete a Docker image
| that was stuck in my private repo and there was no GUI to
| clear it. Wasted a couple of hours trying to send a GraphQL
| query which would have been a 2 minute jobbie using cURL.
| Github's public REST API didn't have this feature.
| bastardoperator wrote:
| They have a curl example in the docs
|
| https://docs.github.com/en/packages/learn-github-
| packages/de...
| azimuth11 wrote:
| A lot of REST APIs are just as hard to grok as GraphQL is as a
| whole. Companies often lack schemas and documentation, which
| GraphQL helps with out of the box.
| glsdfgkjsklfj wrote:
| You know a company do not take tech seriously when they use fancy
| quotation marks in code blocks.
| seoaeu wrote:
| It seems weird that in a blog post about a new format for tokens,
| there isn't a single example of what a GitHub token now looks
| like.
| Mandatum wrote:
| I wonder why they went with 2 rather than 3 or 4 for company
| identifier. Stock ticker for instance would make sense. Not
| really practical.
| paxys wrote:
| This isn't meant to be a standard, just something they picked
| for themselves. And it doesn't even need to be a company
| identifier. Slack tokens are prefixed with "xox<token type>-",
| for example.
| ramses0 wrote:
| https://tools.ietf.org/html/rfc8959 - "secret-
| token:E92FB7EB-D882-47A4-A265-A0B6135DC842%20foo"
| Wowfunhappy wrote:
| Does anyone know if these new tokens are backwards compatible
| with software that used the old tokens? By which I mean, I'm
| using a version of Git Tower from before they switched to a
| subscription model, and I'm wondering whether regenerating my
| tokens will make me unable to log in.
| mperham wrote:
| The key insight here is that random tokens should be self-
| describing, so you know their intended use and therefore can make
| decisions and take action when one is detected.
|
| If a script sees "ABC123" in a code commit, that's meaningless.
| If you see "secret-token:ABC123", now you can fail the commit
| with an error message: "Secret token detected in public commit,
| aborting."
| staticassertion wrote:
| FWIW it is still very much worth reading the article, since
| they talk about how they implement that approach. They bring up
| why they use an underscore, checksum'ing, entropy, etc.
| echelon wrote:
| One of the biggest learnings of our org was to prefix tokens
| with the entity type. It's helped immensely.
|
| entity-type:RANDOM_TOKEN
|
| * Helps in migrations, especially complex ones where you split
| up entity types.
|
| * Identities what tokens are so people can look them up if they
| see them in logs.
|
| * Polymorphic relationships can delegate to the appropriate
| owning service easily without additional bookkeeping.
|
| You can also encode other stuff in the token entropy, too, such
| as the author DC/region for active-active setups where you need
| to forward the request to the source of truth in the brief
| window where the other regions don't know about it yet.
| fiddlerwoaroof wrote:
| I've always thought that a Java-style reverse domain name
| format (or, perhaps, URLs) is a great way to encode IDs:
| com.foo.bar.Person:0000-11111-22222-33333 or whatever. That
| way, any code that logs IDs or transfers IDs across the
| network gets tracing "for free" and, when you see an ID in a
| bug report, you can use it to help focus the investigation.
| ljm wrote:
| Ruby/Rails has this in the form of GlobalID[1]. To be
| honest I haven't seen it used outside of whatever Rails
| itself automatically does, but the concept is there.
|
| [1]https://github.com/rails/globalid
| 11235813213455 wrote:
| that's how Stripe prefixes their IDs too, depending on what
| type of entity it is. Makes debugging, docs, .. easier
| xPaw wrote:
| For those that haven't seen it, "secret-token:" is an RFC. I've
| started using it at work.
|
| https://tools.ietf.org/html/rfc8959
| nine_k wrote:
| An unencrypted _version_ marker would be pretty useful, too.
| If anything is long-term, you can safely bet in that it 'll
| need to evolve.
| cornstalks wrote:
| Note that the RFC's category is "informational", which
| doesn't give it as much weight as something that is
| "standards track". Usually the important RFCs are "standards
| track" though there are some "informational" RFCs that are
| also important.
|
| From Wikipedia[1]:
|
| > _An informational RFC can be nearly anything from April 1
| jokes to widely recognized essential RFCs like Domain Name
| System Structure and Delegation (RFC 1591). Some
| informational RFCs formed the FYI sub-series._
|
| [1]: https://en.wikipedia.org/wiki/Request_for_Comments#Infor
| mati...
| revicon wrote:
| The ID prefixing is cool from an identification point of view,
| but we've been using UUIDs for tokens and if we implemented
| this we wouldn't be able to use the UUID optimized datatype
| field in Postgres.
| marksomnian wrote:
| Surely you can just strip off the prefix in the application
| layer before sending it to Postgres? You still get the
| benefits, while being able to use the native query.
| edoceo wrote:
| I do it like that. We're only using a two char prefix (I
| copied Twilio)
| orf wrote:
| Why not? You don't have to store the prefix and the UUID in
| the same column?
| asimpletune wrote:
| Can't you just add a column to your schema with the prefix?
| [deleted]
| gkop wrote:
| Would someone comment on this idea in context of JWTs? Not
| trolling, just curious as I use JWTs and embed this kind of
| metadata as a custom claim, which accomplishes some but far from
| all of what GitHub accomplishes here, but then I have no need for
| the easy scanning. So seeking wisdom from anyone who has thought
| carefully about whether or not to prefix their JWTs in this way.
| benatkin wrote:
| JWTs purposely contain information in plain text (unencrypted
| and not stored in a database), however it is in base64 so you
| don't need to worry about url encoding issues and so it looks
| like a token.
|
| You could add a prefix to a jwt. That would make it a token
| that contains a jwt.
|
| I don't think the tiny prefix is what they want to obscure. So
| it wouldn't go against the design of JWT to add one.
|
| I would do it. I don't see any issues with it.
|
| It would be something like:
|
| BA_<base64>.<base64>.base64
|
| If you wanted to be able to double click to copy and paste,
| which I don't think is a huge usability improvement, you could
| replace the . with _, and I think a lot of devs would be able
| to figure out that it's a representation of a JWT.
| paxys wrote:
| A big motivation for such token formats is to quickly and
| easily identity when they are shared somewhere they shouldn't
| be. JWTs aren't helpful in that regard, since they always
| present themselves as a base64 encoded blob.
| gkop wrote:
| Totally, but JWT-like blobs can be detected (see sibling
| comment), and parsing attempted, so for the automated
| scanning use case, if I understand correctly, it can be done
| with perfect accuracy, just at a larger computation expense
| and worse security exposure due to the complexity of the
| scanning and the need to parse.
|
| The more interesting side to me is the benefit to humans,
| from the prefix technique.
| threeseed wrote:
| The question is why you couldn't just have "${prefix}-${JWT}"
| as your format.
|
| Then you can just strip the prefix before parsing. Which
| means don't need to worry about checksumming or entropy and
| you have the ability to embed large amounts of data as well
| as plenty of client support and libraries.
|
| Would be curious if this implementation is somehow more
| performant.
| parhamn wrote:
| It's pretty grepable because {" (json opener) always encode
| to "ey". So a base64 that starts with "ey" and has 3 dot
| separated sections is a good start for a regex. I'm sure you
| can go further by looking at the spec.
| vsareto wrote:
| >One other neat thing about _ is it will reliably select the
| whole token when you double click on it
|
| Shout out and kudos to whomever brought that up
| chatmasta wrote:
| Frankly the fact that this doesn't happen with `-` in a
| `<code>` block should be considered a browser bug.
| williamdclt wrote:
| Well it doesn't happen in IDEs either (by default, at least)
| chatmasta wrote:
| I can see the argument for multi-line pre-formatted code
| blocks, but for inline `<code>` it would be nice if double
| clicking anywhere selected the whole thing.
| mbauman wrote:
| Is it `a-long-identifier` or is it `x-y`?
| williamdclt wrote:
| I'd rather consistent behaviour TBH, I'm not too happy
| that `-` is a non-word character but I'd rather it always
| behaves the same everywhere without having to think about
| context
___________________________________________________________________
(page generated 2021-04-05 23:00 UTC)