[HN Gopher] Okta Outage
___________________________________________________________________
Okta Outage
Author : hunter2_
Score : 96 points
Date : 2021-12-15 15:57 UTC (7 hours ago)
(HTM) web link (status.okta.com)
(TXT) w3m dump (status.okta.com)
| nickdothutton wrote:
| Auth services need to be engineered for at least five nines if
| not six. System design fail.
| justapassenger wrote:
| You can engineer for any number of nines and still have massive
| outages.
| thrashh wrote:
| With that logic, you can do anything and it's A-OK if nothing
| you do succeeds.
| salawat wrote:
| Can confirm. This is how the world works smh.
| neurotixz wrote:
| OKTA SLAs and support terms specifically exclude AWS outages,
| so why would they?
| netghost wrote:
| Ahh, but how many of the nines need to go on which side of the
| decimal point?
| teddyh wrote:
| https://news.ycombinator.com/item?id=29567170
| whalesalad wrote:
| TIL okta is in us-west-2
| gabrielsroka wrote:
| With redundancy in other regions
| https://www.okta.com/video/oktane19-roadmap-why-building-cus...
| marcinzm wrote:
| Given they were down I would say there was, in practice, no
| redundancy. Simply claiming something doesn't make it true.
| gabrielsroka wrote:
| I don't know the details but they don't fail over
| automatically. There has to be a reason to push the button.
| Perhaps the reason did not exist this time. But I know for
| sure that there is redundancy.
| stevehawk wrote:
| So Okta's redundancy is about as reliable as AWS's status
| page?
| marcinzm wrote:
| Redundancy is only relevant if it helps you in an outage.
| Otherwise it's just a pointless marketing term no matter
| how much effort you put into it.
| pdx6 wrote:
| Okta uses nearly all the us regions, with older (and larger)
| customers in us-east-1.
|
| I used to work there and know the internals well. These aws
| outages must be causing massive chaos there.
| dingosity wrote:
| Oh. It wasn't just me.
| anonymousiam wrote:
| Heh. Our company just switched to their TFA from MS as of this
| morning. Poor timing.
| polskibus wrote:
| What was the business rationale for such switch? I usually see
| people migrate toward MS auth not away from it.
| oneplane wrote:
| We're moving everything that is still tied to it away from
| MS. The last products we have to solve are Dynamics and Excel
| but that scope is so small compared to everything else that
| it might not matter to leave those as-is for now if we can
| get at least Dynamics as SaaS and ditch AD (which only
| remains for Dynamics).
|
| MS doesn't do the things we need in a better way than other
| options, and it's almost always more expensive at product
| level and TCO level.
| jiggawatts wrote:
| I've found the opposite -- as long as you're _happy_ with
| staying within the MS ecosystems. So Azure, Office 365,
| Teams, etc...
|
| You hear people calling Microsoft expensive when they're on
| some random mix of Gmail, Notes, CM9, or whatever.
|
| Then MS seems expensive because it's all or nothing.
| Dipping your toe in the water turns into a dive to the
| bottom of the pool.
| oneplane wrote:
| To be honest not a lot of our users and administrators
| have actually been happy in the MS ecosystem. There are a
| few outliers, some licensing middlemen, a few MSPs and a
| couple of hardcore Excel number crunchers that use the
| Axapta or Dynamics connectors. But you can find those
| anywhere like with SAS and SPSS. A lot of users don't
| really care at all so that just makes it a cost and 'does
| it do the bare minimum'-deal for them.
|
| A few people that really invest and enjoy a specific
| application does not make it great, especially when it
| turns our they are just doing more than they should be
| doing; i.e. when you have an InDesign professional that
| would be typesetting materials for publication but the
| person that writes the copy is also trying to 'typeset'
| the source in Word. It's great if you then feel like Word
| gets you cool typeset documents as a power user, but if
| 9999 people in a 10k company don't do that and just let
| the publication team do that properly in InDesign
| according to the media standards, it's no reason to keep
| it as a default available application.
|
| A lot of the usage comes from "well, it was already there
| so I went and did it in that". Not because it was
| actually the standard, best choice or in scope of the
| task that was supposed to be done.
|
| Same goes for things like notes and documentation:
|
| - Code-level docs go in the repo (MD, RST mostly) - Org-
| level docs go in the wiki (Confluence) - Publications are
| delivered as copy to the publication team which then uses
| the DTP/typesetting thing of choice
|
| Yet someone who would ignore that creates extra work by
| doing it in a different application first, then copying
| it around and converting it. That means that the
| person/process needs to be fixed, and doesn't mean we
| need Word as an expensive WordPad/Pages replacement.
|
| Now, this might not apply to things like mini-orgs inside
| a bigger org, or very small companies and individuals.
| But I wasn't writing about those anyway ;-) At that level
| you don't really have the size and scope to make good
| choices anyway, and you're best off just sticking with
| one big vendor, not because they are the best, but
| because you won't be handling multi-vendor management
| anyway.
| anonymousiam wrote:
| According to our CTO, it was related to security. With
| Microsoft, the TFA options did not include a hardware token.
| So now we can authenticate with a phone call, a text message,
| or a security token. (
| https://www.okta.com/identity-101/security-token/ )
|
| The main advantage is that the hardware token can be used in
| areas where mobile phones are prohibited, and of course
| immunity from a SIM swap attack.
| vladvasiliu wrote:
| They do support hardware tokens. I use a Yubikey with it.
| However, support is spotty outside of Windows.
| anonymousiam wrote:
| There may have been other reasons not mentioned, such as
| Microsoft's tiered services model. It sometimes seems as
| if they deliberately provide poor solutions at the lower
| tiers. The USG is pretty pissed off about that.
|
| Also, Yubikey would would not work, because like mobile
| phones, USB devices are restricted in some areas.
| vladvasiliu wrote:
| > Also, Yubikey would would not work, because like mobile
| phones, USB devices are restricted in some areas.
|
| What kind of token would work, then? Something that only
| generates a TOTP, like those fobs some banks used to give
| out?
| anonymousiam wrote:
| Yes. Same method as Google/Microsoft Authenticator, but
| implemented in a separate hardware device (fob).
| Closi wrote:
| Ah, Microsoft has this in public preview at the moment so
| full support looks imminent.
| bsder wrote:
| Link please so I can track?
| amw-zero wrote:
| I think that depends on many factors, like where you work,
| what class of companies you work for, etc. For example, I
| have not even seen a Windows machine in 10 years at work. I
| can't think of anyone in my professional circle either who
| would suggest using any MS product.
| jiggawatts wrote:
| I saw a Linux machine recently... I think.
| nefitty wrote:
| I had to use Okta at a company once. I asked myself the very
| same question you posed, every single day.
| vosper wrote:
| Is MS auth better? Does that mean Active Directory?
|
| We're in the middle of a migration from in-house auth
| (which we need to get rid off) to Okta and I think the
| people involved are finding Okta pretty confusing. But it's
| a big product and auth stuff is complicated, so I'm not
| sure how much it's Okta's fault.
| vladvasiliu wrote:
| In the context of Okta it's probably AzureAD. But yes,
| it's related to Active Directory, you can easily sync the
| two. It's probably why many companies use it: it's easy
| to add on to your existing Windows infrastructure.
|
| My client uses it, it works mostly well. It does have its
| annoying limitations, though, such as no group
| inheritance and limited support for hardware tokens
| outside of Windows (no support on Safari/iOS,
| Safari/macOS, Firefox/Linux).
| count wrote:
| What do you mean hardware tokens outside of windows? For
| primary workstation authentication? Or AzureAD auth? By
| hardware tokens do you mean RSA or TOTP hardware fobs?
| marcinzm wrote:
| I've used Okta at two companies now and I found it fairly
| pleasant. Most issues were around people getting locked out
| in my experience.
| mooreds wrote:
| I think this is why having the ability to self-host is
| worthwhile. This option gives you flexibility to bring this stuff
| in house if you want to build a team to operate it (or put it on
| a current team's todo list).
|
| Should you? I don't know your situation and whether you can build
| an Okta-caliber level team internally. (My guess is that many
| smaller or non-tech focused orgs would have a hard time with
| that, but that's just a guess.) It's a hard question worth
| asking.
|
| It's easy to think "we could have done better" when things are on
| fire, as opposed to all the times when the status chart is all
| green and you don't have to think about Okta (feel free to
| s/Okta/other service provider/) at all.
|
| Disclosure: I work for FusionAuth, an auth provider that has both
| SaaS and self-hosted installation options.
| zitterbewegung wrote:
| When I worked for an event services company and a fairly large
| SR22 / budget insurance company the better question is not
| cloud or no but really hybrid cloud systems. The first one had
| a rather long downtime event that they invested in a remote
| failsafe. The insurance company was largely resilient and when
| power went out the servers were up but the people doing work
| weren't able to come there. They bought a generator but they
| weren't able to turn it on.
|
| Blaming Okta or any other group isn't the issue. Your customers
| don't care how you are down they only care if you are down.
|
| Also, I got a bill from Amazon when I forgot to shut down a
| pagemaker instance and that cost me $700. I self host now
| buying a business internet package with a static ip. I also
| upgraded the machine but it wasn't necessary and in hindsight I
| shouldn't have done the upgrade but just fix the case.
| tyingq wrote:
| I know there are complexities involved, but auth is one of those
| things that needs to be very insulated from single region issues.
| How long was Okta not working for customers?
| tw04 wrote:
| Someone should tell Amazon... IAM is single homed to us-east-1
| AFAIK.
| dastbe wrote:
| (I used to work at aws)
|
| The control API (i.e. adding/removing roles, modifying
| policies, etc.) is available out of us-east-1. However, the
| bits of IAM that relate to distributing credentials to
| instances/tasks/lambdas and STS are all regionalized and
| isolated.
| dboreham wrote:
| So...parent is correct.
| dastbe wrote:
| no, because most applications don't have an online
| dependency for creating roles and modifying policies.
| what they do typically have an online dependency for is
| provisioning credentials from those roles, which is
| architected to be regionally independent.
| sharpy wrote:
| Disclaimer: Former AWS engineer, never worked on IAM
| directly.
|
| AWS is divided into multiple partitions. For the vast
| majority of users, there is one partition - the regular
| commercial - other partitions being China, GovCloud, etc.
|
| Within each partition, there is a primary region that
| needs to be available for creation/mutation of
| credentials and policies. However, that data is
| replicated to other regions within the partition. That
| means the use of credentials that exist does NOT depend
| on the primary region being available. The replication is
| something that is closed monitored, and SLA breaches will
| result in pages.
| t0mas88 wrote:
| Not really. The parts that can take your app down are
| distributed.
| [deleted]
| kache_ wrote:
| If you only knew how bad modern software infrastructure really
| was
| tyingq wrote:
| Yeah, for a similar vendor, it's interesting to read this
| page and look at the diagram:
|
| https://auth0.com/availability-trust
|
| And then read this tweet:
|
| https://twitter.com/auth0/status/1471159935597793290
|
| Edit: Ah, seems they picked us-west-1 and us-west-2 as the
| two regions... _" In this case, we use two AWS regions: us-
| west-2 (our primary) and us-west-1 (our failover)."_[1] So
| bit by a double-region failure.
|
| [1] https://auth0.com/blog/auth0-architecture-running-in-
| multipl...
| crescentfresh wrote:
| Now that Okta bought Auth0 what's the developer experience
| like I wonder? I imagine the infrastructure of the two
| products are still completely isolated. But is it still a
| separate product you can use for identity management or are
| new customers forced to use Okta?
| bouzouk wrote:
| Auth0 customer: they kept everything isolated (and
| promised it will stay like that for a while)
| drooby wrote:
| Looks like about 45 mins
| activitypea wrote:
| Seconded
| abruzzi wrote:
| Cisco Duo SSO/MFA was also out this morning during the AWS
| outage. I guess usable redundancy for these services is a
| difficult problem.
| sbilstein wrote:
| rolling your own auth is underrated.
___________________________________________________________________
(page generated 2021-12-15 23:01 UTC)