[HN Gopher] CrowdStrike ex-employees: 'Quality control was not p...
       ___________________________________________________________________
        
       CrowdStrike ex-employees: 'Quality control was not part of our
       process'
        
       Author : everybodyknows
       Score  : 528 points
       Date   : 2024-09-13 20:17 UTC (1 days ago)
        
 (HTM) web link (www.semafor.com)
 (TXT) w3m dump (www.semafor.com)
        
       | Alupis wrote:
       | > "Speed was the most important thing," said Jeff Gardner, a
       | senior user experience designer at CrowdStrike who said he was
       | laid off in January 2023 after two years at the company. "Quality
       | control was not really part of our process or our conversation."
       | 
       | This type of article - built upon disgruntled former employees -
       | is worth about as much as the apology GrubHub gift card.
       | 
       | Look, I think just as poorly about CrowdStrike as anyone else out
       | there... but you can find someone to say anything, especially
       | when they have an axe to grind and a chance at some spotlight.
       | Not to mention this guy was a designer and wouldn't be involved
       | in QC anyway.
       | 
       | > Of the 24 former employees who spoke to Semafor, 10 said they
       | were laid off or fired and 14 said they left on their own. One
       | was at the company as recently as this summer. Three former
       | employees disagreed with the accounts of the others. Joey
       | Victorino, who spent a year at the company before leaving in
       | 2023, said CrowdStrike was "meticulous about everything it was
       | doing."
       | 
       | So basically we have nothing.
        
         | nyc_data_geek1 wrote:
         | >>So basically we have nothing.
         | 
         | Except the biggest IT outage ever. And a postmortem showing
         | their validation checks were insufficient. And a rollout
         | process that did not stage at all, just rawdogged straight to
         | global prod. And no lab where the new code was actually
         | installed and run prior to global rawdogging.
         | 
         | I'd say there's smoke, and numerous accounts of fire, which
         | this can be taken in the context of.
        
           | mewpmewp2 wrote:
           | There definitely was a huge outage, but based on the given
           | information we still can't know for sure how much they
           | invested in testing and quality control.
           | 
           | There's always a chance of failure even for the most
           | meticulous companies.
           | 
           | Now I'm not defending or excusing the company, but a singular
           | event like this can happen to anyone and nothing is 100%.
           | 
           | If thorough investigation revealed poor quality control
           | investment compared to what would be appropriate for a
           | company like this, then we can say for sure.
        
             | daedrdev wrote:
             | Two things are clear though
             | 
             | Nobody ran this update
             | 
             | The update was pushed globally to all computers
             | 
             | With that alone we know they have failed the simplest of
             | quality control methods for a piece of software as
             | widespread as theirs. This is even excluding that there
             | should have been some kind of error handling to allow the
             | computer to boot if they did push bad code.
        
               | busterarm wrote:
               | Also it's the _second_ time that they had done this in a
               | few short months.
               | 
               | They had previous bricked linux hosts earlier with a
               | similar type of update.
               | 
               | So we also know that they don't learn from their
               | mistakes.
        
               | rblatz wrote:
               | The blame for the Linux situation isn't as clear cut as
               | you make it out to be. Red hat rolled out a breaking
               | change to BPF which was likely a regression. That wasn't
               | caused directly by a crowdstrike update.
        
               | IcyWindows wrote:
               | At least one of the incidents involved Debian machines,
               | so I don't understand how Red Hat's change would be
               | related.
        
               | rblatz wrote:
               | Sorry, that's correct it was Debian, but Debian did apply
               | a RHEL specific patch to their kernel. That's the
               | relationship to red hat.
        
               | busterarm wrote:
               | It's not about the blame, it's about how you respond to
               | incidents and what mitigation steps you take. Even if
               | they aren't directly responsible, they clearly didn't
               | take proper mitigation steps when they encountered the
               | problem.
        
               | roblabla wrote:
               | How do you mitigate the OS breaking an API below you in
               | an update? Test the updates before they come out? Even if
               | you could, you'd still need to deploy a fix before the OS
               | update hits the customers, and anyone that didn't update
               | would still be affected.
               | 
               | The linux case is just _very_ different from the windows
               | case. The mitigation steps that could have been taken to
               | avoid the linux problem would not have helped for the
               | windows outage anyways, the problems are just too
               | different. The linux update was about an OS update
               | breaking their program, while the windows issue was about
               | a configuration change they made triggering crashes in
               | their driver.
        
               | busterarm wrote:
               | You're missing the forest for the trees.
               | 
               | It's: a) an update, b) pushed out globally without proper
               | testing, c) that bricked the OS.
               | 
               | It's an obvious failure mode that if you have a proper
               | incident response process would be revealed from that
               | specific incident and flagged for needing mitigation.
               | 
               | I do this specific thing for a living. You don't just
               | address the exact failure that happened but try to
               | identify classes of risk in your platform.
               | 
               | > Even if you could, you'd still need to deploy a fix
               | before the OS update hits the customers, and anyone that
               | didn't update would still be affected.
               | 
               | And yet the problem would still only affect Crowdstrike's
               | paying customers. No matter how much you blame upstream
               | your paying customers are only ever going to blame their
               | vendor because the vendor had discretion to test and not
               | release the update. As their customers should.
        
               | ScottBurson wrote:
               | > there should have been some kind of error handling
               | 
               | This is the point I would emphasize. A kernel module that
               | parses configuration files must defend itself against a
               | failed parse.
        
               | hn_throwaway_99 wrote:
               | While I agree with this, from a software engineering
               | perspective I think it's more useful to look at the
               | lessons learned. I think it's too easy to just throw
               | "Crowdstrike is a bunch of idiots" against the wall, and
               | I don't think that's true.
               | 
               | It's clear to me that CrowdStrike saw this as a _data_
               | update vs. a _code_ update, and that they had much more
               | stringent QA procedures for code updates that they did
               | data updates. It 's very easy for organizations to lull
               | themselves into this false sense of security when they
               | make these kinds of delineations (sometimes even
               | subconsciously at first), and then over time they lose
               | site of the fact that a bad data update can be just as
               | catastrophic as a bad code update. I've seen shades of
               | this issue elsewhere many times.
               | 
               | So all that said, I think your point is valid. I know
               | Crowdstrike had the posture that they wanted to get
               | vulnerability files deployed globally as fast as possible
               | upon a new threat detection in order to protect their
               | clients, but it wouldn't have been that hard to build in
               | some simple checks in their build process (first deploy
               | to a test bed, then deploy globally) even if they felt a
               | slower staged rollout would have left too many of their
               | clients unprotected for too long.
               | 
               | Hindsight is always 20/20, but I think the most important
               | lesson is that this code vs data dichotomy can be
               | dangerous if the implications are not fully understood.
        
               | llm_trw wrote:
               | I'm sorry but there comes a point where you have to call
               | a spade a spade.
               | 
               | When you have the trifecta of regex, *argv packing and
               | uninitialized memory you're reaching levels of
               | incompetence which require being actively malicious and
               | not just stupid.
        
               | abraae wrote:
               | > It's clear to me that CrowdStrike saw this as a data
               | update vs. a code update, and that they had much more
               | stringent QA procedures for code updates that they did
               | data updates.
               | 
               | It cannot have been a surprise to Crowdstrike that
               | pushing bad data had the potential to bork the target
               | computer. So if they had such an attitude that would
               | indicate striking incompetence. So perhaps you are right.
        
               | Comma2976 wrote:
               | Crowdstrike is a bunch of idiots
        
               | mavhc wrote:
               | If they weren't idiots they wouldn't be parsing data in
               | the kernel level module
        
               | GuB-42 wrote:
               | It could have been ok to expedite data updates, should
               | the code treat configuration data as untrusted input, as
               | if it could be written by an attacker. It means fuzz
               | testing and all that.
               | 
               | Obviously the system wasn't very robust, as a simple,
               | within specs change could break it. A company like
               | CrowdStrike, which routinely deals with memory exploits
               | and claims to do "zero trust" should know better.
               | 
               | As often, there is a good chance it is an organization
               | problem. The team in charge of the parsing expected that
               | the team in charge of the data did their tests and made
               | sure the files weren't broken, while on the other side,
               | they expected the parser to be robust and at worst, a
               | quick rollback could fix the problem. This may indeed be
               | the sign of a broken company culture, which would give
               | some credit to the ex-employees.
        
               | Izkata wrote:
               | > Obviously the system wasn't very robust, as a simple,
               | within specs change could break it.
               | 
               | From my limited understanding, the file was corrupted in
               | some way. Lots of NULL bytes, something like that.
        
               | GuB-42 wrote:
               | From the report, it seems the problem is that they added
               | a feature that could use 21 arguments, but there was only
               | enough space for 20. Until now, no configuration used all
               | 21 (the last one was a wildcard regex, which apparently
               | didn't count), but when they finally did, it caused a
               | buffer overflow and crashed.
        
               | acdha wrote:
               | That rumor floated around Twitter but the company quickly
               | disavowed it. The problem was that they added an extra
               | parameter to a common function but never tested it with a
               | non-wildcard value, revealing a gap in their code
               | coverage review:
               | 
               | https://www.crowdstrike.com/wp-
               | content/uploads/2024/08/Chann...
        
               | RaftPeople wrote:
               | > _It 's clear to me that CrowdStrike saw this as a data
               | update vs. a code update_
               | 
               | > _Hindsight is always 20 /20, but I think the most
               | important lesson is that this code vs data dichotomy can
               | be dangerous if the implications are not fully
               | understood._
               | 
               | But it's not some new condition that the industry hasn't
               | already been dealing with for many many decades (i.e.
               | code vs config vs data vs any other type of change to
               | system, etc.).
               | 
               | There are known strategies to reduce the risk.
        
             | idkwhatimdoin wrote:
             | > If thorough investigation revealed poor quality control
             | investment compared to what would be appropriate for a
             | company like this, then we can say for sure.
             | 
             | We don't really need that thorough of an investigation.
             | They had no staged deploys when servicing millions of
             | machines. That alone is enough to say they're not running
             | the company correctly.
        
               | dartos wrote:
               | Totally agree.
               | 
               | I'd consider staggering a rollout to be the absolute
               | basics of due diligence.
               | 
               | Especially when you're building a critical part of
               | millions of customer machines.
        
               | mewpmewp2 wrote:
               | I would say that canary release is an absolute must 100%.
               | Except I can think of cases where it might still not be
               | enough. So, I just don't feel comfortable judging them
               | out of the box. Does all the evidence seem to point
               | against them? For sure. But I just don't feel comfortable
               | giving that final verdict without knowing for sure.
               | 
               | Specifically because this is about fighting against
               | malicious actors, where time can be of essence to deploy
               | some sort of protection against a novel threat.
               | 
               | If there's deadlines that you can go over, and nothing
               | bad happens, for sure. Always have canary releases, and
               | perfect QA, monitoring everything thoroughly, but I'm
               | just saying, there can be cases where damage that could
               | be done if you don't act fast enough, is just so much
               | worse.
               | 
               | And I don't know that it wasn't the case for them. I just
               | don't know.
        
               | dartos wrote:
               | In this case, they pretty much caused a worst case
               | scenario...
        
               | acdha wrote:
               | > Specifically because this is about fighting against
               | malicious actors, where time can be of essence to deploy
               | some sort of protection against a novel threat.
               | 
               | This is severely overstating the problem: an extra few
               | minutes is not going to be the difference between their
               | customers being compromised. Most of the devices they run
               | on are never compromised, because anyone remotely serious
               | has defense in depth.
               | 
               | If it was true, or even close to true, that would make
               | the criticism more rather than less strong. If time is of
               | the essence, you invest in things like reviewing test
               | coverage (their most glaring lapse), fuzz testing, and
               | common reliability engineering techniques like having the
               | system roll back to the last known good configuration
               | after it's failed to load. We think of progressive
               | rollouts as common now but they got to get that
               | mainstream in large part because the Google Chrome team
               | realized rapid updates are important but then asked what
               | they needed to do to make them safe. CrowdStrike's report
               | suggests that they wanted rapid but weren't willing to
               | invest in the implementation because that isn't a
               | customer-visible feature - until it very painfully became
               | one.
        
               | wlonkly wrote:
               | I also fall on the side of "stagger the rollout" (or
               | "give customers tools to stagger the rollout"), but at
               | the same time I recognize that a lot of customers would
               | not accept delays on the latest malware data.
               | 
               |  _Before_ the incident, if you asked a customer if they
               | would like to get updates faster even if it means that
               | there is a remote chance of a problem with them... I bet
               | they 'd still want to get updates faster.
        
               | canucker2016 wrote:
               | They literally half-assed their deployment process - one
               | part enterprisey, one part "move fast and break things".
               | 
               | Guess which part took down much of the corporate world?
               | 
               | from Preliminary Post Incident Review at
               | https://www.crowdstrike.com/falcon-content-update-
               | remediatio... :
               | 
               | "CrowdStrike delivers security content configuration
               | updates to our sensors in two ways: Sensor Content that
               | is shipped with our sensor directly, and Rapid Response
               | Content that is designed to respond to the changing
               | threat landscape at operational speed.
               | 
               | ...
               | 
               | The sensor release process begins with automated testing,
               | both prior to and after merging into our code base. This
               | includes unit testing, integration testing, performance
               | testing and stress testing. This culminates in a staged
               | sensor rollout process that starts with dogfooding
               | internally at CrowdStrike, followed by early adopters. It
               | is then made generally available to customers. Customers
               | then have the option of selecting which parts of their
               | fleet should install the latest sensor release ('N'), or
               | one version older ('N-1') or two versions older ('N-2')
               | through Sensor Update Policies.
               | 
               | The event of Friday, July 19, 2024 was not triggered by
               | Sensor Content, which is only delivered with the release
               | of an updated Falcon sensor. Customers have complete
               | control over the deployment of the sensor -- which
               | includes Sensor Content and Template Types.
               | 
               | ...
               | 
               | Rapid Response Content is used to perform a variety of
               | behavioral pattern-matching operations on the sensor
               | using a highly optimized engine.
               | 
               | Newly released Template Types are stress tested across
               | many aspects, such as resource utilization, system
               | performance impact and event volume. For each Template
               | Type, a specific Template Instance is used to stress test
               | the Template Type by matching against any possible value
               | of the associated data fields to identify adverse system
               | interactions.
               | 
               | Template Instances are created and configured through the
               | use of the Content Configuration System, which includes
               | the Content Validator that performs validation checks on
               | the content before it is published.
               | 
               | On July 19, 2024, two additional IPC Template Instances
               | were deployed. Due to a bug in the Content Validator, one
               | of the two Template Instances passed validation despite
               | containing problematic content data.
               | 
               | Based on the testing performed before the initial
               | deployment of the Template Type (on March 05, 2024),
               | trust in the checks performed in the Content Validator,
               | and previous successful IPC Template Instance
               | deployments, these instances were deployed into
               | production."
        
               | hello_moto wrote:
               | > one part enterprisey, one part "move fast and break
               | things".
               | 
               | When there's 0day, how enterprisey you would like to
               | catch the 0day?
        
               | tsimionescu wrote:
               | Not sure, but definitely more enterprisey than "release a
               | patch to the entire world at once before running it on a
               | _single machine_ in-house ".
        
               | mewpmewp2 wrote:
               | So it would be preferable to have your data encrypted,
               | taken hostage unless you pay, and be down for days,
               | instead of 6 hours of just down?
        
               | xeromal wrote:
               | That's a false dichotomy
        
               | tsimionescu wrote:
               | Do you seriously believe that _all_ CrowdStrike on
               | Windows customers were at such imminent risk of
               | ransomware that one-two hours to run this on one internal
               | setup and catch the critical error they released would
               | have been dangerous?
               | 
               | This is a ludicrous position, and has been proven
               | obviously false by the proceedings: all systems that were
               | crashed by this critical failure were not, in fact,
               | attacked with ransomware once the CS agent was un-
               | installed (at great pain).
        
               | Aeolun wrote:
               | Nonsense. You don't need any staged deploys if you simply
               | make no mistakes.
               | 
               | /s
        
           | quietbritishjim wrote:
           | The sentence you quoted clearly meant, from the context,
           | "clearly we have nothing [to learn from the opinions of these
           | former employees]". Nothing in your comment is really
           | anything to do with that.
        
             | tomrod wrote:
             | Triangulation versus new signal.
        
           | sundvor wrote:
           | "Everyone" piles on Tesla all the time; a worthwhile
           | comparison would be how Tesla roll out vehicle updates.
           | 
           | Sometimes people are up in arms "where's my next version" (eg
           | when adaptive headlights was introduced), yet Tesla
           | prioritise a safe, slow roll out. Sometimes the updates fail
           | (and get resolved individually), but never on a global scale.
           | (None experienced myself, as a TM3 owner on the "advanced"
           | update preference).
           | 
           | I understand the premise of Crowdstrike's model is to have up
           | to date protection everywhere but clearly they didn't think
           | this through enough times, if at all.
        
             | kccqzy wrote:
             | You can also say the same thing about Google. Just go look
             | at the release notes on the App Store for the Google Home
             | app. There was a period of more than six months where every
             | single release said "over the next few weeks we're rolling
             | out the totally redesigned Google Home app: new easier to
             | navigate 5-tab layout."
             | 
             | When I read the same release notes so often I begin to
             | question whether this redesign is really taking more than
             | six months to roll out. And then I read the Sonos app
             | disaster and I thought that was the other extreme.
        
               | cesarb wrote:
               | > Just go look at the release notes on the App Store for
               | the Google Home app. [...] When I read the same release
               | notes so often I begin to question whether this redesign
               | is really taking more than six months to roll out.
               | 
               | Google is terrible at release notes. Since several years
               | ago, the release notes for the "Google" app on the
               | Android app store always shows the exact same four
               | unchanging entries, loosely translating from Portuguese:
               | "enhanced search page appearance", "new doodles designed
               | for app experience", "offline voice actions (play music,
               | enable Wi-Fi, enable flashlight) - available only in the
               | USA", "web pages opened directly within the app". I
               | heavily doubt it's taking these many years to roll out
               | these changes; they probably simply don't care anymore,
               | and never update these app store release notes.
        
           | hello_moto wrote:
           | > And no lab where the new code was actually installed and
           | run prior to global rawdogging.
           | 
           | I thought the new code was actually installed, the running
           | part depends on the script input...?
        
         | sonofhans wrote:
         | If design isn't involved in QC you're not doing QC very well.
         | If design isn't plugged into development process enough to
         | understand QC then you're not doing design very well.
        
           | tw04 wrote:
           | Why would a UX designer be involved in any way, shape, or
           | form in kernel level code patches? They would literally never
           | ship an update if they had that many hands in the pot for
           | something completely unrelated. Should they also have their
           | sales reps and marketing folks pre-brief before they make any
           | code changes?
        
             | sonofhans wrote:
             | A UX designer might have told them it was a bad idea to
             | deploy the patch widely without testing a smaller cohort,
             | for instance. That's an obvious measure that they skipped
             | this time.
        
               | newshackr wrote:
               | But that doesn't have anything to do with what UX
               | designers typically do
        
               | hello_moto wrote:
               | the person you're replying will not take any sane
               | argument once they decided that UX must be involved in
               | kernel technical decision...
        
               | sonofhans wrote:
               | Pfft, I never said that at all. I'm not talking about
               | technical decisions. OP was talking about QC, which is
               | verifying software for human use. If you don't have user-
               | centered people involved (UX or product or proserve) then
               | you end up with user-hostile decisions like these people
               | made.
        
               | sigseg1v wrote:
               | How would it not be related? Jamming untested code down
               | the pipe with no way for users to configure when it's
               | deployed and then rendering their machines inoperable is
               | an extremely bad user experience and I would absolutely
               | expect a UX expert to step in to try to avoid that.
        
               | diatone wrote:
               | Not true; UX designers typically are responsible for
               | advocating for a robust, intuitive experience for users.
               | The fact that kernel updates don't have a user interface
               | doesn't make them exempt from asking the simple question:
               | how will this affect users? And the subsequent question:
               | is there a chance that deploying this eviscerates the
               | user experience?
               | 
               | Granted, a company that isn't focused on the user
               | experience as much as it is on other things might not
               | prioritise this as much in the first place.
        
               | fzeroracer wrote:
               | I can't believe people on HN are posting this stuff over
               | and over again. Either you are holistically disconnected
               | from what proper software development should look like or
               | outright creating the same environments that resulted in
               | the crowdstrike issue.
               | 
               | Software security and quality is the responsibility of
               | everyone on the team. A good UX designer should be
               | thinking of ways a user can escape the typical flow or
               | operate in unintended ways and express that to testers.
               | And in decisions where management is forcing untested
               | patches everyone should chime in.
        
             | zipy124 wrote:
             | I would agree if it was a UI designer, but a good UX
             | designer designs for the users, which in this case
             | including the system admins who will be updating kernel
             | level code patches. Ensuring they have a good experience
             | e.g no crashes, is their job. A recommendation would likely
             | be for example small roll-outs to minimise the number of
             | people having a bad user experience on a roll-out that goes
             | wrong.
        
         | darby_nine wrote:
         | I feel like crowdstrike is perfectly capable of mounting its
         | own defense
        
         | JumpCrisscross wrote:
         | > _This type of article - built upon disgruntled former
         | employees - is worth about as much as the apology GrubHub gift
         | card_
         | 
         | To you and me, maybe. To the insurers and airlines paying out
         | over the problem, maybe not.
        
         | bdcravens wrote:
         | I'm going with principle of least astonishment, where
         | productivity is more highly valued in most companies than
         | quality control.
        
         | insane_dreamer wrote:
         | > So basically we have nothing.
         | 
         | Except the fact that CrowdStrike fucked up the one thing they
         | weren't supposed to fuck up.
         | 
         | So yeah, at this point I'm taking the ex-employees' word,
         | because it confirms the results that we already know -- there
         | is no way that update could have gone out had there been proper
         | "safety first" protocols in place and CrowdStrike was
         | "meticulous".
        
         | theideaofcoffee wrote:
         | I just don't think a company like Crowdstrike has a leg to
         | stand on when leveling the "disgruntled" label in the face of
         | their, let's face it, astoundingly epic fuck up. It's the
         | disgruntled employees that I think would have the most clear
         | picture of what was going on, regardless of them being in QA/QC
         | or not because they, at that point, don't really care any more
         | and will be more forthright with their thoughts. I'd certainly
         | trust their info more than a company yes-man which is probably
         | where some of that opposing messaging came from.
        
           | paulcole wrote:
           | Why would you trust a company no-man any more than a company
           | yes-man? They both have agendas and biases. Is it just that
           | you personally prefer one set of biases (anti-company) more
           | than the other (pro-company)?
        
             | theideaofcoffee wrote:
             | Yes, I am very much biased toward being anti-company and I
             | make no apologies for that. I've been in the corporate
             | world long enough to know first-hand the sins that PR and
             | corporate management commits on the company's behalf and
             | the harm it does. I find information coming from the
             | individual more reliable than having it filtered through
             | corpo PR, legal, ass-covering nonsense, the latter group
             | often wanting to preserve the status quo than getting out
             | actual info.
        
               | paulcole wrote:
               | OK just checking. Nice that you at least acknowledge your
               | bias.
        
             | noisy_boy wrote:
             | Because there is still an off-hand chance that an employee
             | who has been let go isn't speaking out of spite and merely
             | stating the facts - depends on a combination of their
             | honesty and the feeling they harbor about being let go.
             | Everyone who is let go isn't bitter and/or a liar.
             | 
             | However, every company yes-man is paid to be a yes-man and
             | will speak in favor of the company without exception - that
             | literally is the job. Otherwise they will be fired and will
             | join the ranks of the aforementioned people.
             | 
             | So logically it makes more sense for me to believe the
             | former more than the latter. The two-sides are not
             | equivalent (as you may have alluded) in term of
             | trustworthiness.
        
               | nullvoxpopuli wrote:
               | Agreed. As a data point, i'm not disgruntled (i'm quoted
               | in this article).
               | 
               | Mostly disappointed.
        
             | insane_dreamer wrote:
             | Well, in this case, we know one side (pro-company) fucked
             | up big time. The other side (anti-company) may or may not
             | have fucked up.
             | 
             | That makes it easier to trust one side over another.
        
               | paulcole wrote:
               | You've kind of set yourself up in a no-lose situation
               | here.
               | 
               | If the employees fucked up then you'll say the company
               | still fucked up because it wasn't managing the employees
               | well.
               | 
               | And then in that situation you'll still believe the lying
               | employees who say its the company's fault while leaving
               | out their culpability.
        
         | tooltower wrote:
         | This is like online reviews. If you selectively take positive
         | or negative reviews and somehow censor the rest, the reviews
         | are worthless. Yet, if you report on all the ones you find,
         | it's still useful.
         | 
         | Yes, I'm more likely to leave reviews if I'm unsatisfied. Yes,
         | people are more likely to leave CS if they were unhappy. Biased
         | data, but still useful data.
        
         | denkmoon wrote:
         | Well they certainly don't care about the speed of the endpoints
         | their malware runs on. Shit has ruined my macos laptop's
         | performance.
        
           | nullvoxpopuli wrote:
           | All EDR software does (at least on macos)
           | 
           | Source: me, a developer who also codes in free time and
           | notices how bad fs perf is especially.
           | 
           | I've had the CrowdStrike sensor, and my current company is
           | using cyberhaven.
           | 
           | So.. while 2 data points don't technically make a pattern, it
           | does begin to raise suspicion.
        
         | Aeolun wrote:
         | Honestly, this article describes nearly all companies (from the
         | perspective of the engineers) so I'm not sure I find it hard to
         | believe this one is the same.
        
         | zik wrote:
         | Here's some anecdotal evidence - a friend worked at CrowdStrike
         | and was horrified at how incredibly disorganised the whole
         | place was. They said it was completely unsurprising to them
         | that the outage occurred. More surprising to them was that it
         | hadn't happened more often given what a clusterfrock the place
         | was.
        
         | wpietri wrote:
         | > So basically we have nothing.
         | 
         | No, what we have is a publication who is claiming that the
         | people they talked to were credible and had points that were
         | interesting and tended to match one another and/or other
         | evidence.
         | 
         | You can make the claim that Semafor is bad at their jobs, or
         | even that they're malicious. But that's a hard claim to make
         | given that in the paragraph you've quoted they are giving you
         | the contrary evidence that they found.
         | 
         | And this is a process many of us have done informally. When we
         | talk to one ex-employee of a company, well maybe it was just
         | that guy, or just where he was in the company. But when a bunch
         | of people have the same complaint, it's worth taking it much
         | more seriously.
        
         | lr4444lr wrote:
         | In principle yes, I agree that former employees' sentiments
         | have an obvious bias, but if they all trend in the same
         | direction - people who worked in different times and functions
         | and didn't know each other while on the job - that points to a
         | likely underlying truth.
        
         | _heimdall wrote:
         | I do agree with having to expect bias there, but who else do
         | you really expect to speak out?Any current employee would very
         | quickly become an ex-employee if they speak out with any
         | specifics.
         | 
         | I would expect any contractor that may have worked for
         | CrowdStrike, or done something like a third-party audit, would
         | be under an NDA covering their work.
         | 
         | Who's left to speak out with any meaningful details?
        
         | _fat_santa wrote:
         | > Quality control was not really part of our process or our
         | conversation.
         | 
         | Is anyone really surprised or learned any new information? For
         | us that have worked for tech companies, this is one of those
         | repeating complaints that you hear across orgs that indicates a
         | less than stellar engineering culture.
         | 
         | I've worked with numerous F500 orgs and I would say 3/5 orgs
         | that I worked in, their code was so bad that it made me wonder
         | how they haven't had a major incident yet.
        
         | skenderbeu wrote:
         | Disgruntled are the Crowdstrike customers that had to deal with
         | the outage. These employees have a lot of reputation to lose
         | for coming forward. Crowstrike is a disgrace of a company and
         | many others like it are doing the same behaviors but they just
         | haven't gotten caught yet. Software development has become a
         | disgrace when the bottom line of squeezing margins to please
         | investors took over.
        
         | iudqnolq wrote:
         | There are some very specific accusations backed up by non-
         | denials from crowdstrike.
         | 
         | Ex-employees said bugs caused the log monitor to drop entries.
         | Crowdstrike responded the project was never designed to alert
         | in real time. But Crowdstrike's website currently advertises it
         | as working in real time.
         | 
         | Ex-employees said people trained to monitor laptops were
         | assigned to monitor AWS accounts with no extra training.
         | Crowdstrike replied that "there were no experienced 'cloud
         | threat hunters' to be had" in 2022 and that optional training
         | was available to the employees.
        
       | Sarkie wrote:
       | It was shown in the RCA that their QA processes were shit
        
       | monksy wrote:
       | No shit.
        
       | nine_zeros wrote:
       | Typical of tech companies these days. Quality is considered
       | immaterial - or worse - put on low level managers and engineers
       | who don't have the time to clearly examine quality and good roll
       | out practices.
       | 
       | C-Suite and investors don't seem to want to spend on quality.
       | They should just price in that their stock investment could
       | collapse any day.
        
       | 0xbadcafebee wrote:
       | Critical software infrastructure should be regulated the way
       | critical physical infrastructure is. We don't trust the people
       | who make buildings and bridges to "do the right thing" - we
       | mandate it with regulations and inspections. (When your software
       | not working strands millions of people around the globe, it's
       | critical) And this was just a regular old "accident"; imagine the
       | future, when a war has threat actors _trying_ to knock things
       | out.
        
         | theideaofcoffee wrote:
         | "We can't regulate the industry because then the US loses to
         | China" or "regulation will kill the US competitive advantage!"
         | responses I've had to suggesting the same and I just can't. But
         | I agree with you 100%. If it's safety critical, it should be
         | under even more scrutiny than other things, it shouldn't be
         | left to self-regulating QA-like processes in profit seeking
         | companies and has to have a bit more scrutiny before the big
         | button gets pressed.
         | 
         | Edit: Disclaimer: The quotes aren't mine, just retorts I've
         | received from others when I suggest the R-word.
        
           | janalsncm wrote:
           | > then the US loses to China
           | 
           | Yeah it makes no sense. Was the US not losing to China when
           | we own-goaled the biggest cybersecurity incident in history?
        
             | worik wrote:
             | > then the US loses to China
             | 
             | Such a silly meme, too. Economics 101 China and USA would
             | both benefit by halting the conflict and trading with each
             | other
        
           | Zigurd wrote:
           | Not to mention humans going extinct because regulators are to
           | blame for there being no city on Mars. Because that's
           | definitely the reason there's no city on Mars.
        
         | owl57 wrote:
         | Did you notice that the piece of software in question was
         | apparently installed mostly in companies where regulations and
         | inspections already override sysadmins' common sense? Are you
         | sure the answer is simply more of the same?
        
           | acdha wrote:
           | It's not true that "common sense" is being overridden: most
           | companies and sysadmins do need that baseline to avoid
           | "forgetting" about things which aren't trivial to implement
           | (if you didn't work in the field 10+ years ago, it was common
           | to see systems getting patched annually or worse, people
           | opening up SSH/Remote Desktop to the internet for
           | convenience, shared/short passwords even for privileged
           | accounts, vendors would require horribly insecure
           | configuration because they didn't want to hire anyone who
           | knew how to do things better, etc.). There are drawbacks to
           | compliance security but it has been useful for flushing all
           | of that mess out.
           | 
           | Even if it wasn't wrong, that's still the wrong reaction.
           | We're in this situation because so many companies were
           | negligent in the past and the status quo was obviously
           | untenable. If there is a problem with a given standard the
           | solution is to make a better system (e.g. like Apple did)
           | rather than to say one of the most important industries in
           | the world can't be improved because that'd require a small
           | fraction of its budget.
        
           | sitkack wrote:
           | I sure noticed how much snark you packed into two sentences!
        
           | 0xbadcafebee wrote:
           | I've worked in these enterprise organizations for a long
           | time. They don't run on common sense, or even what one might
           | consider "business sense". Their existing incentives create
           | bizarre behavior.
           | 
           | For example, you might think _" if a big security exploit
           | happens, the stock price might tank"_. So if they value the
           | stock price, they'll focus on security, right?. In reality
           | what they do is focus on _burying the evidence_ of security
           | exploits. Because if nobody finds out, the stock price won 't
           | tank. Much easier than doing the work of actually securing
           | things. And apparently it's often legal.
           | 
           | And when it's not a bizarre incentive, often people just
           | ignore risks, or even low-level failures, until it's too
           | late. Four-way intersections can pile up accidents for years
           | until a school bus full of kids gets T-boned by a dump truck.
           | We can't expect people to do the right thing even if they
           | notice a problem. Something has to force the right thing.
           | 
           | The _only_ thing I have ever seen force an executive to do
           | the right thing is a law that says they will be held liable
           | if they don 't. That's still not a guarantee it will actually
           | happen correctly, course. But they will put pressure on their
           | underlings to at least try to make it happen.
           | 
           | On top of that, I would have standards that they are required
           | to follow, the way building codes specify the standard
           | tolerances, sizes, engineering diagrams, etc that need to be
           | followed and inspected before someone is allowed into the
           | building. This would enforce the quality control (and someone
           | impartial to check it) that was lacking recently.
           | 
           | This will have similar results as building codes - increased
           | bureaucracy, cost, complexity, time... but also, more safety.
           | I think for _critical_ things, we really do need it.
           | Industrial controls, like those used for water, power
           | (nuclear...), gas, etc, need it. Tanker and container ships,
           | trains /subways, airlines, elevators, fire suppressants,
           | military/defense, etc. The few, but very, very important,
           | systems.
           | 
           | If somebody else has better ideas, believe me, I am happy to
           | hear them....
        
             | abbadadda wrote:
             | Probably there should be an independent body that oversees
             | postmortems on tech issues, with the ability to suggest
             | changes. This is what airlines face during crash
             | investigations and often new rules are put in place (e.g.,
             | don't let the shift manager self-certify his own work in
             | the incident where the pilot's window popped off). How this
             | would look like with software companies, and what the bar
             | is for being subject to this rigor I don't know (I suspect
             | not for a Candy Crush outage though).
             | 
             | In general, the biggest problem I see with late stage
             | capitalism, and a lack of accountability in general, is
             | that given the right incentives people will "fuck things
             | up" faster than you can stop them. For example, say
             | CrowdStrike was skirting QA - what's my incentive as an
             | individual employee versus the incentive of an executive at
             | the company? If the exec can't tell the difference between
             | good QA and bad QA, but can visually see the accounting
             | numbers go up when QA is underfunded, he's going to
             | optimize for stock price. And as an IC there's not much you
             | can do unless you're willing to fight the good fight day in
             | and day out. But when management repeatedly communicates
             | they do not reward that behavior, and indeed may not care
             | at all about software quality over a 5 year time horizon,
             | what do you do? The key lies in finding ways to convince
             | executives or short of that holding them to account like
             | you say.
        
               | theideaofcoffee wrote:
               | I've commented on this before, but in this case I think
               | it starts to fall onto the laps of the individual
               | employees themselves by way of licensing, or at least
               | some sort of certification system. Sure, you could skirt
               | a test here or there, but then you'd only be shorting
               | yourself when shit hits the fan. It'd be your license and
               | essentially your livelihood on the line.
               | 
               | "Proper" engineering disciplines have similar systems
               | like the Professional Engineer cert via the NSPE that
               | requires designs be signed off. If you had the
               | requirement that all software engineers (now with the
               | certification actually bestowing them the proper title of
               | 'engineer') sign off on their design, you could prevent
               | the company from just finding someone else more
               | unscrupulous to push that update or whatever through. If
               | the entirety of the department or company is employing
               | properly certificated people, they'd be stuck actually
               | doing it the right way.
               | 
               | That's their incentive to do it correctly: sign your name
               | to it, or lose your license, and just for drama's sake,
               | don't collect $200, directly to jail. For the companies,
               | employ properly licensed engineers, or risk unlimited
               | downside liability when shit goes sideways, similar to
               | what might happen if an engineering firm built a shoddy
               | bridge.
               | 
               | Would a firm that peddles some sort of CRUD app need to
               | go through all of this? If it handles toxic data like
               | payments or health data or other PII, sure. Otherwise,
               | probably not, just like you have small contracting
               | outfits that build garden sheds or whatever being a bit
               | different than those that maintain, say, cooling systems
               | for nuclear plants. Perhaps a law might be written to
               | include companies that work in certain industries or
               | business lines to compel them to do this.
        
             | chii wrote:
             | While good, those ideas will all increase costs.
             | 
             | Would you pay 10x (or more, even) for these systems? That
             | means 10x the price of water, utilities, transport etc,
             | which then accumulate up the chain to make other things
             | which don't have criticality but do depend on the ones that
             | do.
             | 
             | The thing is, what exists today exists because it's the
             | path of least resistence.
        
               | solidninja wrote:
               | No, it exists because of all must bow to the deity of
               | increasing shareholder value. Remember that good product
               | is not necessarily equal or even a subset of the easy to
               | sell product. Only once the incentives are aligned
               | towards building quality software that lasts will we see
               | change.
        
               | duckmysick wrote:
               | You're right (not sure about the exact factor though) -
               | and there's also additional costs when those systems
               | fail. Someone, somewhere lost money when all those planes
               | were grounded and services suspended.
               | 
               | At some point - maybe it already happened, I don't know -
               | spending more on preventive measures and maintenance will
               | be the path of least resistance.
        
               | tempodox wrote:
               | Cars without seat belts were the path of least resistance
               | for a long time. I wonder how that changed.
        
               | insane_dreamer wrote:
               | > Would you pay 10x (or more, even) for these systems?
               | 
               | if it's critical to your business, then yes; but you
               | quickly find out whether or not it's actually critical to
               | your business or whether it's something you can do
               | without
        
               | Vegenoid wrote:
               | Consumer costs would not go up 10x to put more care into
               | ensuring the continuous operation of critical IT
               | infrastructure. Things like "an update to the software or
               | configuration of critical systems must first be performed
               | on a test system".
        
         | tedk-42 wrote:
         | Like everything, cheap, quick or good rule applies (pick 2).
         | 
         | Software is pretty much always made cheaply and quickly. Even
         | NASA will have b software blunders and have rockets explode mid
         | flight.
        
         | TiredOfLife wrote:
         | The regulations were the reason the companies were running
         | Crowdstrike in the first place.
        
           | 0xbadcafebee wrote:
           | I'm saying that a (different) regulation, standard, and
           | inspection, should apply to the whole software bill of
           | materials, as it relates to the critical-ness of the product.
           | Like, if security is important, the security-critical
           | components should be inspected/tested. That's how you build a
           | building safely: the nails are built to a certain
           | specification and the nail vendor signs off on that.
        
       | pclmulqdq wrote:
       | Everything that we know about CrowdStrike stinks of Knight
       | Capital to me. A minor culture problem snowballed into complete
       | dysfunction, eventually resulting in a company-ending bug.
        
         | ForOldHack wrote:
         | Knight Capitol:
         | 
         | "$10 million a minute.
         | 
         | That's about how much the trading problem that set off turmoil
         | on the stock market on Wednesday morning is already costing the
         | trading firm.
         | 
         | The Knight Capital Group announced on Thursday that it lost
         | $440 million when it sold all the stocks it accidentally bought
         | Wednesday morning because a computer glitch. "
         | 
         | Glitch. Oh...
         | 
         | https://en.wikipedia.org/wiki/Therac-25
        
           | 0cf8612b2e1e wrote:
           | I do not work in finance, but surely every trading company
           | has had an algorithm go wild at some point. Just becomes a
           | matter of how fast someone can pull the circuit breaker
           | before the expensive failure becomes public.
        
             | worik wrote:
             | > surely every trading company has had an algorithm go wild
             | at some point.
             | 
             | You would think so.
             | 
             | Cynical me.
             | 
             | But no. When money is at stake much more care is taken than
             | when lives are at stake.
        
             | pclmulqdq wrote:
             | Shamelessly plugging my own blog post on this:
             | https://specbranch.com/posts/knight-capital/
             | 
             | The TL;DR of Knight is that Knight had several things go
             | wrong at the same time, and had no circuit breaker for the
             | problem that did not stop trading for the whole firm for
             | the day. Most trading firms have had things go badly, but
             | the holes in the Swiss cheese aligned for Knight (and they
             | were larger than many other firms). This all comes from a
             | sort of culture of carelessness.
        
               | odyssey7 wrote:
               | I always thought the Swiss cheese model was used to
               | suggest that no one party could possibly be responsible
               | for a bad thing that happened. Interesting to see the
               | company's culture blamed for the cheese itself.
        
               | pclmulqdq wrote:
               | Personally, I think there are too many things in modern
               | American society that involve diffusion of
               | responsibility, presumably so that people avoid negative
               | consequences. If you're going to suggest that a system
               | gives 1/10th of the responsibility to 10 different
               | people, the one who made the system is the enabler of
               | that and IMO should suffer the consequences.
        
               | odyssey7 wrote:
               | The Swiss cheese model fits better as a rebuttal when the
               | cheese comprises both the finger-pointer and the finger-
               | pointee. Think: sure, our software had a bug that said up
               | was down, but what about all of your own employees who
               | used the software, had certifications, and should have
               | known better than to accept its conclusions?
               | 
               | Your usage, in assigning blame rather than diffusing it,
               | was novel to me.
        
             | bitcharmer wrote:
             | We have circuit breakers for that very purpose. Everyone on
             | the street does. It's just that theirs seems to have failed
             | for some reason.
        
               | pclmulqdq wrote:
               | Theirs didn't fail, and they did have one. The circuit
               | breaker they had that would have worked was a big red
               | button that killed all of their trading processes, which
               | would have meant spending the rest of the day figuring
               | out and unwinding their positions.
               | 
               | Ihey were unwilling to push that button in the short time
               | they had. If you read the reports to the SEC or the
               | articles about it, you will note that. The follow-ups
               | recommended that all firms adopt a big red button that is
               | less catastrophic.
        
               | bitcharmer wrote:
               | Gotcha, thanks for correcting me, I need to read up more
               | about the incident.
        
       | bb88 wrote:
       | Most interesting quote in the article:                   "It was
       | hard to get people to do sufficient testing sometimes," said
       | Preston         Sego, who worked at CrowdStrike from 2019 to
       | 2023. His job was to review the         tests completed by user
       | experience developers that alerted engineers to bugs
       | before proposed coding changes were released to customers. Sego
       | said he was          fired in February 2023 as an "insider
       | threat" after he criticized the         company's return to-work
       | policy on an internal Slack channel.
       | 
       | Okay clearly that company has a culture issue. Imagine
       | criticizing a policy and then getting labeled "insider threat".
        
         | nullvoxpopuli wrote:
         | I'd like to clarify: that my job was also to educate,
         | modernize, and improve developer velocity through tooling and
         | framework updates / changes (impacting every team in my
         | department (UX / frontend engineering)).
         | 
         | Reviewing tests is part of PR review.
         | 
         | --- and before anyone asks, this is my statement on CrowdStrike
         | calling everyone disgruntled:
         | 
         | "I'm not disgruntled.
         | 
         | But as a shareholder (and probably more primarily, someone who
         | cares about coworkers), I am disappointed.
         | 
         | For the most part, I'm still mourning the loss of working with
         | the UX/Platform team."
        
           | bb88 wrote:
           | I mourn the fact that your ex co-workers are still working
           | for a shitty company.
        
             | nullvoxpopuli wrote:
             | The market for jobs isn't great, so i don't blame them.
             | 
             | At the same time, i feel like big profit-chasing software
             | companies are _all_ like how CrowdStrike is.
             | 
             | Many may be in the same type of company, but situations
             | have not arisen that reveal how leadership really feels
             | about employees.
        
         | Aeolun wrote:
         | > Imagine criticizing a policy and then getting labeled
         | "insider threat".
         | 
         | Especially because that's incredibly dumb. A true insider
         | threat would play nice while you find all your confidential
         | data leaking.
        
           | bb88 wrote:
           | I mean, that's just insanely true. I think this is maybe the
           | most dystopian company I've ever heard of so far.
        
         | wesselbindt wrote:
         | > return to work
         | 
         | I know you're just quoting the phrase, but what a gross and
         | dishonest way of phrasing "return to office". Implies working
         | remotely doesn't count as work. Smacks of PR. Yuck.
        
       | seanw444 wrote:
       | And everybody gasped in surprise.
        
       | tamimio wrote:
       | I think the whole world knew that already.
        
       | insane_dreamer wrote:
       | > CrowdStrike disputed much of Semafor's reporting
       | 
       | I expect some ex-employees to be disgruntled and present things
       | in a way that makes CroudStrike look bad. That happens with every
       | company.
       | 
       | BUT, CrowdStrike has ZERO credibility at this point. I don't
       | believe a word they say.
        
         | Zigurd wrote:
         | At some companies, like Boeing, the shorter list would be the
         | gruntled employees.
        
           | insane_dreamer wrote:
           | > gruntled
           | 
           | have never heard that word used is a non-negative way
        
             | beng-nl wrote:
             | Off-Topic, but do I have a story for you
             | 
             | https://www.ling.upenn.edu/~beatrice/humor/how-i-met-my-
             | wife...
        
             | tsimionescu wrote:
             | Fun linguistics fact, but gruntled as the antonym of
             | disgruntled is a back-formation. The word disgruntled is a
             | bit strange, in that it uses "dis-" not as a reversal
             | prefix (such as in dissatisfied or dissimilar), but as an
             | intensifier. The original "gruntle" was related to grunt,
             | grunting, it was similar to "grumble", denoting the sounds
             | an annoyed crowd might make. But this old sense of gruntle,
             | gruntling, gruntled has not been used since the 16th
             | century. And in the past century, people have started back-
             | forming a new "gruntle" by analyzing "dis-gruntled" as
             | using the more common meaning of "dis-".
             | 
             | A similar use of dis- as an intensifier apparently happened
             | in "dismayed" (here from an Old French verb, esmaier ,
             | which meant to trouble, to disturb), and in "disturbed"
             | (from Latin a word, turba, meaning turmoil). I haven't
             | heard any one say they are "mayed" or "turbed", but people
             | would probably see the same as "gruntled" if you used them.
        
             | dbattaglia wrote:
             | I've only heard it from Michael Scott: "Everyone here is
             | extremely gruntled".
        
       | chaps wrote:
       | Worked on a team that deployed crowdstrike agents to organize
       | and... Yeah. One of the biggest problems we had was that the
       | daemon would log a massive amount of stuff... But had no config
       | for it to stop or reduce it.
        
       | st3fan wrote:
       | Found out that the CrowdStrike Mac agent (Falcon) sends all your
       | secrets from environment variables to their cloud hosted SIEM. In
       | plain text.
       | 
       | Anyone with access to your CS SIEM can search for GitHub, aws,
       | etc creds. Anything your devs, ops and sec teams use on their
       | Macs.
       | 
       | Only the Mac version does this. There is no way to disable this
       | behaviour or a way to redact things.
       | 
       | Another really odd design decision. They probably have many many
       | thousands of plain text secrets from their customers stored in
       | their SIEM.
        
         | x3n0ph3n3 wrote:
         | Can you provide some more info on this? How do you know? Is
         | this documented somewhere?
         | 
         | I'm sure this is going to raise red-flags in my IT department.
        
           | st3fan wrote:
           | Ask them to search for the usual env var names like
           | GITHUB_TOKEN or AWS_ACCESS_KEY_ID.
        
           | skewer99 wrote:
           | AKIDs... ugh. They'll be there if you use AWS + Mac.
           | 
           | Again, the plaintext is the problem.
           | 
           | These environment variables get loaded from the command line,
           | scripts, etc. - CrowdStrike and all of the best EDRs also
           | collect and send home all of that, but probably in an
           | encrypted stream?
        
             | zxexz wrote:
             | I usually remote dev on an instance in a VPC because of
             | crap like this. If you like terrible ideas (I don't use
             | this except for debugging IAM stuff, occasionally), you can
             | use the IMDS like you were an AWS instance by giving a
             | local loopback device the link-local ipv4 address
             | 169.254.169.254/32 and binding traffic on the instance's
             | 169.254.169.254/32 port 80 to your lo's port 80, and a
             | local AWS SDK will use the IAM instance profile of the
             | instance you're connected to. I'll repeat, this is not a
             | good idea.
        
         | apimade wrote:
         | Is this really a criticism? Because this has been the case
         | forever with all security and SIEM tools. It's one of the
         | reasons why the SIEM is the most locked down pieces of software
         | in the business.
         | 
         | Realistically, secrets alone shouldn't allow an attacker access
         | - they should need access to infrastructure or a certificates
         | in machines as well. But unfortunately that's not the case for
         | many SaaS vendors.
        
           | st3fan wrote:
           | But why only forced on MacOS?
           | 
           | I think some configurability would be great. I would like to
           | provide an allow list or the ability to redact. Or exclude
           | specific host groups.
           | 
           | We all have different levels of acceptable risk
        
             | btilly wrote:
             | Conspiracy theory time. Because Apple is the only OS
             | company that has reliably proven that it won't decrypt hard
             | drives at government request.
        
               | vagrantJin wrote:
               | This is a true conspiracy .
        
               | jordanb wrote:
               | Seriously? Crowdstrike is obviously NSA just like
               | Kaspersky is obviously KGB and Wiz is obviously Mossad.
               | Why else are counties so anxious about local businesses
               | not using agents made by foreign actors?
        
               | smolder wrote:
               | KGB is not even a thing. Modern equivalent is FSB, no?
               | I'm skeptical. I don't think it's obvious that these are
               | all basically fronts, as much as I'm willing to believe
               | that IC tentacles reach wide and deep.
        
               | iml7 wrote:
               | It depends on the country it is in, it rejects the US
               | government's request. But it fully complies with any
               | request from the Chinese government
        
               | EE84M3i wrote:
               | I'd be interested to learn more about that.
               | 
               | My mental model was that Apple provides backdoor
               | decryption keys to China _in advance_ for devices sold in
               | China /Chinese iCloud accounts, but that they cannot/will
               | not bypass device encryption for China for devices sold
               | outside of the country/foreign iCloud accounts.
        
               | throwaway48476 wrote:
               | The venn diagram of users who don't want the government
               | to access their data and crowdstrike customers is two
               | circles in different galaxies.
        
               | xnyan wrote:
               | It's probably being run on an enterprise-managed mac. The
               | only person who can be locked out via encryption is the
               | user.
        
           | immibis wrote:
           | The certificate private key is also a secret.
        
           | meowface wrote:
           | All SIEM instances certainly contain a lot of sensitive data
           | in events, but I'm not sure if most agents forward all
           | environment variables to a SIEM.
        
             | hello_moto wrote:
             | Agents don't just read env vars and send them to SIEM.
             | 
             | There's a triggering action that caused the env vars to be
             | used by another ... ehem... Process ... that any EDR
             | software in this beautiful planet would have tracked.
        
               | st3fan wrote:
               | No it logs every command macOS runs or that you type in a
               | terminal. Either directly or indirectly. From macOS
               | internal periodic tasks to you running "ls".
        
           | worik wrote:
           | > Because this has been the case forever with all security
           | and SIEM tools.
           | 
           | Why?
           | 
           | There is no need to send your environment variables.
        
             | gruez wrote:
             | Otherwise malware can hide in environment variables
        
               | llm_trw wrote:
               | Ok, suppose you're right.
               | 
               | Why are they only doing it for macs then?
        
               | st3fan wrote:
               | It may depend a bit on your organization but I bet most
               | folks using an EDR solution can tell you that Macs are
               | probably very low on the list when it comes to malware.
               | You can guess which OS you will spend time on every day
               | ...
        
               | llm_trw wrote:
               | So because macs are not the targets of malware ... we're
               | locking them down tighter than any other system?
        
               | namaria wrote:
               | No, see, they're leveling the playing field by storing
               | all secrets they find on macs in plaintext
        
               | batch12 wrote:
               | I don't think this is limited to just Macs based on my
               | experience with the tool. It also sends command line
               | arguments for processes which sometimes contain secrets.
               | The client can see everything and run commands on the
               | endpoints. What isn't sent automatically can be collected
               | for review as needed.
        
               | st3fan wrote:
               | It does redact secrets passed as command line arguments.
               | This is what makes it so inconsistent. It does recognize
               | a GitHub token as an argument and blanks it out before
               | sending it. But then it doesn't do that if the GitHub
               | token appears in an env var.
        
               | worik wrote:
               | They do not need to take the data off the computer to do
               | that
        
               | cma wrote:
               | Malware can hide in the frame buffer at modern
               | resolutions. They could keep a full copy of it and each
               | frame transition too.
        
           | chelmzy wrote:
           | Most sane SIEM engineers would implement masking for this.
           | Not sure if CS still uses Splunk but they did at one point.
           | No excuse really.
        
           | wbl wrote:
           | What do you think grants the access to the infra or ability
           | to get a certificate?
        
           | ants_everywhere wrote:
           | Ideally secrets never leave secure enclaves and humans at the
           | organization can't even access them.
           | 
           | It's totally insane to send them to a remote service
           | controlled by another organization.
        
             | Natsu wrote:
             | I mean it's right there in the name. They're not really
             | secrets any longer if you're sharing them in plaintext with
             | another company.
        
             | cj wrote:
             | Essentially, it's straddling two extremes:
             | 
             | 1) employees are trusted with secrets, so we have to audit
             | that employees are treating those secrets securely (via
             | tracking, monitoring, etc)
             | 
             | 2) we don't allow employees to have access to secrets
             | whatsoever, therefore we don't need any auditing or
             | monitoring
        
               | ants_everywhere wrote:
               | You give employees the ability to _use_ the secrets, and
               | that usage is tracked and audited.
               | 
               | It works the same way for biometrics like face unlock on
               | mobile phones
        
               | stogot wrote:
               | Exporting to a SIEM does not correlate to either of those
               | extremes. It's stupidity and makes auditing worse
        
               | cj wrote:
               | SIEM = Security Information & Event Management
               | 
               | Factually, it is necessary for auditing and absolutely
               | correlates with the extreme of needing to monitor the
               | "usage" of "secrets".
               | 
               | In a highly auditable/"secure" environment, you can't
               | give secrets to employees with no tracking of when the
               | secrets are used.
        
               | davorak wrote:
               | > In a highly auditable/"secure" environment, you can't
               | give secrets to employees with no tracking of when the
               | secrets are used.
               | 
               | This does not seem to require regularly exporting secrets
               | form the employee's machines though. Which is the main
               | complaint I am reading. You would log when the secret is
               | used to access something, presumably remote to the users
               | machine.
        
               | halayli wrote:
               | That's far from factual and you are making things up. You
               | don't need to send the actual keys to a siem service to
               | monitor the usage of those secrets. You can use a
               | cryptographic hash and send the hash instead. And they
               | definitely don't need to dump env values and send them
               | all.
               | 
               | Sending env vars of all your employees to one place
               | doesn't improve anything. In fact, one can argue the
               | company is now more vulnerable.
               | 
               | It feels like a decision made by a clueless school
               | principle, instead of a security expert.
        
               | Too wrote:
               | In a highly secure environment, don't use long lived
               | secrets in the first place. You use 2FA and only give out
               | short lived tokens. The IdP (ID Provider) refreshing the
               | token for you provides the audit trail.
               | 
               | Repeat after me: Security is not a bolt on tool.
        
               | defrost wrote:
               | More like a triple lock steel core reinforced door laying
               | on its side in an open field?
               | 
               | Good start, might need a little more work around the
               | edges.
        
               | smolder wrote:
               | A secure environment doesn't involve software
               | exfiltrating secrets to a 3rd party. It shouldn't even
               | centralize secrets in plaintext. The thing to collect and
               | monitor is behavior: so-and-so logged into a dashboard
               | using credentials user+passhash and spun up a server
               | which connected to X Y and Z over ports whatever... And
               | those monitored barriers should be integral to an
               | architecture, such that every _behavior_ in need of
               | auditing is provably recorded.
               | 
               | If you lean in the direction of keylogging all your
               | employees, that's not only lazy but ineffective on
               | account of the unnecessary noise collected, and it's
               | counterproductive in that it creates a juicy central
               | target that you can hardly trust anyone with. Good
               | auditing is minimally useful to an adversary, IMO.
        
               | userbinator wrote:
               | _employees are trusted with secrets, so we have to audit
               | that employees are treating those secrets securely_
               | 
               | IMHO needing to be monitored constantly is not being
               | "trusted" by any sense of the word.
        
               | fragmede wrote:
               | I can trust you enough to let you borrow my car and not
               | crash it, but still want to know where my car is with an
               | Airtag.
               | 
               | Similarly employees can be trusted enough with access to
               | prod, while the company wants to protect itself from
               | someone getting phished or from running the wrong "curl |
               | bash" command, so the company doesn't get pwned.
        
             | cortesoft wrote:
             | > Ideally secrets never leave secure enclaves and humans at
             | the organization can't even access them.
             | 
             | Right, but doesn't that mean there is no risk from sending
             | employee laptop ENV variables, since they shouldn't have
             | any secrets on their laptops?
        
           | AmericanChopper wrote:
           | Keeping secrets and other sensitive data out of your SIEM is
           | a very important part of SIEM design. Depending on what
           | you're dealing with you might want to tokenize it, or redact
           | it, but you absolutely don't want to don't want to just
           | ingest them in plaintext.
           | 
           | If you're a PCI company then ending up with a credit card
           | number in your SIEM can be a massive disaster. Because you're
           | never allowed to store that in plaintext, and your SIEM data
           | is supposed to be immutable. In theory that puts you out of
           | compliance for a minimum of one year with no way to fix it,
           | in reality your QSAs will spend some time debating what to do
           | about it and then require you to figure out some way to
           | delete it, which might be incredibly onerous. But I have no
           | idea what they'd do if your SIEM somehow became full of
           | credit card numbers, that probably is unfixable...
        
             | ronsor wrote:
             | > But I have no idea what they'd do if your SIEM somehow
             | became full of credit card numbers, that probably is
             | unfixable...
             | 
             | You'd get rid of it.
        
               | AmericanChopper wrote:
               | If that's straightforward then congratulations, you've
               | failed your assessment for not having immutable log
               | retention.
               | 
               | They certainly wouldn't let you keep it there, but if
               | your SIEM was absolutely full of cardholder data, I
               | imagine they'd require you to extract ALL of it, redact
               | the cardholder data, and the import it to a new instance,
               | nuking the old one. But for a QSA to sign off on that
               | they'd be expecting to see a lot of evidence that
               | removing the cardholder data was the only thing you
               | changed.
        
           | Aeolun wrote:
           | If my security software exfiltrates my secrets _by design_ ,
           | I'm just going to give up on keeping anything secure now.
        
           | benreesman wrote:
           | Arbitrary bad practices as status quo without criticism, far
           | from absolving more of the same, demand scrutiny.
           | 
           | Arbitrarily high levels of market penetration by sloppy
           | vendors in high-stakes activities, far from being an argument
           | for functioning markets, demand regulation.
           | 
           | Arbitrarily high profile failures of the previous two, far
           | from indicating a tolerable norm, demand criminal
           | prosecution.
           | 
           | It is recently that this seemingly ubiquitous vendor, with
           | zero-day access to a critical kernel space that any red team
           | adversary would kill for, said "lgtm shipit" instead of
           | running a test suite with consequences and costs (depending
           | on who you listen to) ranging from billions in lost treasure
           | to loss of innocent life.
           | 
           | We know who fucked up, have an idea of how much corrupt-ass
           | market failure crony capitalism could admit such a thing.
           | 
           | The only thing we don't know is how much worse it would have
           | to be before anyone involved suffers any consequences.
        
           | kmacdough wrote:
           | "Oh, but our system is _so secure,_ you don 't need other
           | layers."
        
           | lolinder wrote:
           | > Realistically, secrets alone shouldn't allow an attacker
           | access - they should need access to infrastructure or a
           | certificates in machines as well.
           | 
           | This isn't realistic, it's idealistic. In the real world
           | secrets are enough to grant access, and even if they weren't,
           | exposing one half of the equation in clear text by design is
           | still _really bad_ for security.
           | 
           | Two factor auth with one factor known to be compromised is
           | actually only one factor. The same applies here.
        
         | skewer99 wrote:
         | The monitoring and collection isn't the problem, that's what
         | modern EDR does - collect, analyze, compare, and do statistics
         | on all of the things.
         | 
         | The plaintext part is not okay.
        
           | notepad0x90 wrote:
           | Thank you, that's a sound perspective, but it is the
           | responsibility of the security staff who deploy EDRs like
           | Crowdstrike to scrub any data at ingestion time into their
           | SIEM. but within CS's platform, it makes little sense to talk
           | about scrubbing, since CS doesn't know what you want scrubbed
           | unless it is standardized data forms (like SSNs,credit
           | cards,etc..).
           | 
           | Another way to look at it is, the CS cloud environment is
           | effectively part of your environment. the secrets can get
           | scrubbed, but CS still has access to your devices, they can
           | remotely access them and get those secrets at any time
           | without your knowledge. that is the product. The security
           | boundary of OP's mac is inclusive of the CS cloud.
        
             | st3fan wrote:
             | Unfortunately the software doesn't allow for scrubbing or
             | redacting to be configured. Those features simply do not
             | exist.
        
               | notepad0x90 wrote:
               | for their own cloud, yeah, you basically accept their
               | cloud as an extension of your devices. but the back-end
               | they use(d?), Splunk, does have scrubbing capability they
               | can expose to customers, if actual customers requested
               | it.
               | 
               | In reality, you can take steps to prevent PII from being
               | logged by Crowdstrike, but credentials are too non-
               | standard to meaningfully scrub. It would be an exercise
               | in futility. If you trust them to have unrestricted
               | access to the credential, the fact that they're
               | inadvertently logging it because of the way your
               | applications work should not be considered an increase in
               | risk.
        
         | SoftTalker wrote:
         | Secrets in clear text in environment variables is never a good
         | idea though.
        
           | dchftcs wrote:
           | There are secrets like passwords, but there are also secrets
           | like "these are the parameters for running a server for our
           | assembly line for X big corp".
        
         | brundolf wrote:
         | Do you have a source?
        
         | madcadmium wrote:
         | Does it also monitor the contents of your copy/paste buffer? It
         | would scoop up a ton of privileged data if so.
        
         | hiddencost wrote:
         | This kind of information seems like it should have a CVE and a
         | responsible disclosure process.
         | 
         | Kidding, mostly, but wow that's a hell of a vulnerability.
        
           | notepad0x90 wrote:
           | It is not a vulnerability, you literally pay for this
           | feature. I really don't want to defend Crowdstrike but HN
           | keeps making it hard not to.
        
             | hiddencost wrote:
             | Storing secrets in unsecured environments in plaintext is
             | literally a vulnerability.
             | 
             | One of the most famous examples can be seen in the NSA
             | slide at the top of this article:
             | 
             | https://www.washingtonpost.com/world/national-
             | security/nsa-i...
        
               | notepad0x90 wrote:
               | the security tools' storage system is always considered a
               | secured environment.
        
               | j4coh wrote:
               | Without even having to secure it?
        
               | throw_a_grenade wrote:
               | Yes, but also No.
               | 
               | So there's this thing called "Threat model" and it
               | includes some assumptions about some moving parts of the
               | infra, and it very often includes assertion that a
               | particular environment (like IDS log, signing infra
               | surrounding HSM etc.) is "secure" (they mean outside of
               | the scope of that particular threat model). So it often
               | gets papered over, and it takes some reflex to say "hey,
               | how we will secure that other part". There needs to be
               | some conciousnes about it, because it's not part of this
               | model under discussuon, so not part of the agenda of this
               | meeting...
               | 
               | And it gets lost.
               | 
               | That's how shit happens in compliance-oriented security.
        
         | notepad0x90 wrote:
         | that's what EDRs do. anyone with access to your SIEM or CS data
         | should also be trusted with response access (i.e.: remotely
         | access those machines).
         | 
         | If you want this redacted, it is a SIEM functionality not
         | Crowdstrike's. Depends on the SIEM but even older generation
         | SIEMs have a data scrubbing feature.
         | 
         | This isn't a Crowdstrike design decision as you've put it.
         | _any_ endpoint monitoring too, including the free and open
         | source ones behave just as you described. You won 't just see
         | env vars from macs but things like domain admin creds and PKI
         | root signing private keys. If you give someone access to an
         | EDR, or they are incident responders with SIEM access, you've
         | trusted them with full -- yet, auditable and monitored --
         | access to that deployment.
        
           | pmlnr wrote:
           | Don't downvote this, this is the sad truth.
        
           | Fnoord wrote:
           | Sure, storage. Networking though? SIEMs receive and send data
           | unencrypted? They should not. By sending the data in plain
           | text you open up an attack surface to anyone sniffing the
           | network.
        
             | notepad0x90 wrote:
             | Crowdstrike like many EDRs uses mutually authenticated TLS
             | to send the data over the network to their cloud.
        
         | jgtrosh wrote:
         | Did somebody say GDPR?
        
           | pmlnr wrote:
           | Companies believe GDPR doesn't apply to their human
           | resources.
        
             | riedel wrote:
             | They have IT policies to make sure it largely does not
             | apply. Even in our policy officially any personal use is
             | forbidden. Funnily there is also agreement with our
             | employee board, that any personal use will not be
             | sanctioned. So guess what happens. This done to circumvent
             | not only GPR but also TTDSG in germany (which is harsher on
             | 'spying' as it applies to telecoms. For any 'officially'
             | gathered personal information though typical very specific
             | agreements with our employee board exist though (reporting
             | of illness, etc). Wonder how such information which is also
             | sensitive in a workplace is handled. Also I see those
             | systems used in hospitals etc, if other peoples data is
             | pumped through this systems GDPR definitively applies and
             | auditors may find it (I only know such auditing in finance
             | though). In the future NIS2 will also apply so exactly the
             | people that use such systems will be put under additional
             | scrutiny. Hope this triggers also some auditing of the
             | systems used and not just the use of more of such systems.
        
           | unilynx wrote:
           | What would you expect the GDPR to say? This is allowed as
           | long as the GDPRs requirements are followed
        
           | raverbashing wrote:
           | Not applicable. It is not related to personal data
        
         | philshem wrote:
         | SIEM = Security information and event management
         | 
         | https://en.wikipedia.org/wiki/Security_information_and_event...
        
         | debarshri wrote:
         | It is a common in the world of SIEM. Logs with secrets and PII
         | data is often sent and stays in the SIEM for years until an
         | incident occurs.
        
         | MasterIdiot wrote:
         | Having worked for a SIEM vendor, I can say that all security
         | software is extremely invasive, and most security people can
         | probably track every action you make on company-issued devices,
         | and that includes HTTPS decryption.
        
           | firtoz wrote:
           | Reminds me of a guy I know openly bragging that he can watch
           | all of his customers who installed his company's security
           | cameras. I won't reveal his details but just imagine any
           | cloud security camera company doing the same and you would
           | probably be right.
           | 
           | I guess it's pretty much the same principle.
        
           | blablabla123 wrote:
           | Yeah the question is always if the cure is better than the
           | disease. I'm quite ambivalent on this. On the one hand I tend
           | to agree with the "Anti AV camp" that a sufficiently
           | maintained machine can do well when following best practices.
           | Of course that includes SIEM which can also be run on-premise
           | and doesn't necessarily have to decrypt traffic if it just
           | consumes properly formatted logs.
           | 
           | On the other hand there was e.g. WannaCry in 2017 where
           | 200,000 systems across 150 countries running Windows XP and
           | other unsupported Windows Server versions had crypto miners
           | installed. It shows that companies world-wide had trouble
           | properly maintaining the life cycle of their systems. I think
           | it's too easy to only accuse security vendors of quality
           | problems.
        
         | batch12 wrote:
         | Anyone with the right level of access to your Falcon instance
         | can run commands on your endpoints (using RTR) and collect any
         | data not already being collected.
        
       | avree wrote:
       | ""Speed was the most important thing," said Jeff Gardner, a
       | senior user experience designer at CrowdStrike who said he was
       | laid off in January 2023 after two years at the company. "Quality
       | control was not really part of our process or our conversation."
       | 
       | Their 'expert' on engineering process is a senior UX designer?
       | Somehow, I doubt they were very close to the kernel patch
       | deployment process.
        
         | acdha wrote:
         | They probably weren't, but that still speaks to their general
         | culture and is compatible with what we know about their kernel
         | engineering culture (limited testing, no review, no use of
         | common fail safe mechanisms).
        
           | hello_moto wrote:
           | A company can have different business units with different
           | culture/mentality.
           | 
           | I bet my ass anyone working in low-level code don't ship the
           | way you do in Cloud.
        
             | acdha wrote:
             | > I bet my ass anyone working in low-level code don't ship
             | the way you do in Cloud.
             | 
             | Their technical report says otherwise - and we know they
             | didn't adopt the common cloud practices of doing real
             | testing before shipping or having a progressive deployment.
        
           | esperent wrote:
           | > is compatible with what we know
           | 
           | In other words, it confirms our biases and we're willing to
           | accept it at face value despite there being only a single
           | anecdotal piece of evidence.
        
             | acdha wrote:
             | It sounds like you might want to read their technical
             | report. That's neither anecdotal nor a single point, and it
             | showed a pretty large gap in engineering leadership with
             | numerous areas well behind the state of the art.
             | 
             | That's why I said it was compatible: both these former
             | employees and their own report showed an emphasis on
             | shipping rapidly but not the willingness to invest serious
             | money in the safeguards needed to do so safely. If you want
             | to construct another theory, feel free to do so.
        
       | panic wrote:
       | Why would it matter? The absolute worst case scenario happened
       | and their stock is still up 50% YoY, beating the S&P 500.
        
         | 0cf8612b2e1e wrote:
         | I thought you were joking. The stock market is incredible.
         | 
         | Everyone must realize that crowdstrike has a captive audience
         | with no alternatives that can meet corporate compliance.
        
           | intelVISA wrote:
           | Can't think of a bigger flex of how locked-in their market
           | share is.
           | 
           | On the plus side this should spur some disruptors into gear,
           | assuming VCs are willing to pivot from wasting money funding
           | LLM wrappers.
        
         | hyperpape wrote:
         | It's down 30% since the incident, and flat since 3 years ago.
         | 
         | If it runs up a huge amount in the first half of the year and
         | then the incident knocks off 30% of their market, that still
         | means the incident was really bad.
        
           | hello_moto wrote:
           | Their stock has always been volatile but you can't ignore the
           | fact that it hasn't been that bad after the incident.
        
       | goralph wrote:
       | What are some alternatives to CrowdStrike?
        
         | taspeotis wrote:
         | Personal: Nothing - Windows Defender is built into Windows.
         | 
         | Business: Nothing - Windows Defender Advanced Threat Protection
         | is built into the higher Microsoft 365 license tiers.
         | 
         | It amazes me people chose to pay money to have all their PCs
         | bluescreen.
        
           | neverrroot wrote:
           | This is a good example of very limited thinking.
        
           | Aeolun wrote:
           | mdatp is also a virus. So slow...
        
             | taspeotis wrote:
             | It can record some telemetry to help you understand why
             | it's slow: https://learn.microsoft.com/en-us/defender-
             | endpoint/troubles...
        
           | digitalsushi wrote:
           | if you had used 'some' before 'people' i could agree but some
           | industries have to use a siem or they can be fined, so, i
           | mean if there's a list of siems that are definitely not going
           | to ever crash by messing around in the kernel lets get a list
           | going
        
             | taspeotis wrote:
             | Microsoft Sentinel seems like a pretty unlikely candidate
             | for SIEM to crash every machine it's receiving data from.
        
           | qaq wrote:
           | large orgs want something that will run across all of their
           | fleet so linux servers, Macs etc.
        
             | taspeotis wrote:
             | Linux: https://learn.microsoft.com/en-us/defender-
             | endpoint/microsof...
             | 
             | macOS: https://learn.microsoft.com/en-us/defender-
             | endpoint/microsof...
             | 
             | It does iOS and Android too.
             | 
             | Again, if you're an organisation big enough to care about
             | single-pane-of-glass-monitoring you probably already have
             | access to this via the Microsoft 365 license tier you're
             | on.
        
         | worik wrote:
         | > What are some alternatives to CrowdStrike?
         | 
         | In house competence
        
           | rnts08 wrote:
           | But then you can't blame anyone else when shit hits the fan!
           | Isn't that what you're really paying for with EDR? No one is
           | safe from a targeted attack, regardless of software.
           | 
           | /s
        
           | duckmysick wrote:
           | Insurers often require to have Endpoint Detection and
           | Response for all the devices, from a third-party. In-house
           | often won't cut it, even if it makes more practical sense.
        
         | strunz wrote:
         | Carbon Black was, though now they're owned by Broadcom and
         | folded into Symantec
        
         | TillE wrote:
         | Everything that describes itself as "endpoint security".
        
         | iamhamm wrote:
         | SentinelOne
        
       | ramesh31 wrote:
       | If their (or your) shop is anything like mine, its' been a
       | constant whittling of ancillary support roles (SDET, QA, SRE) and
       | a shoving of all of the above into the sole responsibility of
       | devs over the last few years. None of this is surprising at all.
        
       | nittanymount wrote:
       | does it have competitors ?
        
       | xyst wrote:
       | Switch off CrowdStrike junk. Those companies renewing contracts
       | with them have idiots for leaders.
       | 
       | Many competing platforms that can be a drop in placement for
       | ClownStrike.
        
       | paulcole wrote:
       | Well if they say that QA was part of the process then they'll
       | look like idiots because they sucked at the process.
       | 
       | Don't find this particularly interesting news.
        
       | hinkley wrote:
       | I have only just begun to consider this question: when does risk
       | taking become thrill seeking?
       | 
       | At some point you go past questions of laziness or discipline and
       | it becomes a neurosis. Like an addiction.
        
       | Cyclone_ wrote:
       | Not justifying what they did with qc, but qc is missing from
       | quite a few places in software development that I've been apart
       | of. People might get the impression from the article that every
       | software project is well tested, whereas in my experience most
       | are rushed out.
        
         | Borborygymus wrote:
         | Exactly.
         | 
         | Much of the discourse around this topic has described ideal
         | testing and deployment practise. Maybe it's different in
         | Silicon Valley or investment banks, but for the sorts of
         | companies I work for (telco mostly) things are very far from
         | that ideal.
         | 
         | My view of he industry is one of shocking technical ineptitude
         | from all but a minority of very competent people who actually
         | keep things running... Of management who prioritize short term
         | cost reduction over quality at every opportunity, leading to
         | appalling technical debt and demoralized, over-worked staff who
         | rapidly stop giving a damn about quality, because speaking out
         | about quality problems is penalized.
        
         | padjo wrote:
         | I've worked for several multi billion dollar software
         | companies. None of them had a dedicated QA function by design.
         | Everything is about moving fast. That culture is ok if you're
         | making entertainment software or low criticality business
         | software. It's a very bad idea for critical software.
         | Unfortunately the "move fast" attitude has metastasised to
         | places where it has no place .
        
       | mattfrommars wrote:
       | Side effect of the old adage, "move fast, fail fast"?
        
       | Timber-6539 wrote:
       | Doesn't matter now. CRWD didn't go to zero. Meaning they get the
       | chance to do this again.
        
       | noisy_boy wrote:
       | Would be interesting to know from their employees if there have
       | been any tangible changes in the blind pursuit of velocity,
       | better QA etc in the aftermath of this fiasco.
        
       | jokoon wrote:
       | We need laws and regulations on software the same way we have for
       | toys, cars, airplanes, boats, buildings.
       | 
       | This silicon valley libertarian non sense needs to stop.
        
       | jrm4 wrote:
       | Does anyone have a logical reason why this company should _not_
       | be sued into oblivion?
        
         | superposeur wrote:
         | Yes, because in point of fact this company is the best at what
         | it does -- preventing security breaches. The outage --
         | disruptive as it was -- was not a breach. This elemental fact
         | is lost amidst all the knee jerk HN hate, but goes a long way
         | toward explaining why the stock only took a modest hit.
        
           | hun3 wrote:
           | That's a somewhat narrow definition of "security."
           | 
           | The 3rd component of the CIA triad is often overlooked, yet
           | the availability is what makes the _protected_ asset--and,
           | transitively, the _protection_ itself--useful at the first
           | place.
           | 
           | The disruption is effectively a Denial of Service.
        
       | nailer wrote:
       | It's a UX designer. I don't particularly like crowdstrike, but
       | this person will know very little about their kernel Drivers.
        
       | ricardobayes wrote:
       | I believe one of the biggest bad trends of the software industry
       | as a whole is cutting down on QA/testing effort. A buggy product
       | is almost always an unsuccessful one.
        
         | breadwinner wrote:
         | Blame Facebook and Google for that. They became successful
         | without QA engineers, so the rest of the industry decided to
         | follow suit in an effort to stay modern.
        
       | bitcharmer wrote:
       | Another company that got MBA-ified
        
       | sersi wrote:
       | Crowdstrike was heavily pushed on us at a previous company both
       | for compliance reason by some of our clients (BCG were the ones
       | pushing us to use crowdstrike) and from our liability insurance
       | company.
       | 
       | It was really an uphill battle to convince everyone not to use
       | Crowdstrike. Eventually I managed to but after many meetings
       | where I had to spend a significant amount of time convincing
       | different shareholders. I'm sure a lot of people just fold and go
       | with them.
        
         | mikeocool wrote:
         | Curious -- did you go with a different EDR solution? Or were
         | you able to convince people not to roll one out at all?
        
         | wesselbindt wrote:
         | What made you unwilling to use CS at the time?
        
       | manvillej wrote:
       | anyone feel like this and Boeing sound remarkably similar?
       | 
       | Its almost like there is a lesson for executives here. hmmmm
        
         | bitcharmer wrote:
         | The only lesson for these people is loss of bonuses. This will
         | keep happening for as long as golden parachutes are a thing.
        
           | wesselbindt wrote:
           | How can we get rid of golden parachutes?
        
       | bmitc wrote:
       | Has anyone _actually_ worked at a place where quality control was
       | treated as important? I wouldn 't consider this exactly
       | surprising.
        
         | 6h6n56 wrote:
         | Nope. Did everyone forget the tech motto "move fast and break
         | things"? Where is the room for quality control in that
         | philosophy?
         | 
         | Corps won't even put resource into anti-fraud efforts if they
         | believe the millions being stolen from their bottom line isn't
         | worth the effort. I have seen this attitude working in FAANGS.
         | 
         | None of this will change until tech workers stop being sadists
         | and actually unionize.
        
         | sudosysgen wrote:
         | Yes, at a trading company, where important central systems had
         | a multiweek testing process (unless the change was marked as
         | urgent, in which case it was faster) with a dedicated team and
         | a full replica environment which would replay historical
         | functions 1:1 (or in some cases live), and every change needed
         | to have an automated rollback process. Unsurprising since it
         | directly affects the bottom line.
        
           | bmitc wrote:
           | Very interesting. Thanks for sharing.
           | 
           | > every change needed to have an automated rollback process
           | 
           | How did you accomplish that?
        
         | m3047 wrote:
         | Yes. It was a manufacturing facility and since the products
         | were photosensitive the entire line operated in total darkness.
         | It was two months before they turned the lights on and I could
         | see what I was programming for.
         | 
         | This was the first place I saw standups. [Edit: this was the
         | 1990s] They were run by and for the "meat", the people running
         | the line. "Level 2" only got to speak if we were blocked, or to
         | briefly describe any new investigations we would be
         | undertaking.
         | 
         | Weirdly (maybe?) they didn't drug test. I thought of all the
         | places I've worked, they would. But they didn't. They were
         | firmly committed to the "no SPOFs" doctrine and had a "tap out"
         | policy: if anyone felt you were distracted, they could "tap you
         | out" for the day. It was no fault. I was there for six months
         | and three or four times I was tapped out and (after the first
         | time, because they asked what I did with my time off the first
         | time) told to "go climb a rock". I tapped somebody out once,
         | for what later gossip suggested was a family issue.
        
         | insane_dreamer wrote:
         | I haven't worked there but I would presume that systems running
         | nuclear reactors or ICBM launchers have a strong emphasis on
         | QC.
        
       | hitekker wrote:
       | I was surprised by how dismissive these comments are. Former
       | staff members, engineers included, are claiming that their former
       | company's unsafe development culture contributed to a colossal
       | world-wide outage & other previous outages. These employee's
       | allegations ought to be seen as credible, or at least as
       | informative. Instead, many seem to be attacking the UX designer
       | commenting on 'Quality control was not part of our process'.
       | 
       | My guess is that people are identifying with sentence said just
       | before: "Speed [of shipping] is everything." Aka "Move fast and
       | break things."
       | 
       | The culture described by this article must mirror many of our
       | lived experiences. The pure pleasure of shipping code, putting
       | out fires, making an impact (positive or negative)... and then
       | leaving it to the next engineers & managers to sort out, ignoring
       | the mess until it explodes. Even when it does, no one gets blamed
       | for the outage and soon everyone goes back to building features
       | that get them promoted, regardless of quality.
       | 
       | Through that ZIRP light, these process failures must look like a
       | feature, not a bug. The emphasis on "quality" must also look like
       | annoying roadblocks in the way of having fun on the customer's
       | dime.
        
         | wesselbindt wrote:
         | There's folks out there who enjoy putting out proverbial fires?
         | I find rework like that quite frustrating
        
           | MichaelZuo wrote:
           | Well there are a handful of expert consultants who do, since
           | they charge an eye watering price per hour for putting out
           | fires.
        
           | hitekker wrote:
           | Absolutely. Some people are born firefighters. Nothing wrong
           | with that.
           | 
           | I once worked with a senior engineer who loved running
           | incidents. He felt it was _real_ engineering. He loved
           | debugging thorny problems on a strict timeline, getting every
           | engineer in a room and ordering them about, while also
           | communicating widely to the company. Then, there 's the rush
           | of the all-clear and the kudos from stakeholders.
           | 
           | Specific to his situation, I think he enjoyed the inflated
           | ownership that the sudden urgency demanded. The system we
           | owned was largely taken for granted by the org; a dead-end
           | for a career. Calling incidents was a good way to get
           | visibility at low-cost, i.e., no one would follow-up on our
           | postmortem action items.
           | 
           | It eventually became a problem, though, when the system we
           | owned was essentially put into maintenance mode, aka zero
           | development velocity. Then I estimate (balancing for other
           | variables) the rate the senior engineer called an incident
           | for not-incidents went up by 3x...
        
             | oooyay wrote:
             | That's called hero culture and there's definitely something
             | wrong with it.
        
             | wesselbindt wrote:
             | I agree that enjoying firefighting is not inherently
             | harmful. However, the situation you describe afterward irks
             | me in some way I can't quite put my finger on. A lot of
             | words (toxic, dishonest, marketing, counterproductive, bus
             | factor) come to mind, but none of them quite fit.
        
           | jamesmotherway wrote:
           | Some people rise to the occasion during crises and find it
           | rewarding. There's a lot of pop science around COMT (the
           | "warrior gene" associated with stress resilience), which I
           | take with a grain of salt. There does seem to be something
           | there, though, and it overlaps with my personal experience
           | that many great security operations people tend to have ADHD
           | traits.
        
           | 1000100_1000101 wrote:
           | I've volunteered to fight a share of fires from people who
           | check things in untested, change infrastructure randomly,
           | etc.
           | 
           | What I've learned is that fixing things for these people (and
           | even having entire teams fixing things for weeks) just leads
           | to a continued lax attitude to testing, and leaving the
           | fallout for others to deal with. To them, it all worked out
           | in the end, and they get kudos for rapidly getting a solution
           | in place.
           | 
           | I'm done fixing their work. I'd rather work on my own tasks
           | than fix all the problems with theirs. I'm strongly
           | considering moving on, as this has become an entrenched
           | pattern.
        
         | righthand wrote:
         | Former QA engineer here, and can confirm quality is seen as an
         | annoying roadblock in the way of self-interested workers,
         | disguised as in the way of having fun on the customers dime.
         | 
         | My favorite repeated reorg strategy over the years is "that we
         | will train everyone in engineering to be hot swappable in their
         | domains". Talk about spinning wheels.
        
         | ClickedUp wrote:
         | This is not a game. I would normally agree but not when it
         | comes to low-level kernel drivers. They're a cyber security
         | company making it even worse.
         | 
         | Not very long ago we had this client who ordered a custom high
         | security solution (using a kernel driver). I can't reveal too
         | much but basically they had this offline computer running this
         | critical database and they needed a way to account for every
         | single system call to guarantee that any data could have not
         | been changed without the security system alerting and logging
         | the exact change. No backups etc were allowed to leave the
         | computer ever. We were even required to check ntdll (this was
         | on Windows) for hooks before installing the driver on-site &
         | other safety precautions. Exceptions, freezes or a deadlock? No
         | way. Any system call missed = disaster.
         | 
         | We took this seriously. Whenever we made a change to the driver
         | code we had to re-test the driver on 7 different computers (in-
         | office) running completely different hardware doing a set test
         | procedure. Last test before release entailed an even more
         | extensive test procedure.
         | 
         | This may sound harsh but CrowdStrike are total amateurs, always
         | been. Besides, what have they contributed to the cyber security
         | community? - Nothing! Their research are at a level of a junior
         | cyber security researcher. They are willing to outright lie and
         | jump to wild conclusions which is very frowned upon in the
         | community. Also heard others comment on how CS really doesn't
         | really fit the mold of a standard cyber security company.
         | 
         | Nah, CS should take a close look at true professional companies
         | like Kaspersky and Checkpoint; industry leaders who've created
         | proven top notch security solutions (software/services) but not
         | least actually contributed their valuable research to the
         | community for free, catching zero-days, reporting them before
         | no one even had a chance of exploiting them.
         | 
         | They deserve some criticism.
        
           | musicale wrote:
           | I'm don't Kaspersky and Checkpoint either. But CS should exit
           | the market.
        
       | addled wrote:
       | Yesterday morning I learned that someone I was acquainted with
       | had just passed away and the funeral is scheduled for next week.
       | 
       | They recently had a stroke at home just days after spending over
       | a month in the hospital.
       | 
       | Then I remembered that they were originally supposed to be
       | getting an important surgery, but it was delayed because of the
       | CrowdStrike outage. It took weeks for the stars to align again
       | and the surgery to happen.
       | 
       | It makes me wonder what the outcome would have been if they had
       | gotten the surgery done that day, and not spent those extra weeks
       | in the hospital with their condition and stressing about their
       | future?
        
         | oehpr wrote:
         | I appreciate your post here and I'm glad you shared, because
         | it's an example of a distributed harm. One of millions to shake
         | out of this incident, that doesn't have a dollar figure, so it
         | doesn't really "count".
         | 
         | To illustrate:
         | 
         | If I were to do something horrible like kick a 3 year olds knee
         | out and cripple them for life, I would be rightly labeled a
         | monster.
         | 
         | But If I were to say... advocate for education reform to push
         | American Sign Language out of schools, so that deaf children
         | grow up without a developmental language? We don't have words
         | for that, and if we did, none of them would get near the
         | cumulative scope and harm of that act.
         | 
         | We simply do not address distributed harms correctly. And a big
         | part of it is that we don't, we _can 't_, see all the tangible
         | harms it causes.
        
         | namdnay wrote:
         | Not to defend Crowdstrike in any way, but it's a bit unfair to
         | only look at the downside. What if his hospital hadn't bought
         | an antivirus, and got hit by ransomware?
        
       | SlightlyLeftPad wrote:
       | Just another example of technical leadership being completely
       | irresponsible and another example of tech companies prioritizing
       | the wrong things. As a security company, this completely blows
       | their credibility. i'm not convinced they learned anything from
       | this and don't expect this effect to change anything. This is a
       | culture issue, not a technical one. One RCA isn't going to change
       | this.
       | 
       | Reliability is a critical facet of security from a business
       | continuity standpoint. Any business still using crowdstrike is
       | out of their mind.
        
       ___________________________________________________________________
       (page generated 2024-09-14 23:01 UTC)