[HN Gopher] The CrowdStrike file that broke everything was full ...
       ___________________________________________________________________
        
       The CrowdStrike file that broke everything was full of null
       characters
        
       Author : behnamoh
       Score  : 240 points
       Date   : 2024-07-19 18:47 UTC (4 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | TillE wrote:
       | That probably explains how it got past internal testing.
       | Something went wrong after that, during deployment.
        
         | behnamoh wrote:
         | It could be as simple as cosmic radiation that flipped a bit
         | (it has happened before:
         | https://www.independent.co.uk/news/science/subatomic-
         | particl...), or as sophisticated as an adversarial hacking.
        
           | rvnx wrote:
           | The same cosmic radiation that flips the bits to make some
           | specific political party win.
        
         | zimpenfish wrote:
         | Someone on the Fediverse conjectured that it might have been
         | down to the Azure glitch earlier in the day. An empty file
         | would fit that if they weren't doing proper error checking on
         | their downloads, etc.
        
           | homero wrote:
           | It's crazy if they weren't signing and verifying downloads
        
         | janice1999 wrote:
         | It's still crazy that a security tool does not validate content
         | files it loads from disk that get regularly updated. Clearly
         | fuzzing was not a priority either.
        
           | Zigurd wrote:
           | How many years has this Crowdstrike code been running without
           | issues? You have put your finger on it: Fuzzing should have
           | been part of a test plan. Even TDD isn't a bastard test
           | engineer writing tests that probe edge cases. Even observing
           | that your unit tests have good code coverage isn't a
           | substitute for fuzzing. There is even a counter-argument that
           | something that been reliable in the field should not be fixed
           | for reasons like failing a test case never seen in real
           | deployments, so why go making trouble.
        
         | password4321 wrote:
         | https://news.ycombinator.com/item?id=41006104#41006555
         | 
         |  _the flawed data was added in a post-processing step of the
         | configuration update, which is after it 's been tested
         | internally but before it's copied to their update servers_
        
         | gbin wrote:
         | I don't understand, how the signature even worked? Please
         | please tell me those drivers are signed... Right? ...
        
       | Retr0id wrote:
       | While I won't discount it entirely, I think the people acting
       | like this (alone) implies malice are being very silly.
        
         | JumpCrisscross wrote:
         | It demonstrates terrible QC.
        
           | loloquwowndueo wrote:
           | Don't mess with Quebec :D
        
           | Retr0id wrote:
           | Clearly _something_ failed catastrophically, but it could
           | well be post-QC
        
             | cjbprime wrote:
             | There should be no "post-QC". You do gradual rollout across
             | the fleet, while checking your monitoring to ensure the
             | fleet hasn't gone down.
        
               | Retr0id wrote:
               | Non-gradual-rollout updates are an exacerbating factor,
               | but it isn't a root cause.
        
         | midtake wrote:
         | I disagree. In the current day the stakes are too highly to
         | naively attribute software flaws to incompetence. We should
         | assume malice until it is ruled out, otherwise it will become a
         | vector for software implants. These are matters of national
         | security at this point.
        
           | Retr0id wrote:
           | You can (and should) want to identify the root cause, without
           | assuming malice.
        
           | gerdesj wrote:
           | "These are matters of national security at this point."
           | 
           | Which nation exactly? Who on earth "wins" by crashing vast
           | numbers of PCs worldwide?
           | 
           | Many of the potential foes you might be thinking of are
           | unlikely to actually run CS locally but its bad for business
           | if your prey can't even boot their PCs and infra so you can
           | scam them.
           | 
           | I might allow for a bunch of old school nihilists getting off
           | on this sort of rubbish but it won't last and now an entire
           | class of security software, standards and procedures will be
           | fixed up. This is no deliberate "killer blow".
           | 
           | Who knew that well meaning security software running in Ring
           | 0 could fuck up big style if QA takes a long walk off a short
           | plank? Oh, anyone who worked in IT during the '90s and '00s!
           | I remember Sophos and McAfee (now Trellix) and probably
           | others managing to do something similar, back in the day.
           | 
           | Mono-cultures are subject to pretty catastrophic failures, by
           | definition. If you go all in with the same thing as everyone
           | else then if they sneeze, you will catch the 'flu too.
        
       | fourteenfour wrote:
       | At least it compressed well, which must have saved network
       | resources during the update. :)
        
         | sitkack wrote:
         | Resources like wall clock time.
        
       | bloopernova wrote:
       | Note to self: on Monday, add a null character check to pre-commit
       | hooks, and add the same check to pipelines.
        
         | Retr0id wrote:
         | It's perfectly normal for binary artifacts to contain null
         | bytes, even long runs of them.
        
           | bloopernova wrote:
           | Yeah, I'd need to figure it out properly, but for unicode
           | text files it should be OK. Good point about the binaries
           | though, thank you!
        
             | hawski wrote:
             | You say Unicode, but you mean UTF-8. Now for 16 bit Unicode
             | the story is different :)
        
         | MilStdJunkie wrote:
         | I mentioned it in a separate parent, but null purge is - for
         | the stuff I work with - completely non-negotiable. Nulls seem
         | to break virtually everything, just by existing. Furthermore,
         | old-timey PDFs are chock full of the things, for God knows what
         | reason, and a huge amount of data I work with are old-timey
         | PDF.
        
           | toast0 wrote:
           | > Furthermore, old-timey PDFs are chock full of the things,
           | for God knows what reason, and a huge amount of data I work
           | with are old-timey PDF.
           | 
           | Probably UCS-2/UTF-16 encoding with ascii data.
        
         | mrguyorama wrote:
         | The problem ISN'T the null character though. The problem is
         | that they tested the system, THEN changed stuff, then uploaded
         | the changed stuff.
         | 
         | Their standard methodology was to deploy untested stuff.
        
       | thrill wrote:
       | So ... the checksum of all the Seinfeld episodes?
        
         | geor9e wrote:
         | Explain joke please
        
           | themagicteeth wrote:
           | Seinfeld is a show about nothing
           | 
           | http://seinfeldscripts.com/ThePitch.htm
        
           | WhyCause wrote:
           | Seinfeld was "a show about nothing."
        
       | kristjansson wrote:
       | Something like                  zero_output_file(fh, len(file))
       | flush()             fill_output_file(fh, data)
       | 
       | with an oops in line 3?
        
       | AndrewKemendo wrote:
       | This should not have passed a competent C/I pipeline for a system
       | in the critical path.
       | 
       | I'm not even particularly stringent when it comes to automated
       | test across-the-board but for this level of criticality of
       | system, you need exceptionally good state management
       | 
       | To the point where you should not roll to production without an
       | integration test on every environment that you claim to support
       | 
       | Like it's insane to me that this size and criticality of a
       | company doesn't have a staging or even a development test server
       | that tests all of the possible target images that they claim to
       | support.
       | 
       | Who is running stuff over there - total incompetence
        
         | martinky24 wrote:
         | A lot of assumptions here that probably aren't worth making
         | without more info -- For example it could certainly be the case
         | that there was a "real" file that worked and the bug was in the
         | "upload verified artifact to CDN code" or something, at which
         | point it passes a lot of things before the failure.
         | 
         | We don't have the answers, but I'm not in a rush to assume that
         | they don't test anything they put out at all on Windows.
        
           | EvanAnderson wrote:
           | I haven't seen the file, but surely each build artifact
           | should be signed and verified when it's loaded by the client.
           | The failure mode of bit rot / malice in the CDN should be
           | handled.
        
             | gjsman-1000 wrote:
             | Perhaps - but if I made a list of all of the things your
             | company _should_ be doing and didn 't, or even things that
             | your side project _should_ be doing and didn 't, or even
             | things in your personal life that you _should_ be doing and
             | haven 't, I'm sure it would be very long.
        
               | EvanAnderson wrote:
               | A company deploying kernel-mode code that can render huge
               | numbers of machines unusable should have done better.
               | It's one of those "you had one job" kind of situations.
               | 
               | They would be a gigantic target for malware. Imagine
               | pwning a CDN to pwn millions of client computers. The CDN
               | being malicious would be a major threat.
        
               | soraminazuki wrote:
               | Oh, they have one job for sure. Selling compliance. All
               | else isn't their job, including actual security.
               | 
               | Antiviruses are security cosplay that works by using a
               | combination of bug-riddled custom kernel drivers and
               | unsandboxed C++ parsers running with the highest level of
               | privileges to tamper with every bit of data it can get
               | its hands on. They violate every security common sense.
               | They also won't even hesitate to disable or delay
               | rollouts of actual security mechanisms built into
               | browsers and OSes if it gets in the way.
               | 
               | The software industry needs to call out this scam and put
               | them out of business sooner than later. This has been the
               | case for at least a decade or two and it's sad that
               | nothing has changed.
               | 
               | https://ia801200.us.archive.org/1/items/SyScanArchiveInfo
               | con... https://robert.ocallahan.org/2017/01/disable-your-
               | antivirus-...
        
               | heraldgeezer wrote:
               | Nope, I have seen software like Crowdstrike, S1, Huntress
               | and Defender E5 stop active ransomware attacks.
        
               | soraminazuki wrote:
               | That anecdote doesn't justify installing gaping security
               | holes into the kernel with those tools. Actual security
               | requires knowledge, good practice, and good engineering.
               | Antiviruses can never be a substitute.
        
               | cduzz wrote:
               | Which is their "One Job" ?
               | 
               | Options include:
               | 
               | 1. protected the systems always work even if things are
               | messed up
               | 
               | 2. protected systems are always protected even when
               | things are messed up
               | 
               | The two failure modes are exclusive; ideally you let the
               | end user decide what to do if the protection mechanism is
               | itself unstable.
               | 
               | One could suggest "the system must always work" but
               | that's ignoring that sometimes things don't go to plan.
               | 
               | None of the systems in boot loops were p0wned by known
               | exploits while they were boot looping. As far as we know
               | anyhow.
               | 
               | (edited to add the obvious default of "just make a
               | working system" which is of course both a given and not
               | going to happen)
        
               | bn-l wrote:
               | I think in this case it's reasonable for us to expect
               | that they are doing what they _should_ be doing.
        
               | jjav wrote:
               | > all of the things your company should be doing and
               | didn't
               | 
               | Processes need to match the potential risk.
               | 
               | If your company is doing some inconsequential social app
               | or whatever, then sure, go ahead and move fast and break
               | things if that's how you roll.
               | 
               | If you are a company, let's call them Crowdstrike, that
               | has access to push root privileged code to a significant
               | percentage of all machine on the internet, the minimum
               | quality bar is vastly higher.
               | 
               | For this type of code, I would expect a comprehensive
               | test suite that covers everything and a fleet of QA
               | machines representing every possible combination of
               | supported hardware and software (yes, possibly thousands
               | of machines). A build has to pass that and then get
               | rolled into dogfooding usage internally for a while. And
               | then very slowly gets pushed to customers, with
               | monitoring that nothing seems to be regressing.
               | 
               | Anything short of that is highly irresponsible given the
               | access and risk the Crowdstrike code represents.
        
               | Denvercoder9 wrote:
               | > A build has to pass that and then get rolled into
               | dogfooding usage internally for a while. And then very
               | slowly gets pushed to customers, with monitoring that
               | nothing seems to be regressing.
               | 
               | That doesn't work in the business they're in. They need
               | to roll out definition updates quickly. Their clients
               | won't be happy if they get compromised while CrowdStrike
               | was still doing the dogfooding or phased rollout of the
               | update that would've prevented it.
        
             | xyst wrote:
             | Hindsight is 20/20
             | 
             | This is a public company after all. In this market, you
             | don't become a "Top-Tier Cybersecurity Company At A Premium
             | Valuation" with amazing engineering practices.
             | 
             | Priority is sales, increasing ARR, and shareholders.
        
               | fsloth wrote:
               | This is the market. Good engineering practices don't hurt
               | but they are not mandatory. If Boeing can wing it so can
               | everybody.
        
               | StressedDev wrote:
               | Boeing has been losing market share to AirBus for
               | decades. That is what happens when you cannot fix your
               | problems, sell a safe product, keep costs in line, etc.
        
               | MBCook wrote:
               | That's too much of an excuse.
               | 
               | This isn't hindsight. It's "don't blow up 101" level
               | stuff they messed up.
               | 
               | It's not that this got past their basic checks, they
               | don't appear to have had them.
               | 
               | So let's ask a different question:
               | 
               | The file parser in their kernel extension clearly never
               | expected to run into an invalid file, and had no
               | protections to prevent it from doing the wrong thing _in
               | the kernel_.
               | 
               | How much you want to bet that module could be trivially
               | used to do a kernel exploit early in boot if you managed
               | to feed it your "update" file?
               | 
               | I bet there's a good pile of 0-days waiting to be found.
               | 
               | And this is _security software_.
               | 
               | This is "we didn't know we were buying rat poison to put
               | in the bagels" level dumb.
               | 
               | Not "hindsight is 20/20".
        
               | SoftTalker wrote:
               | Truly an "the emperor has no clothes" moment.
        
               | StressedDev wrote:
               | Not caring about the actual product will eventually kill
               | a company. All companies have to constantly work to
               | maintain and grow their customer base. Customers will
               | eventually figure out if a company is selling snake oil,
               | or a shoddy product.
               | 
               | Also, the tech industry is extremely competitive. Leaders
               | frequently become laggards or go out of business. Here
               | are some companies who failed or shrank because their
               | products could not complete: IBM, Digital Equipment, Sun,
               | Borland, Yahoo, Control Data, Lotus (later IBM),
               | Evernote, etc. Note all of these companies were at some
               | point at the top of their industry. They aren't anymore.
        
               | worik wrote:
               | > Not caring about the actual product will eventually
               | kill a company.
               | 
               | Eventually
               | 
               | By then the principles are all very rich, and no longer
               | care.
               | 
               | Do you think Bill Gates sleeps well?
        
               | geodel wrote:
               | Keyword is _eventually_. By then C-level would 've been
               | retired. Others in top management would've changed
               | multiple jobs.
               | 
               | IMO point is not where are these past top companies now
               | but where are top people in those companies now. I
               | believe they end up being in very comfortable situation
               | no matter which place.
               | 
               | Exceptions of course would be criminal prosecution,
               | financial frauds etc.
        
             | AdamJacobMuller wrote:
             | The file was just full of null bytes.
             | 
             | It's very possible the signature validation and
             | verification happens after the bug was triggered.
        
               | wk_end wrote:
               | "Load a kernel module and _then_ verify it " is not the
               | way any remotely competent engineer would do things.
               | 
               | (...which doesn't rule out the possibility that CS was
               | doing it.)
        
               | justinclift wrote:
               | The ClownStrike Falcon software that runs on both Linux
               | and macOS was _incredibly_ flaky and a constant source of
               | kernel problems at my previous work place. We had to push
               | back on it regardless of the security team 's (strongly
               | stated) wishes, just to keep some of the more critical
               | servers functional.
               | 
               | Pretty sure "competence" wasn't part of the job
               | description of the ClownStrike developers, at least for
               | those pieces. :( :( :(
        
               | soraminazuki wrote:
               | ClownStrike left kernel panics unfixed for a year until
               | macOS deprecated kernel extensions altogether. It was
               | scary because crash logs indicated that memory was
               | corrupted while processing network packets. It might've
               | been exploitable.
        
               | usr1106 wrote:
               | Haven't used Windows for close to 15 years, but I read
               | the file is (or rather supposed to be) a NT kernel
               | driver.
               | 
               | Are those drivers signed? Who can sign them? Only
               | Microsoft?
               | 
               | If it's true the file contained nothing but zeros that
               | seems to be also kernel vulnerability. Even if signing
               | were not mandatory, shouldn't the kernel check for some
               | structure, symbol tables or the the like before
               | proceeding?
        
               | dagaci wrote:
               | Think more, imagine that the your CrowdStrike security
               | layer detects an 'unexpected' kernel level data file.
               | 
               | Choice #1 Diable security software and continue. Choice
               | #2 Stop. BSOD message contact you administrator
               | 
               | There may be nothing wrong with the drivers.
        
               | derefr wrote:
               | Choice #3 structure the update code so that verifying the
               | integrity of the update (in kernel mode!) is upstream of
               | installing the update / removing the previous definitions
               | package, such that a failed update (for _whatever_
               | reason) results in the definitions remaining in their
               | existing pre-update state.
               | 
               | (This is exactly how CPU microcode updates work -- the
               | CPU "takes receipt" of the new microcode package, and
               | integrity-verifies it internally, before starting to do
               | anything involving updating.)
        
               | warkdarrior wrote:
               | > a failed update (for whatever reason) results in the
               | definitions remaining in their existing pre-update state
               | 
               | Fantastic solution! You just gave the attackers a way to
               | stop all security updates to the system.
        
               | rahkiin wrote:
               | The file was data used by the actual driver like some
               | virus database. It is not code loaded by the kernel
        
               | poizan42 wrote:
               | No the file is not a driver. It's a file loaded by a
               | driver, some sort of threat/virus definition file I
               | think?
               | 
               | And yes Windows drivers are signed. If it had been a
               | driver it would just have failed to load. Nowadays they
               | must be signed by Microsoft, see
               | https://learn.microsoft.com/en-us/windows-
               | hardware/drivers/d...
        
               | MBCook wrote:
               | That was my read.
               | 
               | The kernel driver was signed. The file it loaded as input
               | with garbage data had seemingly no verification on it at
               | all, and it crashed the driver and therefore the kernel.
        
               | usr1106 wrote:
               | Hmm, the driver must be signed (by Microsoft I assume).
               | So they sign a driver which in turn loads unsigned files.
               | That does not seem to be good security.
        
               | anonymfus wrote:
               | NT kernel drivers are Portable Executables, and kernel
               | does such checks, displaying BSOD with stop code
               | 0xC0000221 STATUS_IMAGE_CHECKSUM_MISMATCH if something
               | went wrong.
               | 
               | https://learn.microsoft.com/en-us/windows-
               | hardware/drivers/d...
        
             | chatmasta wrote:
             | The _actual_ bug is not that they pushed out a data file
             | with all nulls. It's that their kernel module crashes when
             | it reads this file.
             | 
             | I'm not surprised that there is no test pipeline for new
             | data files. Those aren't even really "build artifacts." The
             | software assumes they're just data.
             | 
             | But I am surprised that the kernel module was deployed with
             | a bug that crashed on a data file with all nulls.
             | 
             | (In fact, it's so surprising, that I wonder if there is a
             | known failing test in the codebase that somebody marked
             | "skip" and then someone else decided to prove a point...)
             | 
             | Btw: is that bug in the kernel module even fixed? Or did
             | they just delete the data file filled with nulls?
        
               | hansvm wrote:
               | > Btw: is that bug in the kernel module even fixed? Or
               | did they just delete the data file filled with nulls?
               | 
               | Is that a real question? They definitely didn't do
               | anything more than delete the file, perhaps just rename
               | it.
        
               | SoftTalker wrote:
               | The instructions that my employer emailed were:
               | 1. Start Windows in Safe Mode or the Windows Recovery
               | Environment (Windows 11 option).       2. Navigate to the
               | C:\Windows\System32\drivers\CrowdStrike directory.
               | 3. Locate the file matching C-00000291*.sys and delete
               | it.       4. Restart your device normally.
        
           | chrisjj wrote:
           | > it could certainly be the case that there was a "real" file
           | that worked and the bug was in the "upload verified artifact
           | to CDN code" or something
           | 
           | I.e. only one link in the chain wasn't tested.
           | 
           | Sorry, but that will not do.
           | 
           | > We don't have the answers, but I'm not in a rush to assume
           | that they don't test anything they put out at all on Windows.
           | 
           | The parent post did not suggest they don't test anything. It
           | suggested they did not test the whole chain.
        
             | martinky24 wrote:
             | From the parent comment:
             | 
             | > it's insane to me that this size and criticality of a
             | company doesn't have a staging or even a development test
             | server that tests all of the possible target images that
             | they claim to support
             | 
             | I know nothing about Crowdstrike, but I can guarantee that
             | "they need to test target images that they claim to
             | support" isn't what went wrong here. The implication that
             | they don't test against Windows is so incredulous, it's
             | hard to take the poster of that comment seriously.
        
               | StressedDev wrote:
               | Thank you for pointing this out. Whenever I read articles
               | about security, or reliability failures, it seems like
               | the majority of the commenters assume that the person or
               | organization which made the mistake is a bunch of bozos.
               | 
               | The fact is mistakes happen (even huge ones), and the
               | best thing to do is learn from the mistakes. The other
               | thing people seem to forget is they are probably doing a
               | lot of the same things which got CrowdStrike into
               | trouble.
               | 
               | If I had to guess, one problem may be that CrowdStrike's
               | Windows code did not validate the data it received from
               | the update process. Unfortunately, this is very common.
               | The lesson is to validate any data received from the
               | network, from an update process, received as user input,
               | etc. If the data is not valid, reject it.
               | 
               | Note I bet at least 50% of the software engineers
               | commenting in this thread do not regularly validate
               | untrusted data.
        
         | 0xcafecafe wrote:
         | They could even have done slow rollouts. Roll it out to a
         | geographical region and wait an hour or so before deploying
         | elsewhere.
        
           | xyst wrote:
           | Or test in local environments first. Slow rollouts like this
           | tend to make deployments very very painful.
        
             | koliber wrote:
             | Slow rollouts can be quite quick. We used to do 3-day
             | rollouts. Day one was a tiny fraction. Day two was about
             | 20%. Day three was a full rollout.
             | 
             | It was ages ago, but from what I remember, the first day
             | rollout did occasionally catch issues. It only affected a
             | small number of users and the risk was within the tolerance
             | window.
             | 
             | We also tested locally before the first rollout.
        
               | rplnt wrote:
               | I don't know about this particular update, but when I
               | used to work for an AV vendor we did like 4 "data"
               | updates a day. It is/was about being quick a lot of the
               | time, you can't stage those over 3 days. Program updates
               | are different, drivers of this level were very different
               | (Microsoft had to sign those, among many things).
               | 
               | Not thay it exuces anything, just that this probably
               | wasn't treated as an update at all.
        
           | daseiner1 wrote:
           | You say _even_ (emphasis mine). Is this not industry
           | standard?
        
           | saati wrote:
           | In theory CrowdStrike protects you from threats, leaving
           | regions unprotected for an hour would be an issue.
        
             | Thaxll wrote:
             | Not really, even for security updates are not needed by the
             | minute. Do you think Microsoft rollout world wide updates
             | to everyone?
        
         | notabee wrote:
         | Without delving into any kind of specific conspiratorial
         | thinking, I think people should also include the possibility
         | that this was malicious. It's much more likely to be
         | incompetence and hubris, but ever since I found out that this
         | is basically an authorized rootkit, I've been concerned about
         | what happens if another Solarwinds incident occurs with
         | Crowdstrike or another such tool. And either way, we have the
         | answer to that question now: it has extreme consequences. We
         | really need to end this blind checkbox compliance culture and
         | start doing real security.
        
         | sonotathrowaway wrote:
         | That's not even getting into the fuckups that must have
         | happened to allow a bad patch to get rolled out everywhere all
         | at once.
        
         | carterschonwald wrote:
         | The strange thing is that when I interviewed there years ago
         | with the team that owns the language that runs in the kernel,
         | they said their ci has 20k or 40k machine os
         | combinations/configurations. Surely some of them were vanilla
         | windows!
        
           | dboreham wrote:
           | They used synthetic test data in CI that doesn't consist of
           | zeros.
        
             | dlisboa wrote:
             | Fuzz testing would've saved the day here.
        
               | azemetre wrote:
               | I'm sure some team had it in their backlog for years.
        
               | 0x6c6f6c wrote:
               | Oh yeah, FEAT#927261? Would love to see that ticket go
               | out
        
               | queuebert wrote:
               | That team was probably laid off because they weren't
               | shipping product fast enough.
        
         | russdill wrote:
         | You can have all the CI, staging, test, etc. If some bug after
         | that process nulls the file, the rest doesn't matter
        
           | Jtsummers wrote:
           | If a garbage file is pushed out, the program could have
           | handled it by ignoring it. In this case, it did not and now
           | we're (the collective IT industry) dealing with the
           | consequences of one company that can't be bothered to
           | validate its input (they aren't the only ones, but this is a
           | particularly catastrophic demonstration of the importance of
           | input validation).
        
             | russdill wrote:
             | I'll agree that this appears to have been preventable.
             | Whatever goes through CI should have a hash, deployment
             | should validate that hash, and the deployment system itself
             | should be rigorously tested to insure it breaks properly if
             | the hash mismatches at some point in the process
        
           | fabian2k wrote:
           | Those signature files should have a checksum, or even a
           | digital signature. I mean even if it doesn't crash the entire
           | computer, a flipped bit in there could still turn the entire
           | thing against a harmless component of the system and lead to
           | the same result.
        
             | HL33tibCe7 wrote:
             | What happens when your mechanism for checksumming doesn't
             | work? What happens when your mechanism for installing after
             | the checksum is validated doesn't work?
             | 
             | It's just too early to tell what happened here.
             | 
             | The likelihood is that it _was_ negligence. But we need a
             | proper post-mortem to be able to determine one way or
             | another.
        
           | LorenPechtel wrote:
           | Yup. I had quite a battle with some sort of system bug (never
           | fully traced) where I wrote valid data but what ended up on
           | disk was all zero. It appeared to involve corrupted packets
           | being accepted as valid.
           | 
           | It doesn't matter how much you test if something down the
           | line zeroes out your stuff.
        
           | Cerium wrote:
           | What sort of sane system modifies the build output after
           | testing?
           | 
           | Our release process is more like: build and package, sign
           | package, run CI tests on signed package, run manual tests on
           | signed package, release signed package. The deployment
           | process should check those signatures. A test process should
           | by design be able to detect any copy errors between test and
           | release in a safe way.
        
         | hnlmorg wrote:
         | It seems unlikely that a file entirely full of null characters
         | was the output of any automated build pipeline. So I'd wager
         | something got built, passed the CI tests, then the system broke
         | at some point after that when the file was copied ready for
         | deployment.
         | 
         | But at this stage, all we are doing is speculating.
        
         | dagaci wrote:
         | /* Acceptance criteria #1: do not allow machine to boot if
         | invalid data signatures are present, this could indicate a
         | compromised system. Booting could cause presidents diary to
         | transmit to rival 'Country' of the week */
         | 
         | if(dataFileIsNotValid) { throw FatalKernelException("All your
         | base are compromised"); }
         | 
         | EDIT+ Explanation:
         | 
         | With hindsight not booting may be exactly the right thing to do
         | since a bad datafile would indicate a compromised distribution/
         | network.
         | 
         | The machines should not fully boot until file with valid
         | signature is downloaded.*
        
         | arp242 wrote:
         | > Like it's insane to me that this size and criticality of a
         | company doesn't have a staging or even a development test
         | server that tests all of the possible target images that they
         | claim to support.
         | 
         | Who is saying they don't have that? Who is saying it didn't
         | pass all of that?
         | 
         | You're making tons of assumptions here.
        
           | martinky24 wrote:
           | Yeah... the comment above reads like someone who has read a
           | lot of books on CI deployment, but has zero experience in a
           | real world environment actually doing it. Quick to throw
           | stones with absolutely no understanding of any of the nuances
           | involved.
        
             | chrisjj wrote:
             | So let's hear the "nuances" that excuse this.
        
               | cweld510 wrote:
               | It's not a matter of excusing or not excusing it.
               | Incidents like this one happen for a reason, though, and
               | the real solution is almost never "just do better."
               | 
               | Presumably crowdstrike employs some smart engineers. I
               | think it's reasonable to assume that those engineers know
               | what CI/CD is, they understand its utility, and they've
               | used it in the past, hopefully even at Crowdstrike.
               | Assuming that this is the case, then how does a bug like
               | this make it into production? Why aren't they doing the
               | things that would have prevented this? If they cut
               | corners, why? It's not useful or productive to throw
               | around accusations or demands for specific improvements
               | without answering questions like these.
        
               | arp242 wrote:
               | I am not defending of excusing anything. I am saying
               | there is not enough information to make a judgement one
               | way or the other. Right now, we have almost zero
               | technical details.
               | 
               | Call me old-fashioned and boring, but I'd like to have
               | some basic facts about the situation first. After this I
               | decide who does and doesn't deserve a bollocking.
        
               | chrisjj wrote:
               | I think we do have enough info to judge e.g. :This should
               | not have passed a competent C/I pipeline for a system in
               | the critical path."
               | 
               | Thay info includes that the faulty file consisted
               | entirely of zeros.
        
               | arp242 wrote:
               | > That info includes that the faulty file consisted
               | entirely of zeros.
               | 
               | Even that is not certain. Some people are reporting that
               | this isn't the case and that the all-zeroed file may be a
               | "quick hack" to send out a no-op.
               | 
               | So no, we have very little info.
        
               | jacobr1 wrote:
               | Not an excuse - they should be testing for this exact
               | thing - but Crowdstrike (and many similar security tools)
               | have a separation between "signature updates" and
               | "agent/code" updates. My (limited) reading of this
               | situation is that this as a update of their "data" not
               | the application. Now apparently the dynamic update
               | included operating code, just just something the
               | equivalent of a yaml file or whatever, but I can see how
               | different kinds of changes like this go through different
               | pipelines. Of course, that is all the more reason to
               | ensure you have integration coverage.
        
             | AndrewKemendo wrote:
             | There is no nuance needed - this is a giant corporation
             | that sells kernel layer intermediation at global scale. You
             | better be spending billions on bulletproof deployment
             | automation because *waves hands around in the air pointing
             | at whats happening just like with solarwinds*
             | 
             | Bottom line this was avoidable and negligent
             | 
             | For the record I owned global infrastructure as CTO for the
             | USAF Air Operations weapons system - one of the largest
             | multi-classification networked IT systems ever created for
             | the DoD - even moreso during a multi-region refactor as a
             | HQE hire into the AF
             | 
             | So I don't have any patience for millionaires not putting
             | the work in when it's critical infrastructure
             | 
             | People need to do better and we need accountability for
             | people making bad decisions for money saving
        
               | arp242 wrote:
               | Almost everything that goes wrong in the world is
               | avoidable one way or the other. Simply stating "it was
               | avoidable" as an axiom is simplistic to the point of
               | silliness.
               | 
               | Lots of very smart people have been hard at work to
               | prevent airplanes from crashing for many decades now, and
               | planes still crash for all sorts of reasons, usually
               | considered "avoidable" in hindsight.
               | 
               | Nothing is "bulletproof"; this is a meaningless buzzword
               | with no content. The world is too complex for this.
        
               | HL33tibCe7 wrote:
               | > You better be spending billions on bulletproof
               | deployment automation
               | 
               | There is no such thing.
        
           | JKCalhoun wrote:
           | To be sure. But the fact is the release broke.
           | 
           | I'm not sure: is having test servers that it passed any
           | better than none at all?
        
             | martinky24 wrote:
             | Yes, yes it is. Because there's tons more breakages that
             | have likely been caught.
             | 
             | One uncaught downstream failure doesn't invalidate the
             | effort into all the previously caught failures.
        
             | strken wrote:
             | It is absolutely better to catch some errors than none.
             | 
             | In this case it gives me vibes of something going wrong
             | _after_ the CI pipeline, during the rollout. Maybe they
             | needed advice a bit more specific than  "just use a staging
             | environment bro", like "use checksums to verify a release
             | was correctly applied before cutting over to the new
             | definitions" and "do staged rollouts, and ideally release
             | to some internal canary servers first".
        
               | exe34 wrote:
               | I don't understand why you wouldn't do staged roll outs
               | at this scale. even a few hours delay might have been
               | enough to stop the release from going global.
        
               | martinky24 wrote:
               | "Have these idiots even heard of CI/CD???" strangely
               | seems to be a common condescending comment in this
               | thread.
               | 
               | I honestly though HN was slightly higher quality than
               | most of the comments here. I am proven wrong.
        
               | kristjansson wrote:
               | Big threads draw a lot of people; we regress toward the
               | mean
        
               | StressedDev wrote:
               | Agreed - The worst part is most of the people making
               | these unhelpful comments are probably doing the same
               | sorts of things which caused this outage.
        
               | chuckadams wrote:
               | > I honestly though HN was slightly higher quality
               | 
               | HN reminds me of nothing so much as Slashdot in the early
               | 2000's, for both good and ill. Fewer stupid memes about
               | Beowulf Clusters and Natalie Portman tho.
        
               | chuckadams wrote:
               | They almost certainly have such a process, but it got
               | bypassed by accident, probably got put into a "minor
               | updates" channel (you don't run your model checker every
               | time you release a new signature file after all).
               | Surprise, business processes have bugs too.
               | 
               | But naw, must be every random commentator on HN knows how
               | to run the company better.
        
             | chatmasta wrote:
             | The release didn't break. A data file containing nulls was
             | downloaded by a buggy kernel module that crashed when
             | reading the file.
             | 
             | For all we know there is a test case that failed and they
             | decided to push the module anyway ("it's not like anyone is
             | gonna upload a file of all nulls").
             | 
             | Btw: where are these files sourced from? Could a malicious
             | Crowdstrike customer trick the system into generating this
             | data file, by e.g. reporting it saw malware with these
             | (null) signatures?
        
             | leptons wrote:
             | A lot of the software industry focuses on strong types,
             | testing of all kinds, linting, and plenty of other
             | sideshows that make programmers feel like they're in
             | control, but these things only account for the problems you
             | can test for and the systems you control. So what if a
             | function gets a null instead of a float? It shouldn't crash
             | half the tech-connected world. Software resilience is kind
             | of lacking in favor of trusting that strong types and tests
             | will catch most bugs, and that's good enough?
        
           | ikiris wrote:
           | Dude, the fact that it breaks directly.
           | 
           | You sound like the guy that a few years ago tried to argue
           | (the company in question) tested os code that didn't include
           | any drivers for their gear's local storage. Its obvious it
           | wasn't to anyone competent.
        
         | dheera wrote:
         | I don't know if people on Microsoft ecosystems even know what
         | CI pipelines are.
         | 
         | Linux and Unix ecosystems in general work by people thoroughly
         | testing and taking responsibility for their work.
         | 
         | Windows ecosystems work by blame passing. Blame Ron, the IT
         | guy. Blame Windows Update. Blame Microsoft. That's how stuff
         | works.
         | 
         | It has always worked this way.
         | 
         | But also, all the _good_ devs got offered 3X the salary at
         | Google, Meta, and Apple. Have you ever applied for a job at
         | CrowdStrike? No? That 's why they suck.
         | 
         | * A disproportionately large number of Windows IT guys are
         | named Ron, in my experience.
        
           | kabdib wrote:
           | That's a pretty broad brush.
        
         | miki123211 wrote:
         | Keep in mind that this was probably a data file, not
         | necessarily a code file.
         | 
         | It's possible that they run tests on new commits, but not when
         | some other, external, non-git system pushes out new data.
         | 
         | Team A thinks that "obviously the driver developers are going
         | to write it defensively and protect it against malformed data",
         | team B thinks "obviously all this data comes from us, so we
         | never have to worry about it being malformed"
         | 
         | I don't have any non-public info about what actually happened,
         | but something along these lines seems to be the most likely
         | hypothesis to me.
         | 
         | Edit: Now what would have helped here is a "staged rollout"
         | process with some telemetry. Push the update to 0.01% of your
         | users and solicit acknowledgments after 15 minutes. If the vast
         | majority of systems are still alive and haven't been restarted,
         | keep increasing the threshold. If, at any point, too many of
         | the updated systems stop responding or indicate a failure,
         | immediately stop the rollout, page your on-call engineers and
         | give them a one-click process to completely roll the update
         | back, even for already-updated clients.
         | 
         | This is exactly the kind of issue that non-invasive, completely
         | anonymous, opt-out telemetry would have solved.
        
           | adzm wrote:
           | This was a .dll in all but name fwiw.
        
         | ar_lan wrote:
         | > tests all of the possible target images that they claim to
         | support.
         | 
         | Or even at the very least the most popular OS that they
         | support. I'm genuinely imagining right now that for this
         | component, the entirety of the company does not have a single
         | Windows machine they run tests on.
        
       | bryanlarsen wrote:
       | Segue: What the heck is about Windows files and null characters?
       | I've been almost exclusively dealing with POSIX file systems for
       | the last 30 years, but I'm currently shipping a cross-platform
       | app and a lot of my Windows users are encountering corrupted
       | files which exhibit a bunch of NULs in random places in the file.
       | I've added ugly hacks to deal with them but it'd be nice to get
       | down to root causes. Is there a handy list of things that are
       | safe on POSIX but not on Windows so I can figure out what I'm
       | doing wrong?
       | 
       | I'm at the stage where I'm thinking "%$#@ this, I'm never going
       | to write to the Windows file system again, I'm just going to
       | create an SQLite DB and write to that instead". Will that fix my
       | problems?
        
         | nradov wrote:
         | The Windows NTFS is safe and reliable. It doesn't corrupt
         | files. You have probably misunderstood the problem.
        
           | bryanlarsen wrote:
           | However I'm using it it certainly isn't. It's certainly quite
           | likely the problem is me, not Windows.
           | 
           | Most of the files that are getting corrupted are being
           | written to in an append-only fashion, which is generally one
           | of the mechanisms for writing to files to avoid corruption,
           | at least on POSIX.
        
           | omoikane wrote:
           | I have observed a file filled NULs that was caused by a power
           | loss in the middle of a write -- my UPS alerted me that
           | utility power is gone, I tried to shutdown cleanly, but the
           | battery died before a clean shutdown completed. This was NTFS
           | on a HDD and not a SSD.
           | 
           | I am not saying it happens often, but it does happen once in
           | a while.
        
             | bryanlarsen wrote:
             | Yes, corruption does appear to be correlated with power
             | cycles.
        
             | ale42 wrote:
             | Had the same on journaled ext4 on Linux. Lots of NULL bytes
             | in the middle of the syslog because of unclean shutdown.
        
             | dist-epoch wrote:
             | NTFS guarantees file-system metadata integrity, not file
             | data integrity. Subtle but important difference.
             | 
             | The file was corrupted, but the file-system remained
             | consistent.
        
         | tatersolid wrote:
         | Returning all zeros randomly is one of the failure modes of
         | crappy consumer SSDs with buggy controllers. Especially those
         | found in cheap laptops and on Amazon. If it's a fully
         | counterfeit drive it might even be maliciously reporting a size
         | larger than it has flash chips to support. It will accept
         | writes without error but return zeros for sectors that are
         | "fake". This can appear random due to wear-leveling in the
         | controller.
        
         | alexisread wrote:
         | I'd guess yes, and you get the SQL goodness to boot :)
         | 
         | Sounds like you have an encoding issue somewhere, windows has
         | it's own charset - Windows-1252, so I'd vet all your libs that
         | touch the file (including eg. .Net libs etc). If one of them
         | defaults to that encoding you may get it either mislabelling
         | the file encoding, or adding in null after each append etc.
         | 
         | SQLite is tested cross-platform so 100% the file will be cross-
         | platform compatible.
        
         | dist-epoch wrote:
         | A lot of your users encountering file corruption on Windows
         | either means that somehow your users are much more likely to
         | have broken hardware, or more realistically that you have a bug
         | in your code/libraries.
        
       | neffy wrote:
       | There's a claim over on Mastodon from Kevin Beaumont that the
       | file is different on every customer he's received the file from.
       | 
       | https://cyberplace.social/@GossiTheDog/112812454405913406
       | 
       | (scroll down a little)
        
         | drewg123 wrote:
         | I thought windows required all kernel modules to be signed..?
         | If there are multiple corrupt copies, rather than just some
         | test escape, how could they have passed the signature
         | verification and been loaded by the kernel?
        
           | dist-epoch wrote:
           | This is not even a valid executable.
           | 
           | Most likely is not loaded as a driver binary, but instead is
           | some data file used by the CrowdStrike driver.
        
       | millero wrote:
       | Yes, this fits in with what I heard on the grapevine about this
       | bug from a friend who knows someone working for Crowdstrike. The
       | bug had been sitting there in the kernel driver for years before
       | being triggered by this flawed data, which actually was added in
       | a post-processing step of the configuration update - after it had
       | been tested but before being copied to their update servers for
       | clients to obtain.
       | 
       | Apparently, Crowdstrike's test setup was fine for this
       | configuration data itself, but they didn't catch it before it was
       | sent out in production, as they were testing the wrong thing.
       | Hopefully they own up to this, and explain what they're going to
       | do to prevent another global-impact process failure, in whatever
       | post-mortem writeup they may release.
        
         | finaard wrote:
         | You need to be a very special kind of stupid to think
         | postprocessing anything after you've tested it is a good idea.
        
           | dgfitz wrote:
           | Hmm, I post-process autonomous vehicle logs probably daily.
           | 
           | Why is this stupid? It's pretty useful to see a graph of
           | coolant temp vs ambient temp vs motor speed vs roll/pitch.
           | 
           | I must be especially stupid I suppose. Nuts.
        
             | Flockster wrote:
             | That is not remotely what was meant..
        
               | dgfitz wrote:
               | Perhaps word choice and sentence structure are important
               | then.
        
               | heylook wrote:
               | What you have just said is one of the most insanely
               | idiotic things I have ever heard. At no point in your
               | rambling, incoherent response were you even close to
               | anything that could be considered a rational thought.
               | Everyone in this room is now dumber for having listened
               | to it. I award you no points, and may God have mercy on
               | your soul.
        
               | jmull wrote:
               | I don't think people should restate the basic context of
               | the thread for every post... That's a lot of work and
               | noise, and probably the same people who ignore the thread
               | context would also ignore any context a post provided.
        
             | wri321 wrote:
             | This is comparable to modifying the system under test after
             | it has been validated and not simply looking at recorded
             | data.
        
           | dist-epoch wrote:
           | "We need to ship this by Friday. Just add a quick post-
           | processing step, and we'll fix it next week properly" - how
           | these things tend to happen.
        
             | yard2010 wrote:
             | In my first engineering job ever, I worked with this snarky
             | boss who was mean to everyone and just said NO every time
             | to everything. She also had a red line: NO RELEASES ON THE
             | LAST DAY OF THE WEEK. I couldn't understand why. Now, 10
             | years later, I understand I just had the best boss ever. I
             | miss you, Vik.
        
               | arp242 wrote:
               | I still have a 10-year old screenshot from a colleague
               | breaking production on a Friday afternoon and posting
               | "happy weekend everyone!" just as the errors from
               | production started to flood in on the same chat. And he
               | really did just fuck off leaving us to mop up the
               | hurricane of piss he unleashed.
               | 
               | He was not my favourite colleague.
        
           | arp242 wrote:
           | "I heard on the grapevine from a friend who knows someone
           | working for Crowdstrike" is perhaps not the most reliable
           | source of information, due to the game of telephone effect if
           | nothing else.
           | 
           | And post-processing can mean many things. Could be something
           | relatively simple such as "testing passed, so lets mark the
           | file with a version number and release it".
        
       | MilStdJunkie wrote:
       | Holy smokes. I'm no programmer, but I've built out bazillions of
       | publishing/conversion/analysis systems, and null purge is pretty
       | much the first thing that happens, every time. x00 breaks
       | virtually everything just by existing - like, seriously, markup
       | with one of these damn things will make the rest of the system
       | choke and die as soon as it looks at it. Numpy? Pytorch? XSL?
       | Doesn't matter. _cough cough cough GACK_
       | 
       | And my systems are all halfass, and I don't really know what I'm
       | doing. I can't imagine actual real professionals letting that
       | moulder its way downstream. Maybe their stuff is just way more
       | complex and amazing than I can possibly imagine.
        
         | wormlord wrote:
         | Not a C programmer, why is 0x00 so bad? It's the string
         | terminator character right?
        
           | bagful wrote:
           | Indeed, '\0' is the sentinel character for uncounted strings
           | in C, and even if your own counted string implementation is
           | "null-byte clean", aspects of the underlying system may not
           | be (Windows and Unix filesystems forbid embedded null
           | characters, for example).
        
           | tedunangst wrote:
           | It's a byte like any other. You're more likely to see big
           | files full of 0x0 than 0x1, but it's really not so different.
        
         | hawski wrote:
         | Binary files are full of null bytes it is one of the main
         | criteria of binary file recognition. Also large swaths of null
         | bytes are also common, common enough we have sparse files -
         | files with holes in them. Those holes are all zeroes, but are
         | not allocated in the file system. For an easy example think
         | about a disk image.
        
       | j-wags wrote:
       | It's possible that these aren't the original file contents, but
       | rather the result of a manual attempt to stop the bleeding.
       | 
       | Someone may have hoped that overwriting the bad file with an
       | all-0 file of the correct size would make the update benign.
       | 
       | Or following the "QA was bypassed because there was a critical
       | vulnerability" hypothesis, stopping distribution of the real
       | patch may be an attempt to reduce access to the real data and
       | slow reverse-engineering of the vulnerability.
        
       | 0cf8612b2e1e wrote:
       | On the plus side of this disaster, I am holding out some pico-
       | sized hope that maybe organizations will rethink kernel level
       | access. No, random gaming company, you are not good enough to
       | write kernel level anti cheat software.
        
         | majormajor wrote:
         | I can't imagine gaming software being affected at all, unless
         | MS does a ton of cracking down (and would still probably give
         | hooks for gaming since they have gaming companies in their
         | umbrella).
         | 
         | No corporate org is gonna bat an eye at Riot's anti-cheat
         | practices, because they aren't installing LoL on their line of
         | business machines anyway.
        
           | InitialLastName wrote:
           | Right, MS just paid $75e9 for a company whose main products
           | are competitive multiplayer games. They are never going to be
           | incentivized to compromise that sector by limiting what anti-
           | cheat they can do.
        
             | Y_Y wrote:
             | That's 7.5e10 USD in SI.
        
               | InitialLastName wrote:
               | Engineering notation used for prefix convenience.
        
           | minetest2048 wrote:
           | Until the malware bring their own compromised signed anti
           | cheat driver on their own, like what happened with Genshin
           | Impact anti cheat mhyprot2
        
           | tgsovlerkhgsel wrote:
           | > because they aren't installing LoL on their line of
           | business machines anyway
           | 
           | But if their business is incompatible with strict software
           | whitelisting, their employees might...
        
         | pvillano wrote:
         | imo anti-cheat should mostly be server-side behavior based
        
           | gruez wrote:
           | How are you going to catch wallhackers that aren't blatantly
           | obvious?
        
             | JasonSage wrote:
             | You may not, and that's ok.
        
               | chowells wrote:
               | It's not ok for people playing those games. They'll quit
               | playing that game and go to one with invasive client-side
               | anti-cheat instead.
               | 
               | The incentives and priorities are _very_ different for
               | people who want to play fair games than they are for
               | people who want to maximize their own freedom.
        
               | kjkjadksj wrote:
               | This is a solved issue already. Vote kicks or server
               | admin intervention. Aimbotting was never an issue for the
               | old primitive fps games I would play because admins could
               | spectate and see you are aimbotting.
               | 
               | A modern game need only telemetry that captures what a
               | spectating admin picks up, rather than active
               | surveillance.
               | 
               | Hackers are only a problem when servers are left
               | unmoderated and players can't vote kick.
        
               | nemothekid wrote:
               | You can't have vote kicks/server admins/hosted servers
               | with competitive ranked ladders. If your solution is
               | "don't have competitive ranked ladders" then you are just
               | telling the majority of people who even care about anti-
               | cheat to just not play their preferred game mode.
        
               | chowells wrote:
               | That stopped being a solution when winning online started
               | mattering. There are real money prizes for online game
               | tournaments. Weekly events can have hundreds of dollars
               | in their prize pools. Big events can have thousands.
               | 
               | Suddenly vote kicking had to go, because it was abused.
               | Not in the tournaments themselves, but in open ranked
               | play which serves as qualifiers. An active game can rack
               | up thousands of hours of gameplay per day, far beyond the
               | ability of competent admins to validate. Especially
               | because cheating is often subtle. An expert can spend
               | more than real time looking for subtle patterns that
               | automated tools haven't been built to detect.
               | 
               | Games aren't between you and your 25 buddies for bragging
               | rights anymore. They're between you and 50k other active
               | players for cash prizes. The world has changed. Anti-
               | cheat technology _followed_ that change.
        
               | JasonSage wrote:
               | I play one of those games that doesn't strongly enforce
               | anti-cheating, and I agree with you that it's a huge
               | detraction compared to games with strong anti-cheat.
               | 
               | But I strongly disagree about the use of invasive client-
               | side anti-cheat. Server-side anti-cheat can reduce the
               | number of cheaters to an acceptably low level.
               | 
               | See for example how lichess detects and aids in detection
               | of cheaters: https://github.com/clarkerubber/irwin
               | 
               | And chess is a game where I feel like it would be
               | relatively hard to detect cheating. An algorithm looking
               | at games with actors moving in 3D space and responding to
               | relative positions and actions of multiple other actors
               | should have a great many more ways to detect cheating
               | over the course of many games.
        
               | JasonSage wrote:
               | And frankly, I think the incentive structure has nothing
               | to do with whether tournaments are happening with money
               | on the line, and a great deal more whether the company
               | has the cash and nothing better to do.
               | 
               | Anti-cheat beyond a very basic level is nothing to these
               | companies except a funnel optimization to extract the
               | maximum lifetime value out of the player base. Only the
               | most successful games will ever have the money or reach
               | the technical capability to support this. Nobody making
               | these decisions is doing it for player welfare.
        
             | mrguyorama wrote:
             | The only reason wallhacking is possible in the first place
             | is a server sending a client information on a competitor
             | that the client should not know about.
             | 
             | IE the server sends locations and details about all players
             | to your client, even if you are in the spawn room and can't
             | see anyone else and your client has to hide those details
             | from you. It is then trivial to just pull those details out
             | of memory.
             | 
             | The solution forever has been to just not send clients
             | information they shouldn't have. My copy of CS:GO should
             | not know about a terrorist on the other side of the map.
             | The code to evaluate that literally already exists, since
             | the client will answer that question when it goes to render
             | visuals and sound. They just choose to not do that testing
             | server side.
             | 
             | Aimbotting however is probably impossible to stop. Your
             | client has to know where the model for an enemy is to
             | render it, so you know where the hitbox roughly should be,
             | and most games send your client the hitbox info directly so
             | it can do predict whether you hit them. I don't think you
             | can do it behaviorally either.
        
               | snailmailman wrote:
               | To some extent though- the games _do_ need information
               | about players that are behind walls. In CSGO /CS2, even
               | if you can't see the player you can hear their footsteps
               | or them reloading, etc. the sound is _very_ positional.
               | Plus, you can shoot through some thin walls at these
               | players. Even if they can't be _seen_.
               | 
               | I don't believe server side anti cheat can truly be
               | effective against some cheats. But also Vanguard is trash
               | and makes my computer bluescreen. I've stopped playing
               | league entirely because of it.
        
               | 0cf8612b2e1e wrote:
               | Nit, but surely hit detection happens on the server?
               | Shooting wildly should always register a hit, regardless
               | of what the client knows.
        
               | pohuing wrote:
               | You don't happen to have used some means to install win
               | 11 on an unsupported device have you? People bypassing
               | the windows install requirements and then vanguard making
               | false assumptions have been a source of issues.
        
               | anonymoushn wrote:
               | You may have players complain that when they walk around
               | a corner, the enemy who they should be able to see
               | immediately is briefly invisible.
        
               | andy81 wrote:
               | Aside from aimbots, there's plenty of abusable legitimate
               | information exposed to the client.
               | 
               | E.g. For CS:GO, the volume of footsteps and gunshots vary
               | by distance so you could use them to triangulate an
               | enemy's position.
        
               | bsder wrote:
               | > The only reason wallhacking is possible in the first
               | place is a server sending a client information on a
               | competitor that the client should not know about.
               | 
               | Some information is required to cover the network and
               | server delays.
               | 
               | The client predicts what things should look like and then
               | corrects to what they actually are if there is a
               | discrepancy with the server. You _cannot_ get around this
               | short of going back to in-person LAN games.
        
               | SigmundA wrote:
               | So the server must render the 3d world from each players
               | perspective to do these tests? Sounds ridiculously
               | expensive.
        
               | Ukv wrote:
               | > So the server must render the 3d world from each
               | players perspective to do these tests?
               | 
               | Just some raycasts through the geometry should be
               | sufficient, which the server is already doing (albeit on
               | likely-simplified collision meshes) constantly.
               | 
               | If you really do have a scenario where occlusion
               | noticeably depends on more of the rendering pipeline (a
               | window that switches between opaque and transparent based
               | on a GPU shader?) you could just treat it as always
               | transparent for occlusion checking and accept the tiny
               | loss that wallhackers will be able to see through it, or
               | add code to simulate that server-side and change the
               | occlusion geometry accordingly.
        
               | kjkjadksj wrote:
               | Of course you can. You can measure telemetry like where
               | the aimpoint is on a hitbox. Is it centered or at least
               | more accurate than your globabl population? Hacker, ban.
               | How about time to shoot after hitting target? Are they
               | shooting instantly, is the delay truly random? If not
               | then banned. You can effectively force the hacking tools
               | to only be about as good as a human player, at which
               | point it hardly matters whether you have hackers or not.
               | 
               | Of course, no one handles hacking like this because its
               | cheaper to just ship fast and early and never maintain
               | your servers. Not even valve cares about their games and
               | they are the most benevolent company in the industry.
        
               | nemothekid wrote:
               | Valve does not have kernel level anticheat. Faceit does.
               | Most high ranked players prefer to play on Faceit because
               | of the amount of cheaters in normal CS2 matchmaking.
        
             | Ukv wrote:
             | Minimize the possible advantage by not sending the client
             | other players' positions until absolutely necessary (either
             | the client can see the other player, or there's a movement
             | the client could make that would reveal the other player
             | before receiving the next packet), and eliminate the
             | cheaters you can with server-side behavior analysis and
             | regular less-invasive client-side anticheat.
             | 
             | Ultimately even games with kernel anticheat have cheating
             | issues; at some point you have to accept that you cannot
             | stop 100.0% of cheaters. The solution to someone making an
             | aimbot using a physically separate device (reading monitor
             | output, giving mouse input) cannot be to require keys to
             | the player's house.
        
               | lutoma wrote:
               | > not sending the client other players' positions until
               | absolutely necessary (either the client can see the other
               | player, or there's a movement the client could make that
               | would reveal the other player before receiving the next
               | packet)
               | 
               | I think the problem with this is sounds like footsteps or
               | weapons being fired that need to be positional.
               | 
               | Which makes me wonder if you could get away with mixing
               | these sounds server-side and then streaming them to the
               | client to avoid sending positions. Probably infeasible in
               | practice due to latency and game server performance, but
               | fun to think about.
        
               | Ukv wrote:
               | To whatever extent the sound is intended to only give a
               | general direction, I'd say quantize the angle and volume
               | of the sound before it's sent such that cheaters also
               | only get that same vague direction. Obviously don't send
               | inaudible/essentially-inaudible sounds to the client at
               | all.
        
               | Workaccount2 wrote:
               | They need to just make CPU's, GPU's, and memory modules
               | with hardware level anti-cheat. Totally optional
               | purchase, but grants you access to very-difficult-to-
               | cheat-in servers.
        
               | didntcheck wrote:
               | That sort of already exists - I believe a small number of
               | games demand that you have Secure Boot enabled, meaning
               | you should only have a Microsoft-approved kernel and
               | drivers running. And then the anticheat is itself
               | probably kernel level, so can see anything in userspace
               | 
               | It may still be possible to get round this by using your
               | own machine owner key or using PreLoader/shim [1] to sign
               | a hacked Windows kernel
               | 
               | [1] https://wiki.archlinux.org/title/Unified_Extensible_F
               | irmware...
        
               | bpye wrote:
               | I guess you've just invented an Xbox/PlayStation.
        
             | Am4TIfIsER0ppos wrote:
             | Standalone servers. Run your own then you can ban anyone
             | you like, or better still only allow anyone you like.
        
               | cobalt60 wrote:
               | Nothing like sourcemodded server! Good old days!
        
             | kjkjadksj wrote:
             | Did their hitbox clip through the wall? Yes? Banned. You
             | could do it with telemetry.
        
               | Arnavion wrote:
               | You're confusing wallhacking with noclipping. Wallhacking
               | is being able to see through walls, like drawing an
               | outline around all characters that renders with highest
               | z-order, or making wall textures transparent.
               | 
               | It does not result in any server-side-detectable
               | difference in behavior other than the hacker seemingly
               | being more aware of their surroundings than they should,
               | which can be hard to determine for sure. Depending on how
               | the hack is done, it may not be detectable by the client
               | either, eg by intercepting the GPU driver calls to render
               | the outlines or switch the wall textures.
        
             | josephcsible wrote:
             | Stop thinking about trying to catch wallhackers. Instead,
             | make wallhacking impossible. Do that by fixing the server
             | to, instead of sending all player positions to everyone,
             | only send player positions to clients that they have an
             | unobstructed view of.
        
         | frizlab wrote:
         | Unless I'm mistaken on macOS at least kernel access is just not
         | possible, so at least there's that.
        
         | pityJuke wrote:
         | The problem you're fighting is cheat customers who go "random
         | kernel-level driver? no problem!"
        
       | hn_throwaway_99 wrote:
       | On a related note, I don't think that it's a coincidence that 2
       | of the largest tech meltdowns in history (this one and the
       | SolarWinds hack from a few years ago) were both the result of
       | "security software" run amok. (Also sad that both of these
       | companies are based in Austin, which certainly gives Austin's
       | tech scene a black eye).
       | 
       | IMO, I think a root cause issue is that the "hacker types" who
       | are most likely to want to start security software companies are
       | also the least likely to want to implement the "boring" pieces of
       | a process-oriented culture. For example, I can't speak so much
       | for CrowdStrike, but it came out the SolarWinds had an
       | _egregiously_ bad security culture at their company. When the
       | root cause comes out about this issue dollars-to-donuts it was
       | just a fast and loose deployment process.
        
         | koliber wrote:
         | Don't forget heartbleed, a vulnerability in OpenSSL, the
         | software that secures pretty much everything.
        
         | NegativeK wrote:
         | Alternate hypothesis that's not necessarily mutually exclusive:
         | security software tends to need significant and widespread
         | access. That means that fuckups and compromises tend to be more
         | impactful.
        
           | hn_throwaway_99 wrote:
           | 100% agree with that. The thing that baffles me a bit, then,
           | is that if you are writing software that _is_ so critical and
           | can have such a catastrophic impact when things go wrong,
           | that you double and triple check everything you do - what you
           | DON 'T do is use the same level of care you may use with some
           | social media CRUD app (move fast and break things and all
           | that...)
           | 
           | To emphasize, I'm really just thinking about the bad
           | practices that were reported after the SolarWinds hack
           | (password of "solarwinds123" and a bunch of other insider
           | reports), so I can't say that totally applies to CrowdStrike,
           | but in general I don't feel like these companies that can
           | have such a catastrophic impact take appropriate care of
           | their responsibilities.
        
         | compacct27 wrote:
         | The Austin tech culture is...interesting. I stopped trying to
         | find a job here and went remote Bay Area, and talking to tech
         | workers in the area gave me the impression it's a mix of
         | slacker culture and hype chasing. After moving back here, tech
         | talent seems like a game of telephone, and we're several jumps
         | past the original.
         | 
         | When I heard CrowdStrike was here, it just kinda made sense
        
         | moandcompany wrote:
         | Crowdstrike was originally founded and headquartered in Irvine,
         | CA (Southern California). In those days, most of its
         | engineering organization was either remote/WFH or in Irvine,
         | CA.
         | 
         | As they got larger, they added a Sunnyvale office, and later
         | moved the official headquarters to Austin, TX.
         | 
         | They've also been expanding their engineering operations
         | overseas which likely includes offshoring in the last few
         | years.
        
           | nullify88 wrote:
           | They bought out Humio in Aarhus, Denmark. Now Falcon
           | Logscale.
        
         | ajsnigrutin wrote:
         | Security software needs kernel level access.. if something
         | breaks, you get boot loops and crashes.
         | 
         | Most other software doesn't need that low level of access, and
         | even if it crashes, it doesn't take the whole system with it,
         | and a quick, automated upgrade process is possible.
        
           | rahkiin wrote:
           | Security software needs kernel level access.. *on Windows.
           | macOS has an Endpoint Security userland extension api
        
             | sharkjacobs wrote:
             | This seems like a pretty clear example of the philosophical
             | divide between MacOS and Windows.
             | 
             | A good developer with access to the kernel can create
             | "better" security software which does less context
             | switching and has less performance impact. But a bad
             | (incompetent or malicious) developer can do a lot more harm
             | with direct access to the kernel.
             | 
             | We see the exact same reasoning with restricting execution
             | of JIT-compiled code in iOS.
        
         | cedws wrote:
         | > the "hacker types" who are most likely to want to start
         | security software companies are also the least likely to want
         | to implement the "boring" pieces of a process-oriented culture
         | 
         | I disagree, security companies suffer from "too big to fail"
         | syndrome where the money comes easy because they have customers
         | who want to check a box. Security is NOT a product you pay for,
         | it's a culture that takes active effort and hard work to embed
         | from day one. There's no product on the market that can provide
         | security, only products to point a finger at when things go
         | wrong.
        
           | Andrex wrote:
           | The market is crying for some kind of '10s "agile hype"
           | equivalent for security evangelism and processes.
        
       | OutOfHere wrote:
       | This looks like a test file that got deployed. Perhaps a QA test
       | was newly added which ran and overwrote the build. This is all I
       | can think of.
        
       | markus_zhang wrote:
       | I'm starting to think that the timing (Friday) and the scale as
       | well as other things (like this finding) might -- just might
       | point to a bad actor.
       | 
       | We will probably have to wait for CS' own report.
        
       | breadwinner wrote:
       | I blame Microsoft. Why? Because they rely on third parties to
       | fill in their gaps. When I buy a Mac it already has drivers for
       | my printers, but not if I buy a Windows PC. Some of these
       | printers drivers are 250 MB, which is a crazy size for a driver.
       | If it is more than a few 100 KB it means the manufacturer does
       | not know how to make a driver software. Microsoft should make it
       | unnecessary to rely on crappy third party software so much.
        
         | luuurker wrote:
         | CrowdStrike's mess up is CrowdStrike's fault, not Microsoft's.
         | We might not like the way Windows works, but it usually works
         | fine and more restrictive systems also have downsides. In any
         | case, it was CrowdStrike who dropped the ball and created this
         | mess.
         | 
         | I don't like what Microsoft is doing with Windows and only use
         | it for gaming (I'm glad Linux is becoming a good option for
         | that), so I'm far from being a "Microsoft fan", but Windows is
         | very good at installing the software needed. Plug a GPU, mouse,
         | etc, from any well known brand and it should work without you
         | doing much.
         | 
         | I didn't have to install anything on my Windows PC (or my MBP)
         | last time I bought a new printer (Epson). The option to let
         | Windows install the drivers needed is enabled though... some
         | people disable that.
        
           | breadwinner wrote:
           | > _CrowdStrike 's mess up is CrowdStrike's fault, not
           | Microsoft's._
           | 
           | Disagree. It is everyone's fault. It is CrowdStrike's fault
           | for not testing their product. It is Microsoft's fault for
           | allowing CrowdStrike to mess with kernel and not vetting such
           | critical third parties. It is the end customers' fault for
           | installing crapware and not vetting the vendor.
        
             | yoavm wrote:
             | so now we're vouching for more restrictive operating
             | systems? the last thing I want is an operating system that
             | can only install vetted apps, and that these apps are
             | restricted even if I provide my root password.
        
             | luuurker wrote:
             | We expect different things from the OS we use, I guess.
             | 
             | My main machine is a Macbook Pro and one thing that annoys
             | me a lot is the way Apple handles apps that are not
             | notarized. I don't use iPhones because of the system
             | restrictions (file access, background running, etc) and
             | because I can only install what Apple allows on their
             | store. You can see why I don't want Microsoft to hold my
             | hand when I use Windows... it's my machine, I paid for it,
             | I should be able to install crapware and extend the system
             | functionality if that's what I want especially when I pick
             | an OS that allows me to do that.
             | 
             | In this case, enterprise customers decided to use an OS
             | that allows them to also use CrowdStrike. Maybe Microsoft
             | could handle this stuff better and not show a BSOD? I guess
             | so, but I won't blame them for allowing these tools to
             | exist.
             | 
             | Don't get me wrong, there's a place for very restrictive
             | operating systems like iOS or ChromeOS, but they're not for
             | everyone or enough for all tasks. Windows is a very capable
             | OS, certainly not the best option for everyone, but the day
             | Microsoft cripples Windows like that, it's the day I am
             | forced to stop using it.
        
         | Alghranokk wrote:
         | I think this is unfair; m$ does provide perfectly usable
         | generic printer drivers, as long as you only use basic
         | universal features. The problem is that the printer producers
         | each want to provide a host of features on top of that, each in
         | their own proprietary way, with post print hole-punching, 5
         | different paper trays, user boxes, 4 different options for
         | duplex printing.
         | 
         | Also, label printers, why the heck does zebra only do EPL or
         | ZPL? Why not pcl6 or PS like the rest of the universe?
         | 
         | The point is that printers are bullshit. Nobody knows how they
         | work, and assuming that microsoft should just figure it out on
         | its own is at least in my opinion, unreasonable.
        
           | breadwinner wrote:
           | What Windows was known for in the 1990s, is good quality 1st
           | party drivers. Then after Windows achieved monopoly status
           | they shifted driver responsibility to device manufacturers. I
           | have never had to install a third party driver on a Mac, but
           | on Windows I do. If Apple can do it Microsoft can too.
        
         | mardifoufs wrote:
         | Which printer did you try it with? I've never had issues with
         | printing out of the box with windows or mac. At least not for
         | the past 5 years.
         | 
         | Also, I'm glad Microsoft doesn't provide an easy way to get
         | what is essentially complete control over a machine, and every
         | single event/connection/process that it has.
        
       | cmrdporcupine wrote:
       | Is there not responsibility at some level as well to _Microsoft_
       | for having a kernel which even _loaded_ this? Not just because of
       | the apparent corruption, but also ... it was, I heard.. signed
       | and given a bit of an MS blessing.
       | 
       | This crap shouldn't be run in kernel space. But putting that
       | aside, we need kernels that will be resilient to and reject this
       | stuff.
        
         | ale42 wrote:
         | The thing is that, despite the file has a confusing .sys
         | extension, it's not the driver, but rather a data file loaded
         | by the Crowdstrike driver.
        
       | ThinkBeat wrote:
       | Maybe Crowdstrike has adopted the modern ethos Move fast and
       | break things With continuous integration we ship a thousand times
       | a day. fuck QA.
        
       | motohagiography wrote:
       | conspiracy prediction: I don't think CS will give a complete
       | public RCA on it, but I do think the impact and crisis will be a
       | pretext for granting new internet governance powers, maybe via
       | EO, or a co-ordinated international response via the UN/ITU and
       | the EU.
        
         | cherryteastain wrote:
         | EU recently passed a law in this domain:
         | https://www.eiopa.europa.eu/digital-operational-resilience-a...
        
       | kragen wrote:
       | this seems like the second or third test file any qa person would
       | have tried, after an empty file and maybe a minimal valid file.
       | the level of pervasive incompetence implied here is staggering
       | 
       | in a market where companies compete by impressing nontechnical
       | upper management with presentations, it should be no surprise
       | that technically competent companies have no advantage over
       | incompetent ones
       | 
       | i recently read through the craig wright decision
       | https://www.judiciary.uk/judgments/copa-v-wright/ (the guy who
       | fraudulently claimed to be satoshi nakamoto) and he lacked even
       | the most basic technical competence in the fields where he was
       | supposedly a world-class specialist (decompiling malware to c);
       | he didn't know what 'unsigned' meant when questioned on the
       | witness stand. he'd been doing infosec work for big companies
       | going back to the 90s. he'd apparently been tricking people with
       | technobabble and rigged demos and forged documents for his entire
       | career
       | 
       | george kurtz, ceo and founder of crowdstrike, was the cto of
       | mcafee when they did the exact same thing 14 years ago:
       | https://old.reddit.com/r/sysadmin/comments/1e78l0g/can_crowd...
       | https://en.wikipedia.org/wiki/George_Kurtz
       | 
       | it's horrifying that pci compliance regulations have injected
       | crowdstrike (and antivirus) into virtually every aspect of
       | today's it infrastructure
        
         | GardenLetter27 wrote:
         | Also ironic that the compliance ended up introducing the
         | biggest vulnerability as a massive single point of failure.
         | 
         | But that's government regulation for you.
        
           | kragen wrote:
           | pci-dss is not a government agency but it might as well be;
           | it's a collusion between visa, mastercard, american express,
           | discover, and jcb to prevent them from having different data
           | security standards (and therefore being able to compete on
           | security)
        
             | derefr wrote:
             | You mean "and therefore requiring businesses that take
             | credit cards to enforce the union of all the restrictions
             | imposed by all six companies (which might not even be
             | possible--the restrictions might be contradictory) in order
             | to accept all six types of cards"
        
           | acdha wrote:
           | > But that's government regulation for you.
           | 
           | You misspelled "private sector". Use of endpoint monitoring
           | software is coming out of private auditing companies driven
           | by things like PCI or insurers' requirements - almost nobody
           | wants to pay for highly-skilled security people so they're
           | outsourcing it to the big auditing companies and checklists
           | so that if they get sued they can say they were following
           | industry practices and the audit firms okayed it.
        
           | babypuncher wrote:
           | this had nothing to do with government regulation, thank
           | private sector insurance companies.
        
             | czbond wrote:
             | Pci-dss is a method for card companies to allay the risk
             | onto the merchant and away from the card companies - just
             | like insurance.
        
           | moandcompany wrote:
           | It's definitely ironic, and compatible with the security
           | engineering world joke that the most secure system is one
           | that cannot be accessed or used at all.
           | 
           | I suppose one way to "stop breaches" is to shut down every
           | host entirely.
           | 
           | In the military world, there is a concept of an "Alpha
           | Strike" which generally relates to a fast-enough and strong-
           | enough first-strike that is sufficient to disable the
           | adversary's ability to respond or fight back (e.g. taking
           | down an entire fleet at once). Perhaps people that have been
           | burned by this event will start calling it a Crowdstrike.
        
           | phatfish wrote:
           | It seems government IT systems in general faired pretty well
           | the last 12 hrs, but loads of large private companies were
           | effectively taken offline, so there's that.
        
         | worstspotgain wrote:
         | I don't mean to sound conspiratorial, but it's a little early
         | to rule out malfeasance just because of Hanlon's Razor just
         | yet. Most fuckups are not on a ridonkulous global scale. This
         | is looking like the biggest one to date, the Y2K that wasn't.
        
         | martin-t wrote:
         | We as a society need to start punishing incompetence the same
         | way we punish malice.
         | 
         | Of course, we also need to first start punishing individuals
         | for intentionally causing harm through their decisions even if
         | the harm was caused indirectly through other people. Power
         | allows people to distance themselves from the act. Distance
         | should not affect the punishment.
        
           | worik wrote:
           | > We as a society need to start punishing incompetence the
           | same way we punish malice.
           | 
           | Yes
           | 
           | But competence is marketed
           | 
           | The trade names like "Crowdstrike" and "Microsoft "
        
         | worik wrote:
         | > george kurtz, ceo and founder of crowdstrike, was the cto of
         | mcafee when they did the exact same thing 14 years ago:
         | https://old.reddit.com/r/sysadmin/comments/1e78l0g/can_crowd...
         | 
         | I find it amusing that the people commenting on that link are
         | offended this called a "Microsoft " outage, when it is
         | "Crowdstrike's fault".
         | 
         | This is just as much a Microsoft failure.
         | 
         | This is even more, another industry failure
         | 
         | How many times does this have to happen before we get some
         | industry reform that lets us do our jobs and build the secure
         | reliable systems we have spent seven decades researching?
         | 
         | 1988 all over again again again
        
           | TeMPOraL wrote:
           | It's simple: the failure is not specific to the OS.
           | 
           | Crowdstrike runs on MacOS and Linux workstations too. And
           | it's just as dangerous there; the big thread has stories of
           | Crowdstrike breaking Linux systems in the past months.
           | 
           | Crowdstrike isn't needed by/for Windows, it's mandated by
           | corporate and government bureaucracies, where it serves as a
           | tool of employee control and a compliance checkbox to check.
           | 
           | That's why it makes no sense to blame Microsoft. If the world
           | run on Linux, ceteris paribus, Crowdstrike would be on those
           | machines too, and would fuck them up just as bad globally.
        
         | baxtr wrote:
         | I think the worst part of the incident is that state actors now
         | have a clear blueprint for a large scale infrastructure attack.
        
           | IAmNotACellist wrote:
           | I can think of a lot better things to put in a kernel-level
           | driver installed on every critical computer ever than a bunch
           | of 0s.
        
       | olliej wrote:
       | We can argue all we want about CI infrastructure, manual testing,
       | test nets/deployment, staged deployment.
       | 
       | All of that is secondary: they wrote and shipped code that
       | blindly loaded and tried to parse content from the network, and
       | crashed when that failed. In kernel mode.
       | 
       | Honestly it's probably good that this happened, because
       | presumably someone malicious could use this level of broken logic
       | to compromise kernel space.
       | 
       | Certainly the trust they put in the safety of parsing content
       | downloaded from the internet makes me wonder about the
       | correctness of their code for reading data from userspace.
        
       | bb88 wrote:
       | We've had security software in the past break software
       | compilation in this method by replacing entire files with zeros.
       | I'm not saying this is the case, but it wouldn't surprise me if
       | it were.
       | 
       | Basically the linker couldn't open the file on windows (because
       | it was locked by another process scanning it), and didn't error.
       | Just replaced the object code to be linked with zeros.
       | 
       | People couldn't figure out what was wrong until they opened a
       | debugger and saw large chunks of object code replaced with zeros.
        
       | slashdave wrote:
       | I don't get it. Shouldn't the file have a standard format, with a
       | required header and checksum (among other things), that the
       | driver checks before executing?
        
       | fhub wrote:
       | Anytime critical infrastructure goes down I always have a
       | fleeting thought back to "Spy Game" movie where the CIA cut power
       | to part of a Chinese city to help with a prison escape.
        
       | Thaxll wrote:
       | I'm not versed enough into windows loading dll / driver, but
       | isn't the caller able to handle that situation? Or windows
       | itself? Does loading an empty file driver can be handled in a way
       | that it does not make the OS crash?
        
       | yamumsahoe wrote:
       | thats a lot of prod to test in.
        
       | jmspring wrote:
       | Poor testing. But we also need to stop CISOs, etc doing "checkbox
       | compliance" and installing every 3rd party thing on employee
       | laptops. My prior employer, there were literally 13 things
       | installed for "laptop security" - 1/2 of them overlapped.
       | Developers had the same policy as an AE and as a Sales Engineer
       | as well as an HR person. Crowdstrike was one of the worst.
       | Updating third party packages in go was 30-40% faster in an
       | emulated arm64 VM (qemu) - virtualized disk / disk just a large
       | file - on an Intel MBP compared to doing the same operation on
       | the native system in OSX.
        
       ___________________________________________________________________
       (page generated 2024-07-19 23:03 UTC)