Today I'm grateful I'm using Linux - Global IT issues caused by Crowdstrike update causes BSOD on Windows

Thorned_Rose@sh.itjust.works · 2 years ago

Today I'm grateful I'm using Linux - Global IT issues caused by Crowdstrike update causes BSOD on Windows

TCB13@lemmy.world · 2 years ago

While I don’t totally disagree with you, this has mostly nothing to do with Windows and everything to do with a piece of corporate spyware garbage that some IT Manager decided to install. If tools like that existed for Linux, doing what they do to to the OS, trust me, we would be seeing kernel panics as well.

tenchiken@lemmy.dbzer0.com · 2 years ago

Hate to break it to you, but CrowdStrike falcon is used on Linux too…

kautau@lemmy.world · edit-2 1 year ago

And if it was a kernel-level driver that failed, Linux machines would fail to boot too. The amount of people seeing this and saying “MS Bad,” (which is true, but has nothing to do with this) instead of “how does an 83 billion dollar IT security firm push an update this fucked” is hilarious

DigitalDilemma@lemmy.ml · 1 year ago

And Macs, we have it on all three OSs. But only Windows was affected by this.

Mikina@programming.dev · 1 year ago

I wouldn’t call Crowdstrike a corporate spyware garbage. I work as a Red Teamer in cybersecurity, and EDRs are bane of my existence - they are useful, and pretty good at what they do. In the last few years, I’m struggling more and more to with engagements we do, because EDRs just get in the way and catch a lot of what would pass undetected a month ago. Staying on top of them with our tooling is getting more and more difficult, and I would call that a good thing.

I’ve recently tested a company without EDR, and boy was it a treat. Not defending Crowdstrike, to call that a major fuckup is great understatement, but calling it “corporate spyware garbage” feels a little bit unfair - EDRs do make a difference, and this wasn’t an issue with their product in itself, but with irresponsibility of their patch management.

TCB13@lemmy.world · 1 year ago

Fair enough.

Still this fiasco proved once again that the biggest thread to IT sometimes is on the inside. At the end of the day a bunch of people decided to buy Crowdstrike and got screwed over. Some of them actually had good reason to use a product like that, others it was just paranoia and FOMO.

Swarfega@lemm.ee · 1 year ago

I’ve just spent the past 6 hours booting into safe mode and deleting crowd strike files on servers.

allywilson@lemmy.ml · 1 year ago

Feel you there. 4 hours here. All of them cloud instances whereby getting acces to the actual console isn’t as easy as it should be, and trying to hit F8 to get the menu to get into safe mode can take a very long time.

Swarfega@lemm.ee · 1 year ago

Ha! Yes. Same issue. Clicking Reset in vSphere and then quickly switching tabs to hold down F8 has been a ball ache to say the least!

Avatar_of_Self@lemmy.world · 1 year ago

What I usually do is set next boot to BIOS so I have time to get into the console and do whatever.

Also instead of using a browser, I prefer to connect vmware Workstation to vCenter so all the consoles insta open in their own tabs in the workspace.

Blank@lemmy.world · 1 year ago

Just go into settings and add a boot delay, then set it back when you’re done.

resin85@lemmy.ca · 1 year ago

fin@sh.itjust.works · 1 year ago

That’s hell of a strike to the crowd

areyouevenreal@lemm.ee · 1 year ago

Crowdstrike already killed some Linux machines. Let’s not pretend Windows is at fault here or Linux is magically better in this area. No one is immune from software that can run as a kernel module going bad.

capital@lemmy.world · 1 year ago

But, my superiority!

electricprism@lemmy.ml · 1 year ago

Every system has its faults. And I’m still going to dogpile the system with the most faults. But hell Microsoft did buy GitHub, Halo, MineCraft, and a million other things they will probably find a way to buy Linux and ruin it for us just like they ruin everything else.

Let’s see, …we are somewhere in between Extend and Extinguish on the roadmap.

Edit: Case & Point, RIP RedHat & IBM and GitHub CoPilot, what a great idea. RIP Atom Editor and probably a million other things. Do we have a KilledByMicrosoft website yet? I hope people in the pharmacy could get their prescriptions or we might have to add peoples names to the list.

areyouevenreal@lemm.ee · 1 year ago

None of this has to do with the current outage though.

I hope people in the pharmacy could get their prescriptions or we might have to add peoples names to the list.

Which isn’t Microsoft’s fault. Linux systems have also been taken down by Crowdstrike’s fuck ups in the recent past.

electricprism@lemmy.ml · 1 year ago

Microsoft has many faults and I’ll criticize them as I please. And if Linux is a culprit in a global outage someday I’ll contemplate criticizing them too.

This “Not Microsoft’s Fault” comes off as white knighting for Muh Billion Dolla Corporation.

Do we really need to SIMP for the company town.

Microsoft, Google, Apple, Amazon and others deserve every ounce of vitrol they earn through their shitty practices. Again I am criticizing them for being shitty not for the particulars of System X vs System Z but for the aftermath.

Wereduck@lemmy.blahaj.zone · 1 year ago

I get where you are coming from, but this event is pretty much entirely the fault of Crowdstrike and the countless organizations that trusted them. It’s definitely a show of how massive outages are more likely when things are overly centralized and proprietary, and managed by big, shitty, profit driven organizations. Since crowdstrike operates in kernel space, it doesn’t matter which operating system it’s on, it can break it if it does something stupid. In fact they managed to break some redhat machines not too long ago, and some Debian machines not long before that. It’s just the impact wasn’t as far reaching as this recent utter fuckup, just because fewer critical machines were affected, so we didn’t hear about those smaller fuckups in the news.

electricprism@lemmy.ml · 1 year ago

Yes, thank you, exactly. The centralized model has its benefits but it also can act as a single point of failure.

If I was going to analyze from an engineering perspective I would focus on when these inevitable events occur due to human error do we have adequate tools to roll back updates? Do we snapshot OS drives before updates? Is there adequate Safe Mode or Fallback Tools to diagnose which files are offending in order to allow the user to remove them.

In my view the windows user isn’t dignified to have the skills or intelligence needed to workaround a “setback” issue like the one yesterday.

It doesn’t help that NTFS is missing modern capabilities, or that there isn’t easy to use DIFF for the layman to understand which files were added to the filesystem that may be causing the breakage.

To be fair though even with those pot holes filled the entire design paradigm of Windows and a proprietary platform is part of the problem. Software is not broken up into package modules that can be assembled into a functioning system it is encumbered with “anti-piracy” boogie man where the software treats the user as an enemy and is designed to break.

Linux isn’t like that. I’ve cloned many distro drives and swapped them into new machines and with 1 or 2 tweaks they JustWork

I see many people on the net defending Microsoft as blameless for technical reasons.

My criticisms were that Microsoft just sucks as you interpreted correctly and offered a eloquent summary. Thank You.

Where I think the entire conversation should move is –

What are the design flaws that allowed this to happen?

“More Rust & Less C” I see some people suggest as this was allegedly a null pointer issue.

And is Windows Broken By Design? My opinion answer - Yes.

(Okay, and what to do about it before the next billion dollars is lost. I would think critical infrastructure should have a model similar to NixOS in immutability but that’s just my opinion.)

areyouevenreal@lemm.ee · 1 year ago

Windows does have a fallback mode called safe mode and that’s exactly what’s being used to fix this utter mess.

Package management isn’t going to save you from this as it didn’t save the Linux systems affected last time. It didn’t stop Arch Linux from failing to boot after a Grub update either.

Windows also has drive cloning tools, that isn’t unique to Linux.

NixOS isn’t immutable. It’s not an a/b root system and / isn’t read only. Rather it’s what’s known as reproducible. I am not convinced NixOS would make this any easier either given how simple the fix was. Funnily enough though tools exist called ansible and puppet for configuring systems in repeatable ways that apply to both other Linux systems, Windows systems, and even macOS.

There are like one or two valid points in this whole comment and the rest is pretty much falsehoods and misconceptions.

Edit: Forgot to mention tools exist to make Windows immutable as well. So that is an option.

lemmyreader@lemmy.ml · 1 year ago

Windows does have a fallback mode called safe mode and that’s exactly what’s being used to fix this utter mess.

The other fix was reboot your Windows computer at least 15 times.

https://arstechnica.com/information-technology/2024/07/crowdstrike-fixes-start-at-reboot-up-to-15-times-and-get-more-complex-from-there/

Package management isn’t going to save you from this as it didn’t save the Linux systems affected last time. It didn’t stop Arch Linux from failing to boot after a Grub update either.

Not everyone was affected though :

https://web.archive.org/web/20240324083115/https://endeavouros.com/news/full-transparency-on-the-grub-issue/

How come not everyone was impacted?

Prior to the most recent version, grub only registered the fwsetup if detected support. If your machine detected support, you would have had the fwsetup command registered and the failure wouldn’t occur.

areyouevenreal@lemm.ee · 1 year ago

Except they haven’t done anything shitty this time. What you are doing would be a bit like claiming the Nazis are responsible for micro plastics. Like yeah Nazis are shit but making false allegations is just giving their defenders something to throw in your face. It makes you, and everyone who is critical of Microsoft look dumb. How about you criticize the company that actually screwed up? They are also a multi-billion dollar company, yet you aren’t blaming them for something that is clearly their fault.

Catsrules@lemmy.ml · 1 year ago

Sure you can criticize as much as you want but if you are wrong in your criticism it just damages all of your criticism over all.

In my opinion it is important to state facts not fiction. This was not Microsoft’s fault, no matter how much you hate Microsoft it still wasn’t there fault and saying that is was is incorrect and doesn’t solve the issue.

areyouevenreal@lemm.ee · 1 year ago

Well said, that’s one of the points I have been trying to get across.

areyouevenreal@lemm.ee · 1 year ago

Also fyi Red Hat and IBM are still around and aren’t really a force for good anyway. Stop SIMPing for large companies.

LeFantome@programming.dev · 1 year ago

Hilarious. I am sure that, out of principle, you have stopped using all the software that Red Hat contributes to your distribution.

If it is ok with you, I am not going to define my morality in terms of corporate interest. They are not my friends but I do not believe that shutting on their contributions does much for me either.

areyouevenreal@lemm.ee · 1 year ago

I am not shitting on their contributions. All I am saying is that as a large company they aren’t anymore my friend than Microsoft. Generally they still exist and make contributions. Microsoft didn’t kill them like the person I am replying to is insinuating.

axzxc1236@lemm.ee · 1 year ago

I am born too late to understand what Y2K problem was, this (the result) might be what people thought could happen.

cannedtuna@lemmy.world · 1 year ago

Kinda I guess. It was about clocks rolling over from 1999 to 2000 and causing a buffer overflow that would supposedly crash all systems everywhere causing the country to come to a hault.

caseyweederman@lemmy.ca · 1 year ago

And it was okay because a lot of people worked really really hard to make it be okay.

Hildegarde@lemmy.world · 1 year ago

Most old systems used two digits for years. The year would go from 99 to 0. Any software doing a date comparison will get a garbage result. If a task needs to be run every 5 minutes, what will the software do if that task was last run 99 years from now? It will not work properly.

Governments and businesses spent lots of money and time patching critical systems to handle the date change. The media made a circus out of it, but when the year rolled over, everything was fine.

Aceticon@lemmy.world · 1 year ago

Also a lot of people were “on call” to handle any problems when the year changed, so the few problem that had passed unnoticed when doing the fixed and did pop up when the year changed, got solved a lot faster than they normally would.

cannedtuna@lemmy.world · 1 year ago

We also got the worst version of Windows ever, ME. Tho maybe with all the BS they’ve done with 11 that might change.

ikidd@lemmy.world · 1 year ago

I’d use ME before the adware that is the current version. It wasn’t that bad, it was just Win98 with some visual slop on top that crashed slightly more often.

zod000@lemmy.ml · 1 year ago

I’m not sure I’d stick to calling it the worst version “ever” since MS is trying really hard to out do themselves.

pirat@lemmy.world · 1 year ago

Millennium Editions ruin everything!! 🤬

caseyweederman@lemmy.ca · 1 year ago

Y2K was going to be the end of civilisation. This was basically done by the time I woke up today.

DigitalDilemma@lemmy.ml · 1 year ago

Am on holiday this week - called in to help deal with this shit show :(

Botzo@lemmy.world · 1 year ago

Don’t worry, George Kurtz (crowdstrike CEO) is unavailable today. He’s got racing to do #04 https://www.gt-world-challenge-america.com/event/95/virginia-international-raceway

Reddfugee42@lemmy.world · 1 year ago

Most people are completely oblivious because it only affects people using crowdstrike, which practically excludes general consumers.

0ops@lemm.ee · 1 year ago

I just had an Amazon package delayed for a week it says. It doesn’t name names but…

A small number of deliveries may arrive a day later than anticipated due to a third-party technology outage.

bricklove@midwest.social · 1 year ago

I wanted to share the article with friends and copy a part of the text I wanted to draw attention to but the asshole site has selection disabled. Now I will not do that and timesnownews can go fuck themselves

cryoistalline@lemmy.ml · 1 year ago

heres the entire article

Latest Crowdstrike Update Issue: Many Windows users are experiencing Blue Screen of Death (BSOD) errors due to a recent CrowdStrike update. The issue affects various sensor versions, and CrowdStrike has acknowledged the problem and is investigating the cause, as stated in a pinned message on the company’s forum.
Who Have Been Affected
Australian banks, airlines, and TV broadcasters first reported the issue, which quickly spread to Europe as businesses began their workday. UK broadcaster Sky News couldn’t air its morning news bulletins, while Ryanair experienced IT issues affecting flight departures. In the US, the Federal Aviation Administration grounded all Delta, United, and American Airlines flights due to communication problems, and Berlin airport warned of travel delays from technical issues.
In India too, numerous IT organisations were reporting in issues with company-wide. Akasa Airlines and Spicejet experienced technical issues affecting online services. Akasa Airlines’ booking and check-in systems were down at Mumbai and Delhi airports due to service provider infrastructure issues, prompting manual check-in and boarding. Passengers were advised to arrive early, and the airline assured swift resolution. Spicejet also faced problems updating flight disruptions, actively working to fix the issue. Both airlines apologized for the inconvenience caused and promised updates as soon as the problems were resolved.
Crowdstrike’s Response
CrowdStrike acknowledged the problem, linked to their Falcon sensor, and reverted the faulty update. However, affected machines still require manual intervention. IT admins are resorting to booting into safe mode and deleting specific system files, a cumbersome process for cloud-based servers and remote laptops. Reports from IT professionals on Reddit highlight the severity, with entire companies offline and many devices stuck in boot loops. The outage underscores the vulnerability of interconnected systems and the critical need for robust cybersecurity solutions. IT teams worldwide face a long and challenging day to resolve the issues and restore normal operations.
What to Expect:

-A Technical Alert (TA) detailing the problem and potential workarounds is expected to be published shortly by CrowdStrike.
-The forum thread will remain pinned to provide users with easy access to updates and information.

What Users Should Do:

-Hold off on troubleshooting: Avoid attempting to fix the issue yourself until the official Technical Alert is released.
-Monitor the pinned thread: This thread will be updated with the latest information, including the TA and any temporary solutions.
-Be patient: Resolving software conflicts can take time. CrowdStrike is working on a solution, and updates will be posted as soon as they become available.

In an automated reply from Crowdstrike, the company had stated: CrowdStrike is aware of reports of crashes on Windows hosts related to the Falcon Sensor. Symptoms include hosts experiencing a blue screen error related to the Falcon Sensor. The course of current action will be - our Engineering teams are actively working to resolve this issue and there is no need to open a support ticket. Status updates will be posted as we have more information to share, including when the issue is resolved.
For Users Experiencing BSODs:
If you’re encountering BSOD errors after a recent CrowdStrike update, you’re not alone. This appears to be a widespread issue. The upcoming Technical Alert will likely provide specific details on affected CrowdStrike sensor versions and potential workarounds while a permanent fix is developed.
If you have urgent questions or concerns, consider contacting CrowdStrike support directly.

stringere@sh.itjust.works · 1 year ago

If you have urgent questions or concerns, consider contacting CrowdStrike support directly.

Something tells me that isn’t going to provide the comfort it was meant to.

ArrogantAnalyst@infosec.pub · 1 year ago

It is annoying. Some possible solutions:

On desktop: Using Shift + ALT you often can overrule this and select text anyway.

On mobile: Using the reader mode or the Print preview often works. It does for me on this website.

Asidonhopo@lemmy.world · 2 years ago

US and UK flights are grounded because of the issue, banks, media and some businesses not fully functioning. Likely we’ll see more effects as the day goes on.

isolatedscotch@discuss.tchncs.de · 1 year ago

after reading all the comments I still have no idea what the hell crowdstrike is

Ok_imagination@lemmy.world · 1 year ago

AV, EDP they offer other solutions as well. I think their main selling point is tamper-proof protection as well.

abbiistabbii@lemmy.blahaj.zone · 1 year ago

We’re all going to be so smug.

beeng@discuss.tchncs.de · 1 year ago

This is exactly why centralisation of services and large corporations gobbling up smaller companies and becoming behemoth services is so dangerous.

Its true, but otherside of same coin is that with too much solo implementation you lose benefits of economy of scale.

But indeed the world seems like a village today.

DigitalDilemma@lemmy.ml · 1 year ago

you lose benefits of economy of scale.

I think you mean - the shareholders enjoy the profits of scale.

When a company scales up, prices are rarely reduced. Users do get increased community support through common experiences especially when official channels are congested through events like today, but that’s about the only benefit the consumer sees.

Treczoks@lemmy.world · 1 year ago

Same here. I was totally busy writing software in a new language and a new framework, and had a gazillion tabs on Google and stackexchange open. I didn’t notice any network issues until I was on my way home, and the windows f-up was the one big thing in the radio news. Looks like Windows admins will have a busy weekend.

SquigglyEmpire@lemmy.world · 1 year ago

Only if they manage Crowdstrike systems, thankfully.

Thorned_Rose@sh.itjust.works · 2 years ago

For reference, this was the article I first read about this on: https://www.nzherald.co.nz/nz/bank-problems-reports-bnz-asb-kiwibank-anz-visa-paywave-services-down/R2EY42QKQBALXNF33G5PA6U3TQ/

Today I'm grateful I'm using Linux - Global IT issues caused by Crowdstrike update causes BSOD on Windows

Today I'm grateful I'm using Linux - Global IT issues caused by Crowdstrike update causes BSOD on Windows

Latest Crowdstrike Update Causes Blue Screen Of Death On Microsoft Windows, Multiple Users Affected