All Things You Should Know About Testing and DevOps: #Process

Monday, 22 July 2024

Microsoft’s ‘Blue Screen of Death’ is a scapegoating mechanism or process needs to be enhanced?

CrowdStrike is an American cybersecurity firm which specializes in web/cloud based anti-virus software’s, it possesses advanced threat detection capabilities, real-time response, and cloud-native architecture.

Recent Microsoft’s ‘Blue Screen of Death’, was caused due to, the CrowdStrike released update for windows that had a bug.

The latest CrowdStrike patch executes in the Kernel mode and monitors system activities at near to the ground level to monitor the system or server resources.

As CrowdStrike latest patch is able to access the invalid memory location, it generated a ‘Blue Screen of Death’.

It is said that a recently joined employee in CrowdStrike, as System Administrator, optimized the code base, such as he just updated a single line of code, which caused the CrowdStrike security patches in Windows systems or server’s failures, and he has been now thrown out from his job.

MY VIEW POINT ON WHAT CROWDSTRIKE SPECIFIES:

When a new security patch or release is planned, then first the requirement refinements would have been planned and conducted.

Next, the Change Management Board which will analyses these changes and will give a go- or no-go decision, in both meeting the risks, and impacts which would be analysed, discussed, and documented in detail.

Once these requirement, risks and impacts are finalized, the development starts, here the as employee who developed is new to the organization, during and after the development, his work and deliverables, would be monitored and will be reviewed by a senior employee.

The newly developed security patch would have been tested multiple times during the development and reviewed by the senior employee or employees and Unit test cases should have been prepared, in this case, tested the codebase has been placed on the Infrastructure pipeline.

Next the team lead, should have, reviewed the work delivered. According to the process followed in CrowdStrike, Unit test cases or the infrastructure pipeline should have been created and tested.

Then in the functional testing, manual, automated regression, non-functional such as security or performance tests, might has been conducted by the Testing team in the CrowdStrike.

Finaly Product Owner or the manager, should have reviewed all the deliverable including the test results conducted at various level and will approve the new security patch to the production.

Now from the Microsoft side, when new security or patch for the Windows or servers, are delivered by the partners, the intake will be tested, at multiple levels discussed earlier, and at different product vision such Window 10, 11, Windows 2016 server etc.

Now from the companies installing or applying the security or patch to their servers or window machines, should have tested it in their own sandbox.

Best practice, whenever new patch is released by any vendors in this case Microsoft, it will be tested in sandboxes, and N-1 patch will be released to all environment or present in production, and after the through testing Nth version of patch would be released.

So, to conclude, CrowdStrike has not reviewed or tested the security patch, Microsoft has not tested the incoming update from the vendor, the companies who are implementing the patches have less validated process for installing the patch and finally one employee who has developed this has been made as a scapegoat and has been asked to leave.

When a process fails, it is always good to learn incident and to avoid those scenarios in future, and improve the process in place, instead of blame gaming and scapegoating.

Kindly provide your value thought as review comments.