The Cyber Security Remediation Bottleneck

Manual remediation today has serious difficulty keeping up. It is often too slow to prevent damage and sometimes, out of fear of crashing the system, not performed at all.

Cyber crime is getting more and more difficult to deal with because of system complexity, volumes of data, and reduced cycle times — and all while more and more lives depend on data. As digitization proceeds, the volume of data that behavioral analysis systems have to deal with is growing dramatically. Some are experiencing petabyte data loads and worry that the load will accelerate.  

Cycle time refers to the time it takes for attackers to come to the conclusion that their attack has been compromised, and then modify the attack. This is particularly important for behavioral analysis systems because the change in attack results in a new behavior pattern to look for.  If remediation doesn't happen fast enough, an organization may have a compounding set of rogue software and unauthorized intrusions simultaneously. This compounding makes further detection and remediation very difficult.  It also hampers the ability of an organization to warn those impacted by the attack. This leads to further losses as seen in the Yahoo and Equifax cases.

All this is happening against a background of increasing system complexity, driven by layers of technologies, organizational units, vendor proprietary products, and other variables. 

Lives depend more and more on digital systems. Large-scale examples include air traffic control, electric utility operations, and health care systems. Consider that there is an average of three to six connected devices on hospital patients at any given moment. There is also the case of autonomous vehicles.

Remediation Today

Manual remediation today has serious difficulty keeping up. It is often too slow to prevent damage and sometimes, out of fear of crashing the system, not performed at all.

Remediation is generally performed by an SOC (Security Operations Center) that functions 24/7.  Large organizations maintain an internal SOC, contract with an external entity, or a combination thereof.  External companies, generally called an MSSP (Managed Security Service Provider), are often provided by telcos' Enterprise and Government Business Units. To do this, a telco either develops an internal capability and white labels an external MSSP, or openly subcontracts to an external MSSP.

Once an attack has been detected, it is up to the SOC to remediate. Typical remediation includes activities such as: changing firewall settings; disabling IP addresses; quarantining a system component; installing a patch; initiating a system restore function; rebooting a component; reloading software from a known good source; reconfiguring a system component; threat hunting; and so on.

These remediation functions today are provided manually from the SOC. In a piece in Forbes entitled "Take Human Error, Inertia Out Of Security," Larry Ellison of Oracle is quoted as saying, “Why is it that the worst data thefts have occurred after a software patch was available to prevent the system vulnerability that the hackers ultimately exploited? It’s often because the target organization never applied the patch.”

The problem is that to perform these functions manually it is best to have a staff person who is fully knowledgeable about the underlying technology.  It ranges from very expensive to impossible to have a complete contingent of staff with expertise in all the technologies available all the time. This is because of the complexity and volatility of today’s systems and the many layers of legacy and emerging technologies. It is generally referred to as the SOC Staffing Problem.

One common approach to the Staffing Problem is to use “playbooks". That is a step-by-step handbook for each type of technology and each type of threat.  Unfortunately, manual implementation of playbooks can lead to serious problems. A competent staff member can encounter difficulties that cannot be dealt with using a playbook on a technology with which they are not familiar.  For example, such a staff person using a playbook on a portion of the S3 Corp. system, inadvertently hit a wrong key. In the S3 case, the staff person did not understand the underlying technology. As a result, that person could not recover from the keystroke error, and could only watch as a series of cascading system failures brought down the entire network. It took most of a business day to bring the network back up. Since S3’s business is the provision of service through its network, this meant that the company was out of business for a day with serious direct financial and brand value damage. This incorrect key problem is often called the “fat finger” problem.  

In the words of Jon Oltsik, senior principal analyst, Enterprise Strategy Group, “Today’s security operations teams are experiencing pain — too many manual processes, too many disconnected point tools, and a real shortage of the right skills. Manual remediation is time-intensive and can be prone to human errors.”

Because manual remediation is often slow to respond (days, weeks, sometimes even months), this can also lead to attempts to cover up the breach.  Recently, there have been a number of such breach cover-ups that have received a lot of negative attention. It is reasonable to assume that there are at least an equal number of breach cover-ups that have not yet been exposed.


Latest Updates

Click to Discover>

Subscribe to our YouTube Channel