AI + Cybersecurity:
Infrastructure Needs a New Model

By: Sam Jones

Cybersecurity systems are ripe for disruption. Over the years, individual tools have proliferated, each with its own data format, causing a deluge of disparate data. In addition, there is a global shortage of skilled cybersecurity analysts who can evaluate that data (and they are very expensive if you can find them). Finally, and perhaps most critically, hackers are getting smarter and more creative all the time.

Artificial intelligence was supposed to be the cure for these issues, but it has been of limited use in addressing the problem at scale because it requires large, thoughtfully planned infrastructure. In this article, we’ll look at the role of AI in cybersecurity systems and how it can become a truly transformative technology.

AI as snake oil

AI is mentioned a lot in marketing literature describing cybersecurity solutions, but so far, it hasn’t been as transformative as you might think. Despite a market size that grows at a 20.5 percent compound annual growth rate, AI still remains operationally difficult to deploy on security problems. If you were to walk into a modern security operations center (SOC), you’d probably find some big TVs with some difficult-to-read dashboards and CNN, and security analysts that likely find their jobs painful because they are spending their time manually correlating data and trying to discern what is happening at their enterprise in the face of ever-more-complex attacks. If humans are doing this, it begs the question, “Where is the AI?”

Cybersecurity is a messy operational problem, and this is the simple reason why AI has been slow to transform it. Finding threats in an enterprise across hundreds of sources of telemetry when threats often look identical to normal activity is a very difficult problem. Moreover, data from each security tool can take different forms, and it must be normalized before it can be used to train an AI system.

Regardless of the industry and use case, AI learns from data: the AI engine must be trained with data so it can begin to learn what is or isn’t an anomaly. This is what is so messy about the security problem: every enterprise’s security data looks, at minimum, a little different, with different tools and behavior patterns, and at maximum, the data looks wildly different. There is no golden training dataset in security that can be licensed like there might be for image or speech recognition systems. If you want to use AI to address the security problem, you have to create and acquire your own data.

Normalizing data so it’s useful to an AI engine is a huge challenge. The problem is so valuable that Scale AI, a startup that creates data APIs for AI development primarily focused on driverless car applications, snagged a $7 billion valuation less than five years after its founding. Scale AI already counts many of the world’s most innovative organizations as its customers.

What transformative AI will take

AI in security will eventually be transformative, likely both for offense and defense, but that is a story for another day. Here, “transformative” means broadly transformative, across all parts of security, so it fundamentally alters how an enterprise goes about security. For now, we have to be content with some limited applications where AI can improve security.

Still, there are some bright spots for AI in security; these are easy to find by thinking through the data problem. What parts of the security stack generate clean, trainable data? Email fraud and malware detection are two great examples: the AI engine can learn from available phishing examples or malware signatures and spot similar exploits. Data across customer emails and malware sandboxes can be used to train AI models that power enterprise products. The same training is much harder to implement on problems like detecting attacks that move laterally through a network


Latest Updates

Subscribe to our YouTube Channel