others to filter false positives, and if true positives, stop the breach and fix the problems it caused. Leaving it to others to complete the process of identification and remediation cuts across the rest of an organization’s cybersecurity tools. It is not unusual for a mid- to large-size organization to have 150 or more separate cybersecurity tools in use. Each works well within its own sphere—but not with others, particularly when they come from different vendors. Staff are left with the job of manually tying all these together. In addition, they must determine what is a real breach, not a false positive; what remediation is required; and what manual intervention in technically challenging, complex, volatile systems to actually perform for this remediation. This leads to lots of security holes that attackers take advantage of.
Large-scale use of AI systems was initially targeted at selling products. This involves determining which consumers were good targets for a particular product, and what, when, and how to most effectively present content to get them to buy. Inference engines were built on existing racks of X86 and ARM processors (CPUs) deployed in central site cloud systems. Timely response was important. Systems needed to figure out what to show a person visiting a web site before they would click and go to another page. Existing processors were slow and consumed a lot of expensive electricity.
As gaming technology evolved, graphic processing units (GPUs) for gaming PCs were developed. People began using the GPUs to accelerate inference. Over time, the CPU and GPU vendors started customizing their chips for AI. NVIDIA became one of the leading GPU vendors and started offering an AI chip in the $1,500 range. Such chips demonstrated that there was a very attractive market opportunity for AI accelerator chips.
Today there are approximately 40 chip startups addressing the AI market, with still others in stealth mode. Some are focused on other segments such as autonomous vehicles. The others tend to be focused on central site AI. In general, they each have an innovative architecture that seeks to increase speed and decrease power consumption very dramatically. Below are four examples.
Esperanto’s architecture uses 1,000 processors on a chip to generate lots of processing horsepower to speed up AI inference while keeping power consumption low. According to company materials, Esperanto’s chip includes working silicon and rack-mounted systems that interconnect large numbers of chips. The chips interface to the most widely used types of inference models.
Groq’s architecture breaks down processors into their constituent parts, then arranges a matrix of them in a bucket brigade fashion. Argonne National Laboratories says this dramatically speeds up inference. According to Groq, their architecture allows very efficient data stream-oriented processing that can apply a single data input stream to many models. Like Esperanto, Grog incorporates working silicon and rack-mounted systems and the chips interface to the most widely used types of inference models. Furthermore, Grog offers software development tools end users can use.
Cerebras' architecture uses wafer scale integration—a single piece of silicon the size of a dinner plate. This dinner plate consumes 20,000 watts of power, so it needs a large, sophisticated cooling system. The company appears to be focused on training AI models, where they can reduce training times from weeks to minutes.
Abacus Semiconductor Corporation's (ASC) architecture is focused on making the interconnect between processors more efficient. They describe how current processors in multi-chip arrays spend 95 percent of their time handling communication between processors or waiting for results to come back from another processor. Reducing these delays can make ASC systems far faster and more power-efficient than the others. According to their materials, they have interfaces that support common AI models, but also common general purpose cloud apps, and high-performance scientific apps. They do not say that they have working silicon.
The creation of new chips introduces the possibility of creating two new types of threats. First, the new chips themselves will be subject to attack. It is possible that the designers have taken precautions against that, but there is always the possibility that the new architectures will open up new vulnerabilities. Possibly more significant is