Recent measurements show that depending on the particular public GenAI systems deployed, the range of occurrence of simple, easily detected hallucinations ranges from 3 percent to 27 percent and may be even higher. No research data is available on hallucinations in the operations domain. However, there is data in the legal and medical domains that share many properties with operations.
Medical diagnosis is similar to network problem identification. Hallucinations in the legal domain are “pervasive and disturbing…ranging from 69 percent to 88 percent." Hallucinations in medical diagnostics are also high. A recent study found that in the best GenAI output “30 percent of individual statements are unsupported and nearly half of its responses are not fully supported.” This doesn’t necessarily mean that the diagnoses are wrong. But it does raise serious doubt about their validity.
Another problem with hallucinations is maintenance of error. Often GenAI systems maintain that errors are true and correct even when confronted with hallucinations that have been proven false. This can make it more difficult for staff to rely on GenAI operations output. There is also evidence of GenAI systems manipulating people and other systems. They appear to do this to prompt hallucinatory actions.
Another aspect of the appearance of GenAI is that bad actors are using it to create new and more frequent, constantly changing dynamic attacks. Previously installed security operations systems have trouble quickly identifying these dynamic attacks and determining/implementing remediation. To make matters worse, when problems become apparent, it is becoming increasingly difficult to tell if it is an operational or security problem.
Operations functions tend to fall into four basic categories: subsystem installation/provisioning; problem identification; problem remediation; and subsystem retirement. Hallucinations in provisioning and retirement can cause dramatic problems. The big push has been to use GenAI in problem identification and problem remediation. This is where the greatest risks are.
On the problem identification side, adding uncontrolled hallucinations to the mix is just going to make things worse. More false positives and false negatives. Operations teams already suffer from alert fatigue. That is, too many false positive alerts that exhaust resources and divert attention from the real problems. The GenAI quickly spreading dynamic attacks are making problem identification harder. They don’t fit the widely deployed pattern recognition systems, or don’t fit them fast enough. So secondary and tertiary effects show up as operational problems. This makes root cause analysis more difficult.
On the remediation side operating our large complex and volatile networks has become increasingly challenging. As the technical problem has grown, the critical infrastructure nature of these networks has resulted in constantly increasing demands for improved performance, reliability, privacy, and security while also controlling costs. This means dramatically shortening latency, i.e., finding problems and performing remediation in fractions of a second.
These latency requirements can’t be fully met by manual operators alone. Operations needs automated systems. If we introduce GenAI into automated operations, without necessary controls, we risk hallucinations. Hallucinations in problem identification and remediation. Hallucinations creating serious problems that the GenAI systems maintain are not there. Problems that may be so obscure that manual operations staffs will struggle to find the root cause. In our modern networks, one small change in a parameter at one edge of the network can cause serious trouble at another corner of the network. Or, remediating a security attack that is not there producing other problems.
We already have issues with Shelf/Ware. That is software that Boards and senior executives commit to, but operations staffs are afraid to install and use. We don’t need more Shelf/Ware. So, organization leadership and operations staff have to maintain good lines of communication surrounding GenAI. All must make themselves aware of the hallucination and cybersecurity problems inherent in the technology. This requires ongoing efforts to upgrade background knowledge in both groups. Doing so is