LLMs on Commodity Hardware: What,
Where, Why AI PCs Are Already Here

ORDER REPRINTS DOWNLOAD COMMENT DISCUSS SHARE

Can LLMs be made to avoid hallucinations, refuse to produce dangerous output, be empathetic to human needs...to avoid these negative side effects?

exceeding that of LLaMA-2-7B-chat, which is roughly twice its size. Phi-2-2.7B and the tiny-vicuna-1B model show that in the case where a small model is known to produce sufficient inference quality, even CPU-only execution results in adequate inference speed. Even the low-power i7 8550U can produce double-digit tokens-per-second with these small models, and even its worst efficiency rating competes well with that of the Xeon. It’s a slightly mis-matched comparison since the design points of the laptop and server chips are very different, but the comparison is nonetheless revealing.

The main conclusion from this data is that, as shown, ubiquitous LLM deployment is possible today. Apple already has announced that its next notebooks will be more specialized to provide even better performance. Both AMD and Intel have announced that forthcoming x86 chips will have enhanced AI capabilities. We can expect these developments to be followed by improved chips for smartphones and tablets.

Implications

The implications of end users running LLMs on their own machines are quite extensive. So extensive that they deserve their own article. Here are some brief summary considerations.

Individual users working with personal computers running LLMs that they can choose and tailor to their specific needs may mean the user will have more options than merely going online to a multi-user public GenAI system. There are also significant advantages to using local LLMs. On the ethical use side, running in this mode can avoid IP leakage. It can also allow for safe and ethical cybersecurity testing. Both are serious concerns now on multi-user public systems. These Personal LLMs can also be tailored for the specific needs of the user by improving performance, lowering power consumption, using LLMs tailored for specific disciplines, etc.

These Personal LLMs can, however, also remove the guard rails against creating cyber attacks, pornography, social engineering, fake news, etc.

Then there are the implications of embedding LLMs in our networks, infrastructure, appliances, etc. This can have very beneficial effects, but, on the other hand, it can have negative side effects such as hallucinations, which are the creation of undetected false responses. One of the central questions surrounding LLM deployment concerns so-called alignment: Can LLMs be made to avoid hallucinations, refuse to produce dangerous output, be empathetic to human needs and so on to avoid these negative side effects? Here again, this is a big question that will require significant community efforts and resources. This is a subject deserving an article of its own.

One thing that is clear: the social impact of personal LLMs will be significant. The impacts need to be studied and reported on.

Conclusion

The fact that GenAI LLMs can run with okay single user performance on today’s laptop computers has been shown. Soon to be released laptops will provide even better LLM performance. Smartphone, tablet, and IoT LLM platforms will soon follow. The implications for both good and bad are apparent but not totally clear. More study and analysis of the likely future path and consequences are called for.