Qualcomm Drives Digital Evolution for Toyota Rav4 with AI

Qualcomm Unveils AI200 and AI250 – Redefining Rack-Scale Data Center Inference Performance for the AI Era

Qualcomm Technologies, Inc. today announced the launch of its next-generation AI inference-optimized solutions for data centers: the Qualcomm AI200 and AI250 chip-based accelerator cards and racks. Building off the Company’s NPU technology leadership, these solutions offer rack-scale performance and superior memory capacity for fast generative AI inference at high performance per dollar per watt—marking a major leap forward in enabling scalable, efficient, and flexible generative AI across industries.

Key Highlights:

Qualcomm AI200: A purpose-built rack-level AI inference solution designed to deliver low total cost of ownership (TCO). It supports 768GB of LPDDR per card, enabling exceptional scale for large language and multimodal models (LLM, LMM).
Qualcomm AI250: Debuts an innovative near-memory computing architecture, providing a generational leap with greater than 10x higher effective memory bandwidth and significantly lower power consumption.
Software Ecosystem: Both solutions feature a rich software stack with seamless compatibility for leading AI frameworks and "one-click" deployment of Hugging Face models via the Qualcomm AI Inference Suite.
Roadmap: These products are part of a multi-generation data center roadmap with an annual cadence. The AI200 and AI250 are expected to be commercially available in 2026 and 2027, respectively.

“These innovative new AI infrastructure solutions empower customers to deploy generative AI at unprecedented TCO, while maintaining the flexibility and security modern data centers demand,” said Durga Malladi, Senior Vice President & General Manager, technology planning, edge solutions & data center, Qualcomm Technologies, Inc. “Our rich software stack and open ecosystem support make it easier than ever for developers and enterprises to integrate, manage, and scale already trained AI models.”

Enterprise-Grade AI at Scale The Qualcomm AI200 and AI250 solutions are designed to address the most pressing challenges in the data center today: energy efficiency and memory bottlenecks. By leveraging the Qualcomm Hexagon NPU, these accelerators allow enterprises to run massive models locally with reduced latency and power overhead.

Developers also benefit from the Qualcomm Technologies’ Efficient Transformers Library, which simplifies model onboarding and operationalizes AI agents for real-world applications.

Source: Qualcomm media announcement

Follow @PipelineWire