MosaicML Enables Cost-efficient Deployment of Generative AI Models

MosaicML Launches Inference API and Foundation Series for Generative AI; Leading Open Source GPT Models, Enterprise-Grade Privacy and 15x Cost Savings

The offering allows organizations including Replit, Stanford, and Twelve Labs to deploy custom generative AI models at unprecedented cost, quality, and scale

MosaicML, the leading Generative AI infrastructure provider, announced MosaicML Inference and its foundation series of models for enterprises to build on. This new offering allows developers to quickly, easily, and affordably deploy Generative AI models for 15x less than other comparable services. With the addition of inference capabilities, MosaicML now offers a complete, end-to-end solution for Generative AI training and deployment at the most efficient cost available today.

Generative AI models have quickly become a catalyst for innovation across industries from healthcare to financial services to e-commerce. However, off-the-shelf models have well-documented issues around data security, model transparency, and availability. Access to the alternative—custom Generative AI models—has been limited, until now.

“We believe that MosaicML Inference is a game-changer for Generative AI. It radically reduces the cost of serving large models and enables enterprises to do so in their own secure environments. Together with the MosaicML Foundation Series, enterprises now have more capabilities than ever before to achieve their own state-of-the-art AI without concerns about cost, scale, and security.” – Naveen Rao, CEO

Organizations are Building Custom LLMs on MosaicML

Today, organizations including Replit, Stanford, and Twelve Labs are building their own custom VLMs and LLMs on MosaicML because of the maximum control, privacy, and cost efficiencies it affords. MosaicML customers have found that smaller models trained on their own domain-specific data perform better than large generic models like GPT 3.5, the original model behind ChatGPT.

“Using the MosaicML platform, we were able to train and deploy our Ghostwriter 2.7B LLM for code generation with our own data within a week and achieve leading results.” – Amjad Masad, CEO, Replit

MosaicML Inference Curates the Best Open Source Models

MosaicML Inference delivers maximum flexibility and choice for developers who want to add Generative AI to their applications. Developers can choose to deploy their own custom LLMs, or choose from a curated selection of the best open source LLMs available today, including the MosaicML Foundation Series of Models, Instructor-XL, Dolly, and GPTNeoX. The cost and time advantages of MosaicML Inference are attributable to efficient ML systems engineering and optimizations that enable you to serve smaller lightweight domain-specific models.

MosaicML Inference offers two tiers for Generative AI developers to get started easily with their model deployments:

Starter Tier: Open source models curated and hosted by MosaicML are offered as API endpoints for easy starts when adding Generative AI to applications.
Enterprise tier: Custom models developed by enterprises to address specific business use cases. Model and data are fully secured in the customer’s enterprise environment.

MosaicML Foundation Series

The MosaicML Foundation Series are pre-trained GPT-style models for customers to fine tune and deploy. The LLMs in this series are in many cases higher performing than comparable open source models, with unique capabilities that go beyond GPT-4. The first set of models in the series will be open-sourced to the community starting this week.

MosaicML Inference Delivers Privacy & Control

According to a recent KPMG study of 225 US executives, while two-thirds of executives believe that Generative AI will have a major impact on their business, nearly the same percentage say they are still one or two years away from deploying extensively into their operations. Two of the main reasons? Concerns about cyber security (81%) and data privacy (78%) issues.

In addition to unprecedented cost efficiencies, with MosaicML Inference organizations can also develop and deploy their own generative AI models with complete data privacy and control. Developers can deploy on a secure cluster hosted by MosaicML or on their infrastructure of choice such as AWS, Oracle Cloud Infrastructure, and GCP. Developers can turn a saved model checkpoint into a secure, inexpensive API hosted within their own virtual private cloud (VPC) environment in under a minute. Inference data never leaves the secured environment of the user's infrastructure. MosaicML Inference also offers continuous monitoring of cluster and model metrics for enterprise-grade DevOps, ensuring complete transparency for model behavior.

Source: MosaicML media announcement

Click to learn how to make your agents super!

Follow @PipelineWire