SUBSCRIBE NOW
IN THIS ISSUE
PIPELINE RESOURCES
Hippocratic AI Launches as First Safety-Focused Model for Healthcare

Hippocratic AI Launches to Build Safety-Focused Large Language Model for Healthcare

The company’s LLM has passed 100+ healthcare certifications and exceeded GPT-4 and other commercial models’ performance on those same benchmarks. The company has also developed a novel benchmark measuring the bedside manner of large language models to ensure emotional well-being of patients

Hippocratic AI launched out of stealth to announce the industry’s first safety-focused Large Language Model (LLM) designed specifically for healthcare, as well as a $50M seed round co-led by General Catalyst and Andreessen Horowitz.

Large language models (LLMs) and Foundation Models (FMs) like ChatGPT and GPT-4 have surprised the world with their abilities. While researchers have shown that these AI models can pass the USMLE (US Medical Licensing Exam), no company has built a commercial model specifically tuned for healthcare applications. Hippocratic AI is building the first LLM for Healthcare with an initial focus on non-diagnostic, patient-facing applications. This will allow the company to ensure patient safety while improving healthcare access and outcomes.

"The healthcare industry needs its own AI platform, one that is focused on empowering the workforce, reducing burnout, and improving patient safety and experiences with the healthcare system. We joined forces with the Hippocratic AI team, our health assurance ecosystem, and the a16z team to build this platform. Our goal is to fundamentally increase the supply and scalability of healthcare professionals. This is the key to achieving the health assurance vision: a more proactive, more affordable, and equitable system of care for all," said Hemant Taneja, CEO and Managing Director at General Catalyst.

Hippocratic AI was founded by a group of physicians, hospital administrators, Medicare professionals, and artificial intelligence researchers from El Camino Health, Johns Hopkins, Washington University in St. Louis, Stanford, UPenn, Google, and Nvidia.

"After working with Munjal and team for years in his prior company, we know that his lived experience as a healthcare and tech operator gives him an edge in understanding what it takes to bring high-ROI products to market - especially at a time when existing industry players are in such dire need of better operating leverage and financial sustainability. We believe Hippocratic AI’s cross-disciplinary, safety-first approach is what the healthcare industry needs to be able to maintain trust in the power of responsible deployment of generative AI solutions," said Julie Yoo, General Partner at Andreessen Horowitz.

To build a safer large language model the company has focused on three main things: certification, RLHF via healthcare professionals, and bedside manner.

Certification

Passing the USMLE is not enough to ensure a model is ready for the wide variety of healthcare roles that exist in care and payor settings. Therefore, Hippocratic AI focused on testing its model on a wide variety of 114 healthcare certifications and roles. The company also strived to not just get a passing score but to outperform existing state-of-the-art language models such as GPT-4 and other commercially available models. The company was able to outperform GPT-4 on 105 of the 114 tests and certifications, outperform by 5% or more on 74 of the certifications, and outperform by 10% or more on 43 of their certifications. Below are some sample results. Full results here: (www.HippocraticAI.com/benchmarks)

 

Name

Commercial
LLM #1

Commercial
LLM #2

GPT-4

Hippocratic

Δ Improvement
vs Best
Competitor

NAPLEX

North American
Pharmacist
Licensure
Examination

51.0%

0.0%

70.9%

91.1%

20.2%

NCLEX-RN

Registered Nurse

58.8%

25.8%

76.2%

88.6%

12.4%

CPNP-AC

Acute Care
Certified Pediatric
NP

64.0%

22.0%

86.7%

96.0%

9.3%

CPC

Certified
Professional
Coder

54.7%

50.0%

65.3%

71.0%

5.7%

ABOG

American Board of
Obstetrics and
Gynecology
Licensing Exam

44.00%

24.00%

80.30%

92.33%

12.03%

ABU

American Board of
Urology -
Licensing Exam

42.09%

24.24%

67.30%

77.10%

9.80%

Hospital Safety
Training

Hospital Safety
Training
Compliance Quiz

39.4%

27.3%

48.5%

72.7%

24.2%

RD

Registered
Dietician

57.1%

46.9%

71.4%

83.7%

12.3%

CLC

Certified Lactation
Consultant

60.9%

51.7%

79.3%

98.9%

19.6%

CPCO

Certified
Professional
Compliance
Officer

60.7%

54.0%

67.3%

86.0%

18.7%


RLHF with Healthcare professionals

Hippocratic AI has decided that the best people to determine LLM readiness for deployment in the healthcare system are the experts who serve in that role in today’s system. In large language models, there is a technique to mold the AI using human feedback: Reinforcement Learning with Human Feedback (RLHF). Many believe this technique is what led to the remarkable performance of ChatGPT compared to that of prior versions of OpenAI’s language models.

In building Hippocratic AI, the company has engaged healthcare professionals to help guide and train the LLM by rating its responses.

“RLHF with healthcare professionals isn’t just a feature but is really our commitment to partner deeply with the industry,” said Munjal Shah, Co-Founder and CEO of Hippocratic AI. “We aren’t just saying these professions will help us evaluate our system. We are saying we won’t launch each unique role for the LLM unless the professionals who do that exact task today agree the system is ready and safe.”

Some of the roles and tasks the company is exploring include patient navigator, dietician, genetic counselor, enrollment specialist, medication reminders, and more.

Bedside Manner

“In healthcare settings, it isn’t just important to answer the patient accurately. It is equally important that it is done with great bedside manner. Many studies have shown that bedside manner impacts emotional well-being and quality of outcomes. This isn’t just true for doctors but also true for everyone interacting with patients: billing agents, schedulers, and more,” said Meenesh Bhimani MD, Co-Founder and Chief Medical Officer of Hippocratic AI.

To date there are no benchmarks for evaluating the bedside manner of a language model when interacting with patients. Hippocratic AI will be releasing the first of many bedside manner benchmarks for the entire community to use. Below are the initial results the company has achieved against these benchmarks.

Name

Commercial LLM #1

GPT-4

Hippocratic

Δ Improvement vs
Best Competitor

Shows Empathy

30.0%

68.3%

75.0%

6.7%

Shows care and
compassion

43.3%

75.0%

85.0%

10.0%

Making Patient feel at
ease

5.0%

29.2%

57.5%

28.3%

Taking a personal
interest in patient’s
life

33.3%

63.3%

70.0%

6.7%

Helps patient take
control

35.0%

61.7%

65.0%

3.3%


Hippocratic AI will use language models to massively increase healthcare access, reduce costs, and close the healthcare skills gap left behind by the global pandemic. Large language models are one of the best new ways to achieve this, but it has to be done in a safe way and tuned for the healthcare industry.

Source: Hippocratic AI media announcement

FEATURED SPONSOR:

Latest Updates





Subscribe to our YouTube Channel