CHAI 2025: Where Healthcare AI Moved from Promise to Practice

15 min read Abdus-Salaam Muwwakkil – Chief Executive Officer
CHAI 2025 Conference at Stanford - Healthcare AI's Inflection Point

CHAI 2025: Where Healthcare AI Moved from Promise to Practice

How CHAI’s Annual Conference Revealed an Industry at Its Inflection Point

The Moment Everything Changed

Picture this: A patient receives an HIV test result on their phone. The medical terminology is confusing—“4th generation test” instead of the familiar finger stick. They message their provider through the portal and wait. One minute passes. Then two. Frustrated, they click the AI chat option instead.

Over the next fifteen minutes, they engage in a detailed conversation with an AI agent that explains window periods, test accuracy, and even reviews their testing history. When the human provider finally responds 45 minutes later with “Yeah, that’s our new test. Don’t worry about it,” the patient simply replies: “Thanks, I already got my answer from the AI.”

This real-world example, shared at the Coalition for Health AI’s 2025 conference at Stanford, perfectly captures where healthcare finds itself today—at the intersection of overwhelming demand, technological capability, and human limitation. As David Entwistle, Stanford Healthcare’s CEO, told the packed auditorium, “If you asked a room of healthcare CEOs six years ago what it would take to get 20% productivity improvement, they’d say impossible. Today with AI, we might actually achieve that.”

From Hallway Experiment to Hospital Reality

The CHAI conference has become healthcare AI’s most important gathering, and this year’s event marked a profound shift. Gone were the tentative discussions about whether AI belonged in healthcare. Instead, walking the halls of Stanford, you’d hear health system executives comparing deployment strategies, startup founders sharing outcome metrics, and clinicians debating workflow integration.

The numbers tell the story. According to data shared at the conference, 66% of physicians are now using AI tools. Major health systems report thousands of patient interactions daily through AI agents. And perhaps most tellingly, procurement conversations have shifted from “What is AI?” to “How quickly can we deploy this?”—a transformation that Michael Blumenthal from Hiro noted took just six years.

But numbers only capture part of the narrative. The real story emerged in the conference rooms and corridors where healthcare’s transformation is being negotiated, one workflow at a time.

The NIH Makes Its Move

When Dr. Christopher Muller, Deputy Principal Director of the NIH, took the stage, the room fell silent. The NIH funds 87% of global biomedical research—when they speak, the world listens. And what they announced was nothing short of revolutionary: the first comprehensive NIH AI strategic plan.

“The kind of science that’s going to move the fields forward is no longer one scientist sitting in their lab by themselves doing experiments,” Muller explained. The vision he outlined wasn’t just about funding AI research. It was about fundamentally reimagining how medical discovery happens in an age where, as he put it, “the data we’re generating is really beyond the ability of one individual or a small number of individuals to look at and make all the connections.”

The strategic plan, backed by executive orders positioning AI as essential for “human flourishing, economic competitiveness and national security,” signals a sea change in how the nation’s premier research institution views artificial intelligence. They’re not just funding AI projects—they’re building the infrastructure for AI-powered discovery.

What makes this particularly significant is the NIH’s focus on reliability and reproducibility. As Muller stressed, “We have to be able to feed reliable data into them, otherwise what they’re generating is not going to be reliable.” This emphasis on quality over quantity represents a mature approach to AI adoption, one that acknowledges both its transformative potential and its current limitations.

The Trust Revolution

Perhaps the most surprising theme of the conference was the enthusiasm for governance—not typically a word that generates excitement in Silicon Valley. Yet presenter after presenter emphasized that governance frameworks aren’t barriers to innovation; they’re enablers of it.

Barry Stein from Hartford Healthcare captured this perfectly with his Formula One analogy: “Drivers can only go around the track at 100 miles an hour because they’ve got good brakes.” Hartford has built six core capabilities over a decade to safely accelerate AI adoption, treating each new AI tool like “a new drug or angioplasty balloon” that requires rigorous evaluation before clinical use.

This governance-first approach is paying dividends. CHAI announced five new Quality Assurance Resource Providers joining Beekeeper AI—Alignment AI, Signal 1, Lens AI, Click, and Ferrum—creating an ecosystem for independent AI validation. The first 25 solutions to enroll in CHAI’s new model-card registry will receive free independent verification, turning transparency into a market incentive. The message was clear: in healthcare, “move fast and break things” isn’t just inappropriate—it’s potentially lethal.

The startups get it too. In a revealing panel discussion, Sam Varma from Healthvana admitted something you rarely hear in Silicon Valley: “We are actually really excited about governance.” Why? Because in a market where anyone can “take ChatGPT, scrub off the letters and write ‘my cool health app’ on top,” governance frameworks help serious companies differentiate themselves from what he called “convincing fake products.”

The Race for Real-World Impact

Stanford’s Dr. Nigam Shah presented sobering data about the current state of medical AI research. His team’s analysis found that 95% of healthcare AI papers used no actual electronic health record data. The top “healthcare task” being evaluated? Taking medical licensing exams. “Pretty sorry state of affairs,” he concluded.

This disconnect between research and reality motivated Shah’s team to create Med-HELM, a comprehensive evaluation framework for medical language models. With 35 benchmarks across 121 clinical tasks, it’s the most ambitious attempt yet to measure what actually matters in healthcare AI.

The results are illuminating. While AI excels at generating clinical documentation, it struggles with administrative tasks like billing and coding—ironically, some of the most pressing pain points for health systems. The framework costs $11,700 to run completely, but Shah is giving it away free through CHAI, hoping to crowdsource the thousands of private datasets needed to truly validate these models.

Agents Take the Stage

If there was a breakout star of the conference, it was agentic AI—autonomous systems that can take actions on behalf of users. The demonstrations were impressive: Hiro’s AI agent handling complex appointment rescheduling while checking multiple systems and applying business logic. BrainHi serving half the population of Puerto Rico with AI receptionists. Healthvana showing those remarkable patient engagement statistics.

But what struck attendees wasn’t just the technology—it was the scale of deployment. These aren’t pilots anymore. Intermountain Health’s Mona Bassett captured the moment perfectly: “We have too many ideas now.” The challenge has shifted from finding AI use cases to managing the flood of possibilities.

Recognizing this new reality, CHAI announced the formation of an Agentic AI Working Group launching this summer to draft safety guidelines for semi-autonomous care systems. The group will tackle thorny questions about autonomy boundaries, human oversight requirements, and fail-safe mechanisms.

The top five use cases have crystallized around scheduling, prescription refills, general FAQs, call routing, and billing inquiries. Not the sexiest applications, perhaps, but they address real pain points that affect millions of interactions daily. As Emmanuel Okwuendo from BrainHi noted, these systems can help patients find specialty care “in minutes rather than weeks or months.”

The Ambient Revolution Grows Up

The ambient documentation market has matured remarkably. What began as simple transcription services has evolved into sophisticated clinical partners. The panel featuring leaders from Suki, Nuance, Abridge, and Ambience revealed an industry moving beyond competing on accuracy to differentiating on workflow integration and user experience.

“We’re very early,” cautioned Ed Lee from Nuance. “If you’re talking about truly ambient technology using our five senses to help clinicians, all we’ve done is take speech data and EHR data and put it together into essentially a billing artifact.” The vision extends far beyond current capabilities—real-time decision support, multi-modal inputs, predictive documentation.

Yet even today’s “basic” capabilities are transforming clinical practice. As one panelist noted, these tools don’t just save time—“they make people’s work lives more sustainable and let them go home and recharge.”

The Foundation Model Dilemma

How do you choose between OpenAI, Anthropic, Google, and others when new models drop weekly? The foundation model panel tackled this head-on, revealing strategies that would surprise those expecting brand loyalty.

“Test every model empirically for each use case,” advised Zubair from Anthropic, adding that “models cannot forecast in advance how they might perform.” The cost variations are staggering—panelists reported 10 to 1,000x differences for similar performance depending on the use case.

Stanford’s approach, shared by Kathleen, was pragmatic: “The model itself beyond a certain threshold for performance doesn’t matter that much. It has to be good enough and easy for us to access.” They’ve standardized on Azure OpenAI not because it’s always best, but because it enables rapid experimentation.

The consensus? Healthcare organizations will operate in a multi-model world, optimizing different models for different tasks. The key is building the organizational capability to rapidly test and deploy, not picking a winner.

The Global Expansion

CHAI’s announcement of Singapore as its first international chapter signals recognition that healthcare AI can’t be solved by any one country. As Brian Anderson, CHAI’s leader, explained, they’re not building “high-level international standards” but rather taking technical frameworks and applying them to specific communities with “appropriate adherence to local regulations, priorities, and values.” The Singapore chapter will adapt CHAI frameworks to APEC regulations starting in Q3 2025.

The Q1 2026 global summit co-hosted with Spain and the OECD represents the next phase—moving from American-centric standards to truly global frameworks. This matters because healthcare’s challenges are universal, even if the solutions need localization.

Closer to home, CHAI is partnering with the National Association of Community Health Centers to create governance guides specifically tailored to resource-constrained environments. These playbooks will help safety-net clinics adopt AI responsibly without requiring enterprise-level resources.

What This Means for Healthcare Leaders

After three days of presentations, panels, and corridor conversations, several imperatives emerged for healthcare executives:

Start with problems, not technology. Barry Stein mentioned this principle six times in his presentation, and for good reason. The successful implementations all began with clearly defined pain points, not exciting technology looking for applications.

Governance is your competitive advantage. In a world where base AI capabilities are commoditizing, your ability to safely and reliably deploy AI becomes the differentiator. Build those muscles now.

Think portfolio, not projects. The leaders are managing AI as a portfolio of initiatives, not one-off projects. This requires new organizational capabilities and governance structures.

Invest in education at scale. Hartford’s partnership with MIT to educate everyone from ward clerks to executives recognizes a fundamental truth: AI transformation requires widespread literacy, not just expert knowledge.

Prepare for the talent war. As one lunch conversation revealed, “Those really passionate about healthcare tend to outlast” others in this space. Build teams that combine technical expertise with healthcare commitment.

Workforce Education at Scale

The AI transformation demands more than just technical deployment—it requires comprehensive workforce readiness. Hartford Healthcare’s partnership with MIT has become the model, training everyone from ward clerks to cardiac surgeons on AI fundamentals. This isn’t about creating AI specialists; it’s about ensuring every healthcare worker understands how to work alongside AI systems.

CHAI is scaling this approach nationally. A new nursing-specific curriculum developed with the American Nurses Association addresses the unique challenges nurses face as AI enters bedside care. Specialty societies in cardiology, radiology, and pathology are creating discipline-specific guideline tracks, recognizing that AI implementation varies dramatically by clinical context.

Most importantly, CHAI announced a patient-facing literacy program with the National Health Council. This initiative lets citizens understand what it means when an AI system participates in their care—including how to read a model card before consenting to AI-assisted treatment. As one panelist emphasized, “Education, education, education—which is governance.”

Solving the Alignment Problem

Perhaps the most philosophical yet practical challenge emerged from Mercy Health’s announcement of their Value-Alignment Task Force, developed in partnership with faith-based bioethicists. The question they’re tackling is deceptively simple: When AI systems disagree with human clinicians, who decides what constitutes the “right” answer?

Moderator Patrick VN posed the uncomfortable reality: “We can have five specialists look at the same case and get five different opinions. Now we’re adding AI as a sixth voice. But unlike human disagreement, AI operates at scale. When it’s wrong, it’s wrong thousands of times.”

The task force is developing frameworks for encoding institutional values into AI systems—not just clinical accuracy, but ethical priorities around end-of-life care, resource allocation, and patient autonomy. It’s an acknowledgment that healthcare AI isn’t just a technical challenge; it’s fundamentally about values, and different communities may reasonably want their AI systems to reflect different priorities.

The View from 2030

During the foundation model panel, moderator Sue Cho asked panelists to envision healthcare in 2030. The responses were telling. Graham from Kaiser worried that despite AI adoption, “the staffing shortage will get way worse” due to demographics. Others saw administrative roles transformed while clinical positions evolved rather than disappeared.

But perhaps the most provocative vision came from Graham’s twenty-year outlook: “Will we even need health systems?” It’s a question that would have seemed absurd just five years ago. Today, with AI agents handling routine interactions and patients increasingly comfortable with AI-mediated care, it deserves serious consideration.

The Road Ahead

As attendees departed Stanford, the mood was distinctly different from previous years. The question has shifted from “if” to “how fast.” The experimenters have become implementers. The governance skeptics have become framework advocates.

Yet challenges remain formidable. The pace mismatch between healthcare’s cautious culture and AI’s exponential advancement creates constant tension. Shadow AI usage threatens patient privacy. The workforce fears displacement even as shortages worsen. And underneath it all runs the fundamental question: Who decides what constitutes “good” medical AI when even specialists disagree?

What’s clear is that healthcare AI has reached escape velocity. The combination of technological capability, economic pressure, and demonstrated outcomes has created irreversible momentum. As one attendee observed during lunch, “It’s nice to be up here with a bunch of people just talking innovation like it’s a regular day in the office.”

That normalization—from exotic experiment to operational necessity—may be CHAI 2025’s most important signal. Healthcare AI isn’t coming. It’s here. The only question now is how quickly and responsibly we can scale it to meet healthcare’s mounting challenges.

For organizations like OrbDoc, focused on bringing AI’s benefits to rural hospitals, the opportunity has never been clearer. As the market matures, success will come not from the most sophisticated AI, but from the most thoughtful implementation. In healthcare’s AI revolution, the winners will be those who remember that behind every algorithm is a human needing care.

The plane is indeed being built while flying, as Brian Anderson memorably put it. But after CHAI 2025, at least we know where we’re headed. And for the first time, we have the navigation tools to get there safely.