The promise of AI in healthcare operations is often framed in terms of speed and scale, For example, how many calls can be handled, how quickly, and at what cost. Certainly those things matter. But if efficiency is the only metric, something essential gets lost.

A phone conversation that completes a task but leaves a patient confused, unheard, or uncomfortable hasn’t really succeeded. Healthcare interactions carry inherent stakes. Patients are often navigating complex, stressful situations. The way information is delivered – the warmth, the clarity, the pacing – shapes whether they trust the process and act on what they’ve been told.

This is why, alongside operational metrics, at Infinitus we evaluate our AI conversations on something harder to quantify: the human quality of the interaction itself.

A rubric built around people, not just tasks

We’ve developed an internal scoring framework for assessing conversation quality that goes well beyond whether the right information was collected or delivered. It’s organized around four areas: Professionalism, Empathy, Information Accuracy, and Clinical Safety.

Information accuracy and safety are table stakes, as they must be right every time. But professionalism and empathy are where the human dimension lives, and they’re worth examining closely.

Under professionalism, we ask whether the conversation felt smooth and natural, whether it respected the patient’s time, and whether it ended with real closure, so the person on the other end actually knew what to expect next. These might sound like soft criteria, but they have hard consequences. A conversation that leaves someone uncertain about next steps is a conversation that generates callbacks, confusion, and eroded trust.

An illustration of the rating rubric for professionalism calls conducted by Infinitus AI agents.

Empathy criteria go even further. We evaluate whether the AI left space for the patient to speak, responded appropriately to what was actually said, and built enough comfort for the patient to engage openly. We look at personalization – whether the conversation reflected what the patient had already shared, rather than treating every interaction as a blank slate.

None of this is easy to build. Training an AI to pick up on conversational cues, adapt its phrasing to a patient who seems confused, or gracefully navigate an interruption are genuinely hard technical problems. But they’re the right problems to be solving.

Why this matters for healthcare specifically

Healthcare is not a domain where the human element is optional. Patients make decisions based on how they experience interactions. That includes whether they feel respected, whether they understood what they were told, and whether they trust the system they’re engaging with. A voice AI that sounds robotic or indifferent doesn’t just create a bad experience; it can undermine care.

There’s also the matter of the workforce. One reason healthcare organizations turn to AI isn’t just cost, it’s that human staff are stretched, and the most talented people should be spending their time on the interactions that require human judgment, compassion, and expertise. AI can take on volume. But to do that well, it has to meet a standard that reflects the values of the humans it’s working alongside.

Raising the bar

We don’t think the right question is whether AI can replicate human warmth. It can’t, not exactly. But it can be designed to genuinely serve the person on the other end of the call. It can be clear, responsive, respectful, and useful. To remember what someone said and let that shape the conversation. To know when to slow down.

The rubric we use to evaluate our conversations is, at its core, a statement of values. It says that professionalism and empathy aren’t nice-to-haves layered on top of a functional system; they’re part of what makes the system work.

We believe that healthcare AI that feels human isn’t a feature. It’s the point.