What People Think of Machines as Doctors

Research Seminars: ZEW Research Seminar

Unveiling the Value of Gen-AI for e-Health

Large language models (LLMs) generate human-like text from vast data, enabling natural language communication even for tasks demanding expert knowledge. As LLMs increasingly become an alternative for experts, understanding how non-experts perceive and respond to automated responses by machines (in this case, LLMs) is crucial. Framed within the context of patient-physician communication, the authors of the paper presented in this ZEW Research Seminar investigate how non-experts (typical patients) perceive LLM responses versus physician responses and explore the factors influencing their perception. In a survey-based experiment, we compare non-experts’ (survey participants) evaluations of responses from physicians and ChatGPT, a Chat Generative Pretrained Transformer, to patient queries. Their findings reveal that non-experts overwhelmingly prefer ChatGPT responses over responses by physicians, even when machine responses are of low quality (as judged by a blinded panel of experts). Two key factors influencing this preference emerge from the study: longer prose from ChatGPT heightens non-experts’ preference for machines, while disclosing the response source diminishes this preference, especially when the ChatGPT response quality is lower. The study indicates the need for a careful use of LLMs when responding to laypersons, particularly patients, in their search for answers to health-related questions.

Venue