What happens when millions of sick people hand their most intimate data to an AI that legally isn’t a doctor, isn’t covered by medical-privacy law, and still can’t tell a heart attack from heartburn? ChatGPT Health, launched in January 2026, invites users to upload labs, imaging and Apple Watch data while promising not to reuse the chats for model training.
OpenAI frames the tool as an information service, not a diagnostic one. Nate Gross said:
"We train our models specifically to guide patients to health care professionals for diagnosis and treatment. We’re looking to give people information, not tell them if they’re sick, not tell them if they’re healthy."
An accelerated-preview triage study released in February 2026 found the system under-triaged both the most and least serious physician-written vignettes, a pattern the authors warn could delay life-saving care or inflate unnecessary visits.
Neither ChatGPT Health nor xAI’s Grok meets HIPAA standards; they are not covered entities or business associates. When Elon Musk urged X users to upload medical files, Grok itself replied: “Grok is not HIPAA compliant, and we strongly advise against uploading sensitive medical data.”
The Oxford Internet Institute tested large-language models on 1,300 participants using ten physician-drafted vignettes. LLMs alone identified conditions correctly 95 % of the time and appropriate action 56 % of the time.
Once lay users joined the loop, diagnostic accuracy fell to roughly 33 % and action accuracy to below 44 %—no better than a Google search.
Rebecca Payne said:
"The limiting factor wasn’t just the model’s medical knowledge. It was the human-AI communication loop: people providing incomplete information, the model misinterpreting key details, and, importantly, people failing to carry forward a relevant diagnostic suggestion that the model did raise during the exchange."
Haider Warraich, directing a new ARPA-H cardiovascular-LLM effort, distrusts the “AI doctor” label.
"I hate the term AI doctor. There’s a lot more to me than what these technologies can do."
The FDA has not authorized any large-language model for autonomous care. ARPA-H’s ADVOCATE program aims to submit a heart-failure LLM for FDA review within two years, but experts stress more research is needed before clinical practice changes.
Danielle Bitterman counsels patients to treat AI output as a conversation starter, not a verdict.
"Don’t take immediate action just based on what you find online. We can discuss it together."
Consensus advice: use these tools only for low-stakes tasks such as term explanations or visit prep, and always confirm next steps with a clinician.
Source: JAMA
⚠️ LEGAL DISCLAIMER: It is for informational purposes only. It never substitutes for professional medical advice, diagnosis, or treatment. Always consult your doctor regarding any questions about your health.