AI Daily Brief
Healthcare
Latest Intelligence
The latest AI stories, analysis and developments relevant to Healthcare — curated daily by Best Practice AI.
Use Casesfor Healthcare
200 articles
New AI model spots pancreatic cancer up to 3 years earlier than human doctors in test
A new AI diagnostic model has demonstrated the ability to detect pancreatic cancer significantly earlier than traditional human screening.
Artera Launches AI Service Squads for Tailored Healthcare Solutions
Artera introduced AI Service Squads to integrate custom AI solutions within healthcare providers' operations, enhancing both front and back-office tasks.
ADAPTS: Agentic Decomposition for Automated Protocol-agnostic Tracking of Symptoms
arXiv:2605.03212v2 Announce Type: new Abstract: Modeling latent clinical constructs from unconstrained clinical interactions is a unique challenge in affective computing. We present ADAPTS (Agentic Decomposition for Automated Protocol-agnostic Tracking of Symptoms), a framework for automated rating of depression and anxiety severity using a mixture-of-agents LLM architecture. This approach decomposes long-form clinical interviews into symptom-specific reasoning tasks, producing auditable justifications while preserving temporal and speaker alignment. Generalization was evaluated across two independent datasets ($N=204$) with distinct interview structures. On high-discrepancy interviews, automated ratings approximated expert benchmarks ($\text{absolute error}=22$) more closely than original human ratings ($\text{absolute error}=26$). Implementing an ``extended'' protocol that incorporates qualitative clinical conventions significantly stabilized ratings, with absolute agreement reaching $\text{ICC(2,1)} = 0.877$. These findings suggest that the ADAPTS framework enables promising evaluations of psychiatric severity. While the current implementation is purely text-based, the underlying architecture is readily extensible to multimodal inputs, including acoustic and visual features. By approximating expert-level precision in a protocol-agnostic manner, this framework provides a foundation for objective and scalable psychiatric assessment, especially in resource-limited settings.
Are Multimodal LLMs Ready for Clinical Dermatology? A Real-World Evaluation in Dermatology
arXiv:2605.04098v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) have demonstrated promise on publicly available dermatology benchmarks. However, benchmark performance may not generalize to real-world dermatologic decision-making. To quantify this benchmark-to-bedside gap, we evaluated four open-weight MLLMs (InternVL-Chat v1.5, LLaVA-Med v1.5, SkinGPT4 and MedGemma-4B-Instruct) and one commercial MLLM (GPT-4.1) across three publicly available dermatology datasets and a retrospective multi-site hospital-based dermatology consultation cohort comprising 5,811 cases and 46,405 clinical images. Models were evaluated on two clinically relevant tasks: differential diagnosis generation and severity-based triage. Diagnostic performance was modest on public datasets and declined substantially in the real-world cohort. On public benchmarks, top-3 diagnostic accuracy reached 26.55% for the best open-weight model and 42.25% for GPT-4.1. On real-world consultation cases using images alone, top-3 diagnostic accuracy fell to 1.50%-13.35% among open-weight models and 24.65% for GPT-4.1. Incorporating clinical context improved performance across all models, increasing top-3 diagnostic accuracy up to 28.75% among open-weight models and 38.93% for GPT-4.1. However, model outputs were highly sensitive to incomplete or erroneous consultation context. For severity-based triage, models achieved moderate sensitivity (above 60%), suggesting potential utility for screening but insufficient reliability for clinical deployment. These findings demonstrate that benchmark performance substantially overestimates the real-world clinical capability of current dermatology MLLMs.
AI and Suicide Prevention: A Cross-Sector Primer
arXiv:2605.04321v1 Announce Type: new Abstract: AI chatbots already function as de facto mental health support tools for millions of people, including people in crisis. Yet, they lack the clinical validation, shared standards, and coordinated oversight that their societal role demands. This primer was developed in conjunction with a multistakeholder workshop hosted by Partnership on AI in 2026, convening AI labs, mental health practitioners, people with lived experience, and policymakers, to provide a common cross-sector reference point for the current state of the field of AI and suicide prevention. It begins with an overview of clinical best practices, then turns to how frontier AI systems (as of winter 2026) detect and respond to suicide and non-suicidal self-injury (NSSI) queries. Together, these provide insight into what it would take to design and implement AI tools that not only better prevent suicide and NSSI, but also promote overall well-being. Drawing on clinical literature, publicly available AI lab policies, an emerging landscape of evaluation frameworks, and conversations with leaders across the AI and mental health fields, we map challenges posed by general-purpose AI chatbots for mental health across model, product, and policy layers, ultimately highlighting priority areas where cross-industry alignment is both urgently needed and achievable.
Evaluating Patient Safety Risks in Generative AI: Development and Validation of a FMECA Framework for Generated Clinical Content
arXiv:2605.04085v1 Announce Type: new Abstract: Objectives: Large language models (LLMs) are increasingly used for clinical text summarization, yet structured methods to assess associated patient safety risks remain limited. Failure Mode, Effects, and Criticality Analysis (FMECA) provides a proactive framework for systematic risk identification but has not been adapted to LLM-generated clinical content. This study aimed to develop and validate a novel FMECA framework for the prospective assessment of patient safety risks in LLM-generated clinical summaries. Materials and Methods: An interdisciplinary expert panel (n = 8) developed a taxonomy of failure modes through literature review and brainstorming. Standard FMECA dimensions (occurrence, severity, detectability) were adapted into 5-point ordinal scales. The framework was applied to 36 discharge summaries from four patients, generated by an open LLM (GPT-OSS 120B) using real-world clinical data from the Geneva University Hospitals. Reviewers independently annotated the summaries across two rounds. Inter-rater reliability was assessed at failure mode, severity and detectability score levels. Usability and content validity were evaluated using an adapted System Usability Scale and structured feedback. Results: The final framework comprised 14 failure modes organized into categories. Inter-rater agreement improved between rounds, reaching moderate-to-substantial agreement for failure mode identification and good agreement for severity and detectability scoring. Usability was rated as good (mean SUS: 79.2/100), with high evaluator confidence. Discussion and Conclusion: This study presents the first FMECA-based framework for systematic patient safety risk assessment of LLM-generated clinical summaries. The framework provides a structured and reproducible method for identifying clinically relevant risks caused by these summaries.
Roche to Buy PathAI for Up to $1.05 Billion to Bolster AI Diagnostics Tools
The deal seeks to bolster the artificial-intelligence offerings of Roche’s diagnostics division and to help accelerate clinical-therapy development.
To Use AI as Dice of Possibilities with Timing Computation
arXiv:2605.01134v1 Announce Type: new Abstract: The dominant noun-based modeling paradigm has fundamentally constrained AI development, precluding any adequate representation of the future as an open temporal dimension. This paper introduces a verb-based paradigm, together with precise definitions of \emph{timing computation} and \emph{possibility}, that enables AI to function as an effective instrument for realizing the grammar of our thought. Applied to longitudinal EHR data from 3,276 breast cancer patients, the framework empirically demonstrates: (1) automatic discovery of clinically significant patient trajectories, and (2) counterfactual timing deduction. Both results are purely data-driven, require no prior domain knowledge, and, to our knowledge, represent the first such demonstrations in the machine learning literature.
EQUITRIAGE: A Fairness Audit of Gender Bias in LLM-Based Emergency Department Triage
arXiv:2605.03998v1 Announce Type: cross Abstract: Emergency department triage assigns patients an acuity score that determines treatment priority, and clinical evidence documents persistent gender disparities in human acuity assessment. As hospitals pilot large language models (LLMs) as triage decision support, a critical question is whether these models reproduce or mitigate known biases. We present EQUITRIAGE, a fairness audit of LLM-based ESI assignment evaluating five models (Gemini-3-Flash, Nemotron-3-Super, DeepSeek-V3.1, Mistral-Small-3.2, GPT-4.1-Nano) across 374,275 evaluations on 18,714 MIMIC-IV-ED vignettes under four prompt strategies. Of 9,368 originals, 9,346 are paired with a gender-swapped counterfactual. All five models produced flip rates above a pre-registered 5% threshold (9.9% to 43.8%). Two showed directional female undertriage (DeepSeek F/M 2.15:1, Gemini 1.34:1); two were near-parity; one had high sensitivity with weak male-direction asymmetry. DeepSeek's directional bias coexisted with a low outcome-linked calibration gap (0.013 against MIMIC-IV admission), a Chouldechova-style dissociation between within-group calibration and between-pair counterfactual invariance. Demographic blinding reduced Gemini's flip rate to 0.5%; an age-preserving blind variant left DeepSeek with residual F/M 1.25, implicating age as a residual channel. Chain-of-thought prompting degraded accuracy for all five models. A two-model ablation reveals opposite underlying mechanisms for the same directional phenotype: in Gemini the signal is emergent in the combined name+gender swap, while in DeepSeek the gender token alone carries it. EQUITRIAGE shows that group parity, counterfactual invariance, and gender calibration are distinct fairness properties, that intervention effectiveness is model-dependent, and that per-model counterfactual auditing should precede clinical deployment.
ClinicBot: A Guideline-Grounded Clinical Chatbot with Prioritized Evidence RAG and Verifiable Citations
arXiv:2605.00846v1 Announce Type: new Abstract: Clinical diagnosis requires answers that are accurate, verifiable, and explicitly grounded in official guidelines. While large language models excel at natural language processing, their tendency to hallucinate undermines their utility in high-stakes medical contexts where precision is essential. Existing retrieval-augmented generation (RAG) systems treat all evidence equally, producing noisy context and generic answers misaligned with clinical practice. We present ClinicBot, an AI system that translates guideline recommendations into trustworthy clinical support through three key advances: (1) structured extraction of clinical guidelines into semantic units (recommendations, tables, definitions, narrative) with explicit provenance, (2) evidence prioritization that ranks content by clinical significance and guideline structure rather than textual similarity, and (3) a web-based interface that presents concise, actionable answers with verifiable evidence. We will demonstrate ClinicBot using diabetes questions from real patients and an additional diabetes risk assessment tool that is faithful to the American Diabetes Association (ADA) Standards of Care in Diabetes (2025). The demonstration will illustrate how semantic knowledge extraction and hierarchical evidence ranking can reliably operate in a multi-agent setting to process complex clinical guidelines at scale.
NHS to close-source hundreds of GitHub repos over AI, security concerns
Healthcare giant's maintainers handed May deadline to enact the change.
Virtual Speech Therapist: A Clinician-in-the-Loop AI Speech Therapy Agent for Personalized and Supervised Therapy
arXiv:2605.01101v1 Announce Type: new Abstract: This paper develops Virtual Speech Therapist (VST), an intelligent agent-based platform that streamlines stuttering assessment and delivers customized therapy planning through automated and adaptive AI-driven workflows. VST integrates state-of-the-art deep learning-based stuttering classification, and multi-agent large language model (LLM) reasoning to support evidence-based clinical decision-making. The VST begins with the acquisition and feature extraction of patient speech samples, followed by robust classification of stuttering types. Building on these outputs, VST initiates an agentic reasoning process in which specialized LLM agents autonomously generate, critique, and iteratively refine individualized therapy plans. A dedicated critic agent evaluates all generated therapy plans to ensure clinical safety, methodological soundness, and alignment with peer-reviewed evidence and established professional guidelines. The resulting output is a comprehensive, patient-specific therapy draft intended for clinician review. Incorporating clinician feedback, the system then produces a finalized therapy plan suitable for patient delivery, thereby maintaining a clinician-in-the-loop paradigm. Experimental evaluation by expert speech therapists confirms that VST consistently generates high-quality, evidence-based therapy recommendations. These findings demonstrate the system's potential to augment clinical workflows, reduce clinician burden, and improve therapeutic outcomes for individuals with speech impairments. An interactive user interface for the proposed system is available online at: https://vocametrix.com/ai/stuttering-therapy-planning-agent , facilitating real-time stuttering assessment and personalized therapy planning.
Swiss startup Moonlight AI raises €2.8 million to turn routine blood and cytology imaging into genomic insights
Moonlight AI, a Swiss startup building image analysis software for clinical-grade diagnostics, has closed a €2.8 million ($3.3 million) Seed funding round. The round was co-led by Lotus One Investment (Singapore), VP Venture Partners (Switzerland), and MEDIN Fund (Tunisia), with participation from N&V Capital (Liechtenstein) and existing investor QAI Ventures (Switzerland). “Our technology enables labs […]
Pennsylvania Sues AI Company Saying Its Chatbots Give Dangerous Medical Advice
The state of Pennsylvania has taken legal action against an AI company over concerns that its chatbots provide harmful medical guidance.
Pa. suit alleging unlicensed medical practice is latest state action against chatbots
Pennsylvania has sued Character Technologies for allegedly practicing medicine without a license through its Character.AI platform.
Character.AI sued by Pa. over alleged doctor impersonation by chatbot
Pennsylvania's Department of State has sued chatbot developer Character.AI, alleging the company misrepresented its companion chatbots as licensed medical professionals.
Validation of an AI-based end-to-end model for prostate pathology using long-term archived routine samples
Artificial intelligence (AI) is becoming a clinical tool for prostate pathology, but generalization across variations in sample preparation and preservation over prolonged time periods remains poorly understood. We evaluated GleasonAI, an end-to-end attention-based multiple instance learning model, on an independent validation cohort comprising 10,366 biopsy cores from 1,028 patients across 14 Swedish regions, using archival diagnostic specimens from the ProMort cohorts collected between 1998-20...
Tailoring AI solutions for health care needs
The AI market is full of big promises of grand transformation. Health care is a prime target for those promises, beset as it is by financial pressures, labor shortages, and the growing burden of caring for an aging population. AI developers are targeting functions that vary widely, from curing cancer and performing surgery to streamlining…
The Paradox of Medical AI Implementation - by Eric Topol
There have been 44 randomized trials for colonoscopy that consistently, and in aggregate, demonstrate a substantial advantage of AI -assist for detecting adenomatous polyps compared with gastroenterologists without AI , yet that has not been made part of standard medical practice.
Flaws in Kenya’s AI-driven health reforms driving up costs for the poorest
Exclusive: amid unrest, President William Ruto promised to give all Kenyans access to healthcare. But the algorithm favours the rich, an investigation has found An AI system used to predict how much Kenyans can afford to pay for access to healthcare, has systemically driven up costs for the poor, an investigation has found. The healthcare system being rolled out across the country, a key electoral promise of President William Ruto, was launched in October 2024 and intended to replace Kenya’s decades-old national insurance system.
AI finds signs of pancreatic cancer before tumors develop
New AI research shows promise in detecting early signs of pancreatic cancer before physical tumors are even present.
A decade after the ‘Godfather of AI’ said radiologists were obsolete, their salaries are up to $571K and demand is growing fast
"As long as AI doesn't make this quantum leap of becoming sort of AGI,” most jobs are going to be reasonably safe, said one economist.
Adoption and Use of LLMs at an Academic Medical Center
arXiv:2602.00074v2 Announce Type: replace Abstract: While large language models (LLMs) can support clinical documentation needs, standalone tools struggle with "workflow friction" from manual data entry. We developed ChatEHR, a system that enables the use of LLMs with the entire patient timeline spanning several years. ChatEHR enables automations - which are static combinations of prompts and data that perform a fixed task - and interactive use in the electronic health record (EHR) via a user interface (UI). The resulting ability to sift through patient medical records for diverse use-cases such as pre-visit chart review, screening for transfer eligibility, monitoring for surgical site infections, and chart abstraction, redefines LLM use as an institutional capability. This system, accessible after user-training, enables continuous monitoring and evaluation of LLM use. In 1.5 years, we built 7 automations and 1075 users have trained to become routine users of the UI, engaging in 23,000 sessions in the first 3 months of launch. For automations, being model-agnostic and accessing multiple types of data was essential for matching specific clinical or administrative tasks with the most appropriate LLM. Benchmark-based evaluations proved insufficient for monitoring and evaluation of the UI, requiring new methods to monitor performance. Generation of summaries was the most frequent task in the UI, with an estimated 0.73 hallucinations and 1.60 inaccuracies per generation. The resulting mix of cost savings, time savings, and revenue growth required a value assessment framework to prioritize work as well as quantify the impact of using LLMs. Initial estimates are $6M savings in the first year of use, without quantifying the benefit of the better care offered. Such a "build-from-within" strategy provides an opportunity for health systems to maintain agency via a vendor-agnostic, internally governed LLM platform.
Healthcare’s AI Agents Aim to Give Doctors Time Back | PYMNTS.com
Healthcare’s next AI test will be whether agents can give doctors and nurses back something far more valuable: time. Across new reports and commentary,
Carlyle Acquires Healthcare RCM Providers Knack and EqualizeRCM
Carlyle Group Inc. has acquired a majority stake in healthcare revenue cycle management firms Knack RCM and EqualizeRCM, it said in a statement Monday, without disclosing terms.
AI outshines doctors in Harvard's ER study
A new study from Harvard indicates that AI models are demonstrating high performance in emergency room settings, potentially outperforming human doctors in certain diagnostic tasks.
AI Rivals Doctors in Emergency Decision-Making, Harvard Study Reveals
AI models now rival doctors in emergency diagnosis accuracy, but experts stress human oversight remains essential for safe clinical decision-making.
Infor's Technology Tackles Healthcare's AI Execution Gap | Healthcare Digital
Infor's new platform tackles industry-specific AI scaling challenges with robust governance and compliance features for healthcare providers
Tailoring AI solutions for health care needs | MIT Technology Review
The AI market is full of big promises of grand transformation. Health care is a prime target for those promises, beset as it is by financial pressures, labor shortages, and the growing burden of caring for an aging population. AI developers are targeting functions that vary widely, from curing ...
Trends in 2026 for healthcare – How is AI making insight-driven patient care a reality?
Get insights into healthcare trends 2026 and how AI and predictive analytics are reshaping patient care and service delivery.
Download XRPH AI: Earn Rewards for Healthy Actions With an AI Healthcare App | by XRP Healthcare | May, 2026 | Medium
With XRPH AI , users can access AI -powered healthcare tools today – and participate in a system designed to reward healthy actions through real usage.
In Harvard study, AI offered more accurate emergency room diagnoses than two human doctors
A recent Harvard study found that AI models outperformed human doctors in making accurate emergency room diagnoses.
AI Co-Clinician for Healthcare
This article from Google DeepMind introduces an AI co-clinician research initiative aimed at supporting doctors with evidence-grounded, supervised AI in healthcare. Our analysts noted the small sample size but found the multimodal clinical reasoning and broader applicability to regulated industries important for AI leaders to monitor.
AI Enhances Medical Diagnostics
AI is enhancing healthcare by supporting diagnostics and decision-making, but not replacing doctors.
Beacon Biosignals is mapping the brain during sleep
Researchers are using AI to analyze brain activity during sleep, providing new insights into neurological health and sleep patterns.
Galway’s Orreco signs up with MLS Innovation Lab
Orreco uses AI, computer vision and biomarker data to optimise athlete performance, predict injury risk and accelerate recovery, according to the company. Read more: Galway’s Orreco signs up with MLS Innovation Lab
Enabling A New Model for Healthcare with AI Co-Clinician
Google DeepMind introduces an AI co-clinician research initiative to support doctors with evidence-grounded, supervised AI, demonstrating potential for regulated industries.
OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM
Will the rising tide of A.I. adoption lift all boats?
Evaluating TabPFN for Mild Cognitive Impairment to Alzheimer's Disease Conversion in Data Limited Settings
arXiv:2604.27195v1 Announce Type: new Abstract: Accurate prediction of conversion from Mild Cognitive Impairment (MCI) to Alzheimers Diseases (AD) is essential for early intervention, however, developing reliable conversion predictive models is difficult to develop due to limited longitudinal data availability We evaluate TabPFN (Tabular Pre-Trained Foundation Network) against traditional machine learning methods for predicting 3 year MCI to AD conversion using the TADPOLE dataset derived from ADNI. Using multimodal biomarker features extracted from demographics, APOE4, MRI volumes, CSF markers, and PET imaging, we conducted an experimental comparison across varying training set sizes (N=50 to 1000) and models including XGBoost, Random Forest, LightGBM, and Logistic Regression. TabPFN achieved one the highest performance (AUC=0.892), outperforming LightGBM (AUC=0.860) and demonstrating advantages in low data settings. At N=50 training samples, TabPFN maintained strong AUC while the traditional machine learning models struggles at small training samples. These findings demonstrate that foundation models are promising for disease prediction in data limited scenarios, such as Alzheimers diseases.
Opinion | AI-automated prescriptions need safeguards: Responses to readers - The Washington Post
Artificial intelligence can make medical practice more efficient and accessible, but there must be safeguards
Toward Personalized Digital Twins for Cognitive Decline Assessment: A Multimodal, Uncertainty-Aware Framework
arXiv:2604.27217v1 Announce Type: new Abstract: Cognitive decline is highly heterogeneous across individuals, which complicates prognosis, trial design, and treatment planning. We present the Personalized Cognitive Decline Assessment Digital Twin (PCD-DT), a multimodal and uncertainty-aware framework for modeling patient-specific disease trajectories from sparse, noisy, and irregular longitudinal data. The framework combines three methodological components: (1) latent state-space models for individualized temporal dynamics, (2) multimodal fusion for clinical, biomarker, and imaging features, and (3) uncertainty-aware validation and adaptive updating for robust digital twin operation. We also outline how conditional generative models can support data augmentation and stress testing for underrepresented progression patterns. As a preliminary feasibility study, we analyze longitudinal TADPOLE trajectories and show clear separation between cognitively normal and Alzheimer's disease cohorts in ADAS13, ventricle volume, and hippocampal volume over five years. We further conduct a multimodal next-visit prediction ablation using an LSTM sequence model on 3{,}003 visit-pair sequences derived from TADPOLE, where the combined cognitive plus MRI configuration achieves the lowest standardized RMSE for both ADAS13 (0.4419) and ventricle volume (0.5842), outperforming a Last Observation Carried Forward baseline. A Bayesian tensor modeling component for high-dimensional imaging fusion is also discussed. These results support the feasibility of the proposed architecture while also highlighting the need for stronger uncertainty calibration and longer-horizon predictive evaluation. The PCD-DT framework provides a principled starting point for personalized in silico modeling in neurodegenerative disease. This work positions PCD-DT as a foundational step toward clinically deployable, uncertainty-aware digital twin systems.
Frontier AI Models Outperform Human Physicians in Clinical Benchmarks and Emergency Scenarios
New research indicates that advanced LLMs can surpass human performance in specific medical diagnostic tasks. These findings underscore an urgent requirement for prospective clinical trials to validate AI efficacy in healthcare settings.
CareGuardAI: Context-Aware Multi-Agent Guardrails for Clinical Safety & Hallucination Mitigation in Patient-Facing LLMs
arXiv:2604.26959v1 Announce Type: new Abstract: Integrating large language models (LLMs) into patient-facing healthcare systems offers significant potential to improve access to medical information. However, ensuring clinical safety and factual reliability remains a critical challenge. In practice, AI-generated responses may be conditionally correct yet medically inappropriate, as models often fail to interpret patient context and tend to produce agreeable responses rather than challenge unsafe assumptions. Unlike clinicians, who infer risk from incomplete information, LLMs frequently lack contextual awareness. Moreover, real-world patient interactions are open-ended and underspecified, unlike structured benchmark settings. We present CareGuardAI, a risk-aware safety framework for patient-facing medical question answering that addresses two key failure modes: clinical safety risk and hallucination risk. The framework introduces Clinical Safety Risk Assessment (SRA), inspired by ISO 14971, and Hallucination Risk Assessment (HRA) to evaluate medical risk and factual reliability. At inference time, CareGuardAI employs a multi-stage pipeline consisting of a controller agent, safety-constrained generation, and dual risk evaluation, followed by iterative refinement when necessary. Responses are released only when both SRA and HRA are less than or equal to 2, ensuring clinically acceptable outputs with bounded latency. We evaluate CareGuardAI on PatientSafeBench, MedSafetyBench, and MedHallu, covering both safety and hallucination detection. Across these benchmarks, the framework consistently outperforms strong baseline models, including GPT-4o-mini, demonstrating the importance of context-aware, risk-based, inference-time safety mechanisms for reliable deployment in healthcare.
Generative AI In Healthcare: Adoption Matures As Agentic AI Emerges
Generative AI adoption in healthcare is shifting from pilot programs to production, with a focus on clinical documentation and administrative automation.
A Scoping Review of LLM-as-a-Judge in Healthcare and the MedJUDGE Framework
arXiv:2604.25933v1 Announce Type: new Abstract: As large language models (LLMs) increasingly generate and process clinical text, scalable evaluation has become critical. LLM-as-a-Judge (LaaJ), which uses LLMs to evaluate model outputs, offers a scalable alternative to costly expert review, but its healthcare adoption raises safety and bias concerns. We conducted a PRISMA-ScR scoping review of six databases (January 2020-January 2026), screening 11,727 studies and including 49. The landscape was dominated by evaluation and benchmarking applications (n=37, 75.5%), pointwise scoring (n=42, 85.7%), and GPT-family judges (n=36, 73.5%). Despite growing adoption, validation rigor was limited: among 36 studies with human involvement, the median number of expert validators was 3, while 13 (26.5%) used none. Risk of bias testing was absent in 36 studies (73.5%), only 1 (2.0%) examined demographic fairness, and none assessed temporal stability or patient context. Deployment remained limited, with 1 study (2.0%) reaching production and four (8.2%) prototype stage. Importantly, these gaps may interact: when judges and evaluated systems share training data or architectures, they may inherit similar blind spots, and agreement metrics may fail to distinguish true validity from shared errors. Minimal human oversight, limited bias assessment, and model monoculture together represent a governance gap where current validation may miss clinically significant errors. To address this, we propose MedJUDGE (Medical Judge Utility, De-biasing, Governance and Evaluation), a risk-stratified three-pillar framework organized around validity, safety, and accountability across clinical risk tiers, providing deployment-oriented evaluation guidance for healthcare LaaJ systems.
Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control
arXiv:2604.26577v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly considered for deployment as the control component of robotic health attendants, yet their safety in this context remains poorly characterized. We introduce a dataset of 270 harmful instructions spanning nine prohibited behavior categories grounded in the American Medical Association Principles of Medical Ethics, and use it to evaluate 72 LLMs in a simulation environment based on the Robotic Health Attendant framework. The mean violation rate across all models was 54.4\%, with more than half exceeding 50\%, and violation rates varied substantially across behavior categories, with superficially plausible instructions such as device manipulation and emergency delay proving harder to refuse than overtly destructive ones. Model size and release date were the primary determinants of safety performance among open-weight models, and proprietary models were substantially safer than open-weight counterparts (median 23.7\% versus 72.8\%). Medical domain fine-tuning conferred no significant overall safety benefit, and a prompt-based defense strategy produced only a modest reduction in violation rates among the least safe models, leaving absolute violation rates at levels that would preclude safe clinical deployment. These findings demonstrate that safety evaluation must be treated as a first-class criterion in the development and deployment of LLMs for robotic health attendants.
AI #166: Google Sells Out - by Zvi Mowshowitz
Senator Maria Cantwell says that if we let AI make healthcare decisions instead of doctors, we are going to have some real problems. Did you know we already have some real problems the other way? Her objection here is that AI systems designed to catch ‘wasteful spending’ (often but not always read: outright fraud) might deny care.
AI & Tech brief: States regulate AI in health care - The Washington Post
The White House is attempting a rapprochement with Anthropic over its new AI model, Mythos.
Governance for safe and responsible AI in healthcare organisations: a scoping review of frameworks | npj Digital Medicine
This scoping review synthesises current evidence on artificial intelligence (AI) governance in healthcare organisations, outlining key components of AI governance frameworks. Following PRISMA-ScR guidelines, we searched MEDLINE, Embase, and Scopus (April 2024, updated March 2025) for AI governance ...
Zuckerberg Bets $500M on AI Biology
Biohub, the nonprofit spearheaded by Mark Zuckerberg and Priscilla Chan, is committing $500 million to help create better AI simulations of the human body. The bet is that more data and compute will produce more useful models.
MITRE flags rising cyber risks as medical devices adopt AI, cloud and post-quantum technologies - Industrial Cyber
Survey finds 99% back microsegmentation ... short on protecting critical systems · US bill allows critical infrastructure operators to detect and neutralize rogue drones, closing key defense gaps · NMFTA names Ben Wilkens director of cybersecurity to lead strategy and research · OT-ISAC flags rising energy sector cyber risk as OT exposure spreads beyond control rooms into distributed assets · Nozomi joins Dragos in dismissing ZionSiphon as flawed, likely AI-generated ...
Safety Drift After Fine-Tuning: Evidence from High-Stakes Domains
arXiv:2604.24902v1 Announce Type: new Abstract: Foundation models are routinely fine-tuned for use in particular domains, yet safety assessments are typically conducted only on base models, implicitly assuming that safety properties persist through downstream adaptation. We test this assumption by analyzing the safety behavior of 100 models, including widely deployed fine-tunes in the medical and legal domains as well as controlled adaptations of open foundation models alongside their bases. Across general-purpose and domain-specific safety benchmarks, we find that benign fine-tuning induces large, heterogeneous, and often contradictory changes in measured safety: models frequently improve on some instruments while degrading on others, with substantial disagreement across evaluations. These results show that safety behavior is not stable under ordinary downstream adaptation, raising critical questions about governance and deployment practices centered on base-model evaluations. Without explicit re-evaluation of fine-tuned models in deployment-relevant contexts, such approaches fall short of adequately managing downstream risk, overlooking practical sources of harm -- failures that are especially consequential in high-stakes settings and challenge current accountability paradigms.
AI-enabled medtech introduces risks facilities aren't ready for, cybersecurity report says
AI-enabled devices are introducing new risks that organizations aren’t fully equipped to manage, the cybersecurity report said.
GITEX Future Health Africa 2026: The Ethics of AI in Healthcare Under Global Focus
In a 700-bed public hospital in Kimberley, South Africa, a sole radiologist fell ill at the height of the COVID-19 pandemic.
Berlin-based Patronus raises €11 million for senior-friendly emergency smartwatch and family app
Patronus, a Berlin-based elderly care startup developing a mobile emergency smartwatch and a family app, today announced the closing of its €11 million funding round to expand its leadership in the mobile emergency response segment and develop new products around family, wellbeing, and an AI-powered daily companion. The round was led by 3TS Capital Partners, […]
Utah dismisses medical board call to halt its pioneering AI prescription program | KUER
The Utah Medical Licensing Board has “major concerns” and worries Utahns could potentially be harmed. But the Department of Commerce stood by the pilot program.
New Hyderabad Centre Revolutionizes Neuro-Ophthalmology with AI Diagnostics and Integrated Care
LV Prasad Eye Institute in Hyderabad has opened an Eye and Brain Centre, supported by D. E. Shaw India, to revolutionize neuro-ophthalmology through AI-driven diagnostics.
Generative AI in healthcare to hit $30.4b by 2032 on imaging boom | Asian Business Review
The global generative artificial intelligence in healthcare market is projected to reach $30.4b by 2032, expanding at a compound annual growth rate of 34.9%.
Secure On-Premise Deployment of Open-Weights Large Language Models in Radiology: An Isolation-First Architecture with Prospective Pilot Evaluation
arXiv:2604.22768v1 Announce Type: new Abstract: Purpose: To design, implement, evaluate, and report on the regulatory requirements of a self-hosted LLM infrastructure for radiology adhering to the principle of least privilege, emphasizing technical feasibility, network isolation, and clinical utility. Materials and Methods: The isolation-first, containerized LLM inference stack relies on strict
Policies and Safeguards for the Safe Use of AI
Considerations for creating an Al governance and safeguards framework Throughout 2025 and early 2026, a team of AI-focused security professionals in the
HTEC Research: Only One in Three Healthcare Organizations is Ready to Scale AI | National Business | joplinglobe.com
PALO ALTO, Calif.--(BUSINESS WIRE)--Apr 28, 2026--
Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters
An empirical evaluation of the risks of AI model updates using clinical data: stability, arbitrariness, and fairness
Artificial Intelligence and Machine Learning (AI/ML) models used in clinical settings are increasingly deployed to support clinical decision-making. However, when training data become stale due to changes in demographics, environment, or patient behaviors, model performance can degrade substantially. While updating models with new training data is necessary, such updates may also introduce new risks.
An Artifact-based Agent Framework for Adaptive and Reproducible Medical Image Processing
arXiv:2604.21936v1 Announce Type: new Abstract: Medical imaging research is increasingly shifting from controlled benchmark evaluation toward real-world clinical deployment. In such settings, applying analytical methods extends beyond model design to require dataset-aware workflow configuration and provenance tracking. Two requirements therefore become central: \textbf{adaptability}, the ability to configure workflows according to dataset-specific conditions and evolving analytical goals; and \textbf{reproducibility}, the guarantee that all transformations and decisions are explicitly recorded and re-executable. Here, we present an artifact-based agent framework that introduces a semantic layer to augment medical image processing. The framework formalizes intermediate and final outputs through an artifact contract, enabling structured interrogation of workflow state and goal-conditioned assembly of configurations from a modular rule library. Execution is delegated to a workflow executor to preserve deterministic computational graph construction and provenance tracking, while the agent operates locally to comply with most privacy constraints. We evaluate the framework on real-world clinical CT and MRI cohorts, demonstrating adaptive configuration synthesis, deterministic reproducibility across repeated executions, and artifact-grounded semantic querying. These results show that adaptive workflow configuration can be achieved without compromising reproducibility in heterogeneous clinical environments.
CognitiveTwin: Robust Multi-Modal Digital Twins for Predicting Cognitive Decline in Alzheimer's Disease
arXiv:2604.22428v1 Announce Type: new Abstract: Predicting individual cognitive decline in Alzheimer's disease (AD) is difficult due to the heterogeneity of disease progression. Reliable clinical tools require not only high accuracy but also fairness across demographics and robustness to missing data. We present CognitiveTwin, a digital twin framework that predicts patient-specific cognitive trajectories. The model integrates multi-modal longitudinal data (cognitive scores, magnetic resonance imaging, positron emission tomography, cerebrospinal fluid biomarkers, and genetics). We use a Transformer-based architecture to fuse these modalities and a Deep Markov Model to capture temporal dynamics. We trained and evaluated the framework using data from 1,666 patients in the TADPOLE (Alzheimer's Disease Neuroimaging Initiative) dataset. We assessed the model for prediction error, demographic fairness, and robustness to missing-not-at-random (MNAR) data patterns. ognitiveTwin provides accurate and personalized predictions of cognitive decline. Its demonstrated fairness across patient demographics and resilience to clinical dropout make it a reliable tool for clinical trial enrichment and personalized care planning.
AI-Driven Automation in Healthcare
The Robotics Intelligence Seminar at Stanford Research Institute spotlights the future of human-robot collaboration, particularly in healthcare and logistics, driven by AI-enabled full-stack autonomy.
AI and X-Ray Breakthroughs
AI and non-contact imaging have successfully revealed the contents of a charred Herculaneum papyrus without damaging it.
Contributor: AI could democratize medicine, but better regulation comes first - Los Angeles Times
Artificial intelligence has the potential to fundamentally change healthcare, and the possibilities are neither radical nor experimental.
Joseph Ologunja MD - NHS Fellowship in Clinical AI | LinkedIn
Black professionals are not waiting for an invitation to the AI table. They are the ones building it. From fixing bias to rethinking how we see data, the future of this technology is being shaped by the people I met today.
As Trump Officials Pushed Health Savings Accounts, RFK Jr. Aide Ran Wellness Company Poised to Benefit
Calley Means remained president of a company that relied on health savings accounts last year as the Trump administration developed policies to expand them.
Therapy company mixes emotional and artificial intelligence to top ranking
Grow Therapy heads The Americas’ Fastest-Growing Companies 2026 list while testing AI’s potential
Health-care AI is here. We don’t know if it actually helps patients.
I don’t need to tell you that AI is everywhere. Or that it is being used, increasingly, in hospitals. Doctors are using AI to help them with notetaking. AI-based tools are trawling through patient records, flagging people who may require certain support or treatments. They are also used to interpret medical exam results and X-rays. A…
HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering
arXiv:2604.21027v1 Announce Type: new Abstract: Electronic health record (EHR) question answering is often handled by LLM-based pipelines that are costly to deploy and do not explicitly leverage the hierarchical structure of clinical data. Motivated by evidence that medical ontologies and patient trajectories exhibit hyperbolic geometry, we propose HypEHR, a compact Lorentzian model that embeds codes, visits, and questions in hyperbolic space and answers queries via geometry-consistent cross-attention with type-specific pointer heads. HypEHR is pretrained with next-visit diagnosis prediction and hierarchy-aware regularization to align representations with the ICD ontology. On two MIMIC-IV-based EHR-QA benchmarks, HypEHR approaches LLM-based methods while using far fewer parameters. Our code is publicly available at https://github.com/yuyuliu11037/HypEHR.
Agentic AI for Personalized Physiotherapy: A Multi-Agent Framework for Generative Video Training and Real-Time Pose Correction
arXiv:2604.21154v1 Announce Type: new Abstract: At-home physiotherapy compliance remains critically low due to a lack of personalized supervision and dynamic feedback. Existing digital health solutions rely on static, pre-recorded video libraries or generic 3D avatars that fail to account for a patient's specific injury limitations or home environment. In this paper, we propose a novel Multi-Agent System (MAS) architecture that leverages Generative AI and computer vision to close the tele-rehabilitation loop. Our framework consists of four specialized micro-agents: a Clinical Extraction Agent that parses unstructured medical notes into kinematic constraints; a Video Synthesis Agent that utilizes foundational video generation models to create personalized, patient-specific exercise videos; a Vision Processing Agent for real-time pose estimation; and a Diagnostic Feedback Agent that issues corrective instructions. We present the system architecture, detail the prototype pipeline using Large Language Models and MediaPipe, and outline our clinical evaluation plan. This work demonstrates the feasibility of combining generative media with agentic autonomous decision-making to scale personalized patient care safely and effectively.
InVitroVision: a Multi-Modal AI Model for Automated Description of Embryo Development using Natural Language
arXiv:2604.21061v1 Announce Type: new Abstract: The application of artificial intelligence (AI) in IVF has shown promise in improving consistency and standardization of decisions, but often relies on annotated data and does not make use of the multimodal nature of IVF data. We investigated whether foundational vision-language models can be fine-tuned to predict natural language descriptions of embryo morphology and development. Using a publicly available embryo time-lapse dataset, we fine-tuned PaliGemma-2, a multi-modal vision-language model, with only 1,000 images and corresponding captions, describing embryo morphology, embryonic cell cycle and developmental stage. Our results show that the fine-tuned model, InVitroVision, outperformed a commercial model, ChatGPT 5.2, and base models in overall metrics, with performance improving with larger training datasets. This study demonstrates the potential of foundational vision-language models to generalize to IVF tasks with limited data, enabling the prediction of natural language descriptions of embryo morphology and development. This approach may facilitate the use of large language models to retrieve information and scientific evidence from relevant publications and guidelines, and has implications for few-shot adaptation to multiple downstream tasks in IVF.
Clinical Reasoning AI for Oncology Treatment Planning: A Multi-Specialty Case-Based Evaluation
arXiv:2604.20869v1 Announce Type: new Abstract: Background: More than 80% of U.S. cancer care is delivered in community settings, where survival remains worse than at academic centers. Clinicians must integrate genomics, staging, radiology, pathology, and changing guidelines, creating cognitive burden. We evaluated OncoBrain, an AI clinical reasoning platform for oncology treatment-plan generation, as an early step toward OGI. Methods: OncoBrain combines general-purpose LLMs with a cancer-specific graph retrieval-augmented generation layer, a gold-standard treatment-plan corpus as long-term memory, and a model-agnostic safety layer (CHECK) for hallucination detection and suppression. We evaluated clinician-enriched case summaries across gynecologic, genitourinary, neuro-oncology, gastrointestinal/hepatobiliary, and hematologic malignancies. Three clinician groups completed structured evaluations of 173 cases using a common 16-item instrument: subspecialist oncologists reviewed 50 cases, physician reviewers 78, and advanced practice providers 45. Results: Ratings were highest for scientific accuracy, evidence support, and safety, with lower but favorable scores for workflow integration and time savings. On a 5-point scale, mean alignment with evidence and guidelines was 4.60, 4.56, and 4.70 across subspecialists, physician reviewers, and advanced practice providers. Mean scores for absence of safety or misinformation concerns were 4.80, 4.40, and 4.60. Workflow integration averaged 4.50, 3.94, and 4.00; perceived time savings averaged 5.00, 3.89, and 3.60. Conclusions: In this multi-specialty vignette-based evaluation, OncoBrain generated oncology treatment plans judged guideline-concordant, clinically acceptable, and easy to supervise. These findings support the potential of a carefully engineered AI reasoning platform to assist oncology treatment planning and justify prospective real-world evaluation in community settings.
Post Next: Future of Cancer - The Next Frontiers
Post Next: Future of Cancer - The Next Frontiers - The Washington Post Democracy Dies in Darkness By Washington Post Live Register for the program here. Technological breakthroughs in artificial intelligence and beyond are transforming cancer research and care for patients around the world. Join Washington Post Live for a conversation with Microsoft Science President Peter Lee about the progress made and the future of cancer.
Automated Detection of Dosing Errors in Clinical Trial Narratives: A Multi-Modal Feature Engineering Approach with LightGBM
arXiv:2604.19759v1 Announce Type: new Abstract: Clinical trials require strict adherence to medication protocols, yet dosing errors remain a persistent challenge affecting patient safety and trial integrity. We present an automated system for detecting dosing errors in unstructured clinical trial narratives using gradient boosting with comprehensive multi-modal feature engineering. Our approach combines 3,451 features spanning traditional NLP (TF-IDF, character n-grams), dense semantic embeddings (all-MiniLM-L6v2), domain-specific medical patterns, and transformer-based scores (BiomedBERT, DeBERTa-v3), used to train a LightGBM model. Features are extracted from nine complementary text fields (median 5,400 characters per sample) ensuring complete coverage across all 42,112 clinical trial narratives. On the CT-DEB benchmark dataset with severe class imbalance (4.9% positive rate), we achieve 0.8725 test ROC-AUC through 5-fold ensemble averaging (cross-validation: 0.8833 + 0.0091 AUC). Systematic ablation studies reveal that removing sentence embeddings causes the largest performance degradation (2.39%), demonstrating their critical role despite contributing only 37.07% of total feature importance. Feature efficiency analysis demonstrates that selecting the top 500-1000 features yields optimal performance (0.886-0.887 AUC), outperforming the full 3,451-feature set (0.879 AUC) through effective noise reduction. Our findings highlight the importance of feature selection as a regularization technique and demonstrate that sparse lexical features remain complementary to dense representations for specialized clinical text classification under severe class imbalance.
Health Systems Race to Contain AI Misinformation ‘Domino Effect’ - Newsweek
Marketing experts can't control the AI algorithm. But health care leaders aren't "comfortably ceding" their brands just yet.
On-Demand: CE Marking in Europe - Medical Device Regulations and AI [1/1/2026-9/30/2026] - Alabama Small Business Development Center
Live webinar recorded on 10/1/2024 Please join the Alabama International Trade Center and BSI Group for a webinar: CE Marking in Europe – the State for the Medical Device Regulations in Europe and AI Agenda: The State for the Medical Device Regulations in Europe MDR/IVDR/CE Marking/UKCA QMS ...
Emma the joke-telling robot cracks up the care home: Paula Hornickel’s best photograph
‘The first resident that Emma – a social robot – was introduced to was called Peter. After that, Emma assumed they were all called Peter, which everyone found hilarious. Then she broke down’ One morning in July 2025, I arrived in the small, quiet town of Albershausen in south-west Germany.
Opinion | Utah program to let AI refill prescriptions is not a crazy idea - The Washington Post
It’s easy to dismiss Utah’s latest artificial intelligence experiment as dangerous and dystopian. The state has partnered with a company called Doctronic to empower an “AI doctor,” rather than a human clinician, to refill medication ...
AI Startup Has Helped Reverse Thousands of Denied Health Insurance Claims
Americans rarely fight back when insurers reject treatments their doctors have prescribed. Claimable is working to change that, with a little help from Mark Cuban.
The Godmother of Silicon Valley and her former student want to fix how healthcare gets built
Fail fast, revise, repeat: Esther Wojcicki brings her classroom philosophy to healthcare investing with the launch of Treehub.
Error-free Training for MedMNIST Datasets
arXiv:2604.18916v1 Announce Type: new Abstract: In this paper, we introduce a new concept called Artificial Special Intelligence by which Machine Learning models for the classification problem can be trained error-free, thus acquiring the capability of not making repeated mistakes. The method is applied to 18 MedMNIST biomedical datasets. Except for three datasets, which suffer from the double-labeling problem, all are trained to perfection.
Merck Partners With Google
Merck to partner with Google Cloud on AI initiatives.
AI/ML Scientist – Operational Twinning & Healthcare ...
We cannot provide a description for this page right now
New Gallup poll finds that low-income Americans are turning to AI for healthcare
A Gallup poll shows that 32% of low-income Americans use AI as a substitute for doctor visits, compared to 14% of the general population.
Why Healthcare AI Still Struggles To Deliver
Why healthcare AI stalls after the pilot stage, where governance becomes the bottleneck, and where CIOs are finally seeing measurable ROI.
Can you rely on AI chatbots for medical advice?
Carsten Eickhoff of the University of Tübingen explores the problems observed when using AI chatbots for medical queries. Read more: Can you rely on AI chatbots for medical advice?
WHO: Rapid rise of AI in EU healthcare calls for clear frameworks | ICT&health
The use of AI in European healthcare is growing rapidly, but the conditions for responsible implementation are lagging.
New studies show how often chatbots get health answers wrong - The Washington Post
Two studies put ChatGPT, Gemini and others to the test on questions of health. In one, they got almost half the answers wrong.
How the world regulates AI in health - and why it’s complex | ICT&health
In regions such as Europe, the United States, Australia, and China, AI is mainly governed under existing medical device laws.
Murder, she wrote: Ex-FBI chief wants some ransomware crims charged with homicide
Lawmakers decry CISA cuts: 'We are shooting ourselves in the foot' If a cyberattack leads to a death, that's murder. A former FBI cyber division chief urged the US Justice Department to consider felony homicide charges against ransomware actors when attacks on hospitals lead to patient deaths.…
Dental AI adoption expands in the U.S. as Heartland rolls out DentalXChange, HOOTL raises US$6M+ - Oral Health Group
Artificial intelligence (AI) adoption in dental practice management is accelerating in the United States, with a major dental support organization (DSO) deal and fresh venture financing highlighting growing investment in automation across the revenue cycle. On Tuesday, DentalXChange, a dental revenue cycle management (RCM) technology company, announced a new enterprise agreement with Heartland Dental, the largest DSO in the U.S., to deploy ...
Toward Zero-Egress Psychiatric AI: On-Device LLM Deployment for Privacy-Preserving Mental Health Decision Support
Privacy represents one of the most critical yet underaddressed barriers to AI adoption in mental healthcare -- particularly in high-sensitivity operational environments such as military, correctional, and remote healthcare settings, where the risk of patient data exposure can deter help-seeking behavior entirely. Existing AI-enabled psychiatric decision support systems predominantly rely on cloud-based inference pipelines, requiring sensitive patient data to leave the device and traverse externa...
AI Approach for MRI-only Full-Spine Vertebral Segmentation and 3D Reconstruction in Paediatric Scoliosis
MRI is preferred over CT in paediatric imaging because it avoids ionising radiation, but its use in spine deformity assessment is largely limited by the lack of automated, high-resolution 3D bony reconstruction, which continues to rely on CT. MRI-based 3D reconstruction remains impractical due to manual workflows and the scarcity of labelled full-spine datasets. This study introduces an AI framework that enables fully automated thoracolumbar spine (T1-L5) segmentation and 3D reconstruction from ...
DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI
arXiv:2604.15456v1 Announce Type: new Abstract: Trustworthiness and transparency are essential for the clinical adoption of artificial intelligence (AI) in healthcare and biomedical research. Recent deep research systems aim to accelerate evidence-grounded scientific discovery by integrating AI agents with multi-hop information retrieval, reasoning, and synthesis. However, most existing systems lack explicit and inspectable criteria for evidence appraisal, creating a risk of compounding errors and making it difficult for researchers and clinicians to assess the reliability of their outputs. In parallel, current benchmarking approaches rarely evaluate performance on complex, real-world medical questions. Here, we introduce DeepER-Med, a Deep Evidence-based Research framework for Medicine with an agentic AI system. DeepER-Med frames deep medical research as an explicit and inspectable workflow of evidence-based generation, consisting of three modules: research planning, agentic collaboration, and evidence synthesis. To support realistic evaluation, we also present DeepER-MedQA, an evidence-grounded dataset comprising 100 expert-level research questions derived from authentic medical research scenarios and curated by a multidisciplinary panel of 11 biomedical experts. Expert manual evaluation demonstrates that DeepER-Med consistently outperforms widely used production-grade platforms across multiple criteria, including the generation of novel scientific insights. We further demonstrate the practical utility of DeepER-Med through eight real-world clinical cases. Human clinician assessment indicates that DeepER-Med's conclusions align with clinical recommendations in seven cases, highlighting its potential for medical research and decision support.
Palantir's NHS future in doubt as ministers eye contract break
£330M deal leaves service with no ownership of software built to connect trusts to the platform The UK government is considering ending Palantir's involvement in a central NHS data platform after coming under fire from MPs, unions, and campaigners.…
How people use Copilot for Health
arXiv:2604.15331v1 Announce Type: cross Abstract: We analyze over 500,000 de-identified health-related conversations with Microsoft Copilot from January 2026 to characterize what people ask conversational AI about health. We develop a hierarchical intent taxonomy of 12 primary categories using privacy-preserving LLM-based classification validated against expert human annotation, and apply LLM-driven topic-clustering for prevalent themes within each intent. Using this taxonomy, we characterize the intents and topics behind health queries, identify who these queries are about, and analyze how usage varies by device and time of day. Five findings stand out. First, nearly one in five conversations involve personal symptom assessment or condition discussion, and even the dominant general information category (40%) is concentrated on specific treatments and conditions, suggesting that this is a lower bound on personal health intent. Second, one in seven of these personal health queries concern someone other than the user, such as a child, a parent, a partner, suggesting that conversational AI can be a caregiving tool, not just a personal one. Third, personal queries about symptoms and emotional health queries increase markedly in the evening and nighttime hours, when traditional healthcare is most limited. Fourth, usage diverges sharply by device: mobile concentrates on personal health concerns, while desktop is dominated by professional and academic work. Fifth, a substantial share of queries focuses on navigating healthcare systems such as finding providers, and understanding insurance, highlighting friction in the delivery of existing healthcare. These patterns have direct implications for platform-specific design, safety considerations, and the responsible development of health AI.
Is AI actually improving healthcare? | Nature Medicine
A.G. is the Varma Family Chair ... Advanced Research AI Chair funds at the Vector Institute. J.W. is supported by AI & Digital Health Innovation at the University of Michigan, and by the National Heart Lung and Blood Institute of the US National Institutes of Health (grant R01HS027431). ... Correspondence to Anna Goldenberg. The authors declare no competing interests. ... Goldenberg, A., Wiens, J.
HSCC warns AI-driven supply chains are outpacing healthcare cybersecurity defenses and oversight models - Industrial Cyber
HSCC warns AI-driven supply chains are outpacing healthcare cybersecurity defenses, exposing gaps in vendor oversight and risk visibility.
New WHO/Europe report provides first-ever snapshot of AI in health care across European Union Member States
WHO/Europe has released a new report assessing the rapidly evolving use of artificial intelligence (AI) in health care across the 27 European Union (EU) Member States. The first comprehensive review of its kind, the report reveals strong and consistent momentum across EU Member States, with ...
AI in health care: Experts discuss the future of AI practices and policies | AHA News
Jim VandeHei, CEO of Axios; Marc Boom, M.D., AHA board chair and president and CEO of Houston Methodist; Anne Klibanski, M.D., president and CEO of Mass General Brigham; Jonathan Perlin, M.D., president and CEO of Joint Commission; and Ladd Wiley, senior vice president of global corporate affairs, ...
Preparing Healthcare Data for AI: Why Health Systems Must Fix Legacy Systems
Xsolis CTO Zach Evans explains why healthcare AI pilots frequently stall, revealing that less than 20% of enterprise data is ready for AI.
Regulatory Considerations for Artificial Intelligence in Healthcare: A WHO Perspective
The mission of the World Health Organization (WHO) is to promote health, keep the world safe and serve the vulnerable is articulated in its global strategy on digital health 2020–2025. At the heart of...
SciFi: A Safe, Lightweight, User-Friendly, and Fully Autonomous Agentic AI Workflow for Scientific Applications
arXiv:2604.13180v1 Announce Type: new Abstract: Recent advances in agentic AI have enabled increasingly autonomous workflows, but existing systems still face substantial challenges in achieving reliable deployment in real-world scientific research. In this work, we present a safe, lightweight, and user-friendly agentic framework for the autonomous execution of well-defined scientific tasks. The framework combines an isolated execution environment, a three-layer agent loop, and a self-assessing do-until mechanism to ensure safe and reliable operation while effectively leveraging large language models of varying capability levels. By focusing on structured tasks with clearly defined context and stopping criteria, the framework supports end-to-end automation with minimal human intervention, enabling researchers to offload routine workloads and devote more effort to creative activities and open-ended scientific inquiry.
AI in Healthcare: Aid, Not Replace—Clinicians Warn of Risks as Users Seek Speed and Privacy
Physicians stress AI should support, not replace, professional care amid concerns about misinformation and privacy.
How AI Is Being Used To Detect Cancer at The Earliest Stage
Dr. Bea Bakshi, CEO & Co-Founder of C the Signs joins Bloomberg Businessweek to discuss the future of cancer detection and how AI is part of the solution even in the early stages. (Source: Bloomberg)
AI in Healthcare: Aid, Not Replace—Clinicians Warn of Risks as Users Seek Speed and Privacy
Physicians stress AI should support, not replace, professional care amid concerns about misinformation and privacy.
A longitudinal health agent framework
arXiv:2604.12019v1 Announce Type: new Abstract: Although artificial intelligence (AI) agents are increasingly proposed to support potentially longitudinal health tasks, such as symptom management, behavior change, and patient support, most current implementations fall short of facilitating user intent and fostering accountability. This contrasts with prior work on supporting longitudinal needs, where follow-up, coherent reasoning, and sustained alignment with individuals' goals are critical for both effectiveness and safety. In this paper, we draw on established clinical and personal health informatics frameworks to define what it would mean to orchestrate longitudinal health interactions with AI agents. We propose a multi-layer framework and corresponding agent architecture that operationalizes adaptation, coherence, continuity, and agency across repeated interactions. Through representative use cases, we demonstrate how longitudinal agents can maintain meaningful engagement, adapt to evolving goals, and support safe, personalized decision-making over time. Our findings underscore both the promise and the complexity of designing systems capable of supporting health trajectories beyond isolated interactions, and we offer guidance for future research and development in multi-session, user-centered health AI.
China Launches AI Doctor Platform for Parkinson's
Xuanwu Hospital in Beijing has launched El.kz, China's first AI-powered platform for Parkinson's disease, aimed at automating routine patient inquiries. This initiative is part of a broader strategy to digitize healthcare, leveraging over 20 years of clinical data to alleviate physician workload and better manage chronic conditions in an aging population.
Adoption and Effectiveness of AI-Based Anomaly Detection for Cross Provider Health Data Exchange
arXiv:2604.09630v1 Announce Type: new Abstract: This study investigates the adoption and effectiveness of AI-based anomaly detection in cross-provider electronic health record (EHR) environments. It aims to (1) identify the organisational and digital capabilities required for successful implementation and (2) evaluate the performance and interpretability of lightweight anomaly detection approaches using contextual audit data. A semi-systematic scoping synthesis is conducted to derive a four-pillar readiness framework covering governance, infrastructure/interoperability, workforce, and AI integration, operationalised as a 10-item checklist with measurable indicators. This is complemented by a simulation of cross-provider audit logs incorporating contextual features such as provider mismatch, time of access, days since discharge, session duration, and access frequency. A rule-based approach is benchmarked against Isolation Forest, with SHAP used to explain model behaviour. Results show that rule-based methods achieve high recall but generate higher alert volumes, while Isolation Forest reduces alert burden at the cost of lower sensitivity. SHAP analysis highlights provider mismatch and off-hours access as dominant anomaly drivers. The study proposes a staged deployment strategy combining rules for coverage and machine learning for prioritisation, supported by explainability and continuous monitoring. The findings contribute a practical readiness framework and empirical insights to guide the implementation of AI-based anomaly detection in multi-provider healthcare environments.
China Launches AI Doctor Platform for Parkinson's, Streamlining Patient Support
Xuanwu Hospital in Beijing has launched El.kz, China's first AI-powered platform for Parkinson's disease, aimed at automating routine patient inquiries.
Investigating Vaccine Buyer's Remorse: Post-Vaccination Decision Regret in COVID-19 Social Media Using Politically Diverse Human Annotation
arXiv:2604.09626v1 Announce Type: new Abstract: A significant gap exists in datasets regarding post-COVID-19 vaccination experiences, particularly ``vaccine buyer's remorse''. Understanding the prevalence and nature of vaccine regret, whether based on personal or vicarious experiences, is vital for addressing vaccine hesitancy and refining public health communication. In this paper, we curate a novel dataset from a large YouTube news corpus capturing COVID-19 vaccination experiences, and construct a benchmark subset focused on vaccine regret, annotated by a politically diverse panel to account for the subjective and often politicized nature of the topic. We utilize large language models (LLMs) to identify posts expressing vaccine regret, analyze the reasons behind this regret, and quantify its occurrence in both first and second-person accounts. This paper aims to (1) quantify the prevalence of vaccine regret; (2) identify common reasons for this sentiment; (3) analyze differences between first-person and vicarious experiences; and (4) assess potential biases introduced by different LLMs. We find that while vaccine buyer's remorse appears in only $<2\%$ of public discourse, it is disproportionately concentrated in vaccine-skeptic influencer communities and is predominantly expressed through first-person narratives citing adverse health events.
AI chatbots misdiagnose in over 80% of early medical cases, study finds
Top models including OpenAI and DeepSeek make judgments too quickly when patient data is incomplete
AI to predict how bowel cancer patients will respond to new NHS drug | Bowel cancer | The Guardian
PhenMap tool could spare thousands of patients from treatment that would be ineffective for them
7 ways AI is advancing healthcare and wellbeing around the world - Source
AI-powered tools are being used to bring greater efficiency and security to healthcare around the world and increase access to medicines and care.
Healthcare AI Faces Scaling Challenges
A Qventus study reveals that while many health IT leaders deploy AI, only 4% have achieved measurable outcomes due to integration hurdles.
AI Data Governance for Healthcare | Health AI
Data quality, privacy, and governance requirements for clinical AI
India Unveils Futuristic Surgical Tech at SMRSC 2026
India showcased battlefield care and tele-surgery innovations from SS Innovations International at the SMRSC 2026 event.
AI Policy for Reproductive Medicine in California | Health AI
AI compliance requirements for reproductive medicine in California. State-specific regulation, HIPAA, and governance guidance.
Healthcare AI Faces Scaling Challenges
A Qventus study reveals that while 42% of health IT leaders deploy AI across multiple use cases, only 4% have measurable outcomes, highlighting challenges in scaling AI pilots. Experts emphasize the risks of poor technology bets and the necessity of AI integration for competitive advantage, as healthcare systems push for …
Co-design for Trustworthy AI: An Interpretable and Explainable Tool for Type 2 Diabetes Prediction Using Genomic Polygenic Risk Scores
arXiv:2604.08217v1 Announce Type: new Abstract: The polygenic risk scores (PRS) have emerged as an important methodology for quantifying genetic predisposition to complex traits and clinical disease. Significant progress has been made in applying PRS to conditions such as obesity, cancer, and type 2 diabetes (T2DM). Studies have demonstrated that PRS can effectively identify individuals at high risk, thereby enabling early screening, personalized treatment, and targeted interventions for diseases with a genetic predisposition. One current limitation of PRS, however, is the lack of interpretability tools. To address this problem for T2DM, researchers at the Graduate School of Data Science at the Seoul National University introduced eXplainable PRS (XPRS). This visualization tool decomposes PRSs into gene-level and single-nucleotide polymorphism (SNP) contribution scores via Shapley Additive Explanations (SHAP), providing granular insights into the specific genetic factors driving an individual's risk profile. We used a co-design approach to assess XPRS trustworthiness by considering legal, medical, ethical, and technical robustness during early design and potential clinical use. For that, we used Z-inspection, an ethically aligned Trustworthy AI co-design methodology, and piloted the Council of Europe's Human Rights, Democracy, and the Rule of Law Impact Assessment for AI Systems (HUDERIA) (Council of Europe (CAI) 2025). The findings of this use-case comprise a comprehensive set of ethical, legal, and technical lessons learned. These insights, identified by a multidisciplinary team of experts (ethics, legal, human rights, computer science, and medical), serve as a framework for designers to navigate future challenges with this and other AI systems. The findings also provide a useful reference for researchers developing explainability frameworks for PRS in diverse clinical contexts.
IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures
arXiv:2604.07709v1 Announce Type: cross Abstract: Ask a frontier model how to taper six milligrams of alprazolam (psychiatrist retired, ten days of pills left, abrupt cessation causes seizures) and it tells her to call the psychiatrist she just explained does not exist. Change one word ("I'm a psychiatrist; a patient presents with...") and the same model, same weights, same inference pass produces a textbook Ashton Manual taper with diazepam equivalence, anticonvulsant coverage, and monitoring thresholds. The knowledge was there; the model withheld it. IatroBench measures this gap. Sixty pre-registered clinical scenarios, six frontier models, 3,600 responses, scored on two axes (commission harm, CH 0-3; omission harm, OH 0-4) through a structured-evaluation pipeline validated against physician scoring (kappa_w = 0.571, within-1 agreement 96%). The central finding is identity-contingent withholding: match the same clinical question in physician vs. layperson framing and all five testable models provide better guidance to the physician (decoupling gap +0.38, p = 0.003; binary hit rates on safety-colliding actions drop 13.1 percentage points in layperson framing, p = 1 (kappa = 0.045); the evaluation apparatus has the same blind spot as the training apparatus. Every scenario targets someone who has already exhausted the standard referrals.
AI Legislative Update: April 10, 2026 — Transparency Coalition. Legislation for Transparency in AI Now.
Every Friday, TCAI brings you the nation’s most comprehensive update of AI-related legislation moving through state legislatures. This week: Therapy chatbot bans are picking up speed. Maine sent a therapy bot ban to the governor, while Missouri is moving on a similar ban via an omnibus health ...
UnitedHealth Just Dropped $3 Billion On AI. Not To Save Your Life. To Deny Your Claim Faster.
The company now employs 22,000 software engineers. More than 80 percent are building AI tools. Not to find cures. Not to coordinate care.
Opinion | Meet Abi, the AI robot senior care companion - The Washington Post
Abi, an AI -powered companion, at a senior care home in Melbourne, Australia, in 2025.
How AI could change patient care, not replace it | Medical Economics
Patients are bringing their health questions to AI, often before even scheduling a visit with their physician. Medallia's Amber Maraccini, Ph.D., M.A., explains how tools such as ChatGPT Health can either erode trust or create more space for human conversations.
Navigating the European Union’s AI and health data framework - Atlantic Council
The EU is strengthening AI and health data governance, creating a more secure and trusted framework for innovation.
Watch: As AI Makes More Health Coverage Decisions, the Risks to Patients Grow - KFF Health News
Major health insurers and even Medicare are using artificial intelligence to make coverage decisions. But class action lawsuits have accused insurers of using AI to wrongfully withhold treatment, and new research illuminates the risks.
An Analysis of Artificial Intelligence Adoption in NIH-Funded Research
arXiv:2604.07424v1 Announce Type: cross Abstract: Understanding the landscape of artificial intelligence (AI) and machine learning (ML) adoption across the National Institutes of Health (NIH) portfolio is critical for research funding strategy, institutional planning, and health policy. The advent of large language models (LLMs) has fundamentally transformed research landscape analysis, enabling researchers to perform large-scale semantic extraction from thousands of unstructured research documents. In this paper, we illustrate a human-in-the-loop research methodology for LLMs to automatically classify and summarize research descriptions at scale. Using our methodology, we present a comprehensive analysis of 58,746 NIH-funded biomedical research projects from 2025. We show that: (1) AI constitutes 15.9% of the NIH portfolio with a 13.4% funding premium, concentrated in discovery, prediction, and data integration across disease domains; (2) a critical research-to-deployment gap exists, with 79% of AI projects remaining in research/development stages while only 14.7% engage in clinical deployment or implementation; and (3) health disparities research is severely underrepresented at just 5.7% of AI-funded work despite its importance to NIH's equity mission. These findings establish a framework for evidence-based policy interventions to align the NIH AI portfolio with health equity goals and strategic research priorities.
Grounding Clinical AI Competency in Human Cognition Through the Clinical World Model and Skill-Mix Framework
IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures
Ask a frontier model how to taper six milligrams of alprazolam (psychiatrist retired, ten days of pills left, abrupt cessation causes seizures) and it tells her to call the psychiatrist she just explained does not exist. Change one word ("I'm a psychiatrist; a patient presents with... ") and the same model, same weights, same inference pass produces a textbook Ashton Manual taper with diazepam equivalence, anticonvulsant coverage, and monitoring thresholds.
For Oura and Whoop, health has a reasonable chance of being wealth
The two health technology wearable companies have recently raised money at valuations of $11bn and $10bn respectively
Health - The Washington Post
Top health officials point to AI as the solution for dying hospitals, but experts from rural areas of the country say the solution isn’t that simple.
SymptomWise: A Deterministic Reasoning Layer for Reliable and Efficient AI Systems
arXiv:2604.06375v1 Announce Type: new Abstract: AI-driven symptom analysis systems face persistent challenges in reliability, interpretability, and hallucination. End-to-end generative approaches often lack traceability and may produce unsupported or inconsistent diagnostic outputs in safety-critical settings. We present SymptomWise, a framework that separates language understanding from diagnostic reasoning. The system combines expert-curated medical knowledge, deterministic codex-driven inference, and constrained use of large language models. Free-text input is mapped to validated symptom representations, then evaluated by a deterministic reasoning module operating over a finite hypothesis space to produce a ranked differential diagnosis. Language models are used only for symptom extraction and optional explanation, not for diagnostic inference. This architecture improves traceability, reduces unsupported conclusions, and enables modular evaluation of system components. Preliminary evaluation on 42 expert-authored challenging pediatric neurology cases shows meaningful overlap with clinician consensus, with the correct diagnosis appearing in the top five differentials in 88% of cases. Beyond medicine, the framework generalizes to other abductive reasoning domains and may serve as a deterministic structuring and routing layer for foundation models, improving precision and potentially reducing unnecessary computational overhead in bounded tasks.
Front-End Ethics for Sensor-Fused Health Conversational Agents: An Ethical Design Space for Biometrics
arXiv:2604.06203v1 Announce Type: new Abstract: The integration of continuous data from built-in sensors and Large Language Models (LLMs) has fueled a surge of "Sensor-Fused LLM agents" for personal health and well-being support. While recent breakthroughs have demonstrated the technical feasibility of this fusion (e.g., Time-LLM, SensorLLM), research primarily focuses on "Ethical Back-End Design for Generative AI", concerns such as sensing accuracy, bias mitigation in training data, and multimodal fusion. This leaves a critical gap at the front end, where invisible biometrics are translated into language directly experienced by users. We argue that the "illusion of objectivity" provided by sensor data amplifies the risks of AI hallucinations, potentially turning errors into harmful medical mandates. This paper shifts the focus to "Ethical Front-End Design for AI", specifically, the ethics of biometric translation. We propose a design space comprising five dimensions: Biometric Disclosure, Monitoring Temporality, Interpretation Framing, AI Stance, and Contestability. We examine how these dimensions interact with context (user- vs. system-initiated) and identify the risk of biofeedback loops. Finally, we propose "Adaptive Disclosure" as a safety guardrail and offer design guidelines to help developers manage fallibility, ensuring that these cutting-edge health agents support, rather than destabilize, user autonomy.
XRPH AI: Building the Right Foundation for What Comes Next | by XRP Healthcare | Apr, 2026 | Medium
The XRPH AI App is already live, with growing user engagement and a developing framework designed to connect real healthcare activity to measurable outcomes.
Tennessee Bill on AI Friends
I think what Tennessee is doing is they recently passed SB 1580, which makes it illegal to even advertise that an AI can act as a mental health professional.
Accenture global health lead on scaling AI in healthcare with governance and intent | TechTarget
It's about deploying it deliberately where it demonstrably improves outcomes and can be governed with confidence. And the winners are not the ones who adopt AI fastest. They're the ones who adopt it wisely. Jill Hughes has covered health tech news since 2021. ... Predicting 2025's top analytics, AI trends in ... – Healthtech Analytics · Shadow AI in healthcare...
Advita Ortho Unveils AI-Driven Advances
Advita Ortho is spearheading AI-driven advancements in orthopedic care, unveiling 16 new studies at ORS 2026 focused on enhancing joint replacement surgeries with data-driven technologies.
Advita Ortho Unveils AI-Driven Advances in Orthopedic Surgery at ORS 2026
Advita Ortho is spearheading AI-driven advancements in orthopedic care, unveiling 16 new studies at ORS 2026 focused on enhancing joint replacement surgeries with data-driven technologies.
MedGemma 1.5 Technical Report
arXiv:2604.05081v1 Announce Type: new Abstract: We introduce MedGemma 1.5 4B, the latest model in the MedGemma collection. MedGemma 1.5 expands on MedGemma 1 by integrating additional capabilities: high-dimensional medical imaging (CT/MRI volumes and histopathology whole slide images), anatomical localization via bounding boxes, multi-timepoint chest X-ray analysis, and improved medical document understanding (lab reports, electronic health records). We detail the innovations required to enable these modalities within a single architecture, including new training data, long-context 3D volume slicing, and whole-slide pathology sampling. Compared to MedGemma 1 4B, MedGemma 1.5 4B demonstrates significant gains in these new areas, improving 3D MRI condition classification accuracy by 11% and 3D CT condition classification by 3% (absolute improvements). In whole slide pathology imaging, MedGemma 1.5 4B achieves a 47% macro F1 gain. Additionally, it improves anatomical localization with a 35% increase in Intersection over Union on chest X-rays and achieves a 4% macro accuracy for longitudinal (multi-timepoint) chest x-ray analysis. Beyond its improved multimodal performance over MedGemma 1, MedGemma 1.5 improves on text-based clinical knowledge and reasoning, improving by 5% on MedQA accuracy and 22% on EHRQA accuracy. It also achieves an average of 18% macro F1 on 4 different lab report information extraction datasets (EHR Datasets 2, 3, 4, and Mendeley Clinical Laboratory Test Reports). Taken together, MedGemma 1.5 serves as a robust, open resource for the community, designed as an improved foundation on which developers can create the next generation of medical AI systems. Resources and tutorials for building upon MedGemma 1.5 can be found at https://goo.gle/MedGemma.
AI Boom Could Cause Health Care Costs to Soar - Penn LDI
AI is already affecting health care delivery, and the choices policymakers make about payment will define its future trajectory, says LDI Fellow Amol Navathe.
THE DAILY SCRAPE - by Brent Orrell - Help Desk
THE DAILY SCRAPE - by Brent Orrell - Help Desk # Help Desk SubscribeSign in # THE DAILY SCRAPE ### Your daily rundown on AI and what it means for the American Workforce Apr 08, 2026 Share APRIL 7, 2026 *Wall Street is embracing AI agents. While some older workers are choosing early retirement over reskilling, healthcare professionals are discovering that AI anxiety often gives way to cautious adoption once workers see the tools in action. The Trump administration has forced Brown University to invest $50 million in job training programs. * Data Point of the Day Today’s Number $30 billion Anthropic’s annual revenue run rate shows just how fast AI companies are scaling up. How soon can other AI companies move past capital burn and into profitability? — Bloomberg Top Stories AI· AUTOMATION [AI Agents Are Coming for Wall Street](https://www.inc.com/phil-rosen/ai-agents-financ
Uncertainty-Guided Latent Diagnostic Trajectory Learning for Sequential Clinical Diagnosis
arXiv:2604.05116v1 Announce Type: new Abstract: Clinical diagnosis requires sequential evidence acquisition under uncertainty. However, most Large Language Model (LLM) based diagnostic systems assume fully observed patient information and therefore do not explicitly model how clinical evidence should be sequentially acquired over time. Even when diagnosis is formulated as a sequential decision process, it is still challenging to learn effective diagnostic trajectories. This is because the space of possible evidence-acquisition paths is relatively large, while clinical datasets rarely provide explicit supervision information for desirable diagnostic paths. To this end, we formulate sequential diagnosis as a Latent Diagnostic Trajectory Learning (LDTL) framework based on a planning LLM agent and a diagnostic LLM agent. For the diagnostic LLM agent, diagnostic action sequences are treated as latent paths and we introduce a posterior distribution that prioritizes trajectories providing more diagnostic information. The planning LLM agent is then trained to follow this distribution, encouraging coherent diagnostic trajectories that progressively reduce uncertainty. Experiments on the MIMIC-CDM benchmark demonstrate that our proposed LDTL framework outperforms existing baselines in diagnostic accuracy under a sequential clinical diagnosis setting, while requiring fewer diagnostic tests. Furthermore, ablation studies highlight the critical role of trajectory-level posterior alignment in achieving these improvements.
Algebraic Structure Discovery for Real World Combinatorial Optimisation Problems: A General Framework from Abstract Algebra to Quotient Space Learning
arXiv:2604.04941v1 Announce Type: new Abstract: Many combinatorial optimisation problems hide algebraic structures that, once exposed, shrink the search space and improve the chance of finding the global optimal solution. We present a general framework that (i) identifies algebraic structure, (ii) formalises operations, (iii) constructs quotient spaces that collapse redundant representations, and (iv) optimises directly over these reduced spaces. Across a broad family of rule-combination tasks (e.g., patient subgroup discovery and rule-based molecular screening), conjunctive rules form a monoid. Via a characteristic-vector encoding, we prove an isomorphism to the Boolean hypercube $\{0,1\}^n$ with bitwise OR, so logical AND in rules becomes bitwise OR in the encoding. This yields a principled quotient-space formulation that groups functionally equivalent rules and guides structure-aware search. On real clinical data and synthetic benchmarks, quotient-space-aware genetic algorithms recover the global optimum in 48% to 77% of runs versus 35% to 37% for standard approaches, while maintaining diversity across equivalence classes. These results show that exposing and exploiting algebraic structure offers a simple, general route to more efficient combinatorial optimisation.
Resource-Conscious Modeling for Next- Day Discharge Prediction Using Clinical Notes
arXiv:2604.03498v1 Announce Type: new Abstract: Timely discharge prediction is essential for optimizing bed turnover and resource allocation in elective spine surgery units. This study evaluates the feasibility of lightweight, fine-tuned large language models (LLMs) and traditional text-based models for predicting next-day discharge using postoperative clinical notes. We compared 13 models, including TF-IDF with XGBoost and LGBM, and compact LLMs (DistilGPT-2, Bio_ClinicalBERT) fine-tuned via LoRA. TF-IDF with LGBM achieved the best balance, with an F1-score of 0.47 for the discharge class, a recall of 0.51, and the highest AUC-ROC (0.80). While LoRA improved recall in DistilGPT2, overall transformer-based and generative models underperformed. These findings suggest interpretable, resource-efficient models may outperform compact LLMs in real-world, imbalanced clinical prediction tasks.
How AI is used in Pittsburgh hospitals, and what workers think
AI tools are becoming more commonplace in Pittsburgh hospitals, promising productivity gains for practitioners and solving the nursing crisis. But to local hospital employees, enduring understaffing and burnout, another solution is clear: Hire more workers.
AI in the mental health care workforce is met with fear, pushback — and enthusiasm | WFSU News
Artificial intelligence tools that help mental health therapists take notes and keep records are quickly entering the marketplace. But some question the safety of AI in mental health care delivery.
How MassMutual and Mass General Brigham Turned AI Pilot Sprawl into Production Results
This article provides practical lessons on moving from disconnected AI pilots to governed production value, emphasizing success metrics and flexible architecture.
neuroClues raises €10 million Series A to become the brain’s stethoscope for early diagnosis of neurological disorders
neuroClues, a French-Belgian MedTech startup empowering clinicians with biomarkers allowing them to identify neurological disorders years before visible symptoms, has raised a €10 million Series A, along with additional non-dilutive funding, bringing the total capital raised by the company to €25 million. The round is led by Teampact Ventures, White Fund and the EIC Fund […]
Health Care Roundup: Market Talk
Find insight on UltraGreen.ai, Pro Medicus and more in the latest Market Talks covering the health care sector.
AI in Healthcare: How Machine Learning Is Transforming Faster Disease Diagnosis in 2026
AI healthcare diagnostics improve disease detection accuracy and speed, transforming medical care with machine learning and predictive analytics.
AI in the mental health care workforce is met with fear, pushback — and enthusiasm
Artificial intelligence tools that help mental health therapists take notes and keep records are quickly entering the marketplace. But some question the safety of AI in mental health care delivery.
Navigating AI in Healthcare: Can Patients Opt Out of AI Note-Taking? - Dr. Matthew Lynch
In an era where technology continuously reshapes various aspects of our lives, the healthcare sector is no exception. One of the most notable advancements is the integration of artificial intelligence (AI) in clinical settings. Family physician Eric Boose from the Cleveland Clinic has embraced ...
World Health Day: AI doctors vs real doctors — where machines win and where they fail - The Times of India
Science News: A simple test by Dr Mikhail Varshavski, a practising, board-certified family medicine doctor based in New York, shows both AI's strengths and its blin.
AI in the mental health care workforce is met with fear, pushback — and enthusiasm
Artificial intelligence tools that help mental health therapists take notes and keep records are quickly entering the marketplace. But some question the safety of AI in mental health care delivery.
State of the Art Report for Smart Habitat for Older Persons -- Working Group 3 -- Healthcare
arXiv:2604.03255v1 Announce Type: new Abstract: This document reports the State of the Art of science and practice on three topics related to smart and healthy ageing at home: furniture and habitats, Information and Communication Technologies (ICT), and healthcare. The reports were prepared by the working groups of COST Action CA16226, Sheld-on. Sheld-on is a network of researchers, user representatives, industry members, and other stakeholders. The three domains covered in this report were the areas of interest for three working groups from the COST Action. The aim of each working group was to assess the State of the Art for disciplinary understanding, identification of advances in smart furniture and habitat, products, industries and success stories. The findings on these topics of all working groups are compiled here. Due to the different backgrounds of the members of each of the working groups, the document is divided in three separate parts that can be considered as separate State of the Art reports. The goal of this document is to be used as input in the fourth working group of Sheld-on COST Action: Solutions for Ageing Well at Home, in the Community, and at Work, where experts from the three different domains converge to a single working group in order to achieve the action objectives.
VERT: Reliable LLM Judges for Radiology Report Evaluation
arXiv:2604.03376v1 Announce Type: new Abstract: Current literature on radiology report evaluation has focused primarily on designing LLM-based metrics and fine-tuning small models for chest X-rays. However, it remains unclear whether these approaches are robust when applied to reports from other modalities and anatomies. Which model and prompt configurations are best suited to serve as LLM judges for radiology evaluation? We conduct a thorough correlation analysis between expert and LLM-based ratings. We compare three existing LLM-as-a-judge metrics (RadFact, GREEN, and FineRadScore) alongside VERT, our proposed LLM-based metric, using open- and closed-source models (reasoning and non-reasoning) of different sizes across two expert-annotated datasets, RadEval and RaTE-Eval, spanning multiple modalities and anatomies. We further evaluate few-shot approaches, ensembling, and parameter-efficient fine-tuning using RaTE-Eval. To better understand metric behavior, we perform a systematic error detection and categorization study to assess alignment of these metrics against expert judgments and identify areas of lower and higher agreement. Our results show that VERT improves correlation with radiologist judgments by up to 11.7% relative to GREEN. Furthermore, fine-tuning Qwen3 30B yield gains of up to 25% using only 1,300 training samples. The fine-tuned model also reduces inference time up to 37.2 times. These findings highlight the effectiveness of LLM-based judges and demonstrate that reliable evaluation can be achieved with lightweight adaptation.
How MassMutual and Mass General Brigham turned AI pilot sprawl into production results
A case study on how two major organizations successfully scaled their AI initiatives from experimental pilots to full production.
FDA Clears Adjunctive AI Mapping of White Matter on Diffusion MRI | Diagnostic Imaging
Through automated processing of diffusion-weighted magnetic resonance imaging (MRI) scans, the Advanced Neuro Diagnostic Imaging (ANDI) software performs detailed analysis of white matter microstructure.
Healthcare’s AI inflection point: The organizations that win will be the ones with the strongest data foundations | Healthcare Dive
Healthcare doesn’t have an AI experimentation problem. It has an execution gap — and that gap is widening.
AI dolls offer companionship to the elderly
South Korea’s strained social care system turns to ChatGPT-enabled devices as population ages
CMEF 2026: Unveiling Global Medical Innovations and AI Breakthroughs
The 93rd China International Medical Equipment Fair in Shanghai will feature over 5,000 brands and highlight significant AI advancements in medical technology.
NHS staff resist using Palantir software
Staff reportedly cite ethics concerns, privacy worries, and doubt the platform adds much.
Episode 62 - by Karim Hanna, MD - AI+MedEd
Carl Preiksaitis, a Stanford emergency medicine physician and educator, makes a thoughtful case that the answer isn’t to ban AI — it’s to be deliberate about what we protect as human-only cognitive work. His list is clear: first-pass differential diagnosis, initial problem representation, and the assessment and plan.
Enhance Rehab Center Efficiency with AI-Powered Documentation Audits
Enhance rehab center efficiency with AI-powered documentation audits for improved accuracy and care.
The Rise of AI in Healthcare
AI is helping tackle some of the world's most pressing challenges.
AI Tool Empowers Caregivers Managing Stage 4 Cancer Treatment – ICO Optics
Pratik Desai, a 34-year-old with a background in systems integration and AI, built a free, AI-assisted workflow to help his […]
AI In Healthcare Across Sub-Saharan Africa - Voxilens
The decade of the “promising pilot” is over. In 2026, the narrative of AI in African healthcare has shifted from aspirational headlines to verifiable, population-level impact. As the continent faces…
NBER AI in Healthcare: Everything You Need to Know - AI Expert Magazine - premier publication dedicated to AI
KEY FACTS Date: May 7-8, 2026 Location: Cambridge, MA, USA Type: Academic-Industry Symposium Website: rapidscale.net What Is NBER AI in Healthcare? The NBER AI in Healthcare symposium is a premier academic-industry event focused on the rigorous, evidence-based evaluation of artificial ...
AWS and UnitedHealthcare Take Back-Office to Front-End Approach to Healthcare AI | PYMNTS.com
In this week’s Prompt Economy update, new examples from AWS and UnitedHealthcare highlight how agentic AI is being applied across healthcare.
AI in Healthcare and Digital Health Today—April 6, 2026 - LucidQuest Ventures
Weekly AI in Healthcare and Digital Health update highlighting Eli Lilly-Insilico deal and Medtronic strategy shift, plus more.
Clinical AI Shifts to Reasoning in Medical Coding Systems
Corti advances clinical AI with reasoning-based medical coding, improving accuracy and auditability in healthcare systems.
One-Third of U.S. Adults Use AI Chatbots for Health Information, KFF Poll Finds
One-Third of U.S. Adults Use AI Chatbots for Health Information, KFF Poll Finds # One-Third of U.S. Adults Use AI Chatbots for Health Information, KFF Poll Finds April 6, 2026 April 6, 2026 3 A growing number of Americans are bypassing traditional search engines and doctor’s offices in favor of artificial intelligence, with 32% of adults reporting they have turned to AI chatbots for health information in the past year. This shift marks a significant pivot in how the public consumes medical advice, as the share of people using AI for health now equals the proportion who rely on social media for the same purpose. The trend is driven largely by a desire for immediacy and privacy, but it also reveals deeper systemic fractures in the U.S. Healthcare system
Open@Epic to Return in 2026 as Healthcare Data Sharing Accelerates | Epic
The second Open@Epic will be held in Verona, Wis. on Wednesday, October 21 and Thursday, October 22, 2026. Registration opens on July 16. Details and session information are available at open.epic.com/Conference. Alongside an expanding AI roadmap, Epic is highlighting measurable outcomes happening ...
Ember: Meal Scan, Macros & AI Coach
Ember: Meal scan, macros & AI coach.
UnitedHealth Group is making a $3 billion bet on AI. What does it mean for patients?
UHG is spending billions to embed AI to manage claims and care decisions. As 22,000 software engineers go to work, what are the benefits — and risks?
Wearable Robotics: €5 Million Raised For Neuromotor Rehabilitation Expansion
Wearable Robotics, a spin-off of the Sant’Anna School of Advanced Studies, has raised €5 million in a Series A funding round to accelerate its international expansion and advance its neuromotor rehabilitation technologies. The post Wearable Robotics: €5 Million Raised For Neuromotor Rehabilitation Expansion appeared first on Pulse 2.0.
Chatbots Are Now Prescribing Psychiatric Drugs
Chatbots are now being used to prescribe psychiatric drugs, raising concerns about the role of AI in healthcare.
LatentView Soars 14% After $3M Investment in Healtheon AI
LatentView Analytics' stock soared 14% after announcing a $3 million investment in Healtheon AI, aiming to enhance healthcare revenue cycle management through agentic-AI.
Ethical AI or character assassination? - by Gene Balfour
The perception of AI as a force for good or evil varies with each person you discuss this with. The selection of Palantir to improve access to medical records across the UK highlights this debate. Topic group: Labor & Society
AI Can Now Prescribe You Psychiatric Medication in Utah
The pilot program will see an AI chatbot, provided by health technology company Doctronic, prescribe psychiatric medication. But there are plenty of caveats, however, in terms of what it can and can't prescribe. Would you trust an AI to prescribe you mind-altering psychiatric medication? Amid numerous controversies around chatbot therapy or AI giving bad (or dangerous) medical advice, one healthcare provider is betting you will and has received regulatory approval … Topic group: Adoption & Impact
LatentView Soars 14% After $3M Investment in Healtheon AI, Boosting Healthcare AI Innovations
LatentView Analytics' stock rose 14% following a $3 million investment in Healtheon AI, signaling strong market confidence in healthcare AI innovations and the growing intersection of analytics technology and medical services. Topic group: Economics & Markets
I Uploaded My Blood Work to AI. Am I Oversharing?
When you connect medical records and health data to a chatbot, you get results. But you must understand the risks. Topic group: Labor & Society
These charts show the bulk of March’s job gains were concentrated in just a handful of sectors
Healthcare continued to drive gains in employment, while better weather in March also helped.
Work in healthcare, including nursing, boomed again in March. The sector has provided some of the most consistent job growth since the 1980s
Work in healthcare, including nursing, boomed again in March. The sector has provided some of the most consistent job growth since the 1980s.
It's Not Easy to Get Depression-Detecting AI Through the FDA
It's not easy to get depression-detecting AI through the FDA.
This new Nature paper (using old models) illustrates the point of my latest Substack post on AI interfaces. AI did a good job diagnosing medical issues, but when users had to interact with chatbots the interface led to confusion & worse answers
This new Nature paper (using old models) illustrates the point of my latest Substack post on AI interfaces. AI did a good job diagnosing medical issues, but when users had to interact with chatbots the interface led to confusion & worse answers My post: https://www. oneusefulthing.
AI Meets Egg Freezing
Sunfish, an AI-powered fertility platform, is introducing an egg-freezing program that uses predictive models to estimate the cost of reaching a target number of eggs.
Matthew Gallagher used a suite of AI tools to create a telehealth company that generated $401 million in sales in its first full year. It's a great example of what I call The New Rules of Wealth. I'm launching a @MasterClass today so more people can learn how to use AI to rapidly launch and scale businesses.
Matthew Gallagher used a suite of AI tools to create a telehealth company that generated $401 million in sales in its first full year. It's a great example of what I call The New Rules of Wealth. I'm
IKS Health Acquires AI Team to Boost Patient Access Solutions
IKS Health acquires Tij Bedi's AI team, ThinkDTM, to enhance its patient access solutions with advanced AI capabilities.
Whoop hits $10bn valuation as health tracking takes off
Wearable device maker eyes IPO after raising $575mn from sovereign wealth funds and top athletes
Whoop, a Wearable Health Device Maker, Raises $575 Million
With elite athletes like LeBron James and Cristiano Ronaldo as investors, the company, now valued at $10 billion, is courting everyday health enthusiasts.
The Download: AI health tools and the Pentagon’s Anthropic culture war
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. There are more AI health tools than ever—but how well do they work? In the last few months alone, Microsoft, Amazon, and OpenAI have all launched medical chatbots. There’s a clear demand… Topic group: Adoption & Impact
Yuhan USA and Huinno Join Forces
Yuhan USA and Huinno join forces to introduce AI-driven ECG monitoring and prediction solutions in the US digital healthcare market.
Jonathon Trionfi - Group benefits guru | Specializing in self- ...
"There's no such thing as bad healthcare risk, only mispriced risk." Ali Panjwani built a predictive AI platform to prove that.
FDA Clears AI-Powered ECG Tool
Anumana's AI-driven ECG tool for pulmonary hypertension has achieved FDA clearance, marking a first for standard 12-lead ECGs in early PH detection.