MIT Scientists Study Patient Privacy Risks From AI Memorization in Clinical Models

Researchers at the Massachusetts Institute of Technology (MIT) are investigating how advanced artificial intelligence models trained on electronic health records (EHRs) could inadvertently memorize and reveal sensitive patient information, even when the data has been de-identified. As AI systems are increasingly deployed in healthcare for diagnostics and predictive insights, there is growing concern that they might not simply generalize from patterns across many patient records, but instead recall and reproduce details about individual patients. This type of memorization could pose serious privacy risks if malicious users exploit the models to extract confidential data.

The MIT team’s work — presented at the 2025 Conference on Neural Information Processing Systems (NeurIPS) — emphasizes that ordinary testing metrics aren’t enough to protect privacy in clinical AI. Instead, they advocate for rigorous evaluation methods specifically designed to measure how much information an attacker would need to prompt a model into revealing training data tied to a specific patient. Their approach seeks to distinguish between general knowledge (useful clinical patterns) and harmful memorized content that could be reconstructed with targeted queries.

To assess these risks, the researchers developed a series of open-source tests that evaluate uncertainty and assess different “attack” scenarios, ranging from trivial to serious. For example, leaking basic demographic data like age might be less harmful than exposing a diagnosis such as HIV status or substance abuse history. Patients with unusual or unique health profiles are seen as especially vulnerable, since highly distinctive patterns increase the chances a model might unintentionally memorize specific records.

The study underscores the critical need for privacy safeguards before clinical AI systems are widely released. Even when datasets are de-identified, the research suggests that models can still learn and repeat information tied to individuals, challenging assumptions about how well current protections work. The researchers hope their testing framework becomes a standard part of evaluating clinical AI, balancing the technology’s potential benefits with rigorous steps to prevent breaches of patient trust and confidentiality.

MIT Scientists Study Patient Privacy Risks From AI Memorization in Clinical Models

Divya Maheshwari

TOOLHUNT

MIT Scientists Study Patient Privacy Risks From AI Memorization in Clinical Models

Divya Maheshwari

AI Is Coming for Your Office Productivity Suite, Too

From AI to ‘B’: How Artificial Intelligence Is Powering Supply Chains

Productivity Growth: Harvesting the AI Dividend

Hospitals Investing Heavily in Artificial Intelligence

Global Data Center Market Enters High-Growth Era Driven by Artificial Intelligence

TOOLHUNT