
This week, headlines surfaced around a little-known issue outside healthcare IT circles that carries implications for nearly anyone who has visited a doctor, taken a genetic test, or spoken with a therapist in recent years.
Epic, along with health systems including Reid Health and UMass Memorial Health, filed a federal lawsuit against Health Gorilla, alleging that patient medical records were improperly accessed and monetized through the healthcare data exchange ecosystem.
According to the complaint, sensitive information, including genetic data, mental health details, and reproductive health information, was accessed via the Carequality network, enabling at least two downstream entities to obtain patient records without appropriate authorization or patient consent.
While the lawsuit alleges a deliberate effort to circumvent safeguards such as the federal Trusted Exchange Framework and Common Agreement (TEFCA), the broader significance of this moment extends beyond any single actor or network. It exposes a structural weakness in how healthcare governs data once it moves beyond the EHR, one that is becoming increasingly dangerous in an era defined by instant data exchange and AI-driven reuse of unstructured clinical text.
Healthcare has invested heavily in permissioning structures: access controls, network participation rules, and exchange frameworks (like TEFCA) designed to regulate who can request data and how it flows between systems.
But permissioning alone does not answer a more fundamental question:
This is where governance breaks down. Access is treated as authorization, and authorization is assumed to imply appropriate use.
Healthcare needs privacy-by-design controls for text, not just permissioning structures at the network level. Sensitive information must be identified, minimized, or transformed before it leaves a system, not retroactively addressed after misuse occurs.
Much of healthcare data governance today relies on contracts, attestations, and good-faith agreements. These mechanisms assume that misuse will be rare, visible, and slow to emerge.
Modern data ecosystems do not behave that way.
Manual oversight does not scale when:
Relying on polite agreements and post-hoc audits is not governance… It’s risk management by optimism.
Effective governance requires technical enforcement, not just policy language. Controls must be embedded in the systems that prepare and share data, ensuring that safeguards are applied consistently, automatically, and before exposure occurs.
Current healthcare privacy policies were designed for a world of structured data: fields, codes, and predefined schemas. But the most sensitive and valuable data today lives in unstructured text.
Clinical notes, referrals, discharge summaries, and patient communications contain nuance and context that structured frameworks were never built to handle. These documents resist simple redaction rules and are especially vulnerable to downstream misuse once accessed.
This gap is becoming more dangerous in the era of AI.
As unstructured text is increasingly used to train models, automate workflows, and generate insights, the consequences of insufficient governance multiply. Policy is not keeping pace with technology, and AI will only accelerate the mismatch.
As unstructured clinical text increasingly flows across organizations, the healthcare industry needs safeguards that travel with the data itself. That means shifting governance upstream: inspecting, minimizing, and protecting sensitive information before it is exchanged, reused, or repurposed.
This is where privacy-by-design becomes practical, not theoretical. When organizations can understand what sensitive information exists in their text data, apply consistent protections automatically, and maintain clear auditability, interoperability no longer has to come at the expense of patient trust.
At Tonic.ai, we see this shift happening in real time. Tools like Tonic Textual are designed to help healthcare teams identify and protect sensitive information in unstructured text at scale, so data can be shared, analyzed, and used for AI development without exposing raw patient details or relying solely on downstream promises of good behavior.
