
We’re excited to announce Tonic Textual for Microsoft Fabric, now available in public preview to all Fabric users. Textual brings advanced entity-detection, redaction, and synthesis for unstructured text data directly into the Fabric ecosystem—empowering organizations to unlock datasets previously off limits due to privacy concerns for responsible and compliant AI/ML development.
With Tonic Textual integrated into the Fabric user experience, teams can prepare office documents, PDFs and images containing sensitive text for AI and machine learning tasks—all while protecting privacy and maintaining compliance with regulations such as HIPAA and GDPR.
For most organizations, the biggest barrier to AI innovation isn’t compute power or LLM—it’s access to compliant data. While structured data can be easily masked or tokenized, unstructured text remains an unruly challenge for many organizations. The variance across unstructured text is infinite, with personal information often embedded in nuance and context that’s hard to detect with traditional tools.
Manually de-identifying large volumes of sensitive data is an overwhelming task, and often achievable. Each document must be reviewed line by line to locate private information, making in-house solutions both unsustainable and error-prone—especially as data volumes grow and compliance requirements evolve.
Tonic Textual for Microsoft Fabric eliminates that burden. By combining Fabric’s lake-centric architecture and governance with Tonic’s AI-driven de-identification engine, users can easily and automatically identify and protect sensitive entities—such as names, dates, and medical or financial identifiers—without moving data out of Fabric.
The result: privacy-preserving datasets that are immediately ready for downstream workflows including model training, generative AI workloads and AI Agents.
Consider a healthcare organization developing an AI assistant to help clinicians summarize patient case notes. These records contain valuable medical insights—but also a significant amount of personally identifiable information (PII) and protected health information (PHI).
Using Tonic Textual in Microsoft Fabric, the organization can safely process and transform its unstructured EHR data directly in OneLake. Textual automatically detects and anonymizes sensitive entities while maintaining the integrity of clinical language—ensuring data utility for downstream analytics, ML training and AI.
This enables data scientists and clinicians and business users to collaborate confidently, knowing that sensitive data is protected and would never leave their secure Fabric environment.
Example: How to prepare sensitive text data for AI in just a few clicks.

From your Fabric console, navigate to Workloads and select the Tonic Textual workload to add it to your workspace. Learn more about adding workloads. Once added, you will have access to the Textual UI directly from your Microsoft Fabric console.


To use Textual, first open your workspace and click New Item. Select the Tonic Textual from the list of available items.
Next, choose the OneLake Lakehouse containing the files you want to process. Next, choose a target folder where sanitized files will be saved..

Select the specific files or entire folders of files containing sensitive data you want to sanitize. In this example, we have scanned two files from the folder ‘Patient Data’. On the right hand side you can see the status of the job indicating multiple detections of first and last names.

After reviewing the initial analysis of sensitive text identified within your documents, you need to decide what action to take. You can choose a combination of redactions or synthesis (replace with a true-to-life substitute – i.e “John” becomes “William”), or to leave certain entities untouched. In this example, we are using the “Bulk Edit” to automatically redact all of the sensitive entities.

Once the de-identification job is complete, your sanitized files are accessible in the destination folder that you created in Step 2. These files are essentially replicas of the originals, but with redacted or replaced entities based upon your de-identification strategy. Your original files remain un-altered in their source Lakehouse – the sanitized versions are ready for downstream use. With this data unlocked, you can use Azure AI Foundry service to build AI agents in Azure Copilot Studio, enable search using Azure AI Search or train your own ML models using Azure Machine Learning.
Bringing Tonic Textual into Microsoft Fabric makes it easier than ever for organizations to responsibly harness the power of their unstructured data.
With this integration, Microsoft and Tonic.ai are helping organizations move from blocked to AI-ready, at scale.
Tonic Textual for Microsoft Fabric is now available in Public Preview.
Explore the integration, attend upcoming sessions at Microsoft Ignite, and enable Textual directly in your Fabric workspace.
Visit https://www.tonic.ai/partners/microsoft/fabric to learn more about this integration.
