Prepare Unstructured Data for AI in Microsoft Fabric with Tonic Textual | Blog

AI teams need to move quickly, but the reality is that most of the data they need simply isn’t ready for AI. In fact, Gartner predicts that through 2026, 60 % of AI projects will be abandoned because they lack AI-ready data.

Across the enterprise, the most valuable inputs for AI live in unstructured text — notes, support tickets, contracts, call transcripts, internal documentation, and customer communications all contain rich signals models need to learn from. Yet this data is frequently off-limits because it embeds sensitive information that jeopardizes compliance or violates internal privacy policies.

Today, we are excited to announce the general availability of Tonic Textual on Microsoft Fabric, giving teams a secure, Fabric-native way to detect, de-identify, and prepare unstructured text data for AI development.

With this release, organizations can finally move large backlogs of sensitive text from “off limits” to “AI ready” without moving data out of their governed Fabric environment.

Unstructured data is the bottleneck for AI development

The signals that matter most for AI live in unstructured text: how customers speak, how clinicians document care, how work actually gets done. This data holds enormous potential, but its complexity and sensitivity make it the primary constraint on AI progress.

That said, this type of data is laborious to inspect, difficult to govern, and risky to share with models. Sensitive entities such as names, identifiers, locations, and domain-specific values are deeply embedded in free text, PDFs, and scanned documents. Manual redaction does not scale, and one-off scripts can be brittle and incomplete.

As a result, teams either delay AI projects or accept unnecessary privacy risk by exposing raw text to models during training, evaluation, or inference.

AI success depends on having data that is not only high quality, but safe to use.

Preparing unstructured data inside Fabric

Tonic Textual brings AI-native data preparation directly into Microsoft Fabric.

Running inside the Fabric ecosystem, Textual automatically detects sensitive entities across unstructured files stored in OneLake and applies consistent transformations that make text safe for AI development. Teams can redact values, replace them with high-fidelity synthetic data, or apply custom handling based on their use case.

This allows organizations to prepare text data for a wide range of AI workflows, including:

Model training and fine-tuning
Retrieval-augmented generation pipelines
Evaluation and testing environments
Secure prompt and inference workflows

All this is now possible without exporting data to external systems or breaking existing governance controls.

From backlog to usable data

With Tonic Textual on Fabric, AI teams can automatically detect sensitive information across large volumes of unstructured text and apply de-identification or high-fidelity synthetic replacement at scale. Document structure and contextual integrity are preserved, ensuring that prepared data remains useful for downstream AI workflows rather than stripped of meaning.

By enforcing consistent privacy policies across training, evaluation, and inference, teams can move beyond isolated experiments and confidently operationalize AI in production environments. What was once a massive backlog of unusable text becomes a trusted, governed input for AI development.

Ready for production workloads

General availability of Tonic Textual on Microsoft Fabric marks an important milestone for teams building AI in production environments.

Textual on Fabric is built for scale, reliability, and enterprise governance. It integrates directly with Fabric workloads and OneLake storage, enabling privacy-first AI data preparation as a native part of the AI lifecycle rather than a downstream control.

By preparing unstructured data before it ever reaches a model, organizations reduce risk, accelerate development, and create a foundation for responsible AI at scale.

Available Now

Tonic Textual for Microsoft Fabric is generally available today and can be added directly from the Fabric Workload Hub.

If you are building AI with unstructured data and need a secure way to prepare sensitive text for production, you can start a project with Textual on Fabric today.

From off-limits to AI-Ready: Preparing unstructured data directly in Microsoft Fabric with Tonic Textual

Unstructured data is the bottleneck for AI development

Preparing unstructured data inside Fabric

From backlog to usable data

Ready for production workloads

Available Now

Want to make your data usable?

Related blog posts

How to de-identify financial documents with Tonic Textual

Tonic.ai + Microsoft: Accelerating AI adoption with privacy-compliant synthetic data

How synthetic data can help solve AI’s data crisis

Make your sensitive data usable for testing and development.