Product updates

Tonic.ai product updates: August 2025

Author
Chiara Colombi
Author
August 29, 2025

We're excited to share the latest updates and announcements designed to improve your experience with our products. This month's issue includes:

  • Introducing Tonic Structural's Data Vending Machine
  • Streamlining Structural with schema caching
  • Structural's Document View now available for PostgreSQL
  • Object and Array generators for JSON and XML data in Fabricate
  • LLM synthesis strengthened with built-in models in Textual
  • Textual pipelines unified with datasets workflow
  • and our new Audio Redaction and Synthesis playbook

Introducing Structural's Data Vending Machine

Bring the power of Tonic Structural directly into your CI/CD pipeline and into the hands of your developers with the Data Vending Machine, our up-and-coming offering currently in preview. With this new feature built specifically for data end-users, Structural automates the creation of ephemeral, ticket-specific databases, ordered directly by your developers to eliminate data bottlenecks and accelerate your entire development cycle. Connect with our team today to schedule a private preview and help direct the future of the product.

Data Vending Machine - cropped

Streamlining Structural with schema caching

For those of you with sizeable schemas, we’ve released an exciting performance improvement in Structural: a new caching option for schemas. This will significantly speed up UI loading times where fetching a schema has previously lagged, to get you generating and refreshing your data faster.

Find complete instructions on how to use the capability in our product docs. As always, we welcome your feedback and any performance questions you may have—or if you just want to talk cool engineering problems, this was a fun one to solve!

Document View now available for PostgreSQL

JSON data in Structural is more manageable than ever, thanks to expanded functionality of our UI for semi-structured data: Document View now supports PostgreSQL databases, giving you a clear, visual way to explore your JSON content.

Structural - Doc View expression paths - with glow

This update also brings the power of assigning generators to path expressions to PostgreSQL users, a feature that allows you to create bulk rules to automatically protect sensitive fields that match configured path expressions. This simplifies generator configuration, expanding on our existing capabilities for supported JSON data sources, like MongoDB and Files, to equip you to manage large datasets quickly and easily. Connect with our team and check out our product docs to get started.

Object and Array generators for JSON and XML data in Fabricate

Generate synthetic columns of JSON and XML data in Fabricate for a variety of use cases with two newly released and highly customizable generators: Object and Array. These generators significantly broaden the types of data that Fabricate can spin up, unlocking the ability to model the JSON columns in your target database schema or generate XML data within database columns or as a collection of XML documents.

Already a Fabricate power user? Try the SQL generator within Objects to reference other properties within the same object or any ancestor using "{parent(n)}" or columns on the root table using "{root_table}" to simulate realistic data interdependencies. Or leverage Array to create an array of events with a timestamp property that uses the Datetime generator with type: Series to simulate events between (min) and (max) seconds apart.

Support for complex, nested data types means more flexibility and more possibilities in crafting the data you need. Get started with a free account of Fabricate today.

Fabricate - Object generator

LLM synthesis strengthened with built-in models in Textual

LLM synthesis in Textual, formerly reliant on OpenAI, has been upgraded to run on built-in Gemma-based models with custom LoRA adapters, making it faster, more affordable, and fully self-contained within the Tonic platform. This upgrade also brings a notable improvement in Textual’s ability to link entities: the system recognizes variations such as “Sarah,” “S-A-R-A-H,” and “Sarah Smith,” groups them logically, picks a canonical form, and applies consistent replacements.

Format preservation during data synthesis—maintaining the original text’s style, spacing, and capitalization—was already supported and continues to ensure smooth, realistic synthetic results. LLM synthesis can be invoked through the SDK or in the Textual playground UI on the homepage. Haven't created an account yet? Sign up for free.

Textual pipelines unified with datasets workflow

In Textual, all pipeline functionality is now available directly within the datasets workflow, giving you a single place to manage end-to-end processing without relying on separate Pipelines workflows. With a datasets workflow, you can connect to data in your cloud services (Amazon S3, Azure, or Sharepoint) and generate data in the JSON Output format for easier downstream integration. Both cloud-based and local file workflows now run seamlessly in datasets, unifying pipeline functionality into a more streamlined and flexible experience that simplifies setup and keeps your workflows future-proof.

Since all pipeline functionality is now supported through datasets, the Pipelines feature in Textual is now deprecated. We recommend migrating your existing pipelines as soon as possible to avoid disruption. The Pipelines feature in Textual will be sunset on October 1, 2025.

Textual - pipelines in datasets - with glow

Audio Redaction and Synthesis playbook

Working with sensitive audio data? Our latest playbook equips you to operationalize audio data for AI development, while still protecting privacy and adhering to compliance restrictions regarding the governance of PII. The Audio Redaction and Synthesis playbook includes step-by-step guidance for taking audio files and recordings and transforming them into sanitized transcripts that are ready for downstream usage including model training and development.

With Tonic Textual’s audio capabilities, teams can redact sensitive entities within these files or replace them with synthetic but true-to-life alternatives that preserve context and allow for complete and realistic datasets. Check out the playbook to:

  • View a video walkthrough of the audio redaction and synthesis use case
  • Access to a pre-built Jupyter notebook to try for yourself
  • Download a sample audio file for experimentation

Sign up for a free trial of Textual to get started today.

Textual - audio synthesis playbook
Audio redaction and synthesis

Small updates; big impacts

Often it's the little things that matter most. Here's a round up of our smaller releases.

Tonic Structural

  • Structural now offers performance visualization on the jobs page to help you identify and mitigate bottlenecks in your data generation runs. Because who doesn’t love a little performance optimization?
  • We’ve improved file connector performance in Structural by adding a file read parallelism setting configurable at a workspace level—a significant optimization for customers with lots of files in a single file group.

Tonic Fabricate

  • When you create a database or add a table, a new AI Hints field allows you to provide an additional prompt to help further define the database or table. For example, you might indicate specific generators to use or specify limits on column values.
  • The new Markov Chain generator equips you to simulate a realistic flow through different states in your structured data, including allowing you to define the probability of transitioning from any given state to another. How’s that for triggering a flow state.
  • Self-hosted customers can now use Ollama as their LLM provider. One more option in the BYO LLM menu.

As always, we'd love to hear your feedback on our products. What do you need? What do you love? What could be better? Send us a note at hello@tonic.ai or book time directly with our team. And for all the latest updates, be sure to check out our complete release notes.

Chiara Colombi
Director of Product Marketing
A bilingual wordsmith dedicated to the art of engineering with words, Chiara has over a decade of experience supporting corporate communications at multi-national companies. She once translated for the Pope; it has more overlap with translating for developers than you might think.
Accelerate development with high-quality, privacy-respecting synthetic test data from Tonic.ai.Boost development speed and maintain data privacy with Tonic.ai's synthetic data solutions, ensuring secure and efficient test environments.