We're excited to share the latest updates and announcements designed to improve your experience with our products. This month's issue includes:
Bring the power of Tonic Structural directly into your CI/CD pipeline and into the hands of your developers with the Data Vending Machine, our up-and-coming offering currently in preview. With this new feature built specifically for data end-users, Structural automates the creation of ephemeral, ticket-specific databases, ordered directly by your developers to eliminate data bottlenecks and accelerate your entire development cycle. Connect with our team today to schedule a private preview and help direct the future of the product.
For those of you with sizeable schemas, we’ve released an exciting performance improvement in Structural: a new caching option for schemas. This will significantly speed up UI loading times where fetching a schema has previously lagged, to get you generating and refreshing your data faster.
Find complete instructions on how to use the capability in our product docs. As always, we welcome your feedback and any performance questions you may have—or if you just want to talk cool engineering problems, this was a fun one to solve!
JSON data in Structural is more manageable than ever, thanks to expanded functionality of our UI for semi-structured data: Document View now supports PostgreSQL databases, giving you a clear, visual way to explore your JSON content.
This update also brings the power of assigning generators to path expressions to PostgreSQL users, a feature that allows you to create bulk rules to automatically protect sensitive fields that match configured path expressions. This simplifies generator configuration, expanding on our existing capabilities for supported JSON data sources, like MongoDB and Files, to equip you to manage large datasets quickly and easily. Connect with our team and check out our product docs to get started.
Generate synthetic columns of JSON and XML data in Fabricate for a variety of use cases with two newly released and highly customizable generators: Object and Array. These generators significantly broaden the types of data that Fabricate can spin up, unlocking the ability to model the JSON columns in your target database schema or generate XML data within database columns or as a collection of XML documents.
Already a Fabricate power user? Try the SQL generator within Objects to reference other properties within the same object or any ancestor using "{parent(n)}" or columns on the root table using "{root_table}" to simulate realistic data interdependencies. Or leverage Array to create an array of events with a timestamp property that uses the Datetime generator with type: Series to simulate events between (min) and (max) seconds apart.
Support for complex, nested data types means more flexibility and more possibilities in crafting the data you need. Get started with a free account of Fabricate today.
LLM synthesis in Textual, formerly reliant on OpenAI, has been upgraded to run on built-in Gemma-based models with custom LoRA adapters, making it faster, more affordable, and fully self-contained within the Tonic platform. This upgrade also brings a notable improvement in Textual’s ability to link entities: the system recognizes variations such as “Sarah,” “S-A-R-A-H,” and “Sarah Smith,” groups them logically, picks a canonical form, and applies consistent replacements.
Format preservation during data synthesis—maintaining the original text’s style, spacing, and capitalization—was already supported and continues to ensure smooth, realistic synthetic results. LLM synthesis can be invoked through the SDK or in the Textual playground UI on the homepage. Haven't created an account yet? Sign up for free.
In Textual, all pipeline functionality is now available directly within the datasets workflow, giving you a single place to manage end-to-end processing without relying on separate Pipelines workflows. With a datasets workflow, you can connect to data in your cloud services (Amazon S3, Azure, or Sharepoint) and generate data in the JSON Output format for easier downstream integration. Both cloud-based and local file workflows now run seamlessly in datasets, unifying pipeline functionality into a more streamlined and flexible experience that simplifies setup and keeps your workflows future-proof.
Since all pipeline functionality is now supported through datasets, the Pipelines feature in Textual is now deprecated. We recommend migrating your existing pipelines as soon as possible to avoid disruption. The Pipelines feature in Textual will be sunset on October 1, 2025.
Working with sensitive audio data? Our latest playbook equips you to operationalize audio data for AI development, while still protecting privacy and adhering to compliance restrictions regarding the governance of PII. The Audio Redaction and Synthesis playbook includes step-by-step guidance for taking audio files and recordings and transforming them into sanitized transcripts that are ready for downstream usage including model training and development.
With Tonic Textual’s audio capabilities, teams can redact sensitive entities within these files or replace them with synthetic but true-to-life alternatives that preserve context and allow for complete and realistic datasets. Check out the playbook to:
Sign up for a free trial of Textual to get started today.
Often it's the little things that matter most. Here's a round up of our smaller releases.
As always, we'd love to hear your feedback on our products. What do you need? What do you love? What could be better? Send us a note at hello@tonic.ai or book time directly with our team. And for all the latest updates, be sure to check out our complete release notes.