We're excited to share the latest updates and announcements designed to improve your experience with our products. This month's issue includes:
JSON in Document View in Structural 📄
BYO LLM in Fabricate ✨
Amazon DynamoDB out of beta for Structural ⚙️
An LLM fine-tuning playbook for Textual 🎛️
And a slew of exciting "small things" to round out our updates
JSON in a column? Meet Document View in Tonic Structural 📄
Working with JSON data inside your database columns just got way easier.Tonic Structural now supports Document View for embedded JSON, whether it’s in a file, a column, or a full JSON document, to offer greater efficiency and data utility. You can now,
quickly scan and detect sensitive fields, with no more manual clicking
get smart, field-level generator recommendations
link generators across fields to preserve data relationships
tap into an expanded library of transformations for better realism
Pro tip: This can be used in place of our JSON Mask generator, up-leveling our existing capabilities with a streamlined UI and more powerful indexing engine.
BYO LLM for unstructured data in Tonic Fabricate ✨
Tonic Fabricate, our AI-powered platform for synthetic data from scratch, now allows you to connect to the LLM of your choice to fuel your unstructured data generation. Make use of any OpenAI-compatible model by way of the OpenRouter API. Choose which model to use for each type of content, whether free text, conversation, or JSON, to leverage the LLM best suited to your use case or to your organization’s policies and existing tech stack. To learn more, visit our docs.
Amazon DynamoDB is out of beta for Structural! ⚙️
Tonic Structural's DynamoDB connector is officially out of beta! We’ve seen customers use it, continued to expand our internal test cases as we identify more use cases, and optimized performance for your Dynamo usage patterns. Connect to your Dynamo data to give it a try today!
New technical resource for Tonic Textual: LLM fine-tuning playbook 🎛️
We’re excited to share the first playbook in a new series of technical resources, the LLM fine-tuning playbook. This is a prescriptive guide to using Tonic Textual and unstructured data, like clinical notes, to train and fine-tune large language models—while protecting privacy and preserving context. The playbook contains:
a video demo from Tonic.ai’s Head of AI, Ander Steele
step-by-step guidance
ungated access to a Huggingface dataset and pre-configured Jupyter notebook
everything you need to jump in and try for yourself!
We also recently held a webinar on operationalizing unstructured text for ML workflows, which makes the perfect companion to the playbook. Not using Tonic Textual yet? We’ve got you covered. Start experimenting with your unstructured data (PDFs, notes, audio and video files) with a free trial of Tonic Textual today.
Small updates; big impacts
Often it's the little things that matter most. Here's a round up of our smaller releases.
Tonic Structural
Two new webhooks are available for Structural. The first is triggered when Structural detects a schema change (improving upon our existing trigger based on job status). The second runs a GitHub action when triggered: specify the GitHub repository and credentials, and select the workflow; Structural then prompts you to provide the input values. Streamlining your workflows one webhook at a time.
We’ve improved how Structural categorizes and responds to detected schema changes to enhance the efficiency of your automated jobs. Sensitive schema changes include schema changes that could result in data leakage and always block data generation. Non-sensitive notifications (e.g. removed tables and columns) are schema changes that Structural resolves automatically for each data generation.
On the job details page, the Reports and Logs dropdown list now includes a View Gantt option, which allows you to view where time was spent during job execution so you can better track your data generation efficiency. Note that this is available for jobs that use the Data Pipeline V2 process.
Tonic Textual
In Textual, from the global permission sets list, you can now select a global permission set to assign to all Textual users in your organization. By default, all users are granted the built-in General User global permission set.
We’ve improved Textual’s NER model performance at predicting the ORGANIZATION entity type in legal documents.
For a given entity type, you can now specify how to replace specific values. For example, for a Given Name, you could indicate to always replace John with Michael.
You can now configure a Number generator variant to be based on a percentage condition instead of a SQL expression. For example, you can configure a variant to be used in 5% of the rows…or perhaps just 1% to create outliers for testing edge cases or simulating errors in a dataset. 👀
The new Sum From Another Table generator populates a column based on the sum of values for a column in a set of rows in another table, unlocking greater realism in cross-table relationships.
As always, we'd love to hear your feedback on our products. What do you need? What do you love? What could be better? Send us a note at hello@tonic.ai or book time directly with our team. And for all the latest updates, be sure to check out our complete release notes.
Want to make your data usable?
Unblock product innovation with safe, high-fidelity data de-identification and synthesis.
A bilingual wordsmith dedicated to the art of engineering with words, Chiara has over a decade of experience supporting corporate communications at multi-national companies. She once translated for the Pope; it has more overlap with translating for developers than you might think.