Playbook

Audio redaction and synthesis

Textual

The problem

Teams working with voice data like customer service calls, telehealth sessions, or financial consultations all face a shared challenge: how to leverage these insight-right touchpoints for model training and analysis while still protecting privacy and adhering to compliance requirements.

The solution

Tonic Textual’s Synthetic Audio capability unlocks audio for AI development – securely and at scale. 

Users can redact large quantities of audio files with a quick SDK call or simply drag-and-drop audio files within the UI to generate privacy-protected transcripts

Demo: Generating privacy-protected transcripts from audio files using Tonic Textual.

Playbook steps

1

Install the Textual Python SDK

2

Create an SDK client

3

Specify local file locations to ingest recordings via the Textual SDK

4

Specify a generator:

  • ‘Redact’ for simple redactions of sensitive entities 
  • ‘Synthesis’ to replace sensitive entities with synthesized alternatives
5

Generate transcripts in JSON

6

Export for downstream use

Try for yourself

Want to test it out? We’ve included all of the assets from the video in the playbook so that you can experiment on your own.

Built-in Intelligence for real-world data

Tonic.ai comes ready with out-of-the-box support for a rich library of entity types—so your data is understood from day one. From names, dates, and locations to nuanced healthcare, finance, and developer-specific fields, our proprietary models are designed to recognize the structures and semantics that matter most. Whether you're redacting sensitive information or sanitizing datasets with true-to-life synthetic replacements, these built-in types form the backbone of smarter, safer data workflows.

CC Exp
The expiration date of a credit card.
CC_EXP
CVV
The card verification value for a credit card.
CVV
City
The name of a city.
LOCATION_CITY
Country
The name of a country.
LOCATION_COUNTRY
Credit Card
A credit card number.
CREDIT_CARD
DOB
A person's date of birth.
DOB
Date Time
A date or timestamp.
DATE_TIME
Email Address
An email address.
EMAIL_ADDRESS
Event
The name of an event.
EVENT
Family Name
A family name or surname.
NAME_FAMILY
Full Mailing Address
A full postal address. By default, the entity type handling option for this entity type is Off.
LOCATION_COMPLETE_ADDRESS
Gender Identifier
An identifier of a person's gender.
GENDER_IDENTIFIER
Given Name
A given name or first name.
NAME_GIVEN
Healthcare Identifier
An identifier associated with healthcare, such as a patient number.
HEALTHCARE_ID
IBAN Code
An international bank account number used to identify an overseas bank account.
IBAN_CODE
IP Address
An IP address.
IP_ADDRESS
Language
The name of a spoken language.
LANGUAGE
Law
A title of a law.
LAW
Location
A value related to a location. Can include any part of a mailing address.
LOCATION
Medical License
The identifier of a medical license.
MEDICAL_LICENSE
Money
A monetary value.
MONEY
NRP
A nationality, religion, or political group.
NRP
Numeric Identifier
A numeric value that acts as an identifier.
NUMERIC_PII
Numeric Value
A numeric value.
NUMERIC_VALUE
Occupation
A job title or profession.
OCCUPATION
Organization
The name of an organization.
ORGANIZATION
Password
A password used for authentication.
PASSWORD
Person Age
The age of a person.
PERSON_AGE
Phone Number
A telephone number.
PHONE_NUMBER
Product
The name of a product.
PRODUCT
State
A state name or abbreviation.
LOCATION_STATE
Street Address
A street address.
LOCATION_ADDRESS
URL
A URL to a web page.
URL
US Bank Number
The routing number of a bank in the United States.
US_BANK_NUMBER
US ITIN
An Individual Taxpayer Identification Number in the United States.
US_ITIN
US Passport
A United States passport identifier.
US_PASSPORT
US SSN
A United States Social Security number.
US_SSN
Zip
A postal code.
LOCATION_ZIP
Accelerate development with high-quality, privacy-respecting synthetic test data from Tonic.ai.Boost development speed and maintain data privacy with Tonic.ai's synthetic data solutions, ensuring secure and efficient test environments.