Playbook

Centralized vs decentralized data de-identification

Structural
Textual

The problem

Identifying a software solution to meet your data transformation needs is just the first step in defining your approach to compliant data de-identification. How you leverage that solution within your company surfaces a number of new questions to consider:

  • Who will be responsible for managing the data de-identification solution?
  • What roles or permissions will your users have?
  • How will you govern specific de-identification techniques?
  • What security and audit controls will you put in place?

Determining a management plan is critical to maximizing the solution’s value across your use cases and ensuring its scalability across your organization.

The solution

Define from day one whether you’ll implement a centralized or decentralized approach to managing your data de-identification software, and designate the resources you’ll need, including people resources. Identify built-in capabilities to support your approach, such as RBAC or audit trails.

Using the Tonic.ai platform as an example, we’ll outline what this could look like at your organization. Read on for key considerations, recommendations, and the functionality that your team needs to successfully manage and maximize the value of data de-identification at your organization.

Get the full step-by-step playbook for rolling out, managing, and scaling data de-identification to meet software and AI development needs across your organization.

Playbook steps

1

Identify an organizational model for data de-identification that works for you:

  • Centralized
  • Partly centralized
  • Decentralized
2

Develop a process for expanding data de-identification to new use cases, workflows, and teams.

3

Leverage the built-in functionality of data de-identification solutions like Tonic Structural and Tonic Textual to manage access, audit workflows, and standardize processes:

4
5
6

Define your strategy

Connect with a data de-identification expert to build an approach to data privacy compliance tailored to your organization’s development workflows and data needs.

Built-in intelligence for real-world data

Tonic.ai comes ready with out-of-the-box support for detecting a comprehensive catalog of sensitive data types and entities. From names, dates, and locations to nuanced healthcare, finance, and developer-specific fields, our pre-trained models are designed to recognize and flag personally identifiable information (PII) and protected health information (PHI) to ensure compliant data de-identification. This built-in data discovery forms the backbone of strategic, compliant data workflows.

Here’s a look at some of the sensitive data types that Tonic.ai’s products automatically detect.

CC Exp
The expiration date of a credit card.
CC_EXP
CVV
The card verification value for a credit card.
CVV
City
The name of a city.
LOCATION_CITY
Country
The name of a country.
LOCATION_COUNTRY
Credit Card
A credit card number.
CREDIT_CARD
DOB
A person's date of birth.
DOB
Date Time
A date or timestamp.
DATE_TIME
Email Address
An email address.
EMAIL_ADDRESS
Event
The name of an event.
EVENT
Family Name
A family name or surname.
NAME_FAMILY
Full Mailing Address
A full postal address. By default, the entity type handling option for this entity type is Off.
LOCATION_COMPLETE_ADDRESS
Gender Identifier
An identifier of a person's gender.
GENDER_IDENTIFIER
Given Name
A given name or first name.
NAME_GIVEN
Healthcare Identifier
An identifier associated with healthcare, such as a patient number.
HEALTHCARE_ID
IBAN Code
An international bank account number used to identify an overseas bank account.
IBAN_CODE
IP Address
An IP address.
IP_ADDRESS
Language
The name of a spoken language.
LANGUAGE
Law
A title of a law.
LAW
Location
A value related to a location. Can include any part of a mailing address.
LOCATION
Medical License
The identifier of a medical license.
MEDICAL_LICENSE
Money
A monetary value.
MONEY
NRP
A nationality, religion, or political group.
NRP
Numeric Identifier
A numeric value that acts as an identifier.
NUMERIC_PII
Numeric Value
A numeric value.
NUMERIC_VALUE
Occupation
A job title or profession.
OCCUPATION
Organization
The name of an organization.
ORGANIZATION
Password
A password used for authentication.
PASSWORD
Person Age
The age of a person.
PERSON_AGE
Phone Number
A telephone number.
PHONE_NUMBER
Product
The name of a product.
PRODUCT
State
A state name or abbreviation.
LOCATION_STATE
Street Address
A street address.
LOCATION_ADDRESS
URL
A URL to a web page.
URL
US Bank Number
The routing number of a bank in the United States.
US_BANK_NUMBER
US ITIN
An Individual Taxpayer Identification Number in the United States.
US_ITIN
US Passport
A United States passport identifier.
US_PASSPORT
US SSN
A United States Social Security number.
US_SSN
Zip
A postal code.
LOCATION_ZIP
Accelerate development with high-quality, privacy-respecting synthetic test data from Tonic.ai.Boost development speed and maintain data privacy with Tonic.ai's synthetic data solutions, ensuring secure and efficient test environments.