New Db2 for LUW data connector - Tonic now has a data connector for IBM Db2 for Linux, Unix, and Windows (Db2 for LUW). Tonic supports Db2 for LUW version 11.5.
Other updates
When the AI Synthesizer is used in a workspace, Tonic now verifies before data generation that the AI Synthesizer does not use more than the maximum allowed categories.
Amazon EMR
<view name>_tonic_table
.Fixed an issue with the Name generator where capitalization was not preserved if consistency was disabled.
For Table View, fixed an issue where the delete button to remove the generator assignment was sometimes hidden.
Oracle
PostgreSQL
SQL Server
Redesigned Database View
We redesigned Database View to improve the display and the filtering.
In the updated columns list, the Column column contains the schema, table, and column name, and the column data type. It provides access to the data preview option.
The Applied Generator column shows the applied generator. Applied Generator indicates when a column is unprotected, when the column is a primary or foreign key, and when the configuration overrides the parent workspace. If the table mode is not De-Identify, it shows the table mode. It provides access to the commenting option.
Filters other than the column name filter are moved under the Filters option. There are also new filters for the sensitivity type (the type of sensitive data that Tonic detected in the column) and whether the column has a recommended generator.
Privacy Report updates
In the Privacy Report, new column, Column Privacy Rank, indicates the privacy ranking for a column based on the assigned generator and generator configuration. The generator summary and generator reference include the possible privacy ranking values for each generator.
Added a new column, Tonic Detected Sensitivity, that indicates whether the Tonic sensitivity scan identified the column as sensitive. Renamed the Is Sensitive column to Current Sensitivity. Current Sensitivity indicates whether the column is currently marked as sensitive.
Also corrected an earlier issue with the order of the columns.
Other updates
Fixed an issue that caused all subset runs to record the percentage of rows in the subset as 100%. Subset runs that occur after updating to this version display the correct percentage.
The option to write output data to a container repository is out of beta.
Databricks
Google BigQuery
TONIC_GRPC_ENABLED
was false.NUMERIC
or BIGNUMERIC
column.Oracle
IDENTITY
columns. Before this change, IDENTITYM
columns caused errors during destination database creation.For the Custom Categorical generator, you can now add a NULL value to the available custom category values. To indicate a NULL value, use the keyword {NULL}
.
Made the following API updates to better accommodate users of the previous version of the API:
jobs/{id}/workspace_snapshot
now returns the WorkspaceDataModel object.GET jobs/{id}/workspace_snapshot?api-version=v2023_07_00
, that returns V17WorkspaceDataModel
Databricks
TBLPROPERTIES
from the source delta table, including 'delta.feature.allowColumnDefaults'
.Redesigned data model for generator assignments - The new version of the Tonic API includes a redesign of the data model for generator assignments. To use the previous version of the generator assignment data model, make sure that your API calls specify version 2023.07.0.
The generator assignment data model redesign includes the following changes:
metadata
object in the link
object:presetId
generatorId
customValueProcessor
encryptionProcessor
pathExpression
to the metadata
object in the link
object.link
object:subPresetId
subGeneratorId
customSubGeneratorValueProcessor
subGeneratorMetadata
object under metadata:presetId
generatorId
customValueProcessor
TONIC_GRPC_ENABLED
was set to true.NOTE: v1074 was removed.
If your instance of Tonic is deployed on Docker, you can now use an external Kubernetes cluster to enable the option to write destination data to container artifacts.
You can now assign the Integer Key generator to a column with a decimal data type. The actual column values must still be integers.
Fixed an issue in Table View where an error displayed if you changed the selected table while the data was loading.
Databricks
File connector
SQL Server
NOTE: v1051 through v1053 were removed.
Enable administration functions in Tonic Cloud - For Tonic Cloud customers, the new Account Admin permission set provides access to Tonic administration functions for their organization. The Account Admin can reset passwords, delete users, copy and share all workspaces, and download the usage report. The Account Admin permission set is initially granted to the first user in the organization.
Databricks
File connector
MySQL
TONIC_MYSQL_MAX_CONCURRENT_INDEX_CREATION
, to limit the number of concurrent indexes that are created. The default value is 0, which indicates that there is no limit.SQL Server
The Enable Diagnostic Logging global permission is now granted to the built-in Account Admin permission set.
Databricks
CREATE CATALOG
or CREATE SCHEMA
permissions are no longer required if the destination catalog or schema already exists.Diagnostic logging for data generation - By default, Tonic now redacts sensitive data in data generation log files.
When users start a data generation or upsert job, if they have the new global permission Enable diagnostic logging, they can choose to enable diagnostic logging, which does not redact the logs. The Enable diagnostic logging permission is also required to download the diagnostic logs. By default, the permission is only granted to the Admin and Admin (Environment) global permission sets.
In addition to the option for individual jobs, there are environment settings that enable diagnostic logging for specific data connectors.
Other updates
In the Release Candidate version of the API, the response model for the GET /api/workspace/minimal
endpoint has been updated for more straightforward de-serialization.
Fixed an issue where a non-unique composite primary key column could only be assigned unique generators.
Users can now press Enter to finish copying a workspace or a generator preset, instead of having to click Copy.
File connector
Google BigQuery
Oracle
SQL Server
WITH INLINE
clauses from definitions of user-defined functions (UDF). Inlining does not require these clauses. WITH_INLINE
clauses in UDF definitions that do not meet the requirements for inlining can prevent the UDF from being restored properly in the destination database.For the OpenID Connect (OIDC) SSO integration, Tonic now supports authentication by client secret that uses HTTP basic authentication (client_secret_basic
). To provide the client secret, configure the TONIC_SSO_CLIENT_SECRET
environment setting.
SQL Server
TONIC_SQL_SERVER_SKIP_CREATE_DB
, indicates whether to skip schema creation for the destination database. If true, then Tonic does not create the schema. It uses the existing schema to populate the destination database. The default is false. You can configure this environment setting from the Environment Settings tab on Tonic Settings.NOTE: These releases were removed.
During free trial signup, the data connector options now include an option to use local files for the source data. This creates a file connector workspace for local files, and displays the File Groups view to allow the free trial user to start to add file groups to the workspace.
Added an environment setting, Tonic Test Connection Timeout In Seconds (TONIC_TEST_CONNECTION_TIMEOUT_IN_SECONDS
), that you can set from the Environment Settings tab on Tonic Settings. This setting configures the timeout for testing a database connection. Previously, connection test attempts timed out after 5 seconds. The new default is 15 seconds.
When you configure a workspace to write the output to container artifacts, you can now specify custom resources for the Kubernetes pod, including the ephemeral storage, memory, and CPU millicores.
Improved performance when marking a large number of columns as not sensitive.
Fixed an issue that caused Tonic workers that are deployed on Docker to crash unexpectedly.
For numeric columns that support arbitrary precision and scale, when the scale is 0 (for example, NUMERIC(N,0)
), or when the underlying values are all integers, these columns are now supported as primary keys for the purpose of subsetting.
Amazon EMR
Amazon EMR and Databricks
TONIC_WORKSPACE_DEFAULT_SAVE_MODE
indicates the mode to use. If set to a value other than null (Ignore, Append, Overwrite), this setting takes precedence over TONIC_WORKSPACE_DEFAULT_ERROR_ON_OVERRIDE
.Google BigQuery
MongoDB
TONIC_DOCUMENT_MAX_DEPTH
, to configure the maximum depth of JSON document that can be handled. The default value, which is also the recommended minimum value, is 32.SQL Server
When you select the option to write destination data to container artifacts, you can now use Google Artifact Registry (GAR) authorization using Google Cloud Platform (GCP) service account keys.
For the JSON Mask and XML Mask generators, fixed the data preview for JSON or XML field samples that are larger than 120MB by generating a smaller subset of the field.
The Name generator now supports consistency with other columns.
Added new API endpoints to retrieve and set table replacements. These new endpoints are compatible with workspaces for data connectors that do not have schemas, such as Spark-based databases and the file connector. The existing endpoints, which require you to provide a schema, eventually will be deprecated.
Amazon EMR
File connector
MySQL
Oracle
PostgreSQL
SQL Server
Snowflake
Added an environment setting TONIC_DELETE_COLUMN_SCHEMA_ON_WORKSPACE_DELETE
. If the setting is true
, then when a workspace is deleted, Tonic also deletes the associated rows from the ColumnSchemas
table in the Tonic application database.
The new environment setting TONIC_NOTIFICATION_SMTP_TRUST_CERTIFICATE
indicates whether to allow the SMTP server certificate to be trusted.
Improved the performance of previewing data in Privacy Hub.
Fixed an issue where SSO groups were not removed when the value of TONIC_SSO_GROUP_FILTER_REGEX
changed in a way that excluded previously imported groups. The removed groups are removed from any workspaces that they were granted access to.
For the Timestamp Shift Generator, added Month and Year as options for the date part to use to set the allowed range.
When writing data to container artifacts, Tonic now first shuts down the temporary database before it begins to write data to the container.
Amazon EMR
Databricks
MongoDB
Oracle
TONIC_ORACLE_DBLINK_ENABLED
is false
), privileges were not copied from the source to the destination.Amazon EMR
Databricks
MySQL
REPLICATION CLIENT
and REPLICATION SLAVE
grants.Spark SDK
Spark with Livy
When an email notification contains a link to a comment, clicking the link now correctly navigates to and displays the comment.
Added a flag to mark log files that might contain sensitive information.
You can now assign the Timestamp Shift generator to primary key columns, including columns that are part of a composite primary key.
When writing destination data to a container repository, based on the size of the source database, Tonic attaches resource requests to the datapacker pod to verify that the cluster has sufficient resources for the data.
Tonic now supports writing destination data to container registries on Amazon Elastic Container Registry (Amazon ECR).
Fixed an issue where users could not assign access to permission sets when they did not have access to create and manage custom permission sets.
For foreign keys that have multiple primary keys, Tonic now prevents data generation if the assigned generation configuration for the primary key columns is inconsistent.
Standardized the display format of the heading area of the workspace management views.
File connector
MongoDB
MySQL
PostgreSQL
Snowflake
Apply recommended generators to multiple detected sensitive columns - On Privacy Hub, a new banner displays the number of detected (not manually designated) sensitive columns that are not protected. From the banner, you can display the list of columns, grouped by sensitivity type. For each sensitivity type, for the selected columns, you can:
There is also an option to apply the recommended generators to all of the selected columns across all of the sensitivity types.
Writing destination data to container registries (beta feature) - For data connectors that support it, you can now configure a workspace to write destination data to container artifacts on a container registry instead of to a database server. The Job History view provides access to the generated artifacts for each job.
Other updates
Fixed an issue where generated API documentation did not exclude endpoints and schemas from API versions other than the version being viewed.
Fixed an issue where the Protection Audit Trail displayed the incorrect override status of columns in child workspaces.
On the System Status tab of Tonic Settings, fixed an issue where the Data Sharing section incorrectly showed logging as unable to connect.
Databricks
File connector
MongoDB
Oracle
Toggle between database server and container repository - On the workspace configuration view, if the data connector supports writing to a container repository, you can now switch between writing to a database server and writing to a container repository. Tonic saves the information you provide for each option.
Workspace override for statistics seed - From the workspace configuration view, you now can override the Tonic-wide statistics seed that is set as the value of the TONIC_STATISTICS_SEED
environment setting. You can either provide a custom seed value for the workspace, or disable consistency across data generation runs for the workspace. Consistency also applies across workspaces that have the same custom seed value.
New telemetry URL - Tonic telemetry now routes through https://telemetry.tonic.ai/ instead of https://api2.amplitude.com/. The following IP addresses must be allowed:
The following IP addresses no longer need to be allowed: 52.43.241.47, 54.186.140.101, 54.203.75.164, 44.236.122.176, 34.215.78.194, 54.149.61.206, 54.191.147.220, 52.24.22.222, 52.37.168.36, 54.213.191.53, 54.68.108.104, 52.10.121.164, 52.27.184.186, 44.239.225.209, 54.148.216.233, 52.88.224.247Other updates
Fixed an issue in the Kubernetes client configuration that caused Tonic to reject the SSL certificate of a Kubernetes context.
Fixed an issue where, during subset configuration, an error was returned incorrectly if the user had permission to edit subsetting but not to view source data.
Improved the error message that displays when an uploaded virtual foreign key file is invalid.
Fixed an issue that prevented the Character Substitution generator from being used on primary and foreign key columns when subsetting.
When a workspace import encounters an unhandled exception, the error now displays correctly.
For the Address generator, the Country and Country Code options can now be linked. When linked, the country and country code are either “United States” or “US” to match the other linkable components, which are locations in the United States.
Fixed an issue where SSO login using Okta did not work when a custom authorization server is used.
Fixed an issue where, when using data science mode on Tonic Cloud, users could not download CSV files that contained synthetic data.
Data generation jobs that write to a container now include the datapacker logs, if the worker has permissions to read the pods and logs.
Fixed an issue with uploading container output to Harbor.
For string data type columns:
Databricks
Spark SDK
Spark with Livy
SQL Server
SQL_SERVER_SCRIPT_CROSS_DATABASE_REFERENCES
. The default is true, which preserves the existing behavior. To prevent scripting of objects that are defined in other databases that the source database references, change the setting to false.TONIC_SQL_COMMAND_TIMEOUT
environment setting. The default is 0, which indicates an infinite timeout.Configure Tonic environment settings from Tonic Settings - On the Tonic Settings view, a new Environment Settings tab allows you to configure a subset of Tonic environment settings (previously referred to as Tonic environment variables). To use the Environment Settings tab to configure settings, you must have the new Manage environment settings global permission. When you change the values from Tonic Settings, you do not need to restart Tonic.
Other updates
For the Regex Mask generator, fixed an issue where quotes in regular expressions were malformed.
Fixed an issue where the Test Webhook option failed when it shouldn’t.
Fixed issue where the Confirm Data Generation panel erroneously indicated that Tonic could not connect to the destination database when it actually was only unable to connect to the source database.
For the Address generator, removed an inappropriate value from the possible street names.
Improved error messages when Tonic containers fail to start because of missing or incorrect environment variable values.
Databricks
File connector
Google BigQuery
MongoDB
Snowflake on AWS and Azure
Logging and telemetry connection status - On the System Status tab of Tonic Settings, a new Data Sharing section provides a summary of the logging and telemetry connectivity to the Tonic backend. The new section indicates:
Other updates
Tonic now checks for invalid virtual foreign keys for all data generation. Previously it only ran the check for subsetting data generation.
Fixed an issue where the Continuous generator failed when a column contained only NULL values.
Improved performance for the Continuous Generator.
Fixed an issue where the Cell Count in the Usage Report could throw an integer overflow error.
Fixed a subsetting issue where Tonic displayed the error “Error fetching Subset preview
” when users navigated to a workspace.
For subsetting, updated the Graph View display to make it easier to see the connections between the tables.
Improved messaging when Tonic cannot reach the database when you test a database connection.
Databricks
File connector
PostgreSQL
SQL Server
The new TONIC_ENABLE_SECURE_COOKIES
setting indicates whether to enable the "Secure" attribute on Tonic authorization and analytics cookies. The default value is false
. Do not set this to true
if you access Tonic over an HTTP connection. When TONIC_HTTPS_ONLY
is set to true, the “Secure’” attribute is always enabled on Tonic authorization and analytics cookies, and the value of TONIC_ENABLE_SECURE_COOKIES
is ignored.
Updated to prevent simultaneous updates to the same workspace configuration.
For the Constant generator, fixed an issue for JSON columns where setting the constant to an empty string caused data generation to fail without setting the job status to failed.
The upsert pre-job check that validates the constraints on the intermediate and destination databases no longer fails when a database has constraints with duplicate signatures.
Fixed an issue where an empty upstream filter WHERE
clause caused subsetting to fail if the schema changed so that the table was no longer upstream.
To use the API to obtain data encryption settings, the API user must now have the required global permission.
When users log out of Tonic, we now automatically invalidate any JSON Web Tokens (JWTs) that are not expired.
The API endpoint /api/permission-sets
now requires the ManageUserAccess
(Manage user access to Tonic and to any workspace) permission. Added a new endpoint /api/permission-sets/public
, which returns the subset of the data needed by users who do not have that permission.
Amazon EMR
Upsert data generation (beta feature)
Previously, the data generation process always replaced the entire destination database. The new upsert data generation option (currently in beta) allows you to add new records and update existing records without touching any of the other records in the destination database. For example, you might have a regular set of records that you use for testing that you want to maintain.
Upsert requires a connection to an intermediate database. When you run data generation with upsert, the initial data generation writes the transformed data to the intermediate database. It replaces the intermediate database, similar to regular data generation. After the generation to the intermediate database, the upsert process identifies the records to add to or update in the destination database. It ignores other records in the destination database.
Upsert requires a Professional or Enterprise license, and is only supported for the following data connectors:
New AI-enhanced documentation search option
The Tonic documentation now provides access to Lens, an AI-based search option. Instead of searching for specific words, you can ask questions such as "How do I create a workspace?". Lens searches the documentation for the answer. It generates a response that includes links to the topics that contain the information it used.
To use the Lens search, click the search field. At the top right of the search panel, click Lens. Then ask your question.
Other updates
The Custom Categorical generator now supports consistency with other columns. Previously, the generator only supported self-consistency.
Tonic now prevents you from starting data generation for a workspace that does not have a destination database specified.
Fixed a display issue where the generator preset details panel briefly showed the occurrences for the previously selected generator preset.
Tonic now suggests the Name generator for columns that Tonic detects as containing names, when the detection is based on the sampled data in the column. By default, the Name generator uses the First Last format.
A new configuration option allows webhooks to bypass SSL certificate validation and trust the server certificate.
File connector
MongoDB
PostgreSQL
NOTE: v980 through v982 were removed.
New monthly pay-as-you-go plan on Tonic Cloud - Tonic now offers a pay-as-you-go subscription plan for Tonic Cloud. Free trial users are offered the option to use a credit card to purchase a pay-as-you-go license. The monthly subscription grants a Professional level license. With the pay-as-you-go plan, you can configure generators for up to 20 tables across all of your workspaces. Tonic bills you separately for each additional table that you configure. The license renews automatically each month. On Tonic Settings, a new Billing tab displays the next renewal date.
Data migration option for upsert - For upsert, Enterprise users can now connect a workspace to their own data migration script or tool to ensure that schema changes are automatically reflected in the intermediate database.
Other updates
Timestamp Shift is now the suggested generator for birthdate fields. Previously, the suggested generator was Random Timestamp.
Fixed an issue where editing workspace settings could cause you to be logged out.
File connector
Google BigQuery
MongoDB
Oracle
PostgreSQL
SQL Server
New global permission to view organization users - A new global permission, View organization users, determines whether a user is able to see the lists of users and groups in the organization. This permission is required in order to use the Tonic application to grant access to and transfer ownership of a workspace, and to grant access to global permission sets. It is not required when you use the Tonic API to perform these tasks. The permission is granted to the built-in Admin, Admin (Environment), and General User permission sets. When you upgrade, Tonic automatically grants this permission to your custom global permission sets.
Other updates
On the workspace details view, added a new upsert processing option, Warn on Mismatched Constraints. When this is enabled, Tonic treats mismatched foreign key and unique constraints between the source and destination databases as warnings instead of errors, so that the upsert job does not fail.
Tonic now accepts all AWS RDS certificate authorities. Previously, we only accepted rds-ca-2019. The accepted certificates include:
When job log recording (used to download job logs from the Tonic application) fails, it no longer creates a recording retry loop.
File connector
PostgreSQL
Create virtual foreign keys from Subsetting view - On Subsetting view, from a table details panel, you can now add a virtual foreign key to that table. To add a virtual foreign key, you select the foreign key column from the current table, then select the primary key column from the other table.
Other updates
Fixed an issue with TLSv1 and TLSv1.1 support in Tonic.
Improved performance of downstream processing during subsetting.
Fixed an issue where subsetting failed when a composite foreign key included a Boolean value.
On Table View:
File connector
MongoDB
MySQL
Oracle
PostgreSQL
Snowflake on Azure
In Table View, when a generator cannot be applied to a column in order to produce the preview data, the error message now includes the name of the column.
Expanded the table and collection dropdown lists to accommodate longer names.
The Privacy Report now marks a column as consistent when the generator is always consistent.
Fixed a migration issue with file connector files that were added before V920.
Fixed an issue where data generation could not run because of the permissions hierarchy.
Fixed a security issue related to JWT authentication.
Fix an issue where webhooks sometimes did not start when a job was canceled.
Improved error message when Tonic cannot display a date value.
PostgreSQL
TONIC_PAGE_PARALLELISM
and TONIC_PARALLEL_READ_RANGES_TABLES
environment variables for parallel processing.SQL Server