Tonic Structural release information

Learn about what’s in the latest Tonic.ai product releases.
v1027 - v1030
November 17, 2023

When an email notification contains a link to a comment, clicking the link now correctly navigates to and displays the comment.

Added a flag to mark log files that might contain sensitive information.

You can now assign the Timestamp Shift generator to primary key columns, including columns that are part of a composite primary key.

When writing destination data to a container repository, based on the size of the source database, Tonic attaches resource requests to the datapacker pod to verify that the cluster has sufficient resources for the data.

Tonic now supports writing destination data to container registries on Amazon Elastic Container Registry (Amazon ECR).

Fixed an issue where users could not assign access to permission sets when they did not have access to create and manage custom permission sets.

For foreign keys that have multiple primary keys, Tonic now prevents data generation if the assigned generation configuration for the primary key columns is inconsistent.

Standardized the display format of the heading area of the workspace management views.

File connector

  • Fixed an issue with deleting multiple files from a file group.

MongoDB

  • From Privacy Hub, you can now use the option to review and apply recommended generators to all of the detected sensitive fields.
  • When there are schema changes that are not yet scanned, such as a new collection or new fields in a collection, they are now handled as Passthrough. The schema changes no longer cause data generation to fail.

MySQL

  • You can now configure a MySQL workspace to write the destination data to a container repository.

PostgreSQL

  • Fixed an issue where when an excluded schema did not exist in the source database, the test connection option returned an error.
  • Customers are now by default automatically enrolled in Data Pipeline V2 mode, and do not see the option to enable or disable it. The ability to enable or disable Data Pipeline V2 is granted to individual customers as needed.

Snowflake

  • Fixed an issue where when an excluded schema did not exist in the source database, the test connection option returned an error.
v1011 - v1017
October 27, 2023

Apply recommended generators to multiple detected sensitive columns - On Privacy Hub, a new banner displays the number of detected (not manually designated) sensitive columns that are not protected. From the banner, you can display the list of columns, grouped by sensitivity type. For each sensitivity type, for the selected columns, you can:

  • Apply the recommended generator
  • Ignore the generator recommendation
  • Mark the columns as not sensitive
  • There is also an option to apply the recommended generators to all of the selected columns across all of the sensitivity types.

    Writing destination data to container registries (beta feature) - For data connectors that support it, you can now configure a workspace to write destination data to container artifacts on a container registry instead of to a database server. The Job History view provides access to the generated artifacts for each job.

    Other updates

    Fixed an issue where generated API documentation did not exclude endpoints and schemas from API versions other than the version being viewed.

    Fixed an issue where the Protection Audit Trail displayed the incorrect override status of columns in child workspaces.

    On the System Status tab of Tonic Settings, fixed an issue where the Data Sharing section incorrectly showed logging as unable to connect.

    Databricks

    • On the table mode selection panel, for workspaces that write to Databricks Delta tables, disabled the Error on Overwrite table configuration for tables. The setting is not used in this case.

    File connector

    • In the file group configuration, added options to skip lines from the beginning or end of CSV files.

    MongoDB

    • On Collection View, fixed an issue that prevented array fields from displaying in the Hybrid Document View of a collection.

    Oracle

    • For new workspaces, the Preserve source database file storage preferences setting is now off by default instead of on. Existing workspaces are not affected by this change.
    v1018 - v1023
    November 3, 2023

    Toggle between database server and container repository - On the workspace configuration view, if the data connector supports writing to a container repository, you can now switch between writing to a database server and writing to a container repository. Tonic saves the information you provide for each option.

    Workspace override for statistics seed - From the workspace configuration view, you now can override the Tonic-wide statistics seed that is set as the value of the TONIC_STATISTICS_SEED environment setting. You can either provide a custom seed value for the workspace, or disable consistency across data generation runs for the workspace. Consistency also applies across workspaces that have the same custom seed value.

    New telemetry URL - Tonic telemetry now routes through https://telemetry.tonic.ai/ instead of https://api2.amplitude.com/. The following IP addresses must be allowed:

    • 75.2.74.76
    • 99.83.246.105

    The following IP addresses no longer need to be allowed: 52.43.241.47, 54.186.140.101, 54.203.75.164, 44.236.122.176, 34.215.78.194, 54.149.61.206, 54.191.147.220, 52.24.22.222, 52.37.168.36, 54.213.191.53, 54.68.108.104, 52.10.121.164, 52.27.184.186, 44.239.225.209, 54.148.216.233, 52.88.224.247Other updates

    Fixed an issue in the Kubernetes client configuration that caused Tonic to reject the SSL certificate of a Kubernetes context.

    Fixed an issue where, during subset configuration, an error was returned incorrectly if the user had permission to edit subsetting but not to view source data.

    Improved the error message that displays when an uploaded virtual foreign key file is invalid.

    Fixed an issue that prevented the Character Substitution generator from being used on primary and foreign key columns when subsetting.

    When a workspace import encounters an unhandled exception, the error now displays correctly.

    For the Address generator, the Country and Country Code options can now be linked. When linked, the country and country code are either “United States” or “US” to match the other linkable components, which are locations in the United States.

    Fixed an issue where SSO login using Okta did not work when a custom authorization server is used.

    Fixed an issue where, when using data science mode on Tonic Cloud, users could not download CSV files that contained synthetic data.

    Data generation jobs that write to a container now include the datapacker logs, if the worker has permissions to read the pods and logs.

    Fixed an issue with uploading container output to Harbor.

    For string data type columns:

    • Added yyyy/MM/dd as a valid format for the Timestamp Shift generator.
    • Added yyyy/MM/dd and MMddyyyy as available output formats for the Random Timestamp generator.

    Databricks

    • For Databricks 11.3 and later, the Databricks data connector now supports the Regex Mask generator. The regular expressions might work slightly differently than on non-Spark data connectors.

    Spark SDK

    • The Spark SDK data connector now supports the Regex Mask generator. The regular expressions might work slightly differently than on non-Spark data connectors.

    Spark with Livy

    • For Spark 2.3.x and 2.4.2, the Spark with Livy data connector now supports the Regex Mask generator. The regular expressions might work slightly differently than on non-Spark data connectors.

    SQL Server

    • Added a new worker environment setting, SQL_SERVER_SCRIPT_CROSS_DATABASE_REFERENCES. The default is true, which preserves the existing behavior. To prevent scripting of objects that are defined in other databases that the source database references, change the setting to false.
    • Fixed issues where SQL Server connections did not honor the TONIC_SQL_COMMAND_TIMEOUT environment setting. The default is 0, which indicates an infinite timeout.
    v1004 - v1010
    October 20, 2023

    Configure Tonic environment settings from Tonic Settings - On the Tonic Settings view, a new Environment Settings tab allows you to configure a subset of Tonic environment settings (previously referred to as Tonic environment variables). To use the Environment Settings tab to configure settings, you must have the new Manage environment settings global permission. When you change the values from Tonic Settings, you do not need to restart Tonic.

    Other updates

    For the Regex Mask generator, fixed an issue where quotes in regular expressions were malformed.

    Fixed an issue where the Test Webhook option failed when it shouldn’t.

    Fixed issue where the Confirm Data Generation panel erroneously indicated that Tonic could not connect to the destination database when it actually was only unable to connect to the source database.

    For the Address generator, removed an inappropriate value from the possible street names.

    Improved error messages when Tonic containers fail to start because of missing or incorrect environment variable values.

    Databricks

    • Fixed source catalog workspace handling when the Databricks Unity source catalog contains a table with a key constraint.
    • For Databricks 10.4 and earlier, generations to output Databricks tables now correctly create a new destination Databricks database if the specified one is not found.

    File connector

    • For file connector workspaces, the post-job scripts option is now hidden. The file connector does not support post-job scripts.
    • For Amazon S3, fixed an issue where Tonic did not delete a temporary file that was created to test permissions.

    Google BigQuery

    • External tables are now supported with some restrictions. Destination tables must be native BigQuery tables and cannot be external tables, whether masked or passthrough. Performance might be affected because of Google’s implementation of external tables.

    MongoDB

    • Fixed an issue where an error was thrown when generators were applied.

    Snowflake on AWS and Azure

    • Added an option to use key pair authentication to connect to the source and destination databases.
    v993 - v1003
    October 13, 2023

    Logging and telemetry connection status - On the System Status tab of Tonic Settings, a new Data Sharing section provides a summary of the logging and telemetry connectivity to the Tonic backend. The new section indicates:

    • Whether sending logs and telemetry to Tonic.ai is enabled
    • If they are enabled, whether Tonic is able to connect in order to send the logs and telemetry

    Other updates

    Tonic now checks for invalid virtual foreign keys for all data generation. Previously it only ran the check for subsetting data generation.

    Fixed an issue where the Continuous generator failed when a column contained only NULL values.

    Improved performance for the Continuous Generator.

    Fixed an issue where the Cell Count in the Usage Report could throw an integer overflow error.

    Fixed a subsetting issue where Tonic displayed the error “Error fetching Subset preview” when users navigated to a workspace.

    For subsetting, updated the Graph View display to make it easier to see the connections between the tables.

    Improved messaging when Tonic cannot reach the database when you test a database connection.

    Databricks

    • On the workspace settings view, Test Cluster Connection no longer requires that you re-enter your API key.
    • On the workspace settings view, the default job cluster specification now recommends a Databricks 14.0 Spark version.
    • Databricks is now enabled for Professional and Enterprise users on Tonic Cloud, as well as for free trial users.
    • Fixed source catalog workspace handling when the source catalog contains a table with a key constraint.
    • Added a warning to the workspace settings view to prevent specifying the same source and destination locations.

    File connector

    • Fixed the validation used to prevent duplicate files in Amazon S3 file groups.

    PostgreSQL

    • Improved handling when the database does not contain a public schema.
    • Improved subsetting performance for the Data Pipeline V2 data generation process.
    • Improved performance for the Data Pipeline V2 data generation process.

    SQL Server

    • Fixed how Tonic handles system time periods.
    • Improved messaging when Tonic cannot access SQL objects.
    v985 - v992
    October 6, 2023

    The new TONIC_ENABLE_SECURE_COOKIES setting indicates whether to enable the "Secure" attribute on Tonic authorization and analytics cookies. The default value is false. Do not set this to true if you access Tonic over an HTTP connection. When TONIC_HTTPS_ONLY is set to true, the “Secure’” attribute is always enabled on Tonic authorization and analytics cookies, and the value of TONIC_ENABLE_SECURE_COOKIES is ignored.

    Updated to prevent simultaneous updates to the same workspace configuration.

    For the Constant generator, fixed an issue for JSON columns where setting the constant to an empty string caused data generation to fail without setting the job status to failed.

    The upsert pre-job check that validates the constraints on the intermediate and destination databases no longer fails when a database has constraints with duplicate signatures.

    Fixed an issue where an empty upstream filter WHERE clause caused subsetting to fail if the schema changed so that the table was no longer upstream.

    To use the API to obtain data encryption settings, the API user must now have the required global permission.

    When users log out of Tonic, we now automatically invalidate any JSON Web Tokens (JWTs) that are not expired.

    The API endpoint /api/permission-sets now requires the ManageUserAccess (Manage user access to Tonic and to any workspace) permission. Added a new endpoint /api/permission-sets/public, which returns the subset of the data needed by users who do not have that permission.

    Amazon EMR

    • Fixed an issue where when data generation was run from the SDK, some generators, including the Categorical generator, would not work.some text
      • You can now configure a Databricks workspace to write destination data to Databricks Delta tables.
      some text
      • On the file group details for CSV files, added configuration options to quote spaces and trim whitespace.
      • Fixed an issue where Tonic was not able to display a preview of extremely large files.
      some text
      • On the Collection View, the preview icon is now hidden for types that cannot be previewed.
      some text
      • For Data Pipeline V2 data generation, fixed an issue where we did not correctly truncate destination tables.
      • Fixed an upsert issue for tables that have generated identity columns.
      some text
      • For a Snowflake on AWS workspace, you can now provide specific AWS credentials for the file storage locations (S3 buckets or external stages).
      • The Snowflake data connectors are now available to Tonic Cloud users with a Professional or Enterprise license. They are not available to free trial users.
    • DatabricksFile connectorMongoDBPostgreSQLSnowflake
    v942 - v946
    August 18, 2023

    Upsert data generation (beta feature)

    Previously, the data generation process always replaced the entire destination database. The new upsert data generation option (currently in beta) allows you to add new records and update existing records without touching any of the other records in the destination database. For example, you might have a regular set of records that you use for testing that you want to maintain.

    Upsert requires a connection to an intermediate database. When you run data generation with upsert, the initial data generation writes the transformed data to the intermediate database. It replaces the intermediate database, similar to regular data generation. After the generation to the intermediate database, the upsert process identifies the records to add to or update in the destination database. It ignores other records in the destination database.

    Upsert requires a Professional or Enterprise license, and is only supported for the following data connectors:

    • MySQL
    • Oracle
    • PostgreSQL
    • SQL Server

    New AI-enhanced documentation search option

    The Tonic documentation now provides access to Lens, an AI-based search option. Instead of searching for specific words, you can ask questions such as "How do I create a workspace?". Lens searches the documentation for the answer. It generates a response that includes links to the topics that contain the information it used.

    To use the Lens search, click the search field. At the top right of the search panel, click Lens. Then ask your question.

    Other updates

    The Custom Categorical generator now supports consistency with other columns. Previously, the generator only supported self-consistency.

    Tonic now prevents you from starting data generation for a workspace that does not have a destination database specified.

    Fixed a display issue where the generator preset details panel briefly showed the occurrences for the previously selected generator preset.

    Tonic now suggests the Name generator for columns that Tonic detects as containing names, when the detection is based on the sampled data in the column. By default, the Name generator uses the First Last format.

    A new configuration option allows webhooks to bypass SSL certificate validation and trust the server certificate.

    File connector

    • You cannot start data generation for a file connector workspace that has no source files specified.
    • The Table View data preview for file groups that contain JSON or XML files no longer displays above the real data records an extra row that contains the value 0.
    • Fixed an issue where generators such as the Categorical generator unexpectedly could not be used as sub-generators.
    • Improved error handling when a file group is incorrectly configured as having a header row.
    • Fixed an issue that caused data generation to fail for XML files because of missing metadata.
    • Fixed an issue where Database View displayed duplicate columns.
    • Table View no longer displays an extra data row.

    MongoDB

    • Fixed an issue that caused unscanned collections to not display on Collection View.
    • The sensitivity scan no longer marks Null fields as sensitive.

    PostgreSQL

    • Improved performance when refreshing materialized views in PostgreSQL v11, v12 and v13.
    v977 - v984
    September 29, 2023

    NOTE: v980 through v982 were removed.

    New monthly pay-as-you-go plan on Tonic Cloud - Tonic now offers a pay-as-you-go subscription plan for Tonic Cloud. Free trial users are offered the option to use a credit card to purchase a pay-as-you-go license. The monthly subscription grants a Professional level license. With the pay-as-you-go plan, you can configure generators for up to 20 tables across all of your workspaces. Tonic bills you separately for each additional table that you configure. The license renews automatically each month. On Tonic Settings, a new Billing tab displays the next renewal date.

    Data migration option for upsert - For upsert, Enterprise users can now connect a workspace to their own data migration script or tool to ensure that schema changes are automatically reflected in the intermediate database.

    Other updates

    Timestamp Shift is now the suggested generator for birthdate fields. Previously, the suggested generator was Random Timestamp.

    Fixed an issue where editing workspace settings could cause you to be logged out.

    File connector

    • On Tonic Cloud, you can now use Amazon S3 as a source of files. Previously, Tonic Cloud only supported Google Cloud Storage and uploaded local files.
    • For the Categorical generator, linked columns are now in the correct order.

    Google BigQuery

    • Tonic now supports arrays of a supported type. The STRUCT and INTERVAL types are still not supported.

    MongoDB

    • Fixed issue where a UI refresh was required in order to show automatic de-identification of foreign keys from de-identified primary keys.

    Oracle

    • The Download SqlLdr Files workspace permission is now assigned to the built-in Manager and Editor workspace permission sets.
    • Downloaded sqlldr files no longer include .bad files.

    PostgreSQL

    • Fixed an upsert issue for tables that have generated identity columns.

    SQL Server

    • The Constant generator can now handle bit values.
    v972 - v976
    September 22, 2023

    New global permission to view organization users - A new global permission, View organization users, determines whether a user is able to see the lists of users and groups in the organization. This permission is required in order to use the Tonic application to grant access to and transfer ownership of a workspace, and to grant access to global permission sets. It is not required when you use the Tonic API to perform these tasks. The permission is granted to the built-in Admin, Admin (Environment), and General User permission sets. When you upgrade, Tonic automatically grants this permission to your custom global permission sets.

    Other updates

    On the workspace details view, added a new upsert processing option, Warn on Mismatched Constraints. When this is enabled, Tonic treats mismatched foreign key and unique constraints between the source and destination databases as warnings instead of errors, so that the upsert job does not fail.

    Tonic now accepts all AWS RDS certificate authorities. Previously, we only accepted rds-ca-2019. The accepted certificates include:

    • rds-ca-rsa2048-g1
    • rds-ca-rsa4096-g1
    • rds-ca-ecc384-g1

    When job log recording (used to download job logs from the Tonic application) fails, it no longer creates a recording retry loop.

    File connector

    • Additional fixes to skip and log invalid rows instead of failing the data generation.
    • Fixed an issue where when you added a file to an existing file group, and any column name contained a leading or trailing space, Tonic incorrectly displayed a schema mismatch error.
    • You can now add .gzip files to a file group in a file connector workspace. The original file that was compressed must have the same format and structure as the other files in the file group. .gzip files are only supported for workspaces that use files from Amazon S3 or Google Cloud Storage. They are not supported in workspaces that use local files.

    PostgreSQL

    • During upsert, improved performance when de-conflicting unique constraints.
    v965 - v971
    September 15, 2023

    Create virtual foreign keys from Subsetting view - On Subsetting view, from a table details panel, you can now add a virtual foreign key to that table. To add a virtual foreign key, you select the foreign key column from the current table, then select the primary key column from the other table.

    Other updates

    Fixed an issue with TLSv1 and TLSv1.1 support in Tonic.

    Improved performance of downstream processing during subsetting.

    Fixed an issue where subsetting failed when a composite foreign key included a Boolean value.

    On Table View:

    • Added a column warning when the assigned generator fails to generate preview data.
    • Increased the default width of the preview data column.

    File connector

    • Tonic now skips invalid CSV file rows and logs a warning instead of failing the entire file.
    • Added a warning when the same source file is added to multiple file groups in the same workspace.
    • Added JSON Mask and XML Mask as the recommended generators for JSON and XML files.

    ​​MongoDB

    • You can now assign the Conditional and Null generators to binary fields.

    MySQL

    • Fixed an issue where Tonic did not clean up temporary file uploads after it wrote the data.

    Oracle

    • For a data generation job that ran SQL Loader (sqlldr), if sqlldr either failed or succeeded with errors, the job details include an option to download the sqlldr log files.

    PostgreSQL

    • Fixed an upsert issue where jobs failed with a unique constraint violation if the table contained a unique index but not a unique constraint.
    • During upsert, improved performance when de-conflicting unique constraints.

    Snowflake on Azure

    • Fixed a bug that occurred when specifying the Azure Storage Account URL.
    v952 - v960
    September 1, 2023

    In Table View, when a generator cannot be applied to a column in order to produce the preview data, the error message now includes the name of the column.

    Expanded the table and collection dropdown lists to accommodate longer names.

    The Privacy Report now marks a column as consistent when the generator is always consistent.

    Fixed a migration issue with file connector files that were added before V920.

    Fixed an issue where data generation could not run because of the permissions hierarchy.

    Fixed a security issue related to JWT authentication.

    Fix an issue where webhooks sometimes did not start when a job was canceled.

    Improved error message when Tonic cannot display a date value.

    PostgreSQL

    • For the Tonic Data Pipeline V2 processing, Tonic now stops job execution after the initial error.
    • Fixed an issue where check constraints failed to be applied in the destination database.
    • Fixed an issue where views that depend on both a table and a view at the same time were not created in the destination database.
    • Tonic no longer uses the TONIC_PAGE_PARALLELISM and TONIC_PARALLEL_READ_RANGES_TABLES environment variables for parallel processing.

    SQL Server

    • Fixed a caching issue that occurred when connecting to SQL Server.
    • Improved error messaging when a view cannot be created.
    • Improved the readability of SMO error messages.
    v961 - v964
    September 8, 2023

    Removed the requirement that the authentication cookie goes over HTTPS (Secure Cookie). This fixed an issue where users could no longer log into Tonic over HTTP, but they could still log in over HTTPS.

    Fixed an issue where users could not log out of Tonic from the email confirmation page.

    Fixed an issue where upsert failed because of foreign key violations. Also improved upsert performance.

    MongoDB

    • When generator configurations are updated in single document view, Tonic now generates the preview data without re-fetching the data and refreshing the page.
    • Fixed an issue that caused fields to disappear from hybrid view when the Null generator was applied.
    v936 - v941
    August 11, 2023

    On the generator configuration panel, changed the label of the Save As menu to Preset Options. The menu contains options related to configuring generator presets.

    Free trial users now have access to the file connector.

    Tonic now displays the error that occurs when an Algebraic generator configuration does not include any floating-point values.

    For composite generators, the generator preset details panel now provides a clearer explanation that presets for composite generators must be configured from within a workspace.

    Improved performance for the Address generator and the HIPAA Address generator.

    Amazon Redshift

    • Fixed an issue with clearing temporary tables.

    File connector

    • Fixed how Tonic handles EOF characters.

    MongoDB

    • Corrected the order of the available generator presets for a field.

    Snowflake

    • Fixed an issue where data generation returned the error The specified bucket does not exist.

    SQL Server

    • Database connections can now use the MultiSubnetFailover option.
    • Improved error messaging when a database cannot be created because of permissions issues.
    v947 - v951
    August 25, 2023

    You can now export individual topics from the Tonic documentation to PDF files. To export a topic to PDF, click the actions menu next to the topic title, then click Export as PDF.

    Fixed an issue with removing unique constraint conflicts in upsert where rows that didn’t have a conflict were excluded from the upsert process.

    File connector

    • For Amazon S3 and Google Cloud Storage, the permission to list all buckets is no longer required. However, if that permission is not present, users must manually type in the bucket name where the file is located.
    • When you copy a file connector workspace, Tonic now copies the file groups to the new workspace.
    v926 - v935
    August 4, 2023

    Removed support for TIM - The Tonic Installation Manager (TIM) command-line tool to install and configure Tonic is no longer available.

    Free trial users can now use a public email address to create the free trial account. Users with public email addresses cannot invite other users or share workspaces. Public accounts are only allowed for free trials.

    Users on a Professional instance can now share the Manager workspace permission set with users and groups.

    Improved error handling and validation messages for the foreign key file upload process.

    Counts of generator preset occurrences no longer include occurrences in deleted workspaces.

    On the bulk update panel in Database View, the consistency and differential privacy options now display correctly.

    Fixed an issue where you could not select Passthrough as a sub-generator in a composite generator.

    Fixed an issue where custom presets could not be deleted.

    Fixed a display issue where long post-job action names overflowed into the next column.

    Fixed an issue where you could not assign Random Timestamp as a sub-generator for the Conditional generator.

    Fixed an issue where the generator configuration panel displayed the generator preset options when the user did not have the Manage generator presets global permission.

    Fixed an issue where when a constraint failed to be applied, data generation failed.

    Improved display when users who do not have the Manage generator presets permission try to display the Occurrences tab on the preset details panel.

    Improved how we handle unavailable options for workspace actions in Workspaces view and in the Tonic navigation options.

    For the Conditional generator, Tonic now correctly compares MySQL date values.

    Databricks

    • Tonic cluster initialization scripts are now uploaded as workspace files instead of DBFS files. The new, optional Workspace Path setting for Databricks workspaces controls the parent directory where Tonic uploads initialization scripts. The default value is /Shared.

    File connector

    • The file connector can now support .txt files that contain CSV, XML, or JSON content.
    • Fixed an issue where Tonic incorrectly identified how a file connector file was encoded.
    • Improved error messages when uploading files for the file connector.
    • Fixed an issue when configuring a file group from Amazon S3 where users saw the error "Failed to fetch files from S3. The continuation token provided is incorrect." but could still see the list of files in the S3 bucket.
    • Tonic now correctly updates the file configuration for file groups. Previously, users could not add files that did not match the default configuration.
    • Tonic now displays an error when it is unable to read files from Amazon S3.
    • The file explorer for Amazon S3 can now list the files in folders that have names that contain special characters.
    • Improved encoding detection and file parsing.
    • Tonic now correctly handles EOF characters in .csv files.
    • Tonic now preserves the encoding of .csv files.

    MongoDB

    • Fixed an issue where the protection status information at the top of Privacy Hub did not update correctly after a new sensitivity scan.
    v923 - v925
    July 28, 2023

    Custom generator presets

    Earlier this year, for Enterprise instances, we introduced the concept of generator presets. A generator preset is a saved configuration of a generator. You can assign generator presets to columns.

    The initial release only included built-in generator presets, which allowed you to set the default configuration for Tonic generators.

    This update in v924 introduces custom generator presets, which allow you to set up multiple configurations of the same generator. You can create custom generator presets from Generator Presets view. From a generator configuration panel, you can also save the current configuration as a new custom generator preset.

    Generator preset occurrences

    From Generator Presets view, you can see how often each preset was used in a workspace configuration.

    The Occurrences column of the generator presets list shows:

    • The number of times the baseline configuration was used
    • The number of times the baseline configuration was overridden, meaning that a user selected the generator preset and then made a change to the generator configuration

    On the generator preset details panel, the Occurrences tab displays both the number of occurrences and the specific workspaces and columns where the generator preset was used. You cannot see workspace and column details for workspaces that you do not have access to.

    Other updates

    Tonic can now integrate with GitHub for SSO authentication.

    To manage generator presets, users must now have the Manage generator presets global permission. Previously, you could also manage generator presets if you had the Manager or Editor workspace permission set for any workspace.

    Fixed an issue where the table data in Table View was not updated correctly when switching the table mode to or from Scale mode.

    Improved performance for the Regex Mask and Conditional generators.

    MongoDB

    • Fixed an issue where the subsetting Graph View did not display virtual foreign key relationships.
    • You can now add collections to a subsetting rule before the sensitivity scan completes.

    PostgreSQL

    • Fixed an issue where certain database constraints were not handled correctly, which resulted in job warnings about the failure to add those constraints.
    v921 - v922
    July 26, 2023

    Permissions and permission sets

    As of v922, Tonic now uses permissions and permission sets to manage access to Tonic features and functions.

    A permission controls access to a single feature or function. A permission set is a saved collection of permissions.Tonic provides built-in global and workspace permission sets. You cannot change the configuration of built-in permission sets. Enterprise instances can create custom permission sets.

    Global permission sets contain global permissions, which control access to actions outside the context of a specific workspace. The built-in Admin global permission set grants access to all global permissions. Users and groups configured in the TONIC_ADMINISTRATORS environment variable are granted the Admin (Environment) global permission set, which also grants access to all global permissions. These global permission sets replace the previous admin user concept.

    The built-in General User global permission set is assigned to all Tonic users, and grants access to create workspaces. You can also designate a different global permission set to assign to all Tonic users.

    Workspace permission sets are assigned in the context of a specific workspace. They provide access to workspace permissions, which are associated with workspace management functions. The built-in workspace permission sets (Manager, Editor, Viewer Auditor) mirror the previous workspace roles. Similar to the previous Owner role, the Manager workspace permission set is granted access to all workspace permissions. However, unlike the Owner role, the Manager workspace permission set can be assigned to any user or group. You use the workspace sharing function to assign workspace permission sets within a workspace.

    Each workspace has a single owner. The user who creates the workspace is the initial owner. All owners are by default granted the Manager workspace permission set. You can also designate a different workspace permission set to assign to workspace owners. You use the transfer ownership function to select a different owner for a workspace.

    On Tonic Settings view, the User Management tab is replaced by the Access Management tab. From the Access Management tab, you can:

    • View and manage the list of Tonic users
    • If you use SSO, view a list of SSO groups
    • View, configure, and manage access to global permission sets
    • View and configure workspace permission sets

    API endpoint to track user access and permissions

    A new API endpoint to track the following events related to user access and permissions:

    • A user account is created.
    • A user account is removed.
    • A user logs in to Tonic.
    • A permission is added to or removed from a permission set.
    • A permission set is assigned to or removed from a user. This might be a global permission set, or a workspace permission set in the context of a specific workspace.
    • A generator preset is updated.

    The endpoint is:

    GET /api/audit-events/search

    Other updates

    You can now assign the Business Name generator as a sub-generator for the Regex Mask generator.

    For subsetting, Graph View and the table details panel now display information about cycle breaks, when the subsetting process needs to break a circular dependency.

    Databricks

    • Fixed an issue that prevented the use of partition filters in Databricks Unity Catalog workspaces.

    File connector

    • When a file connector workspace is deleted, Tonic now deletes files that were uploaded from a local file system.

    Spark

    • Fixed an issue with data generation for workspaces that use Hive.
    v910 - v920
    July 21, 2023

    The Admin Panel is renamed to Tonic Settings.

    Tonic now provides a more meaningful error when Preserve Destination mode is assigned to a table in a workspace that does not have a defined destination database.

    Fixed an issue where Tonic opened too many connections to the application database.

    Fixed an issue with timestamps in the Tonic API specification.

    Fixed a Tonic Cloud issue where using a different email domain to update a Tonic license caused issues with Tonic logging.

    Enhanced the performance of the HIPAA Address generator.

    The Data Pipeline V2 data generation process now respects the TONIC_PROCESS_PARALLELISM environment variable.

    Improved performance for subsetting, especially for data that contains a large number of foreign key relationships.Made a small performance improvement to primary key generators.

    File connector

    • Improved error messages when uploading files for the file connector.
    • The file connector now supports extended ASCII-encoded files.
    • Fixed an issue where the file preview omitted the first row of the file.

    MongoDB

    • On the Jobs view, you can now filter for the Collection Schema Scan job type.
    • Fixed an issue where sensitivity scans failed on large collections.

    Oracle

    • Tonic now provides a more meaningful error when it loads Table View for a table that the database account does not have access to query data from.

    PostgreSQL

    • Corrected the job history entries for subsetting jobs that run using the Data Pipeline V2 process.

    Spark

    • Fixed an issue that caused jobs to fail when an invalid Repartition or Coalesce value was specified.
    v899 - v909
    July 14, 2023

    NOTE: Releases v899 through v901 were removed.

    Tonic Data Pipeline V2 for PostgreSQL ends beta

    During the first half of 2023, Tonic has run a beta program for PostgreSQL for our new Data Pipeline V2. The beta program is now ending. Thank you to all of those who provided feedback.

    Starting with version V905, Tonic.ai will progressively enable Data Pipeline V2 for all customers. To ensure a smooth transition for all our PostgreSQL customers, Tonic.ai controls the rollout remotely.

    The remote rollout mechanism is controlled by an HTTPS request from your instance to https://feature-flag.tonic.ai. A JSON payload with a unique identifier for your deployment is sent, and the status of Data Pipeline V2 is returned. This request happens at the start of each data generation. If your Tonic server cannot reach https://feature-flag.tonic.ai, then the check is skipped.

    What to expect for the enrollment:

    • Before your instance is enrolled in Data Pipeline V2, Tonic Customer Success will contact you to confirm your enrollment.
    • After your instance is enrolled, the Data Pipeline V2 toggle on the Confirm Generation panel is removed. All PostgreSQL jobs run using Data Pipeline V2.

    For jobs that run on V2:

    • The job type is Data Pipeline Generation instead of Data Generation.
    • Jobs should run faster. Data Pipeline V2 has better resource handling, and can provide more parallel execution, especially for large tables and to apply constraints. Not all jobs will be faster, but for some jobs there should be a significant improvement.

    We will continue to improve Data Pipeline V2 as we expand coverage to other data connectors.

    Expanded Graph View for subsetting

    The subsetting Graph View is expanded to use more of the available screen space. The Configure Subset panel, which includes the Options and Latest Results tabs, no longer displays on Graph View. It only displays on Table View.

    Other updates

    Fixed an issue where when a data generation job failed, tables that used Preserve Destination table mode were not restored.

    The generated Tonic API documentation now includes the endpoints for managing file groups for file connector workspaces.

    Fixed an issue that caused jobs for some workspaces to fail with the exception "Cannot modify workspace whose schema is not the latest version.".

    Fixed an issue where the job details view displayed incorrect information.

    Updated the /api/DataSource endpoint to not contain secure data such as the API key.

    Updated our OpenAPI documents to ensure that all values of operationId are unique.

    Improved error messages for failed data generation.

    Databricks

    • The Test Connection button on the workspace details view now works correctly.

    MongoDB

    • Fixed an issue where schema scans time out even when a timeout is not configured.
    • Fixed an issue where MongoDB workspaces did not support MongoDB versions below v4.4.

    MySQL

    • Tonic data generation now supports generated columns in de-identified tables.

    Oracle

    • To improve the resiliency of Oracle commands during data generation, fixed the retry logic.

    PostgreSQL

    • Tonic no longer waits to process a job cancellation until after it finishes the constraint application that was in progress when the job was canceled.
    • Fixed an issue where Tonic returned an "insufficient space" exception when writing numeric values to a destination database.

    Snowflake

    • Improved performance for Snowflake on Azure.

    Spark

    • Job tracking URLs now display correctly on the job details page.

    SQL Server

    • Fixed an issue where Tonic did not retrieve all of the compound keys from the source data.
    v895 - v898
    July 7, 2023

    NOTE: Releases v897 and v898 were removed.

    On Subsetting view, Graph View now displays a loading animation as new data is loaded.

    Improved performance for the UUID Key and Integer Key generators.

    File connector

    • Fixed an issue where files sometimes did not upload completely.

    Google BigQuery

    • Improved performance of destination database writes during data generation.

    MySQL

    • Improved performance for destination database writes during data generation.

    Oracle

    • Improved error handling when the tablespace in the source database is missing from the destination database.
    v877 - v886
    June 23, 2023

    The ASCII Key generator now includes an Exclude Lowercase Alphabet option to exclude lowercase letters from the destination values.

    Fixed an issue that prevented free trial signups.

    Data generation no longer fails when Tonic is unable to retrieve the destination database size.

    Updated the FNR generator to prevent a possible leakage of PII.

    Added a Date column to the usage report. The date column provides the date and time when the data generation job was completed.

    Fixed an issue in the JSON Mask generator where it incorrectly changed the format of timestamps.

    Subsetting is no longer prevented when a table that is not in the subset is assigned Preserve Destination mode.

    Databricks

    • On the workspace details view, you can now optionally specify the catalog where the source database is located. If you do not specify a catalog, then the default catalog is used.

    MongoDB

    • Fixed an issue where the collection statistics failed because the statistics object became too large.
    • Updated to write collection records in batches.
    • Tonic now continues to retrieve documents after a failure.

    Oracle

    • Removed the uniqueness check for individual columns that are part of a composite index.

    PostgreSQL

    • Fixed an issue where there was duplicated data from parent tables of inherited tables.
    • Improved performance for query to retrieve tables and columns.

    Snowflake

    • Improved performance for data generation.
    • Updated how Tonic uploads files to Amazon S3 to reduce memory usage.

    SQL Server

    • Fixed DNS resolution for Kerberos.
    • Fixed an issue where Kerberos authentication failed with an error that the destination array was not long enough.
    v887 - v894
    June 30, 2023

    File connector

    The new file connector data connector allows you to use files from either Amazon S3, Google Cloud Storage, or a local file system as the source data. The file connector supports .csv, .json, and .xml files. Within a file connector workspace, you create file groups. Each file group contains files that have an identical format and structure. A file group is treated as a table for the purposes of table mode and generator configuration. The file connector is available with the Professional and Enterprise license plans.

    Generic O​​IDC SSO Support

    Tonic now supports authentication using a generic OpenID connection.

    Tonic API versioning

    We have introduced a versioning scheme for the Tonic API. API versions are released more or less quarterly, with the version identifier in the format vYYYY.MM.P (Year.Month.Patch). The current release candidate (v.RC) contains API updates in progress.

    You should now specify an API version in your API requests. The System Status tab of the Admin Panel lists the latest available version. You can also select the version to use when you do not provide a version in the request. If you do not provide an API version in a request or select a default API version, then until January 31, 2024, Tonic automatically uses the latest version. After January 31, 2024, Tonic will return an error from the request.

    Other updates

    Fixed an issue where Tonic incorrectly returned the error No Destination DB has been configured for this Workspace for workspaces that used Preserve Destination.

    For subsetting Graph View, updated the default zoom level to allow users to see more of the graph.

    The Keycloak SSO provider now supports PKCE challenge.

    Fixed an issue where deep links did not work for SAML SSO.

    MongoDB

    • For Mongo queries, Tonic now can now use disk as well as memory.

    Oracle

    • Tonic now refreshes materialized views even when SKIP_CREATE_DB is set to true.

    PostgreSQL

    • In the job progress steps, fixed an issue that caused the number of rows in a table to display as -1.
    • When the Data Pipeline V2 processing is enabled, tables are now processed by size, with larger tables processed first.

    SQL Server

    • Fixed support for Kerberos authentication.

    Spark

    • Added support for the UUID Key generator to Livy and Databricks workspaces that use Spark 2.3.x and above.
    • Added support for the UUID Key generator to the Tonic Java SDK.
    v818 - v826
    May 12, 2023

    Fixed an issue where users could only select generators that supported uniqueness constraints for columns that were not unique.

    Fixed an issue where admin users who did not have edit permissions on any workspaces could not edit presets from the Generator Presets view.

    Improved data generation resiliency against transient failures.

    Removed erroneous error messages.

    To add AWS credentials to containers, you can now mount to ~/.aws/credentials.

    Improved error messaging for Table View.

    Fixed a display issue where the column configuration panel was too narrow and required horizontal scrolling.

    Exporting or copying a workspace no longer requires the workspace to have a valid source database connection.

    Reduced the amount of memory needed to run the Tonic web server.

    MongoDB

    • Better handling of errors that involve invalid UUIDs.

    Oracle

    • Updated the required permissions for destination database connections. If SELECT ANY DICTIONARY or SELECT_CATALOG_ROLE cannot be granted, then Tonic can use a selection of ALL_ views (not recommended).
    • If TONIC_ORACLE_SKIP_CREATE_DB=true, then external tables are now excluded from the table list in Tonic. Tonic does not process those tables.

    PostgreSQL

    • Fixed an issue where the Data Pipeline V2 flow would hang.
    • Fixed an issue where extensions such as pgcrypto were not transferred when data generation included schema filtering.
    • Improved performance when handling constraints.

    Snowflake on AWS

    • As of V823, you can choose whether to use the Lambda process for data generation, which was previously the only option. By default, Snowflake on AWS uses a new, more resilient data generation process. You only need to use the Lambda data generation process for extremely large volumes of data (hundreds of gigabytes to terabytes).

      For existing workspaces, for versions before V826, the new default process is used. To use the Lambda data generation process, you must update your workspace configuration. As of V826, existing workspaces use the Lambda data generation process.
    • For the temporary CSV files used to retrieve and write source and destination data, you can now specify to use an external stage instead of an S3 bucket. The option to use an external stage is not available when you use the Lambda data generation process.
    • You can now specify different file storage locations for the temporary source and destination data files. In other words, you can have different S3 buckets or different external stages. Note that this option is not available when you use the Lambda data generation process.
    • For the new data generation process, fixed an issue where data generation jobs would hang instead of failing.

    Snowflake on Azure

    • Before it runs a data generation, Tonic now verifies that there is a valid value for the Azure Blob Storage account key, which is set as the value of the environment variable TONIC_AZURE_BLOB_STORAGE_ACCOUNT_KEY.
    • Fixed an issue where data generation jobs would hang instead of failing.
    v867 - v876
    June 16, 2023

    The new usage report summarizes the data processed for each table for data generation jobs. The report is a .csv file that you download from Tonic. To download the report, on the Admin Panel, click Download Usage Report.

    When certain sensitive loggers are enabled, Tonic now disables log collection.

    Fixed an issue on Database View where a column configuration panel would close unexpectedly.

    For the TONIC_ADMINISTRATORS environment variable, you can now specify the names of SSO groups to grant administrator privileges to. Previously you could only specify user email addresses.

    The Tonic SDK Javadoc now displays correctly.

    Restored the ability to import a workspace configuration from Workspaces view.

    MongoDB

    • Added support for the DBRef datatype.
    • Improved performance for collection management.

    PostgreSQL

    • For the beta Data Pipeline V2 data generation, adjusted the logging level for telemetry-related log messages to DEBUG.
    • For the beta Data Pipeline v2 processing, improved parallel processing for constraints that cross tables.
    • For the beta Data Pipeline V2 process, reduced the default maximum number of source and destination connections to 8. These are set as the values of the TONIC_JOBFLOW_MAX_SOURCE_CONNECTIONS and TONIC_JOBFLOW_MAX_DESTINATION_CONNECTIONS environment variables. We recommend that you set each value to the number of CPUs on the corresponding database.

    Snowflake

    • Fixed an issue where preserve destination tables were removed during data generation.
    v856 - v866
    June 9, 2023

    The new FNR generator transforms Norwegian national identity numbers. The FNR generator was added in V857. It included options to specify a range of birthdates and preserve the indicated gender. In V866, removed the date range configuration options. The destination values are now always within the same date range as the source values. The FNR generator also now can be used for columns that have uniqueness constraints. The final digits in the destination value are not a valid checksum.

    For the beta Data Pipeline V2 processing, fixed an issue where jobs would hang if they were canceled before the job started.

    Fixed an issue where the Foreign Keys view would freeze.

    Fixed an issue where when you typed @ to add a user mention to a comment, suggestions for the user did not display.Upgraded to use .NET 7.

    "Data science modeling" is changed to "data science mode".

    When you configure the SSH tunnel settings for a workspace, Tonic now obscures the SSH passphrase.

    MongoDB

    • Added a configuration to prevent Tonic from retrieving other information about collections when retrieving a collection list. Addresses an issue where retrieving the collection list took a very long time. To disable the retrieval of other collection information, set the environment variable TONIC_MONGO_DISABLE_COLLECTION_INFORMATION_FETCHING to false.
    • Fixed an issue with collection scanning that caused application pages to not load.
    • Collection scans are now able to continue when the scan for an individual collection fails. The job logs include warnings for each failed collection.
    • For collection scans, for each schema Tonic now limits the number of documents to scan and the length of time for the scan.

    PostgreSQL

    • Tonic now uses the estimated row count from PostgreSQL statistics to determine the parallelism for a table. Customers should ensure that they have up-to-date statistics for their source tables, especially for large ones.

    Snowflake

    • For source tables that are assigned Preserve Destination mode, Tonic no longer attempts to add existing constraints to the destination tables.
    • Fixed a syntax error in the post-generation processing.
    • Fixed an issue where data generation failed with the error "Unable to determine AWS Region".
    • Fixed an issue where preserve destination tables were removed during data generation.

    SQL Server

    • During data generation, Tonic now warns users when a filegroup that exists in the source database does not exist in the destination database.