Tonic Structural release information

Learn about what’s in the latest Tonic.ai product releases.
v1260
September 9, 2024

File connector - You can now specify a separate IAM role for the output location.

v1259
September 6, 2024

Create sensitivity rule from Database View - When you select the Database View bulk edit option for columns that have the same data type, do not have an assigned generator, and do not have a recommended generator, you now have the option to create a custom sensitivity rule. You can then immediately run a new sensitivity scan to catch matching columns.

Fixed an issue with the JSON Mask generator configuration panel where the example data did not update correctly.

Structural now displays a warning when the pre-job checks determine that the source database is on a newer major version than the destination database.

Salesforce - Rewrote the connector algorithm to avoid using sentinels, and to improve subset creation.

v1253
August 27, 2024

The detector for city names now ignores misleading values that are not city names.

v1258
September 6, 2024

Snowflake - Fixed a regression introduced in v1213 that limited table parallelism for all data generations that use the V2 pipeline.

v1257
September 4, 2024

Oracle - Fixed an issue where for subsetting data generation, the Maximum Character Limit was not calculated properly.

v1255
August 29, 2024

Snowflake

  • Fixed an issue where after any row failed to parse, Structural did not process the rest of an unloaded table file.
  • For AWS, added retry support to address some transient issues with Amazon S3.
v1254
August 28, 2024

Bug fixes and other internal updates.

v1252
August 26, 2024

Bug fixes and other internal updates.

v1251
August 26, 2024

Bug fixes and other internal updates.

v1244 - v1250
August 23, 2024

Updated the application to reflect the rename to Tonic Structural. Includes renaming the Tonic Settings view to Structural Settings.

From the Access Management tab of Structural Settings, users with permission to manage Structural access can now restore deleted users.

For a column that is part of a unique compound index, Structural now only suggests generators that can be used for unique columns.

Structural now detects SWIFT codes based on the format of the data in addition to the column name.

Fixed an issue where all subsetting WHERE clauses failed internally.

Databricks

  • On self-hosted instances, you can now configure whether Structural creates the destination database schema for Databricks tables. The environment setting TONIC_DATABRICKS_SKIP_CREATE_DB indicates whether to skip the schema creation. The default is false. The environment setting TONIC_DATABRICKS_ENABLE_WORKSPACE_SKIP_CREATE_DB indicates whether to include the option in the workspace configuration, and use TONIC_DATABRICKS_SKIP_CREATE_DB to determine the default. The default is true. You can add these settings to the Environment Settings list on Structural Settings.
  • When a cluster is resizing, and the workspace is configured with Use Databricks Job Cluster turned off, Structural data generation now waits for the resizing to complete.

File connector

  • Fixed an issue where a copied local files workspace that contained Avro or Parquet files could not be loaded.
  • Structural now supports authentication to Amazon S3 buckets using assumed roles for cross-account access.
  • Fixed an issue where file connector workspaces did not load when they used an Amazon S3 source with environment-based authentication.

Snowflake

  • Fixed a rare subsetting issue where processing a large table could cause the job to hang.
  • Fixed an issue where Structural did not retry transient failures when it read unloaded files from a cloud storage provider.
v1223 - v1227
July 26, 2024

Self-hosted instances can now schedule sensitivity scans to run automatically on a weekly basis. By default, the weekly scans are enabled and run each Sunday at midnight.

Structural can now detect the following additional sensitivity types:

  • Money Amount
  • Usernames

File connector

  • Fixed a regression that made Table View unusable.

Oracle

  • Fixed an issue that caused data generation to fail when subsetting with a date-based upstream table filter.

Salesforce

  • Resolved issues where Salesforce workspaces became unrecoverable.
  • Fixed issues with the use of session refresh tokens.

PostgreSQL

  • Preserve Destination and Incremental tables no longer drop columns that reference user-defined types (UDTs) or extensions.
  • Added support for the pgvector vector data type as a non-replaceable type that is always passed through to the destination database.

Snowflake on AWS

  • Improved performance when copying passthrough tables when a workspace uses separate source and destination S3 buckets.
v1239 - v1243
August 16, 2024

The scheduled sensitivity scans are now daily instead of weekly. By default, the scans run every day at midnight. Structural scans the 10 workspaces that have the most recent activity. Activity is defined as either a user-initiated workspace event that is added to the Protection Audit Trail, or a data generation job.

On the details view for custom sensitivity rules, fixed an issue where the Edit Current Preset button was always disabled.

When a generation to Ephemeral fails, Structural job logs now include the Ephemeral logs and destination database pod logs.

For users who do not have permission to manage sensitivity rules, the Sensitivity Rules option now displays in a disabled state.

When you configure a workspace to write to a self-hosted Ephemeral instance, or to write to Ephemeral Cloud from a self-hosted Structural instance, the workspace configuration now includes an option to test the Ephemeral connection.

v1218 - v1222
July 19, 2024

Structural can now detect the following additional sensitivity types that are defined by the HIPAA Safe Harbor method:

  • Medical record numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate and license numbers
  • Web URLs
  • Full face photographic images and similar images
  • Biometric identifiers, including finger and voice prints

Removed the environment setting TONIC_SUBSETTING_CYCLE_BREAK_GREEDY_ALGORITHM. The greedy algorithm to compute the required cycle breaks for subsetting is no longer available.

Snowflake

  • A new environment setting allows you to control whether Structural creates the destination database schema before it populates the destination data. By default, TONIC_SNOWFLAKE_SKIP_CREATE_DB is false, meaning that Structural creates the destination database and schema. If you set this to true, then Structural does not create the schema. You must create the destination database with the full schema. You can add TONIC_SNOWFLAKE_SKIP_CREATE_DB to the Environment Settings list on Tonic Settings.
  • Improved performance when writing to Snowflake for de-identified tables and some Passthrough tables.
v1234 - v1238
August 9, 2024

Yugabyte data connector - Structural now allows you to connect to databases on Yugabyte version 2024.1 and above. The Yugabyte data connector is available with a Professional or Enterprise license. It only supports Yugabyte SQL (YSQL).

When you configure a custom security rule, you can now create or edit the assigned generator preset. You can also use a workspace to preview the security rule results. The preview displays the matching columns for the selected workspace.

Structural can now detect the following additional sensitivity types:

  • US driver’s license number
  • Passport number
  • Marital status
  • GPS coordinates
  • Non-birthday dates: admission date, discharge date, date of death
  • US license plate

MySQL

  • When the SQL mode ALLOW_INVALID_DATES is set, Structural now allows Passthrough for columns that contain invalid dates.

Snowflake

  • For the Data Pipeline V2 process, reduced the frequency of polling cloud platforms for new unloaded files.
v1228 - v1233
August 2, 2024

Fixed an issue on the webhook configuration panel where users could not click Save when the Message Body tab contained large property values.

Fixed an issue that caused the Notifications service to stop processing webhooks.

Improved the detection of name values to identify more specific types of names.

Amazon EMR

  • Fixed an issue with previewing data after applying the Timestamp Shift Generator or Date Truncation Generator to timestamp columns with timezones.
v1208 - v1210
July 5, 2024

When Structural detects a state abbreviation, it no longer identifies it as a full state name.

During a sensitivity scan, the value finders now look more holistically at both the data and the column name instead of assessing them individually.

v1211 - v1217
July 12, 2024

For post-job webhook URLs, you cannot use URLs that resolve to a private IPv4 range.

To provide the column name matching criteria for custom sensitivity rules, you can now use a regular expression.

The Structural sensitivity scan can now detect UK and Canada postal codes.

You can now use the Structural API to manage custom sensitivity rules.

When you configure a PostgreSQL or MySQL workspace to write the destination data to a container repository, you can now specify the name of the database.

Fixed an issue where the column sensitivity type was not updated when a later sensitivity scan detected a different type. Columns that are manually marked as sensitive are not affected.

Increased the number of column names that Structural uses to detect sensitivity types.

Amazon EMR

  • Iceberg now correctly handles schemas that contain capitalized column names.

Salesforce

  • The Continuous generator is now available.
  • The Algebraic generator is now available.
  • You can now use WHERE clauses in subsetting target table configuration.
v1202 - v1203
June 21, 2024

When configuring a workspace to write output to an Ephemeral snapshot, you can now optionally configure the compute resources. By default, the resources are based on the size of the source database.

v1204 - v1207
June 28, 2024

Custom sensitivity rules - On self-hosted Enterprise instances, you can now configure custom sensitivity rules, which allow you to create your own sensitivity types. For each rule, you configure the general data type, text matching rules for the column name, and the recommended generator. Structural uses these rules during the sensitivity scan. Matching columns are included on the Recommended Generators by Sensitivity Type panel.

Toleration configuration for output to container repositories - Self-hosted customers who write output to a container repository can now set pod tolerations to enable pods to be scheduled on nodes that have taints. The tolerations are configured in environment settings. You can add these settings to the Environment Settings list on Tonic Settings.

MySQL

  • Fixed an issue with subsetting for instances of MySQL that were deployed using Amazon Aurora where downstream tables were not populated properly.
v1193 - v1197
June 7, 2024

Fixed an issue where sensitivity scans suggested generators based on substrings within a column name.

HTML is now removed from text in comment fields.

Fixed an issue where the XML Path generator did not work correctly.

A new environment setting, TONIC_SUBSETTING_CYCLE_BREAK_GREEDY_ALGORITHM, indicates whether to use a new, faster greedy algorithm to compute the required cycle breaks for subsetting. By default, the setting is false.

File connector

  • Fixed an issue with uploading .txt files for local file workspaces.
  • For CSV file groups, added an option to specify the encoding format of all files. If not specified, Structural attempts to detect the encoding automatically. When encoding cannot be determined, the automatic encoding detection now defaults to UTF-8 instead of windows-1252.

MongoDB

  • Added an API endpoint to retrieve all of the field paths in a database.

MySQL

  • Writing output to a container repository now works with multiple database schemas.
  • Improved resilience to transient issues when copying tables.
  • Fixed an issue with delayed retries of failed file uploads during data generation.

Oracle

  • You can now write output data from an Oracle workspace to a Tonic Ephemeral snapshot.

Salesforce

  • You can now provide connected app credentials in the workspace configuration. These fields are only displayed if the credentials are not configured in the TONIC_SALESFORCE_CONSUMER_KEY and TONIC_SALESFORCE_CONSUMER_SECRET environment settings.

Snowflake

  • Fixed a regression where ALTER statements were inappropriately run through the GetDdl flow.
v1198 - v1201
June 14, 2024

Sensitivity scans now detect name values more accurately.

Fixed an issue with certificate uploads for database settings.

Fixed an issue where the Structural application would hang after you created a workspace.

Shared logs are now transferred to an HTTPS endpoint instead of an Amazon S3 endpoint.

Amazon EMR

  • Fixed an issue where Table View filters and table filters reported all WHERE clauses as invalid.

Amazon Redshift

  • You can now configure a workspace to include or exclude specific schemas.

PostgreSQL

  • When you choose to write output to an Ephemeral snapshot, you can now provide a custom configuration file.
v1190 - v1192
May 31, 2024

Improved the accuracy of name detection.

MongoDB

  • On Collection View, for hybrid view, added a Filters panel. For single view, you can now filter fields by value.

Oracle

  • The Data Generation Pipeline v2 for Oracle now supports subsetting.
v1183 - v1189
May 24, 2024

Salesforce data connector - The Salesforce data connector is now available for self-hosted instances that have a Professional or Enterprise license. It is currently only available by request. To request access to the Salesforce data connector, contact Tonic.ai support.

Linking address columns for recommended generators - The recommended generators panel in Privacy Hub now indicates when address columns should be linked. The columns are displayed in groups. You then apply the recommended generators to all of the columns in the group, and the columns are automatically linked.

Other updates

Fixed an issue with subsetting. When processing upstream tables with nullable foreign keys that had no referenced key values to process, upstream filters were not applied.

Improved performance of the Conditional generator when using the IS IN operator.

The upsert option for workspaces is now out of beta.

Fixed an issue where the number of generators that are slow to compute was calculated incorrectly, which affected how we parallelized the generator processing.

The default value for the environment setting TONIC_ORACLE_DBLINK_ENABLED is changed to false. The plan is to eventually remove the feature.

Fixed an issue where the TONIC_DISABLE_IPV6 setting did not completely prevent services from binding to ipv6 addresses.

When applied to a numeric type column, the SSN generator now by default generates values without hyphens.

Amazon EMR

  • Added support for the Iceberg framework in Glue. If you use the Iceberg framework, then in the Spark Configuration section of the workspace details, make sure to add the following configurations: spark.sql.catalog.glue_catalog, .warehouse, .catalog-impl, and .io-impl.

File connector

  • Improved resilience to missing cloud files when configuring file groups, previewing data, and running generation jobs.

MongoDB

  • Fixed an issue with creating virtual foreign keys in MongoDB workspaces.

MySQL

  • Improved the performance of the destination database teardown step for databases that have many partitions.

Oracle

  • You can now use the Data Pipeline V2 process to run data generation for an Oracle workspace. The Confirm Generation panel includes a toggle to enable or disable the new process. Note that you cannot use the new process when subsetting is enabled.

Snowflake

  • Fixed a subsetting issue where extra downstream rows were included when no primary keys existed in the table relationship.
v1024 - v1026
November 10, 2023

API endpoints for subset configuration - The Tonic API now includes endpoints for subsetting configuration. You can use the endpoints to retrieve the subsetting configurations for a workspace, update subsetting configuration, and remove subsetting configuration. A subsetting configuration identifies a table as either a target table (percentage or WHERE clause) or a lookup table.

Improved how Tonic identifies values as names, to reduce false positives.

For upsert data generation, fixed an issue that caused failures on tables that contain foreign keys but no primary keys.

File connector

  • Sensitivity scans no longer discard the most recently generated set of downloadable generated files, which are generated from files uploaded from a local file system.

MongoDB

  • Fixed an issue that caused duplicate or inaccurate schema issues to be displayed on Schema Changes view.

Oracle

  • Tonic now validates destination database objects, such as views and packages, that were invalidated by the data generation. A warning is issued for objects that fail to validate. Validation failure can be caused by insufficient permissions for the Tonic user on the destination schema.

PostgreSQL

  • For Data Pipeline V2 data generation, fixed a race condition that could cause data generation to fail.
v1178 - v1182
May 17, 2024

Helm charts for deploying Structural to Kubernetes are now published at quay.io/tonicai/structural in addition to GitHub.

From the recommended generators panel on Privacy Hub, you can now enable or disable self-consistency for all columns within a sensitivity category.

Fixed an issue in Table View that sometimes caused the column order to be incorrect.

Added an environment variable TONIC_DISABLE_IPV6 to the PyML container. When set to true, the container no longer listens on IPv6 addresses.

File connector

  • Added support for Avro files.

MySQL

  • Fixed an issue with validating subsetting target table WHERE clauses when the table or schema name contained special characters.

PostgreSQL

  • Added limited support for ltree columns on versions older than 1.2. For tables where all columns are assigned the Passthrough generator, Structural copies the ltree data from the source database to the destination database. In tables that are de-identified, ltree columns are nullified in the destination database. If an ltree column is not nullable, then all of the columns in the table must be assigned the Passthrough generator.

Snowflake

  • Added support for date and time columns that have a seconds precision of 0.
  • Fixed an issue where the incorrect credentials were used when using separate source and destination buckets and credentials for Amazon S3.

SQL Server

  • Fixed an issue introduced in v1178 where XML columns might be persisted as nvarchar columns in the destination database.