Tonic Textual release information

Learn about what’s in the latest Tonic.ai product releases.
v179
October 31, 2024

Textual can now redact images in .docx files.

v178
October 30, 2024

Fixed a rare issue where Azure OCR returned a400 response when the file upload stream contained corrupted data.

Improved synthesis on days of the week and ordinal numbers that are flagged as DATE_TIME.

Textual now only disables a numeric span when it overlaps one of the following disabled types: DATE_TIME, DOB, LOCATION, LOCATION_ADDRESS, LOCATION_ZIP, MONEY, CREDIT_CARD, PHONE_NUMBER.

v177
October 28, 2024

Textual now allows you to parse EML and MSG files.

v176
October 25, 2024

You can now use the Python SDK to configure Azure pipelines.

v175
October 25, 2024

Bug fixes and other internal updates.

v174
October 24, 2024

Bug fixes and other internal updates.

v173
October 24, 2024

You can now use the Python SDK to configure Amazon S3 pipelines.

v172
October 23, 2024

Amazon Textract can now be used to process dataset files.

v171
October 23, 2024

On the Python SDK, added parameters for pipeline creation, including the file location, the connection credentials, and whether to also generate redacted files.

v170
October 21, 2024

Improved the Textual NER model throughput on long strings that contain a large number of numeric characters.

Added the redact_html function to the SDK, which allows you redact sensitive values from HTML strings.

v169
October 16, 2024

Improved detection of names and organizations.

Disabled auxiliary model detection of WORK_OF_ART.

v168
October 16, 2024

Improved the Textual NER model throughput on long strings that contain a large number of detected entities.

Added support to store dataset files in a specified S3 bucket, instead of in the Textual application database.

When Textual replaces first name values, it now attempts to use a name with the same gender.

For the DOB (date of birth) entity type, you can now configure synthesis options. You can set how to shift the date.

v167
October 14, 2024

Bug fixes and other internal updates.

v166
October 11, 2024

Improved the synthesized values for the PERSON_AGE entity type.

v165
October 11, 2024

You can now configure the entity type handling for a dataset before you upload the dataset files.

You can now provide added and excluded entity values when you use the SDK to redact individual strings and files.

Added a new method to the SDK. redact_xml works similarly to redact_json. To produce a redacted output, you pass in a redact_xml string.

v164
October 9, 2024

Improve Pipeline UI to include better Python SDK code snippets

v163
October 8, 2024

Improved the user experience when you load a large number of files to a dataset.

v162
October 7, 2024

Updated the UI for adding and excluding values for entity types. Changed the tab labels to Add to detection and Exclude from detection, and removed the requirement to click the edit icon for the entries.

v161
October 2, 2024

Added support in the SDK to create dataset include lists to define additional values for an entity type.

v160
September 29, 2024

Removed support for en_core_web_trf and en_core_web_lg auxiliary models. Disabled model inference for ORGANIZATION, PERSON, LOCATION, and MONEY entity types. Updated the auxiliary model configuration environmental variables to have new default values:
TEXTUAL_AUX_MODEL_GPU: false
TEXTUAL_AUX_MODEL: en_core_web_sm

Fixed a redaction issue that was caused by a regression from v140.

Improved the Textual NER model, specifically for datetime values and and electronic health records.

Fix for correctly re-synthesizing files as part of pipelines

When you call the dataset.add_file method in the Textual SDK, you can now pass in IO bytes.

You can now specify a list of additional values to include for each entity in a datasets. This allows Textual to identify values that it might not identify because they are specific to your organization or industry. The list can contain both specific values and regular expressions.

Improved the file list display for datasets to better accommodate longer file names.

For an uploaded file pipeline, added a Back to Files breadcrumb to return the user to the main file list.

On the dataset details page, the bulk edit function for entity type handling is now a dropdown instead of separate buttons.

v159
September 25, 2024

You can now use the Python SDK to delete files from a dataset.

To improve performance, enabled date synthesis inference on GPU. Added the environment variable TEXTUAL_DATE_ SYNTH_ GPU to manage whether to use it.

Renamed the following environment variables:

  • SOLAR_PREFER_GPU to TEXTUAL_AUX_MODEL_GPU
  • SOLAR_AUX_MODEL to TEXTUAL_AUX_MODEL
v158
September 24, 2024

Bug fixes and other internal updates.

v157
September 23, 2024

Bug fixes and other internal updates.

v156
September 23, 2024

Bug fixes and other internal updates.

v155
September 20, 2024

Bug fixes and other internal updates.