File statistics for pipelines - The pipeline details page now displays a summary of information about the pipelines files, including the number of files, the number of words in the files, the number of detected entity types, and the number of detected topics. For entity types, the display includes the number of detected values for each type. For topics, the display includes the number of files that involve each topic.
On the dataset details page, the preview count for each entity type now reflects the count of values that are assigned that type in the output files. Previously, values that matched multiple entity types were included in the preview count for all of the matching types.
New Textual Home page - The Textual Home page now contains an updated version of the Playground, where you can see how Textual detects and replaces entity values in text. There is no longer a separate Playground page, and there is no LLM Synthesis option. For each entity type, you can configure handling options and added or excluded values. Textual generates Python and cURL versions of the request that you can copy.
From the Request Explorer, in addition to testing added and excluded values, you can now also select the handling type for each entity type. The Unified toggle is replaced with options to view either the original values with their corresponding types (Identification) or the actual output with the replacement values (Replacement).
Edit and replay recorded requests - When you use the Request Explorer to preview a recorded redaction request from the SDK, you can now edit the request to add and exclude entity values. You can then re-run the redaction and view the differences between the original request and the edited request.
When you configure excluded values for an entity type, you can block detection of a specific type within a matching phrase. For example, if you add the phrase "one moment, please" to an excluded value for numeric values, the word “one” is not detected as a numeric value in that specific context.
Fixed an issue that prevented users from saving the dataset configuration for .docx comments.
Record and view redaction requests - When you make a redact call to redact a plain text string, you can now record the request. You specify the amount of time to keep the recording, and any tags to assign to the request. From the new API Explorer page, you can then view and analyze the recorded redaction requests, to assess the quality of the redaction.
New redaction options for datasets - The settings panel for datasets now includes additional configuration options:
Added the Tonic NER model version to the model information. The API endpoint /api/environment/models
reports version strings for NER models.
Entity manager for entity types - The new entity manager allows you to view all of the occurrences of each entity type in a dataset. it displays the original value, the context in the original file, and the context in the transformed file. To view the entities manager, from the entity value preview list, click Open Entities Manager. Note that by default, for the NUMERIC_VALUE
entity type, Textual only provides context information for the first 20 occurrences. To change this, set the SOLAR_NER_OCCURRENCE_IGNORE_NUMERIC_VALUE
environment variable to false
.