Tonic Document Masker - Find and Replace PII in any document
Adam Kamor, PhD
February 11, 2019
A few days ago we made the Tonic Document Masker public, and it’s been fun watching people react to it. Let’s explain a little bit about how it works.
The Tonic Document Masker allows you to find and replace PII in any document. All you need to do is paste a piece of text containing PII and Tonic will parse the text, find PII, and then replace PII with random text, random names, or other types of context specific ‘fake’ text.
For example, the following sentence will be mapped from:
Note: Detected PII is being shown as bolded text.
As you can see, the Document Masker correctly detected all pieces of PII in the above sentence. Specifically, it detected a name, birth date, and location. In addition, it can detect ~20 other types of PII including things like medical codes (IDC10), social security numbers, and ethnicities.
Document Masker has a few tricks up its sleeve. For example, if a document contains the same name in multiple places, its replacement will be consistent throughout. For example, given the text
The Document Masker will create
You can see how Albert Einstein is consistently mapped to the fake name ‘Genny Jakobsen’ while Isaac Newton is mapped to a different name ‘Kirstin Hurd’.
If you’re interested in using Tonic’s Document Masker on your data, give us a shout at email@example.com, or click the chat bubble below.