The Hone Challenge: Unreliable data and the risk of PII in demo and QA environments
Hone provides an industry-leading platform for leadership training, offering comprehensive online learning programs to companies and their employees. Both demoing the Hone platform and building out its features requires access to realistic user data that is representative of their customers’ sensitive information, like PII and employee performance. Finding reliable data for sales demos and software QA had long been a critical challenge, exacerbated by the potential risk of exposing real-world PII along the way.
In both cases, Hone relied on data that they had manually created in production. The sales team demoed off of a single prod account, which hindered their ability to demo in tandem. Jason Lock, Senior Software Engineer and Tech Lead on Hone’s Platform team, explained the pains this created: “With multiple sales engineers demoing the product, they were constantly overriding each other's data, or working off of inaccurate data. But our main concern was the risk of exposing PII.”
This was a primary concern in their QA environments as well: the poor quality of their test data led engineers to pull from production instead. “For QA, we had generic fixture data that wasn't reliable at all,” Lock described. “It was either manually created, or we had some predefined fixtures that had dummy data in it, but it was really sparse. For complex features, it was really hard to test with." They also struggled with the data becoming stale, since it wasn’t regularly refreshed or kept in sync with schema changes when new features were added. “There was a lot of manual setup on QA’s part for every release. They would have to go through an entire regression and then set up all of the test data themselves, which took a long time."
The breaking point came when they embarked on a SOC 2 audit. From a productivity standpoint, they needed to ensure that all of their lower environments, from staging to local development, had data that was as close to production as possible to reliably test their features. From a compliance standpoint, they needed to guarantee that that realistic data didn’t contain so much as an iota of real-world PII.
Like many companies, they considered writing in-house scripts as a solution, but as Lock shared, “It's really hard to maintain. Our data set is very complex—the amount of effort that it would take to cover all of the fields that we need to de-identify is significant." The classic build vs. buy question arose: “Is there a third-party solution out there that we could use that does this way better than we can do it?”
The Hone Solution: Automated PII Detection, Generator Suggestions, and Targeted Subsetting
Lock googled their need and found his way to Tonic.ai. Tonic’s all-in-one data masking, subsetting, and synthesis platform offered exactly what Hone needed and stood out as a unique developer-first solution in the space. “I don't really know of any platforms that do what you do,” Lock commented.
Among the features that made Tonic a fit were its automation capabilities, including PII detection and targeted generator suggestions, as well as the UI’s overall ease of use and ease of deployment. “One thing that I got excited about is the ease of setting up Tonic and being able to manage the service ourselves,” Lock explained. “Offering container images fit really well into our current infrastructure. We knew we could easily plug this into our infrastructure and set it up way easier than writing our own scripts."
The automations also provide peace of mind for Lock. Not only do they streamline and accelerate the de-identification process, they also minimize subjectivity and the risk of human error. “Going back to security and compliance, the ease of being able to identify sensitive fields and apply the appropriate generators to those fields so that we aren't missing anything was a big motivator for us."
But it isn’t just data masking that’s driving Tonic’s value at Hone; they also rely on the platform for database subsetting. Tonic’s patented subsetter enables two distinct use cases for the company. They use it to curate targeted data for their sales demo environments, crafting demo data that best spotlights the capabilities of their platform. “We’ve landed some pretty big deals for us using our demo environments now,” Lock shared.
The second use case is specifically in support of SOC 2. Subsetting provides them with an effective process to securely off-board customers by pulling just the users in question and cleansing their data out of production. This enables Hone to meet the regulatory requirements of the right to be forgotten, fundamental to many compliance frameworks.
With the Tonic solution in place, what does data provisioning at Hone look like now? ““Data provisioning before Tonic was, well, basically there wasn't any. Now it’s a really simple process for everybody,” said Lock. “It takes, like, five minutes for jobs to run.” For local development environments, their developers pull from a container image that is a de-identified version of production and refreshed after every release. For staging, “Our qa team can go in and say we need a refresh of this data, and it's as simple as going into Tonic and rerunning the generation job to update the data. It really gives our QA team and our engineers confidence in testing features.”
The Hone Results: Zero Critical Bugs, 8x Release Speed, and Larger Deal Sizes
The quality of Hone’s Tonic-generated test data has significantly improved the quality of their releases. Hone has seen a dramatic reduction in the number of critical bugs pushed to production. Lock quantified this specifically: “Before we had Tonic and the availability of production-quality data for our engineers and QA, we would see critical issues at least once a week that were tied to not being able to accurately test our features under real-world scenarios. Now, we haven't had a critical issue since we fully operationalized Tonic into our software development life cycle. That was nine months ago." Critical bugs have reduced from once a week down to zero in nine months.
Thanks to the ease of provisioning data—from having no data provisioning process in place to getting fresh data on demand in five minutes flat—Hone has also been able to increase their release velocity. Prior to Tonic, they were pushing a release once every two weeks; today, they push releases twice a week or more.
The speed of their regression tests has also significantly accelerated. “Tonic drastically reduces the amount of time it takes for a full regression test for all of our core features. Before it was somewhere within a two-week time span for QA to get the data set up; now they are ready to go and have tested all of the core features manually within a half a day.” This accelerated work has freed up Hone’s QA team to spend less time on manual testing and more time and energy on automation, creating automated tests that run on Tonic-generated data and helping the engineering team come up with test plans.
Pairing improved and accelerated product development with an enhanced demo experience has also yielded impressive returns in terms of their sales. Thanks to the realism of Tonic-generated data in their sales demo environment, Hone has seen a 5% increase in their deal sizes.
These results all speak to one of the biggest benefits Lock sees: the ability to leverage Tonic as an essential component with Hone’s data platform. “Being able to extend Tonic via API, to use web hooks and trigger workflows in downstream services like GitHub, and to add post-job scripts, as well, is all really fantastic. We want to build an internal data platform that others can build on top of, and Tonic is a really crucial piece of that."
Going forward, the Hone team plans to continue to refine and automate these processes and integrations. And for their sales demo environment, they’ll continue to create more interesting datasets, targeting different scenarios and customers to curate their demos to specific use cases. For Hone and Tonic, both the present and the future look bright.
“I can't emphasize it enough,” concluded Lock. “Tonic has been such a great product to use. From getting it set up, the initial configuration, to the support that we get, to operationalizing: it’s been huge for us for sure."