Test Data Management (TDM) solutions have been on the market for over a decade in various capacities, but until recently, the technology hadn’t seen much in the way of innovation. The advent of data synthesis and generative AI has changed this, offering organizations an opportunity to reconsider their TDM strategies and solutions, in order to assess whether their test data solutions are meeting the demands of today’s complex data, infrastructure, and CI/CD workflows.
In this article, we’ll compare three legacy TDM tools, Delphix, K2View, and IBM Optim, to the modern test data platform Tonic. We’ll compare how the solutions stack up in supporting key test data management features, and we’ll end with a summary of each comparison and a look at what customers have to say. Let’s get to it.
The below table rates four test data management platforms in nine key areas, in terms of capabilities, performance, and integrations. An in-depth analysis of each area follows in the section below.
The quality of your test data defines how useful it is. The higher quality the data, the more bugs you will find early on in testing and the more time and money you’ll save.
Tonic
Tonic’s focus has always been on making data useful, not just masked. We enable you to mimic the complexity of production, to achieve the same realism in your lower environments. Keep your data consistent by mapping a given input always to the same output across tables and environments so you can generate quality, referentially intact data over time. Set virtual foreign keys to identify constraints that do not exist in your database but exist in the data logic. Track unexpected schema changes and ensure no data leaks through to staging environments. And of course, apply versatile data generators for a slew of complex data types such as JSON data, regex, and html.
Delphix
Delphix provides generators for individual columns, however there is no Delphix feature for keeping data consistent across environments. The realism and performance of its output data suffers as a result.
K2View
K2View’s primary focus is on data integration—they use a “business entity” approach. The platform requires you to identify all of the relevant relationships up front. This enables you to link your data across data sources, but at a tremendous cost in terms of configuration time spent up front. What's more, K2View does not offer the ability to scan for sensitive information, meaning that the process of identifying what data requires masking is fully manual.
IBM Optim
IBM Optim offers a very small number of functions with which to mask your data, but these functions lack the ability to maintain the quality and relationships across all of your data (e.g. ensuring input-to-output consistency).
The goal of ephemeral databases is to enable you to spin up (and down) fully hydrated test and development databases instantaneously. They offer significant time savings in provisioning and maintaining test databases for your end-users to use.
Tonic
Tonic.ai's latest offering, Tonic Ephemeral, makes it easy to spin up (and down) fully hydrated test and development databases on demand. Users no longer lose time provisioning and maintaining databases themselves, as Tonic Ephmeral provisions and maintains databases on the user’s behalf. Users can make quick copies of existing data sets, eliminating lag and wait-times for development and testing environments. The platform has built-in expiration times to remove the databases automatically, enabling you to save money on storage and compute by only keeping the databases available for as long as you need them.
Delphix
Delphix offers "data virtualization," which is actually just their own way of saying copy-on-write, for spinning up isolated test databases. But the functionality is limited when it comes to making those databases ephemeral.
K2View
K2View does not seem to have an offering in this space. They use a micro-database approach to make it easy to query individual records, but you have to use their UI for it.
IBM Optim
IBM InfoSphere Optim Virtual Data Pipeline offers some of the same benefits. However, the pipeline does not seem to have broad compatibility and is not known for its ease of use.
Subsetting is a key part of any test data workflow — make your data small enough so that you can save money on storage costs, share it with developers abroad, and allow your developers to load it locally.
Tonic
At Tonic, we have a patented approach to subsetting that allows you to maintain the overall data quality while shrinking your data. Our customers shrink their data an average of 85% using the Tonic subsetter. Ebay uses the Tonic platform to take 8 Petabytes of data and turn it into 1 GB. Tonic's intuitive UI allows you to target a random percentage of data or target specific records with a where clause. Virtual foreign keys can be created within the UI to enforce constraints that do not exist in your database. And the Graph View interface makes it easy to navigate your table structure and see what data is getting incorporated into your subset and why.
Delphix
Delphix's subsetter only works at the table level, allowing you to shrink an individual table down, without any regard for maintaining referential integrity.
K2View
Within K2View's platform, you can subset based on the specific business entity you are interested in. This requires knowing the specific business entity or entities you need. As far as we can tell, there is no K2View feature that allows you to select a random set of records or easily link records in real-time to ensure referential integrity is maintained.
IBM Optim
IBM Optim only permits subsetting at a file or column level, but there do not seem to be options to provide high quality and referential intact subsets.
Relational databases are the traditional way to store data and native connectors are the best way to get high quality performance for your test data creation processes.
Tonic
Tonic offers native, performant, optimized connectors for PostgreSQL MySQL, RDS Aurora (PostgreSQL Aurora and MySQL Aurora), MariaDB, SQL Server, DynamoDB, DB2 and Oracle. New native connectors are regularly added to the platform, based on market requests. Connect with our team to discuss your integration needs and learn more about what's planned next on the roadmap.
Delphix
As of the time of this writing, Delphix supports SQL Server, PostgreSQL, MySQL, MySQL Aurora, SQLite, Oracle, Sybase, CockroachDB. Delphix also provides a proprietary interface that enables connections across other relational databases, however they are not optimized for the individual requirements of each database which means slower performance and key data types are often unsupported.
K2View
K2View supports all the standard relational databases with their data integration tool. Note that you must connect the data into their integration tool and change it into their proprietary data format before enabling any masking, a process that can have a significant impact on implementation.
IBM Optim
IBM provides little publicly available information on what connectors they support. We expect that IBM Optim is optimized for DB2 and that it either works poorly or does not work at all on other data sources.
Data warehouses are expanding rapidly in use for use cases from traditional analytics use cases to AI to real-time data streaming. To make the best use of these tools, you need to be-able to de-identify the sensitive data to give your broader organization access.
Tonic
Similar to with relational databases, Tonic provides performant and optimized native connectors for Snowflake, Databricks, Spark SDK, Spark EMR, Redshift, and Bigquery. The Tonic platform has been architected to match the scale of data stored in data warehouses, ensuring that de-identification processes are never a bottleneck to how fast your data needs to move.
Delphix
Delphix supports a standard set of data warehouse connectors, typically via a proxy that enables the connections but can result in increased time spent configuring the integration and poorer performance at scale.
K2View
K2View supports a standard set of data warehouse connectors with their integration tool. Note that you must connect the data into their integration tool and change it into their proprietary data format before enabling any masking.
IBM Optim
IBM provides little publicly available information on what connectors they support. We expect that IBM Optim is optimized for DB2 and either works poorly or does not work at all on other data sources, especially newer data warehouses.
NoSQL connectors such as DocumentDB, DynamoDB and Mongo are relative newcomers but have become popular since their inception in the early 2000s. They provide more flexibility at the cost of structure. Organizations have to make a decision as to whether to use a NoSQL database or a relational database as their primary datastore and switching between the two types of data stores can be quite costly. Because of the flexibility of the data schema, masking for NoSQL data stores has a unique set of performance and UI requirements.
Tonic
Tonic provides performant and optimized native connectors for Mongo, DocumentDB, and DynamoDB. For a real-world use case, check out how Flywire uses Tonic for de-identifying data across both relational and NoSQL connectors.
Delphix
At the time of this writing, Delphix does not appear to support NoSQL connectors of any type.
K2view
K2View supports a standard set of NoSQL connectors with their integration tool. Note that you must connect the data into their integration tool and change it into their proprietary data format before enabling any masking.
IBM Optim
IBM provides little publicly available information on what connectors they support. As of website content from 2021, it did not seem like they supported Mongo on Optim, however they do have a SaaS offering of Mongo.
Whenever you buy any software, you want to get value out of it as quickly as possible.
Tonic
Thanks to Tonic’s Cloud offering, users can get started exploring the full platform at once, with a free trial. If you like what you see, you can convert your trial to a monthly pay-as-you-go plan for as little as $199/month, or connect with our team to discuss our professional and enterprise plans, including our self-hosted offering. Either way, you’ll get Tonic’s modern UI, which is continuously updated to provide users with a state-of-the-art no-code experience. Features like bulk applying suggested generators to de-identify your data enable you to run a generation in just a few clicks. Constant performance improvements and native database connectors enable you to run your data generation quickly and easily. Additionally, the platform’s file connector allows you to de-identify locally stored files without needing to set up any databases.
Delphix
To start using Delphix, you need to schedule a call with a sales person and then get set up on their platform. Having initially been built over a decade ago, Delphix’s outdated UI can be challenging to navigate.
K2View
K2View has a free trial and cloud offering with an onboarding flow. When it comes to deploying a self-hosted solution, users frequently encounter onboarding and usability issues, delaying their time to value. K2View initially started as a data integration tool and most of their customers seem to be using it purely for the data integration use case, so the masking tool has been a recent addition. Customers can often take months or even more than a year to get up and running with the tool. And even then the workflows can be complex enough to get masking data that they cannot refresh regularly.
IBM Optim
IBM Optim doesn’t offer a free trial without talking to a salesperson. IBM has a traditional sales organization that can get in the way of easy-to-start options, and their software typically requires longer implementation periods.
Effective test data management should accelerate engineering velocity, not slow down workflows by introducing bottlenecks in data provisioning. In order to be effective, TDM solutions must be built to meet the complexity and scale of today's data.
Tonic
Ensuring exceptional performance is one of the key driving factors behind Tonic’s native data source connectors. Tonic’s integrations have been built to take advantage of and account for each database's performance quirks and idiosyncrasies, to ensure that Tonic never slows down your data-dependent workflows. Dozens of Tonic’s enterprise customers process terabytes of data through Tonic, while accelerating their engineering velocity. To put a finer point on it, Tonic processes over 160 billion rows of data for customers per month. We also have customers that use Tonic as an ETL tool on top of the core data masking use cases, enabling them to move data around with Tonic and eliminate other ETL tools in their stack.
Delphix
Users have reported that multi-TB jobs can take days to run on Delphix. Challenges dealing with JSON and XML data, in particular, result in slower workflows. Some users have found they have to slow down their workflows in order to de-identify data outside of the platform.
K2View
Based on available research, K2View performance appears to be about average for data integration platforms.
IBM Optim
Little information is available related to IBM Optim’s performance, but given the age of the platform we expect that it is not optimized for performance within modern infrastructure.
Generative AI is taking the world by storm and taking advantage of generative AI safely and securely is going to be important for all organizations going forward.
Tonic
As a data synthesis company, rather than solely being a data masking provider, Tonic.ai offers multiple solutions in the generative AI space to enable you to take advantage of the latest developments with LLMs and generative AI in a privacy protecting way. The Tonic test data platform itself incorporates AI data synthesis among its data generators. Additionally, Tonic.ai offers two products adjacent to the Tonic test data platform and built specifically for LLM use cases; these products are Tonic Textual and Tonic Validate. Tonic Textual operates similarly to Tonic’s test data platform, except for unstructured, free-text data, rather than structured and semi-structured data. By redacting and synthesizing contextually relevant replacements for sensitive data found within free text, Tonic Textual enables you to train LLMs on your free-text data while ensuring compliance and keeping sensitive information safe. Tonic Validate, meanwhile, is a platform that streamlines evaluating and iterating on the performance of Retrieval Augment Generation (RAG) applications.
Delphix
At the time of this writing, Delphix has no offerings specific to the needs of generative AI tools and has not yet appeared to have integrated AI within its TDM product..
K2View
At the time of this writing, K2View does not offer generative AI tooling or AI functionality within its TDM product.
IBM Optim
IBM is a leader in the AI space with Watson. However, the IBM Optim platform does not appear to offer features focused on enabling safe and secure generative AI usage.
When comparing test data management platforms, Tonic’s technology stands out above the legacy TDM players thanks to its modern, up-to-date approach to generating and integrating quality test data into test and development environments. From the realism of its output data to the broad coverage of its native integrations to its overall ease of use and rapid time to value, Tonic provides a number of advantages over other test data platforms.
At a high level, customers choose Tonic over Delphix for its ease of use, better performance at scale, higher quality output data, and more advanced capabilities in supporting complex data types and subsetting. The Tonic platform is specifically built to scale with your data while handling complex data types and ensuring consistent data masking across databases. For a more detailed summary of Tonic vs Delphix, consider:
Here, too, customers choose Tonic over K2View thanks to Tonic’s ease of use and faster time to value. Price is often a factor as well, given that the K2View platform offers less flexibility is acquiring specific capabilities; their Data Product Platform is built to be purchased all-in-one. Tonic, on the other hand, offers several tiers, with capabilities tailored to those tiers so customers can get the features they need without paying for features they don’t need. For a more detailed summary of Tonic vs K2View, consider:
In choosing Tonic over IBM Optim, customers get a modern UI with better ease of use and capabilities built for today’s workflows, stronger performance at scale, and comprehensive approaches to data masking that ensure higher quality output data. For a more detailed summary of Tonic vs IBM Optim, consider:
The above summarizes what we’ve heard and seen in the field in working with QA engineers and developers over the past five years. To put a finer point on it, we’ll pass the mic over to Tonic’s customers to describe the platform and the results they’ve achieved in their own words.
This article features several customer reviews of the Tonic test data platform in the in-depth test data management feature comparison section above. To read the complete stories behind these quotes, see our Customers page in which you’ll find case studies from a range of organizations and industries, from enterprises to startups, and from healthtech to fintech to e-commerce sites.
Among the stories worth highlighting is eBay’s use of Tonic to hydrate their staging environments with realistic, compliant test data. On eBay's technical blog, eBay VP of Engineering and Technical Fellow Senthil Padmanabhan details the challenges they previously faced in maintaining quality staging environments and how Tonic plays a key role in ensuring those environments are kept up to date and in sync with realistic test data generated from their production data. In particular, they rely on Tonic’s subsetter to shrink their 8 PBs of production data down to 1 GB datasets for use by their developers across many targeted use cases. To safeguard their customer’s privacy, they make use of Tonic’s format-preserving-encryption capabilities, which allows them to generate highly realistic data that looks, feels, and behaves just like their production data.
In Padmanabhan’s words, from our case study: “Tonic has an intuitive, powerful platform for generating realistic, safe data for development and testing. Tonic has helped eBay streamline the very challenging problem of representing the complexities contained within Petabytes of data distributed across many environments.”
Other noteworthy testimonials include:
In addition to our case studies, reviews of the Tonic test data platform can be found on G2.
To get started exploring Tonic’s features for test data management, sign up for a sandbox to take the platform for a spin using a sample dataset or by connecting directly to your own data, whether by uploading a file or connecting to a database. As you get familiar with the platform, our product docs are an excellent resource for finding information on the specific capabilities you need. At any time, you can also connect directly with our team to get a tailored live demo and all your questions answered.