Use Case
Test Data Management

Tonic vs Delphix vs K2View vs IBM Optim. A full comparison.

Author
Yuri Shadunsky
February 2, 2024
In this article
Share

State of the Test Data Management market

Test Data Management (TDM) solutions have been on the market for over a decade in various capacities, but until recently, the technology hadn’t seen much in the way of innovation. The advent of data synthesis and generative AI has changed this, offering organizations an opportunity to reconsider their TDM strategies and solutions, in order to assess whether their test data solutions are meeting the demands of today’s complex data, infrastructure, and CI/CD workflows.

In this article, we’ll compare three legacy TDM tools, Delphix, K2View, and IBM Optim, to the modern test data platform Tonic. We’ll compare how the solutions stack up in supporting key test data management features, and we’ll end with a summary of each comparison and a look at what customers have to say. Let’s get to it.

Comparing the features of Tonic, Delphix, K2View, and IBM Optim

The below table rates four test data management platforms in nine key areas, in terms of capabilities, performance, and integrations. An in-depth analysis of each area follows in the section below.

A table comparing TDM platforms Tonic, Delphix, K2View, and IBM Optim, across 9 key areas, including capabilities, performance, and integrations.

In-depth TDM feature comparison

1. Realistic de-identified data

The quality of your test data defines how useful it is. The higher quality the data, the more bugs you will find early on in testing and the more time and money you’ll save.

Tonic

Tonic’s focus has always been on making data useful, not just masked. We enable you to mimic the complexity of production, to achieve the same realism in your lower environments. Keep your data consistent by mapping a given input always to the same output across tables and environments so you can generate quality, referentially intact data over time. Set virtual foreign keys to identify constraints that do not exist in your database but exist in the data logic. Track unexpected schema changes and ensure no data leaks through to staging environments. And of course, apply versatile data generators for a slew of complex data types such as JSON data, regex, and html.

Delphix

Delphix provides generators for individual columns, however there is no Delphix feature for keeping data consistent across environments. The realism and performance of its output data suffers as a result.

K2View

K2View’s primary focus is on data integration—they use a “business entity” approach. The platform requires you to identify all of the relevant relationships up front. This enables you to link your data across data sources, but at a tremendous cost in terms of configuration time spent up front. What's more, K2View does not offer the ability to scan for sensitive information, meaning that the process of identifying what data requires masking is fully manual.

IBM Optim

IBM Optim offers a very small number of functions with which to mask your data, but these functions lack the ability to maintain the quality and relationships across all of your data (e.g. ensuring input-to-output consistency).

2. Ephemeral databases

The goal of ephemeral databases is to enable you to spin up (and down) fully hydrated test and development databases instantaneously. They offer significant time savings in provisioning and maintaining test databases for your end-users to use.

Tonic

Tonic.ai's latest offering, Tonic Ephemeral, makes it easy to spin up (and down) fully hydrated test and development databases on demand. Users no longer lose time provisioning and maintaining databases themselves, as Tonic Ephmeral provisions and maintains databases on the user’s behalf. Users can make quick copies of existing data sets, eliminating lag and wait-times for development and testing environments. The platform has built-in expiration times to remove the databases automatically, enabling you to save money on storage and compute by only keeping the databases available for as long as you need them.

Delphix

Delphix offers "data virtualization," which is actually just their own way of saying copy-on-write, for spinning up isolated test databases. But the functionality is limited when it comes to making those databases ephemeral.

K2View

K2View does not seem to have an offering in this space. They use a micro-database approach to make it easy to query individual records, but you have to use their UI for it. 

IBM Optim

IBM InfoSphere Optim Virtual Data Pipeline offers some of the same benefits. However, the pipeline does not seem to have broad compatibility and is not known for its ease of use.

3. Subsetting

Subsetting is a key part of any test data workflow — make your data small enough so that you can save money on storage costs, share it with developers abroad, and allow your developers to load it locally.

Tonic

At Tonic, we have a patented approach to subsetting that allows you to maintain the overall data quality while shrinking your data. Our customers shrink their data an average of 85% using the Tonic subsetter. Ebay uses the Tonic platform to take 8 Petabytes of data and turn it into 1 GB. Tonic's intuitive UI allows you to target a random percentage of data or target specific records with a where clause. Virtual foreign keys can be created within the UI to enforce constraints that do not exist in your database. And the Graph View interface makes it easy to navigate your table structure and see what data is getting incorporated into your subset and why.

A summary of a case study with Tonic customer Pax8: Data that used to take weeks to source is now refreshed daily and subsetted in as little as 30 minutes, using Tonic's patented subsetter.

Delphix

Delphix's subsetter only works at the table level, allowing you to shrink an individual table down, without any regard for maintaining referential integrity.

K2View

Within K2View's platform, you can subset based on the specific business entity you are interested in. This requires knowing the specific business entity or entities you need. As far as we can tell, there is no K2View feature that allows you to select a random set of records or easily link records in real-time to ensure referential integrity is maintained.

IBM Optim

IBM Optim only permits subsetting at a file or column level, but there do not seem to be options to provide high quality and referential intact subsets.

4. Relational database connectors

Relational databases are the traditional way to store data and native connectors are the best way to get high quality performance for your test data creation processes.

Tonic

Tonic offers native, performant, optimized connectors for PostgreSQL MySQL, RDS Aurora (PostgreSQL Aurora and MySQL Aurora), MariaDB, SQL Server, DynamoDB, DB2 and Oracle. New native connectors are regularly added to the platform, based on market requests. Connect with our team to discuss your integration needs and learn more about what's planned next on the roadmap.

Delphix

As of the time of this writing, Delphix supports SQL Server, PostgreSQL, MySQL, MySQL Aurora, SQLite, Oracle, Sybase, CockroachDB. Delphix also provides a proprietary interface that enables connections across other relational databases, however they are not optimized for the individual requirements of each database which means slower performance and key data types are often unsupported.

K2View

K2View supports all the standard relational databases with their data integration tool. Note that you must connect the data into their integration tool and change it into their proprietary data format before enabling any masking, a process that can have a significant impact on implementation.

IBM Optim

IBM provides little publicly available information on what connectors they support. We expect that IBM Optim is optimized for DB2 and that it either works poorly or does not work at all on other data sources.

5. Data warehouse connectors

Data warehouses are expanding rapidly in use for use cases from traditional analytics use cases to AI to real-time data streaming. To make the best use of these tools, you need to be-able to de-identify the sensitive data to give your broader organization access.

Tonic

Similar to with relational databases, Tonic provides performant and optimized native connectors for Snowflake, Databricks, Spark SDK, Spark EMR, Redshift, and Bigquery. The Tonic platform has been architected to match the scale of data stored in data warehouses, ensuring that de-identification processes are never a bottleneck to how fast your data needs to move.

Delphix

Delphix supports a standard set of data warehouse connectors, typically via a proxy that enables the connections but can result in increased time spent configuring the integration and poorer performance at scale.

K2View

K2View supports a standard set of data warehouse connectors with their integration tool. Note that you must connect the data into their integration tool and change it into their proprietary data format before enabling any masking.

IBM Optim

IBM provides little publicly available information on what connectors they support. We expect that IBM Optim is optimized for DB2 and either works poorly or does not work at all on other data sources, especially newer data warehouses.

6. NoSQL connectors

NoSQL connectors such as DocumentDB, DynamoDB and Mongo are relative newcomers but have become popular since their inception in the early 2000s. They provide more flexibility at the cost of structure. Organizations have to make a decision as to whether to use a NoSQL database or a relational database as their primary datastore and switching between the two types of data stores can be quite costly. Because of the flexibility of the data schema, masking for NoSQL data stores has a unique set of performance and UI requirements.

Tonic

Tonic provides performant and optimized native connectors for Mongo, DocumentDB, and DynamoDB. For a real-world use case, check out how Flywire uses Tonic for de-identifying data across both relational and NoSQL connectors.

Delphix

At the time of this writing, Delphix does not appear to support NoSQL connectors of any type.

K2view

K2View supports a standard set of NoSQL connectors with their integration tool. Note that you must connect the data into their integration tool and change it into their proprietary data format before enabling any masking.

IBM Optim

IBM provides little publicly available information on what connectors they support. As of website content from 2021, it did not seem like they supported Mongo on Optim, however they do have a SaaS offering of Mongo.

7. Ease of use and time to value

Whenever you buy any software, you want to get value out of it as quickly as possible.

Tonic

Thanks to Tonic’s Cloud offering, users can get started exploring the full platform at once, with a free trial. If you like what you see, you can convert your trial to a monthly pay-as-you-go plan for as little as $199/month, or connect with our team to discuss our professional and enterprise plans, including our self-hosted offering. Either way, you’ll get Tonic’s modern UI, which is continuously updated to provide users with a state-of-the-art no-code experience. Features like bulk applying suggested generators to de-identify your data enable you to run a generation in just a few clicks. Constant performance improvements and native database connectors enable you to run your data generation quickly and easily. Additionally, the platform’s file connector allows you to de-identify locally stored files without needing to set up any databases.

A summary of a case study with Tonic customer Hone: Hone accelerated regression testing from two weeks down to half a day, boosted their release cycle by 8x, and reduced their critical bugs from weekly to zero critical bugs in the nine months since deploying Tonic.

Delphix

To start using Delphix, you need to schedule a call with a sales person and then get set up on their platform. Having initially been built over a decade ago, Delphix’s outdated UI can be challenging to navigate.

K2View

K2View has a free trial and cloud offering with an onboarding flow. When it comes to deploying a self-hosted solution, users frequently encounter onboarding and usability issues, delaying their time to value. K2View initially started as a data integration tool and most of their customers seem to be using it purely for the data integration use case, so the masking tool has been a recent addition. Customers can often take months or even more than a year to get up and running with the tool. And even then the workflows can be complex enough to get masking data that they cannot refresh regularly.

IBM Optim

IBM Optim doesn’t offer a free trial without talking to a salesperson. IBM has a traditional sales organization that can get in the way of easy-to-start options, and their software typically requires longer implementation periods.

8. Performance

Effective test data management should accelerate engineering velocity, not slow down workflows by introducing bottlenecks in data provisioning. In order to be effective, TDM solutions must be built to meet the complexity and scale of today's data.

Tonic

Ensuring exceptional performance is one of the key driving factors behind Tonic’s native data source connectors. Tonic’s integrations have been built to take advantage of and account for each database's performance quirks and idiosyncrasies, to ensure that Tonic never slows down your data-dependent workflows. Dozens of Tonic’s enterprise customers process terabytes of data through Tonic, while accelerating their engineering velocity. To put a finer point on it, Tonic processes over 160 billion rows of data for customers per month. We also have customers that use Tonic as an ETL tool on top of the core data masking use cases, enabling them to move data around with Tonic and eliminate other ETL tools in their stack.

Delphix

Users have reported that multi-TB jobs can take days to run on Delphix. Challenges dealing with JSON and XML data, in particular, result in slower workflows. Some users have found they have to slow down their workflows in order to de-identify data outside of the platform.

K2View

Based on available research, K2View performance appears to be about average for data integration platforms. 

IBM Optim

Little information is available related to IBM Optim’s performance, but given the age of the platform we expect that it is not optimized for performance within modern infrastructure.

9. Generative AI capabilities

Generative AI is taking the world by storm and taking advantage of generative AI safely and securely is going to be important for all organizations going forward.

Tonic

As a data synthesis company, rather than solely being a data masking provider, Tonic.ai offers multiple solutions in the generative AI space to enable you to take advantage of the latest developments with LLMs and generative AI in a privacy protecting way. The Tonic test data platform itself incorporates AI data synthesis among its data generators. Additionally, Tonic.ai offers two products adjacent to the Tonic test data platform and built specifically for LLM use cases; these products are Tonic Textual and Tonic Validate. Tonic Textual operates similarly to Tonic’s test data platform, except for unstructured, free-text data, rather than structured and semi-structured data. By redacting and synthesizing contextually relevant replacements for sensitive data found within free text, Tonic Textual enables you to train LLMs on your free-text data while ensuring compliance and keeping sensitive information safe. Tonic Validate, meanwhile, is a platform that streamlines evaluating and iterating on the performance of Retrieval Augment Generation (RAG) applications.

Delphix

At the time of this writing, Delphix has no offerings specific to the needs of generative AI tools and has not yet appeared to have integrated AI within its TDM product..

K2View

At the time of this writing, K2View does not offer generative AI tooling or AI functionality within its TDM product.

IBM Optim

IBM is a leader in the AI space with Watson. However, the IBM Optim platform does not appear to offer features focused on enabling safe and secure generative AI usage.

Which is the best test data platform?

When comparing test data management platforms, Tonic’s technology stands out above the legacy TDM players thanks to its modern, up-to-date approach to generating and integrating quality test data into test and development environments. From the realism of its output data to the broad coverage of its native integrations to its overall ease of use and rapid time to value, Tonic provides a number of advantages over other test data platforms.

Pros of Tonic over Delphix

At a high level, customers choose Tonic over Delphix for its ease of use, better performance at scale, higher quality output data, and more advanced capabilities in supporting complex data types and subsetting. The Tonic platform is specifically built to scale with your data while handling complex data types and ensuring consistent data masking across databases. For a more detailed summary of Tonic vs Delphix, consider:

  • Ease of use: Tonic provides rapid time-to-value by virtue of its modern UI, fully accessible API, native connectors to the leading relational databases, data warehouses, and NoSQL data stores, and JSON and XML support, with no need to integrate data sources in advance. Deployment and implementation of Tonic is far easier than Delphix thanks to our use of Docker containers and Kubernetes for self-hosted instances, and to the availability of Tonic Cloud, for those who don’t want to manage a self-hosted instance. By comparison, Delphix’s UI is outdated, due to its origins dating back to over a decade ago, and it lacks many features out of the box, leading to cumbersome workarounds.
  • Better performance at scale: Delphix users have reported that the platform runs into significant performance issues when processing larger datasets. These performance issues suggest that Delphix is unable to process data concurrently in an effective way, which may be a result of burdensome de-identification rules under the hood. Tonic, meanwhile, is built to work at the scale of today’s data, in part because we’ve architected our technology to match the scale and performance of data warehouses like Snowflake and Databricks. Our customers regularly process PBs of data using complex de-identification configurations. Additionally, while Delphix does not allow you to work with all schemas in a single database unless you purchase an non-standard add-on, Tonic works natively with all user-defined schemas in your database at once, as standard practice.
  • Higher quality output data: Delphix lacks critical data masking capabilities like ensuring input-to-output consistency and is unable to process complex data types like Regex data, indicating a lack of focus on providing realistic data as the output of their platform. At Tonic, our focus has always been on making data useful, not just masked. We enable you to mimic the complexity of production, to achieve the same realism in your lower environments. We do this by balancing data privacy and data utility, to ensure that each is safeguarded within our approaches to data masking. Other features that elevate the quality of our output data include: input-to-output consistency across tables and databases; column linking; complex generators like JSON mask and regex; and schema change alerts.
  • Subsetting with referential integrity: Delphix does not offer an effective database subsetter out of the box, as their subsetter only works on at the table level and does not maintain referential integrity. The Tonic platform includes a patented, industry leading subsetter that preserves referential integrity, thanks to capabilities like consistency and our virtual foreign key tool which allows you to indicate foreign keys that aren’t expressed in your database. Subsetting is a core focus of our platform and is essential to many of our customers' workflows.

Pros of Tonic over K2View

Here, too, customers choose Tonic over K2View thanks to Tonic’s ease of use and faster time to value. Price is often a factor as well, given that the K2View platform offers less flexibility is acquiring specific capabilities; their Data Product Platform is built to be purchased all-in-one. Tonic, on the other hand, offers several tiers, with capabilities tailored to those tiers so customers can get the features they need without paying for features they don’t need. For a more detailed summary of Tonic vs K2View, consider:

  • Ease of implementation: Tonic is frequently cited as being very easy to get up and running. The platform’s native data source connectors streamline the process of connecting directly to your data, so you can rapidly start working with your data without getting slowed down by implementation overhead. K2View, meanwhile, takes a different approach to working with your data, requiring extensive upfront configuration that can block implementations from ever taking off. Tonic also offers a modern UI, fully accessible API, and JSON and XML support, with no need to integrate data sources in advance.
  • Lower costs: With flexible product tiers and pricing sized to data in scope, Tonic’s self-hosted offering allows teams to get the features and value they need at a price that’s suited to the scale of their projects and organizations. Tonic Cloud, meanwhile, makes the platform available to virtually any team in need of quality data masking, with its low monthly rate and pay-as-you-go approach. While K2View does offer a cloud service, it comes in at a higher price point, and the all-in-one nature of its self-hosted platform often prices out teams that don’t have a bottomless budget.

Pros of Tonic over IBM Optim

In choosing Tonic over IBM Optim, customers get a modern UI with better ease of use and capabilities built for today’s workflows, stronger performance at scale, and comprehensive approaches to data masking that ensure higher quality output data. For a more detailed summary of Tonic vs IBM Optim, consider:

  • Ease of accessibility and implementation: As a much more recently developed player in the test data space, Tonic offers a number of advantages over IBM Optim in the realm of ease of use. Its modern, no-code UI was built with today’s developers, workflows, and data ecosystems in mind. Capabilities like native data source connectors, a fully accessible API, a pay-as-you-go cloud offering, and on-premises deployment via Docker containers or Kubernetes streamline Tonic’s accessibility and implementation. IBM Optim’s outdated data masking platform, having been developed over fifteen years ago, is not known for its ease of use or broad compatibility with other technologies outside of the IBM ecosystem. Additionally, no free trial offering is available without speaking with their sales team.
  • Streamlined performance at scale: As mentioned above, Tonic is built to work at the scale of today’s data because we’ve architected our technology to match the scale and performance of data warehouses like Snowflake and Databricks. Our customers regularly process PBs of data using complex de-identification configurations. In addition to enabling customers to rapidly de-identify full production databases, this large-scale performance extends to our patented subsetter which works across tables and databases, traversing foreign keys to shrink full databases down to manageable datasets for use in developer environments. When it comes to subsetting, IBM Optim only allows you to subset at a file or column level but does not appear to offer options to provide high quality, referentially intact subsets across full databases. What’s more, given the platform’s age, it will not be optimized for performance in modern data infrastructures.
  • Higher fidelity output data: Tonic’s focus on providing engineers with high-quality data that is compliant and safe to use across pre-production environments is evident throughout the platform’s capabilities. Our comprehensive library of data generators, which can be made consistent and can be linked to preserve relationships within your data, and our extensive output destination options combine to provide rapid access to quality data for all testing and development use cases. By comparison, IBM Optim only offers a small number of functions with which to mask your data, and these limited functions lack the ability to maintain consistency and relationships across your data. IBM Optim’s output data suffers in quality as a result.

The above summarizes what we’ve heard and seen in the field in working with QA engineers and developers over the past five years. To put a finer point on it, we’ll pass the mic over to Tonic’s customers to describe the platform and the results they’ve achieved in their own words.

What do real customers say?

This article features several customer reviews of the Tonic test data platform in the in-depth test data management feature comparison section above. To read the complete stories behind these quotes, see our Customers page in which you’ll find case studies from a range of organizations and industries, from enterprises to startups, and from healthtech to fintech to e-commerce sites.

Among the stories worth highlighting is eBay’s use of Tonic to hydrate their staging environments with realistic, compliant test data. On eBay's technical blog, eBay VP of Engineering and Technical Fellow Senthil Padmanabhan details the challenges they previously faced in maintaining quality staging environments and how Tonic plays a key role in ensuring those environments are kept up to date and in sync with realistic test data generated from their production data. In particular, they rely on Tonic’s subsetter to shrink their 8 PBs of production data down to 1 GB datasets for use by their developers across many targeted use cases. To safeguard their customer’s privacy, they make use of Tonic’s format-preserving-encryption capabilities, which allows them to generate highly realistic data that looks, feels, and behaves just like their production data. 

In Padmanabhan’s words, from our case study: “Tonic has an intuitive, powerful platform for generating realistic, safe data for development and testing. Tonic has helped eBay streamline the very challenging problem of representing the complexities contained within Petabytes of data distributed across many environments.”

Other noteworthy testimonials include:

  • Fintech company Paytient, which uses Tonic to streamline their engineering workflows around the world to achieve 3.7x ROI: “We're globally distributed, so there are limits on who can access data, what data they can access, and where data is transferred and stored. Having Tonic generate production-like data for us has simplified the compliance restrictions while allowing us to leverage our global workforce… Tonic's Cloud product easily saved our Engineering team hundreds of hours of development time over several months”
  • Professional training platform Hone, which accelerated their regression testing by 10x thanks to faster test data provisioning: “Tonic drastically reduces the amount of time it takes for a full regression test for all of our core features. Before it was somewhere within a two-week time span for QA to get the data set up; now they are ready to go and have tested all of the core features manually within a half a day.”
  • Healthtech leader Everlywell, which increased their number of daily releases by 3x thanks to the increased quality of their test data: “With Tonic, we’ve shortened our build process from 60 minutes down to 20. Their subsetting and de-identification tools are a critical part of Everlywell’s development cycle, making it easy for us to get data down to a useful size and giving me confidence it’s protected throughout.”

In addition to our case studies, reviews of the Tonic test data platform can be found on G2.

Explore Tonic’s features live 

To get started exploring Tonic’s features for test data management, sign up for a sandbox to take the platform for a spin using a sample dataset or by connecting directly to your own data, whether by uploading a file or connecting to a database. As you get familiar with the platform, our product docs are an excellent resource for finding information on the specific capabilities you need. At any time, you can also connect directly with our team to get a tailored live demo and all your questions answered.

Build better and faster with quality test data today.
Unblock data access, turbocharge development, and respect data privacy as a human right.
Yuri Shadunsky
Senior Product Manager

Fake your world a better place

Enable your developers, unblock your data scientists, and respect data privacy as a human right.