A bilingual wordsmith dedicated to the art of engineering with words, Chiara has over a decade of experience supporting corporate communications at multi-national companies. She once translated for the Pope; it has more overlap with translating for developers than you might think.
The Db2 family of products developed by IBM are some of the oldest data management and relational database management systems (RDBMS) on the market, with development on IBM's first relational database beginning back in the 1970s. Db2 began as a strictly relational system, but has added object-relational and non-relational features over the years like XML, graph store, and JSON. In addition, it now offers AI capabilities and support for multi-cloud environments.
Db2 comes in three distinct flavors, each running on a different operating system:
Key features of Db2 include in-depth querying, accelerated hybrid transaction analytical processing (HTAP) performance, Oracle SQL compatibility, AI and ML capabilities, and actionable compression. Though it is one of the oldest RDBMS's on the market, it ranks fifth on dB-engines.com for relational databases, and seventh overall.
Companies using Db2 frequently need to create mock data that mimics production data for the purposes of QA, testing, and development. Creating homegrown de-identified data using scripts might seem like an inexpensive way to get the job done, but it poses a number of challenges, particularly when working with Db2.
First and foremost, hashing test data in-house is extremely time-consuming. Depending on the size of the database, it can involve weeks or months of monotonous work. And once the work is done...it isn't done. Data is a living organism, so it needs to be refreshed constantly. If not, you run the risk of missing bugs during testing, or presenting inaccurate software performance results to your executive team.
In addition, if you're running Db2 on the mainframe, your data types may not have LUW equivalents. This means you may need to convert the data in order to format it properly. You also cannot register an object in tool data management (TDM) which introduces more complexity to the test data generation process.
Another challenge developers continually face is the difficulty of creating test data that accurately mimics production data. This is a huge ask of someone who is attempting to execute data anonymization manually because there are so many nuances, data links, and data types to identify and replicate. This is especially difficult in databases like Db2 that allow both structured and unstructured data. While it is a gargantuan task for a single developer, it is absolutely necessary in order to render synthetic data that will perform the same as your production data.
The most important driving factor for teams attempting to generate test data based on their Db2 databases is data privacy. Creating homegrown test data opens up potential opportunities for personally identifiable information (PII) to leak into your test data. With unstructured data, data strings can contain pieces of PII, even if the field in which they are contained seems nebulous. Without a method for de-identifying personal info inside text strings, or embedded tables, the risk of exposing data to groups that shouldn't have access is high.
You can search and sort through thousands of records to identify potential PII by hand or using a script. Or you can hook your Db2 database up to Tonic.
Tonic eliminates the difficulties involved in creating test data in Db2 by sitting between your production database and lower environments to safely de-identify and generate your test data. Creating synthetic data in Db2 is simple with Tonic because it automates the most challenging tasks of the process for you:
Ready to equip your teams with a faster, safer way to generate test data for Db2? Get in touch with our team; we’re excited to show you what we’ve built.