Using Data Upsert to Optimize Test Data Management

Shannon Thompson
February 5, 2024
Using Data Upsert to Optimize Test Data Management
In this article

    Generating new test data is great and all, but what happens when you’ve already got a test dataset that you want to preserve in your environment and add new data to? We recently added upsert capabilities in our test data platform to enable you to do just that. In this article, we’ll explore the whats, hows, and whys of upserting data for optimized test data management, providing detailed insights into three specific use cases where upsert proves to be indispensable for fast-moving development teams. 

    Understanding Upsert

    Upsert, a portmanteau of "update" and "insert," is a standard industry process that empowers you to insert new data and update exist00ing rows within a database while preserving the integrity of all other existing records. In Tonic, this functionality proves invaluable in scenarios where maintaining specific test data is essential, as it eliminates the need to replace or overwrite existing data every time you generate a new dataset.

    A gif representing an upsert workflow in which data in an existing database is updated with freshly de-identified data.

    Curious to understand how it works on the backend and in the Tonic UI? Check out this video in which we explain the process step by step:

    Many teams face the challenge of managing test data from multiple sources within the same databases. Here are three common use cases of records you may want to keep intact when generating data: 1. older, pre-existing test data from sources other than Tonic; 2. data from different subsets; and 3. mock data related to an unreleased feature. In each case, by enabling upsert, you can ensure that existing records remain intact, even as you introduce new data or updates into the database. 

    1. Preserving Existing Test Data

    Many organizations store essential data, such as test fixture data, in their test environments, and they want to retain this data while generating new test data. For example, some teams may opt to continue to utilize older test data because they have tests that rely on it or use it for demo purposes. Upsert is particularly useful in these scenarios, as it allows you to inject fresh data into the database without erasing the existing information. This flexibility ensures that everyone has access to the data they need, even when data generation responsibilities are distributed across different teams.

    2. Combining Data from Multiple Subsets

    In complex projects, different teams or entities may need to work with specific subsets of data. Each team requires only a portion of the data, but it is essential that all this data resides in the same test environments for a unified view. Upsert can facilitate this by allowing teams to run multiple subsets of data generation processes and merge the results into a single database. This ensures that each team has access to the data it needs, while centralizing the data into a single database.

    3. Incorporating Mock Data for Unreleased Features

    When developing software for unreleased features, it’s common for teams to create mock data for their testing needs. In cases like these, the schema of the staging database is ahead of the production database. Upsert can be employed to preserve the mock data for these unreleased features and ensure that it isn’t overwritten by newly de-identified production data that Tonic generates. For more complex schema changes, Tonic can even facilitate the schema change migration between the source and staging databases. 

    The Takeaway

    In summary, upsert is a powerful feature that simplifies test data management across your team’s environments. Whether you need to combine data from multiple subsets, incorporate external test data, or insert mock data for unreleased features, upsert is an essential tool for efficient database management. By understanding and leveraging upsert, you can streamline your data operations and enhance the efficiency of your software development and testing processes. Questions about your specific test data use case? Connect with our team to learn more.

    Shannon Thompson
    Senior Product Manager
    Shannon is a product manager at

    Fake your world a better place

    Enable your developers, unblock your data scientists, and respect data privacy as a human right.