How To Create Realistic Test Data For MySQL

Abigail Sims
December 19, 2022
How To Create Realistic Test Data For MySQL
In this article

    Databases systems have come a long way from their origins. Stemming from the original relational database system in the 70s, we’ve seen a lot of big jumps in terms of what they can do and how accessible they can be, even going as far as being designed for personal use instead of huge, bulking database systems for enterprises.

    And that’s precisely how MySQL was born.

    Three Swedes came together to create a new database system, not to beat the big guys but to create something for personal use. With that, they would begin working on MySQL, a relational database management system (RDBMS) inspired by mSQL. They felt that mSQL was too slow and rigid, so they created a system that would be faster and far more flexible.

    MySQL has gone on to become one of the most popular RDBMS solutions in the world, with it taking a dominant 38.9% of all database options. Why? Because it’s just so darn accessible. Not only is it free and open source, but it also comes with all the same features as enterprise software.

    However, with great power comes great responsibility. (Cue spiderman finger guns.) How does MySQL deal with managing data for testing and analysis purposes?

    Struggling to Use MySQL With Test Data?

    Nothing in life is perfect except maybe the McRib and that feeling when you cross off your last to-do list item. MySQL is no different, especially when it comes to using data for testing and analysis.

    MySQL Doesn’t Scale With Big Data

    Since MySQL was primarily designed for personal use, it was never intended to work with massive data sets. So if you’re working with giant data sets, you will feel it. MySQL only operates with a single node, so custom solutions for larger data sets rely on sharding to make it functional. Which means you’ll need to come up with your solution to get it done.

    Making and Retaining Links Is Difficult

    Your data is already linked within your databases, so your test data should be too. However, most people who make their test data will often forget those relations. Without them, your test data doesn’t give you an accurate representation of your production database. And how are you going to test with an inaccurate model?

    Keeping Sensitive Information Out of Test Data Is Tough

    You can create test data in various ways, but not all are equal when exporting your production data. If you’re using XML, JSON, or blobs, there is a genuine risk that some of that nested data gets grabbed, too, leaving your data at risk.

    Using Synthetic Data For MySQL

    So how can we solve these problems?

    On the one hand, you can spend your precious time manually querying your data, scrambling it, hoping you didn’t pull too much and that you get rid of all PII. But that can be incredibly time-consuming, and in the end, you still risk your data being reverse-engineered.

    Instead, you need a tool that can not only maintain the integrity of your data’s structure but you’ll also need something that can handle the scale of what you need. No two databases are alike, so why pick solutions that don’t adapt to your needs?

    How Tonic Can Help

    With Tonic, we’ve got your back, and we’ve got MySQL support baked in. With our generators, your test data will act and look exactly like production data but with none of the hassle. Plus, it’s mathematically guaranteed to be impossible to reverse.

    Oh. and those massive data sets? We’ve got you covered with data subsetting, so you can pull exactly what you need. No more, no less, no fuss, no mess.

    Using Tonic With MySQL and MariaDB

    Let’s take a look at how to create realistic test data for MySQL. Thanks to Tonic, getting your environment up and running is a breeze. Not only do we support MySQL out of the box, but we support MariaDB, too, for those who aren’t inclined to use Oracle products.

    Getting It Set Up

    Naturally, we need to set things up before making that delicious test data. To do so, we’ll need to create a source database user named “tonic”, with access to schema information and access tables in your preferred DBs. Then, your destination DB will also need a “tonic” user with all privileges.

    Configuring Your Workspace

    Once we have that set up, you’ll need to configure your environment next. In Tonic, specify the server and databases your want to connect, as well as user authentication and networking info. By default, Tonic doesn’t block data generation as long as the schema changes don’t conflict with your workspace. If you need to change this, there is a setting in Tonic itself.

    How To Create Realistic Test Data For MySQL

    Once you have the information for your origin and destination databases, you are free to generate data to your heart’s content! All of Tonic’s generators work with MySQL so that you can access it from the get-go. You also have access to data subsetting, so no workload is too big for your team.

    Yeah. It really is that easy.

    Need more help? Reach out to our team at, or book a demo with our team today to get a complete tour of Tonic!

    Abigail Sims
    As a reformed writer now deep in the marketing machine, Abigail can (and will) create narrative-driven content for any technical vertical. With five years of experience telling brand stories for tech startups and small businesses, she thrives at the intersection of complex data and creative communication.