Use Case
Test Data Management

The Role of Ephemeral Environments in QA

Author
Trey Briccetti
May 24, 2024
In this article
Share

Let’s not beat around the bush: if you’re reading this, I don’t need to explain to you why testing is important for assuring the quality of your software. It goes without saying that testing your software is one of the single most important things you can do, other than writing it in the first place. 

Our mission at Tonic.ai has always been to make access to high quality test data safer and easier than ever before, and since the launch of Tonic Ephemeral, we’ve also taken our expertise directly to the CI/CD pipeline, where it’s our goal to make using that test data as smooth as possible by spinning up and down test databases in the blink of an eye. These databases are a vital part of what we call ephemeral testing environments. 

Defining ephemeral environments in QA

Ephemeral testing environments might sound familiar—maybe you know exactly what they are already, or maybe you don’t, either way, you’re probably already using them. An ephemeral environment is just a short-lived place to run your software, where you can execute tests either automatically, manually, or both, and hopefully catch any bugs you wrote before they make it to production. The key thing to keep in mind is that this environment, which may consist of any number of data sources, services, files, or pictures of cats wearing sunglasses, is spun up on-demand and then promptly discarded when it is no longer needed. Github actions is a fine example of a technology that leverages ephemeral environments. Every time you push or merge your code, github will create an environment where it installs some dependencies, runs some code from your actions, and then gives you a shiny green checkmark before destroying the environment.

Ephemeral environments help us keep our software tests running in a controlled setting, where the only variable is our code that we changed since the last run. In that regard, ephemeral environments enable us to test software in a manner that is more adherent to the scientific method.

Integrating ephemeral environments into your QA Strategy

There are plenty of different ways you can set-up ephemeral environments for your CI/CD pipelines—you and the rest of the citizens of the twenty-first century have already figured that out. There are, however, not a lot of ways to set up a database full of safe, realistic, test data in a matter of seconds every time you need to run your tests. That’s where Tonic Ephemeral comes in: import and snapshot your perfect set of test data, and then spin up independent, ephemeral databases from that snapshot at a moment’s notice with just a few API calls. Now, every ephemeral test environment can have its own pre-populated database ready to go when the time comes.

Advantages of ephemeral databases for testing

You might be asking yourself, “If I want to test my application every time I make a change, why don’t I just keep a static, long-lived environment around that I can run my code through every time I need to?” Excellent point, I’m glad you asked. One reason why that is not a good approach is that your application is rarely ever going to leave your environment the same way it found it, whether it’s leaving behind a new record in a database table, some random log files written to disk, or something even worse like a zombie process that eats up a little bit of your cpu time. Do you really want to spend your time writing code to reset your environment to the same position it was in before the tests ran every time? Of course not! Even if you did have all the code for that written, you would still have to run that code after every pass through the testing pipeline. 

Boosting efficiency with on-demand data

The better solution is to make a carbon copy of exactly the environment you need for testing, create it every time you test, and then throw it out when you’re done! Ephemeral is designed to have a “configure once” approach to creating test databases, meaning that you’ll only need to specify what data you want in your test databases one time, and then every database environment you make for your tests will have that data ready for your application to use. Even better, we store database data in its native format, which means you don’t even have to wait for each database to import your data, instead they come online with all of it baked in.

Cutting costs with temporary databases

By keeping your databases active for only the time that your test environment needs it, you avoid wasting time undoing changes or resetting some long running database, and you don’t spend a dime keeping an idle database up for all that time that you’re not running tests. Ditch expensive RDS instances running 24/7, and use Tonic Ephemeral to run your test databases only for the time you need them.

Ephemeral databases and CI/CD integration

If there’s one thing I love more than spending 5 minutes completing a task, it’s spending 5 hours trying to automate it. Luckily, if you want to leverage the power of automating test databases for your CI/CD pipeline, Ephemeral can help you do that in far less time.

Automating database setups in Continuous Integration pipelines

First things first, you’ll need to set up a database snapshot. Essentially a snapshot is just an immutable copy of the data you want in your databases. Either run a Tonic Structural generation directly into Ephemeral, or import data manually using the import data workflow in the UI to create your snapshot. Next up create an API key in the user settings page and keep it somewhere safe. Now you can configure your CI/CD pipeline to run a step prior to setting up your application that calls the Ephemeral API and requests that a new database be created from that snapshot. Once the database has been created, you can configure the database connection information on your application with the credentials you received from the Ephemeral API.

Seamless teardowns for clean testing environments

Later on in your CI/CD pipeline, you can add another step that calls the Ephemeral API once more to delete the database you created for your test environment. If you’d prefer not to have to call the API again, there’s a few other clever ways we enable developers to dispose of their databases:

  1. Inactivity: Ephemeral will automatically clean up the database after it has stopped receiving connections for a configured period of time.
  2. Timed expiration: The databases will deactivate after a set amount of time has elapsed, and the database has no active connections.
  3. Business hours: Specify a schedule for a database to be active during; all other hours, the database will be automatically deactivated.
A screenshot of Tonic Ephemeral's UI for setting database expiration timers.

Enhancing test accuracy with database subsets

Using data subsets for targeted testing

If you’re thinking about using de-identified data from Tonic Structural, but your source database is incredibly large, consider using Structural’s subsetting feature to shrink your database snapshot to only include the relevant rows you need for testing. There’s plenty of information available in our docs on how to configure subsetting to target the right data that you need. The most important thing to keep in mind is that you want to maintain a balance between database size and code coverage. Having a billion rows in your test db is unlikely to add much more code coverage than a million, but it will take a toll on the speed of your database queries and tests. On the contrary, subsetting down to just 10 rows will be unlikely to test much at all, even if your queries do end up being much faster.

Ensuring data integrity with optimized subsets

One of the most important facets of subsetting is the fact that your data must remain referentially intact. This is mandatory to ensure that you don’t end up with a database full of null references and missing data. For more information on how we do that (and why it’s so cool), check out our webinar in which we deep dive cross-database subsetting and the engineering behind it.

Cost-effective testing with ephemeral resources

Managing cloud expenses through smart resource allocation

When creating your databases in Ephemeral, you have several options on how many resources you want that database to have. Consider what the purpose of that database is, will it just serve an environment for 10 minutes of automated testing? Or will you and 10 other developers be running compute intensive performance tests on your environment? No matter the case, Ephemeral provides enough flexibility to cover all of your testing scenarios without the overwhelming amount of configuration and esoteric network settings you might see on RDS. When choosing which resources to run your database with, keep in mind that you won’t need to import all of your test data manually. Since Ephemeral keeps your database snapshots in their native form, spinning up a fully populated test database is incredibly fast and requires no effort by the database to ingest your seed data. What this means for you is that your databases can be provisioned with very small resource groups, and still be lightning fast for your testing scenarios, since they don’t need to clear that initial hurdle of importing all of the data from CSV files or SQL scripts. This optimization is one of the many reasons why Ephemeral is the definitive test database provisioning solution.

Utilizing expiration timers for database cost control

An ephemeral environment, as we have discussed, is meant to only exist for the period of time that you need it to. In theory this is not a hard task to do yourself, but I would be a liar if I said I’ve never spun up an over-provisioned RDS database to test something on and then forgot about it for a full month and cost myself $400 for no reason. Situations like this are not uncommon at all, and that’s exactly why we felt we needed to build a solution. With our expiration timers and inactivity timers you can declare the life-span of your database up front, so you don’t ever have to pay out of pocket for a forgotten database that is sitting idle.

Leveraging API automation for agile testing

Simplifying database provisioning with API calls

If hands-free automation of test environments is your goal, the Ephemeral API is here for you. The Ephemeral API provides programmatic access to all of the features required to stand up, snapshot, and teardown databases right from your code, making management of your test data environments fully automated. After logging into the Ephemeral UI, an API Key can be created in the setting panel, which can be used to authorize your requests to the API. For more information on the Ephemeral API, see our documentation

Enhancing development workflows with automated data management

If you’re already a Tonic Structural user, you may have noticed the new workspace setting “Output to Ephemeral” which enables generation of de-identified data directly into an Ephemeral snapshot. With the combination of Tonic Structural and Ephemeral, going from a production database full of sensitive user data to a fleet of highly available, lightning fast, and cost effective test databases is just a few clicks away. If you’re interested in trying out Ephemeral, sign up for a free trial or book a live demo with our  team!

Build better and faster with quality test data today.
Unblock data access, turbocharge development, and respect data privacy as a human right.
Trey Briccetti
Software Engineer

Fake your world a better place

Enable your developers, unblock your data scientists, and respect data privacy as a human right.