How to Diagnose Test Failures - And Why Your Test Data Matters

Madelyn Goodman
May 24, 2023
How to Diagnose Test Failures - And Why Your Test Data Matters
In this article

    TLDR; Don’t let your error messages get you down - there are ways to find what’s blocking you! Good test failure diagnoses start with high quality and secure test data. It’s just a fact. 

    Don’t see red!

    There is nothing quite as euphoric as the feeling of testing your code and it running without any failures. At the same time, there is nothing quite as frustrating as testing your code for the 10th time and your test still failing. 

    We built Tonic because we know how essential testing is to the engineering workflow, and how frustrating it can be to hit roadblocks. We’re engineers too after all, so we feel your pain. To help you at every step of your testing journey, we compiled our tried and tested tricks to diagnose test failures. You can bet that safe and accurate test data from Tonic is step zero 😉.

    Step 0: Use high quality and secure test data

    The better your test data, the more likely you’ll be able to reproduce your errors and get to the bottom of what’s causing your code to fail. 

    What does “quality test data” mean in the context of diagnosing test failures? Well you want to be using data that is both representative and specific. You want data that’s representative of the real world data that your code will be running on and that covers different edge cases and common failure points. At the same time, it's important for it to be specific to the error you are running into in order to get to the root of the problem. The test dataset itself also needs to be reproducible, again, to allow you to drill down to the root cause of the problem in a controlled environment. 

    How do you find data that does it all? Fake is your friend. 

    Not only does fake, or synthetic, data ensure the security of your customers’ personally identifiable information (PII), but it can also do all of the above. Using Tonic, you can produce secure and accurate data based on your production data that’s scalable, representative, specific, and reliable. Because fake data is more flexible, you can test more edge cases than you might have been able to with your real customer data. Production databases are ever-changing, making it difficult to get consistent data to test your code on. Using a static fake dataset, you can be sure that whatever failures you are running into can be easily reproduced. 

    The old adage of what goes in must come out also applies to the quality of test data and your software.

    Step 1: Read the error message

    It goes without mentioning, but we’re going to do it anyway - the true first step in diagnosing your test failures should always be to thoroughly read the error message. It’s easy to see red and go straight into code review mode, but those little messages do hold a lot of wisdom and can point you in the direction of your mistake. Easy. Fixed. Done.

    Step 2: Run your tests locally 

    The fun really begins when the error message’s explanation is insufficient. Running tests locally is your next step in investigating exactly what’s wrong. 

    Local testing allows for more control over the testing environment and is free from any network or server issues that could be tripping you up. You want to make sure that the test failed due to your code and immediate dependencies and not because of external factors. Further, local tests allow you to set breakpoints to step through your code for a real-time investigation that can help pinpoint issues quickly. 

    Step 3: Determine if anyone else has been impacted by the failure

    Sometimes code failures remain elusive even with local tests. At this point you want to investigate if the failure is due to your code, or a system or intermittent failure. Time to turn to your test logs. 

    Depending on the logging platform your team uses, it can be as easy as a copy and paste search of the most specific part of your error. This will bring up all related errors and you can see who else has been affected, how, and where. Your user feedback logs also hold clues as to if the test failure is due to a persistent bug. 

    If you find that the test is failing across multiple branches and systems, you can pretty confidently say that the problem isn’t in your code, giving you more direction in diagnosing the problem. Should you find that you are the only one experiencing this failure - I hate to break it to you - but the issue is with your work. 

    Step 4: Figure out the best approach to tackling the test failure

    If in the above step you determined that the problem is in your code - it’s time to comb through your work methodically to find the error. If you’re having trouble and have reached the point of desperation in tracking down the pesky error, you can run the test on the different commits on your branch to see which change or addition to your code caused the failure. git bisect is your friend to make this process more systematic.

    If you found your code was innocent and the real problem is systematic, don’t just walk away from the problem - be the hero and diagnose the failure for the team! Look into if it’s an intermittent failure by isolating the different test cases and narrowing down the scope of the test suite. 

    If you’ve finally found the culprit code - see if you can fix it yourself. This should always be your initial instinct - at this point you are the preeminent expert on the failure. Getting the fix done ASAP will avoid it from being absorbed into the long list of things to fix and help get the highest quality product out to your customers.

    If it truly goes beyond your scope of work, then (finally) it’s time to pass the baton, file a ticket, and assign it to the appropriate team.

    Step 5: 🤷 Re-run

    Still finding yourself at a loss for why your tests are failing? It’s possible it’s not a problem in the code at all and might simply be something weird going on with your CI system and you should re-run your build. Obviously this step is to be avoided at all costs as it is the most time and labor consuming way of diagnosing failures. Further, if the issue is actually an intermittent failure, then you just lost a great opportunity to address it head on. 

    It all starts with the data

    The importance of software testing cannot be overstated. Test failures, however, should be celebrated not dreaded, as you have just stumbled across an opportunity to make your code better. Rigorous tests lead to better products and higher customer retention, and all of that starts with your test data. 

    Test with ease and confidence using Tonic to produce secure and accurate test data directly from your database or warehouse. Want to learn more? Book a demo with us and we can show you how Tonic can help you improve your code.

    Madelyn Goodman
    Data Science
    Driven by a passion for promoting game changing technologies, Madelyn creates mission-focused content as a Product Marketing Associate at Tonic. With a background in Data Science, she recognizes the importance of community for developers and creates content to galvanize that community.

    Fake your world a better place

    Enable your developers, unblock your data scientists, and respect data privacy as a human right.