A lot goes into building meaningful synthetic datasets, and we’ll be using this blog as a medium to explore the many topics. One piece of the puzzle is the ability to identify various types of data and then generate random data specific to each type. Today, we’re releasing our API for generating random things to help others in their endeavors to create synthetic data.

Currently the API supports generating:

  1. Addresses
  2. Names
  3. Phone Numbers
  4. MAC and IP Addresses
  5. Social Security Numbers

Take it for a spin at https://randomthings.tonic.ai. Each API call is a GET method that allows you to generate up to 10,000 items per request (adjusted via the amount query parameter). If you’re more the command-line type, here are a few cURL requests to get you started:

curl -i “https://randomthings.tonic.ai/address address_type=city_state_zip&amount=100

This will generate 100 city,state,zip tuples. Locations are generated by sampling from population distributions so you’re more likely to obtain locations that are more populated.

curl -i “https://randomthings.tonic.ai/ssn?amount=100

This will generate 100 social security numbers.

curl -i “https://randomthings.tonic.ai/us_phone?mask=%28xxx%29-xxx-xxxx

This will generate 10 US phone numbers of the format (xxx)-xxx-xxxx, e.g., (404)-555-5555. Note that the round brackets enclosing the area code are URL-encoded.

As always, if you have any questions about synthetic data or its many use cases, don’t hesitate to reach out to us at hello@tonic.ai. Additionally, if there’s a particular data type that you’d like us to add to this API, let us know, and we’ll do our best to add it soon.

Adam Kamor, PhD
Co-Founder & Head of Engineering

Fake your world a better place

Enable your developers, unblock your data scientists, and respect data privacy as a human right.