Blog
Press Releases

Unlocking Secure Data Utility: The Tonic.ai and Databricks Partnership and Integration

Author
Tomer Benami
November 27, 2023
Unlocking Secure Data Utility: The Tonic.ai and Databricks Partnership and Integration
In this article

    In the digital age, where data is the new currency, protecting sensitive information while ensuring its utility is paramount. The strategic partnership between Tonic.ai and Databricks marks a significant milestone in achieving this balance. Tonic's innovative approach to data synthesis is now seamlessly integrated with Databricks, offering a joint solution that is both powerful and privacy-enhancing.

    The Tonic + Databricks connector

    The Tonic-Databricks connector is a testament to the commitment of both companies to democratize data access without compromising on privacy. Tonic.ai's platform generates safe, high-fidelity versions of production data, free of sensitive information, enabling organizations to maintain the value of their data—from engineering to analytics.

    For Databricks customers, this partnership means that they can now harness the full potential of their data without leaving the Databricks ecosystem. The connector leverages Databricks APIs—supporting Unity Catalog, Delta Tables, and Delta Sharing—to provide a comprehensive data protection and utility solution. Together, Tonic is helping make Databricks the Intelligent Data Platform by providing better and more efficient ways to share and test data as companies innovate to incorporate AI into their organizations. 

    Why Tonic

    The Tonic platform enables organizations to provide their teams with high-quality, secure test data, accelerating development cycles and reducing time-to-market. Tonic.ai's additional offerings, such as Tonic Textual and Tonic Validate, further extend the capabilities for handling, respectively, the redaction and synthesis of free-text data, and monitoring of RAG application performance.

    The effectiveness of Tonic’s native database connectors is underscored by success stories from top companies such as eBay, Walgreens, CVS Health, Autodesk, Philips, the NHL, and Plume Design. These organizations have experienced firsthand the benefits of integrating Tonic.ai's synthetic data capabilities into their existing testing and development workflows. 

    Why Tonic and Databricks

    The collaboration between Tonic.ai and Databricks is not just a technical integration—it's a strategic enhancement to the data ecosystem. Tonic and Databricks create a powerful combination, offering a one-stop solution for both data privacy and an open and unified platform for data and AI. This ensures a seamless and flexible experience for users by performing data transformations directly in Databricks clusters. This synergy empowers Databricks customers to:

    • Securely retain data for extended periods.
    • Protect data at the earliest stage in the data lifecycle.
    • Utilize de-identified data for AI/ML development.
    • Integrate Tonic deeply into data applications with the Databricks SDK.
    • Share de-identified data with partners through Delta Sharing.
    • List de-identified data on the Databricks Marketplace.

    Step-by-step guide: Use the Tonic-Databricks connector

    So how do you take advantage of the connector? Here's an overview of the basic steps to connect Tonic to your Databricks catalog, configure and run data generation, and verify and use the resulting synthetic data.

    Before you begin, make sure that you have a Databricks workspace ready and that you possess the necessary permissions to manage data connections.

    Step 1: Establish the connection from Tonic to Databricks

    Log into your Tonic instance and create a Tonic workspace.

    Under Connection Type, choose Databricks.

    Provide your Databricks API token and other connection details. You first provide connection information for your source data.

    Then indicate where Tonic writes the synthesized data.

    Step 2: Configure data mapping and generation

    Use Privacy Hub to view the fields where Tonic detected sensitive information.

    Indicate which tables in your Databricks database to include in the data generation. By default, the data generation includes all of the tables. Truncating a table will exclude the table from the data generation.

    Use the intuitive Tonic UI to assign data generators to table fields. Tonic data generators indicate how to transform the field values. A generator might scramble characters or produce a safe but realistic value that mimics the original data.

    Step 3: Generate synthetic data

    Click the green button to initiate a data generation job within Tonic and monitor its progress.

    After the job is complete, you may validate and utilize the synthetic data in Databricks.

    Use the generated synthetic data for development, testing, QA, and AI/ML model training. Share your data with Delta Sharing and consider listing it in the Marketplace. Integrate the data into your workflows to experience the power of Tonic + Databricks.

    Conclusion

    The Tonic.ai and Databricks partnership is a powerful combination that promises to revolutionize how organizations approach data privacy and utility. By providing a secure, efficient, and user-friendly platform for data masking, subsetting, and synthesis, this alliance empowers organizations to innovate with confidence.

    As we look to the future, the partnership between Tonic.ai and Databricks is poised to set new standards in data privacy and utility, enabling organizations to unlock strategic data assets safely and efficiently. It's an exciting time for data-driven companies and the opportunities for growth and innovation are boundless. Try out the Tonic + Databricks connector for free today by starting a Free Trial

    Tomer Benami
    VP of Finance and Bizops
    Tomer Benami is the VP of Finance and Bizops at Tonic.ai where he brings a blend of core finance expertise, operational savvy, and vision to go-to-market activities. With a proven track record of serving as the senior-most finance leader at companies such as VirtualHealth and Apploi, Tomer enjoys partnering with executive teams, steering organizations towards strategic goals and delivering meaningful results. Beginning his career at KPMG and holding a Master's Degree from the University of Washington, Foster School of Business, he is enthusiastic about the transformative potential of AI while advocating for its responsible and ethical utilization in shaping our future.
    Real. Fake. Data.
    Say goodbye to inefficient in-house workarounds and clunky legacy tools. The data you need is useful, realistic, safe—and accessible by way of API.
    Book a demo
    The Latest
    Tonic Validate extends its RAG evaluation platform to support metrics from Ragas
    RAG Evaluation Series: Validating the RAG performance of OpenAI vs CustomGPT.ai
    Building vs buying test data infrastructure for ephemeral data environments