Hot off the heels of the Privacy Hub, we’re introducing more privacy protecting features to Tonic! Today we’re excited to announce the introduction of differential privacy to Tonic. Differential privacy gives you confidence and visibility into the safety of the data Tonic produces. For the uninitiated, differential privacy is a mathematical guarantee about the privacy of a process. In this first release only certain data types will support differential privacy, but expect more in coming releases.
There’s an old adage that goes something like “if you can’t measure it, you can’t fix it.” Well we’re giving you the tools to measure and fix at the same time. Our new privacy hub feature, not only finds your sensitive data, but also secures it once detected.
Announcing Privacy Hub Driven by dozens of customer requests to solve data governance and proliferation challenges, we’re launching a capabilities to explicitly track and transform data as part of an automated process.
In the last decade, JSON has become the dominant data-interchange format and the backbone of most NoSQL databases.  So it’s not surprising that the ISO SQL standard added JSON support to their specification in late 2016 , and most popular SQL databases, like Postgres, one of our favorites, have upped their JSON game over the past few years. I could take a moment now to weigh in on the NoSQL vs SQL debate, and these developments definitely make that choice more complex, but I’ll save that for another time.
The New York Privacy Act (NYPA) made quite a splash when introduced in the NY State Senate this past May. Like the summer’s first cannon ball into the deep end, it sent waves through the data privacy world, with harbinger-of-doom headlines proclaiming it a “sweeping,” “even bolder”, “considerably tougher”, “stringent”, nay, the “strictest” data privacy bill to date. And the headlines weren’t wrong.
The bill stands to impact companies of any size (including non-profits) doing business in or even so much as producing “products or services that are intentionally targeted to residents” of New York.
It’s hard to believe it but it’s been a year since we released the original version of Condenser, our open source database subsetting tool. Since then, Condenser has been deployed in a variety of situations, and we’ve learned a bunch along the way, specifically, what you need to make it work best for you. The culmination of this is the release of Condenser 2.0! Rejoice 🎉!
In this post, we’ll explore what we’ve learned about subsetting and the challenges our new release helps you overcome.
For better or worse, sometimes we omit foreign key constraints on columns with a foreign key relationship. Sometimes we do this for performance reasons, sometimes it’s the behavior of a framework we’re using, and sometimes it’s desirable semantically. Whatever the case, often times its handy to list the foreign keys in the database that aren’t constraints; and that’s where this simple tool we built comes in handy.
Finding Foreign Keys Without further ado, here’s a tool for finding implicit foreign keys in a Postgres or MySQL database.
Caption: Left side shows psql connected to the proxy while the right side shows psql connected directly to the DB.
Redact and Replace in Realtime with no Additional Infrastructure Many of our customers have multiple databases, complex application logic, and limited time. One of the easiest ways to protect your data is to add a proxy between the consumer (analyst, application, developer, etc) and the data base. Since the proxy doesn’t clone the data, there are no additional infrastructure costs.
We introduced Masquerade as a way to redact data in real time. You can check out of the motivation here. Briefly, adding a proxy between your database and your application is a great way to preserve privacy in a very low friction way. What follows are some of the implementation details.
Introduction to the Postgres Messaging Protocol Postgres clients communicate with Postgres via a messaging protocol over TCP. The best place to learn the details of the protocol is in Postgres’s own documentation.
Here’s a story from a friend…let’s call him Frank. Frank and his team had been building an analytics and reporting platform for hospitals for the past year, iterating with a small clinical practice. Frank knew they were on to something compelling: 100% of the physicians and administrators were using the platform daily and giving high feedback scores. Setting his sights on bigger fish, Frank hustled his way to demo the platform to the largest healthcare provider in California—but his team only had 1 week to prepare.
New regulations around data privacy and an increasing awareness of the importance of protecting sensitive data is pushing companies to lock down access to their production data. Restricting access to high quality data with which to build and test leads to a variety of issues, including making it more difficult to find bugs. In this article we’ll look at a variety of ways to populate your dev/staging environments with high quality synthetic data that is similar to your production data.