For better or worse, sometimes we omit foreign key constraints on columns with a foreign key relationship. Sometimes we do this for performance reasons, sometimes it’s the behavior of a framework we’re using, and sometimes it’s desirable semantically. Whatever the case, often times its handy to list the foreign keys in the database that aren’t constraints; and that’s where this simple tool we built comes in handy.
Without further ado, here’s a tool for finding implicit foreign keys in a Postgres or MySQL database. It uses the Levenshtein distances of column names to find foreign key relationships. Obviously there are theoretically more sophisticated approaches to this problem (e.g. using the values of columns to match keys), but:
Full documentation is in the README.md file in the repo.
There are a handful of situations where knowing a database’s foreign keys can be helpful. Our compelling use case is database subsetting. We have written about database subsetting, and released an open source subsetter. The summary is that database subsetting creates smaller versions of a database that are referentially intact and useful (it’s surprisingly easy to create a database that’s referentially intact, but not useful!). To create a referentially intact subset, you need to know where the references lie. Sometimes our customers don’t have foreign keys, and that’s where this tool is used.
Other applications include:
Enable your developers, unblock your data scientists, and respect data privacy as a human right.