Data synthesis

Generate referentially intact synthetic data across your entire data ecosystem

Author
Mark Brocato
Author
May 14, 2026

Real-world software doesn't run on a single database. It runs on many, connected by shared keys, foreign key relationships, and business logic that spans systems. But when it comes to generating test data or populating training environments, most teams are still stuck doing it one database at a time, then stitching the results together by hand.

The problem gets worse as architectures grow more complex. Microservice teams need coherent data across dozens of services. Engineers building environments for training AI agents need interconnected structured data, free-text content, and live APIs that all reference the same underlying reality. And compliance requirements mean production data is off the table.

Tonic Fabricate has always made it simple to generate realistic synthetic data through a conversational AI agent, whether from scratch using LLM-powered domain expertise or modeled on an existing schema. But until now, each database lived in its own silo. If your system spanned multiple databases, you had to generate them separately and manage the relationships yourself.

That changes with two new capabilities: Projects and Live Connect.

Projects: a unified workspace for your entire data ecosystem

Projects introduce a new layer to Fabricate that mirrors how real software systems are actually built. A project is a workspace that contains multiple databases, database connections, mock APIs, and automated workflows, all managed through a single Fabricate agent conversation.

Within one project, you can generate multiple databases and file formats with referential integrity maintained throughout. The agent understands the relationships between your data sources and preserves shared keys, foreign key constraints, and cross-system dependencies automatically. Whether you generate every database from scratch or combine from-scratch generation with connections to existing data sources, the result is a coherent, interconnected data ecosystem, not a collection of isolated datasets.

Live Connect: model from your production data

Live Connect lets you connect Fabricate directly to your existing production databases, including PostgreSQL, MySQL, SQL Server, Oracle, Snowflake, and Databricks. Fabricate models your real schemas, patterns, and distributions to generate synthetic data that inherits real-world complexity, without exposing sensitive information.

On its own, Live Connect eliminates the guesswork of building synthetic data from scratch when a real reference exists. Inside a project, its value multiplies. You can connect to one production database via Live Connect, generate a second database entirely from scratch, and Fabricate keeps everything referentially intact across both. This means you can mirror the parts of your architecture that already exist while fabricating the parts that don't, all within a single conversation.

How a project comes together

  1. Create a project: Click the + button in the navigation panel to start a new project and define your first database.
  2. Add databases: Connect to existing databases via Live Connect, create new databases from scratch, or mix both approaches within the same project.
  3. Plan and generate: For complex schemas, Fabricate drafts a strategic generation plan, dividing work into logical groups and refining the approach with you before generating.

Operationalize: Push your data into automated workflows, stand up mock APIs, or export in the formats your pipelines need, all spanning the full project.

How an engineering team uses Projects to test across a multi-service architecture

A backend engineer is building a staging environment for an e-commerce platform. The system spans a PostgreSQL database for product catalog and customer data, a SQL Server database for inventory management, and a third-party payment API. In production, these systems share customer IDs, product IDs, and transaction references across every boundary.

Previously, the engineer would generate test data for each system separately, then spend hours manually aligning foreign keys and shared identifiers, or write brittle scripts to stitch the data together. With Projects, they create a single project, connect to the production PostgreSQL database via Live Connect, generate the inventory database from scratch, and stand up a mock payment API, all in one conversation. Fabricate maintains referential integrity across the entire ecosystem automatically. The staging environment mirrors production's interconnected complexity without a single line of hand-coded test data.

Ship faster with data that mirrors your real architecture

Projects and Live Connect together eliminate the manual, error-prone process of generating and maintaining test data across interconnected systems. Engineering teams get self-serve access to realistic, referentially intact data that spans their full architecture, on demand, without compliance risk, and without cross-team dependencies.

For teams building and testing multi-service applications, the payoff is immediate: coherent staging environments, reliable integration tests, and unblocked parallel development. For teams populating environments to train AI agents, it means interconnected, domain-specific data ecosystems that can be generated and iterated on in minutes instead of weeks.

The Tonic Fabricate team built Projects and Live Connect to close the gap between how real software systems are architected and how synthetic data is generated to support them. The result is a tool that finally operates at the scope and complexity of your actual production environment.

Create your free account and start generating across your full stack.

Mark Brocato
Head of Engineering for Fabricate
Mark founded Mockaroo in Maryland in 2014 while leading a software development team at a life sciences startup spun off from his alma mater, Johns Hopkins University. Mockaroo initially started out as a weekend project aimed to help the software testers on his team use more realistic data when testing their products. Once he made Mockaroo available to the general public it quickly grew into a business of it's own, garnering significant usage from software developers, testers, and sales engineers across a wide variety of industries. When he's not helping take care of his two young children, Mark enjoys playing guitar and writing songs.