Synth is a tool for generating realistic data using a declarative data model. Synth is database agnostic and can scale to millions of rows of data.
Synth answers a simple question. There are so many ways to consume data, why are there no frameworks for generating data?
Synth provides a robust, declarative framework for specifying constraint based data generation, solving the following problems developers face on the regular:
- You're creating an App from scratch and have no way to populate your fresh schema with correct, realistic data.
- You're doing integration testing / QA on production data, but you know it is bad practice, and you really should not be doing that.
- You want to see how your system will scale if your database suddenly has 10x the amount of data.
Synth solves exactly these problems with a flexible declarative data model which you can version control in git, peer review, and automate.
The key features of Synth are:
Data as Code: Data generation is described using a declarative configuration language allowing you to specify your entire data model as code.
Import from Existing Sources: Synth can import data from existing sources and automatically create data models. Synth currently has Alpha support for Postgres!
Data Inference: While ingesting data, Synth automatically infers the relations, distributions and types of the dataset.
Database Agnostic: Synth supports semi-structured data and is database agnostic - playing nicely with SQL and NoSQL databases.
Semantic Data Types: Synth integrates with the (amazing) Python Faker library, supporting generation of thousands of semantic types (e.g. credit card numbers, email addresses etc.) as well as locales.