Use case · Verified May 2026

Generate test data with foreign keys

If you have ever generated 50,000 customers and 200,000 orders only to discover that half the orders reference customer IDs that do not exist, this page is for you.

Launch App

Short answer

Define your tables and FK relationships once. SynthForge generates parent rows first, then child rows that draw foreign-key values from the parent IDs that actually exist. Output is ready-to-load DDL and CSV across seven SQL dialects.

The situation

The painful version of this workflow goes: write a Python script with Faker, generate customers, write the customers to a CSV, then write a second script that loads the customer IDs into memory and generates orders by picking randomly from that list. Then you discover line_items also reference orders, so you do it a third time. Then someone changes the schema and you do it all over again.

Or: you use Mockaroo, generate customers, download the CSV, re-upload it as a Dataset, then go to your orders schema and use the Dataset Column type to reference customer IDs. Now do it again for line_items. The result is correct but the workflow is brittle and slow.

SynthForge collapses this into a single operation. You define the schema once. Foreign-key columns reference their parent table by name. The generator runs the tables in dependency order, draws child FK values from the parent IDs that were just generated, and emits the entire dataset in one pass.

How to do it in SynthForge

1. Describe the schema

Type a natural-language description into the AI schema designer (Claude by default), or paste a CREATE TABLE script and let the SQL importer pick out tables and columns. Or build it field-by-field in the visual editor.

2. Mark foreign-key columns

Click a column, set its type to 'foreign key', point it at the parent table.column. Pick a sampling strategy: random (default), uniform across parent rows, sequential (1:N), or weighted.

3. Generate

SynthForge resolves the dependency graph (parents before children) and generates referentially-intact rows in one pass. A typical table of ~1 million rows generates in minutes; the hard cap is 10,000,000 rows per table.

4. Export

Choose your dialect: PostgreSQL (with \copy), MySQL (with LOAD DATA INFILE), SQL Server (with bcp), SQLite, MariaDB, DuckDB, or CockroachDB. Or download CSV / JSON / JSONL / Parquet directly.

sql

-- Example output for PostgreSQL
CREATE TABLE customers (
    id      INTEGER PRIMARY KEY,
    name    VARCHAR(120) NOT NULL,
    email   VARCHAR(255) UNIQUE NOT NULL
);

CREATE TABLE orders (
    id          INTEGER PRIMARY KEY,
    customer_id INTEGER NOT NULL REFERENCES customers(id),
    total       NUMERIC(10,2) NOT NULL
);

\copy customers FROM '/tmp/customers.csv' WITH (FORMAT csv, HEADER);
\copy orders    FROM '/tmp/orders.csv'    WITH (FORMAT csv, HEADER);

When something else is the right call

Honest alternatives in case SynthForge is not the best fit for your specific situation.

Mockaroo

Pick Mockaroo if you only need a single table at a time. Multi-table FK requires a chained workflow (generate parent, download, re-upload as Dataset, reference from child).

Faker (Python or Faker.js)

Pick Faker if you are writing inline test fixtures inside a unit test and you only need one or two values per record. Faker has no native concept of foreign keys; you write the FK logic yourself.

Tonic Fabricate

Pick Fabricate if you want a chat-based AI agent and you can fit inside a $0/$29/$Enterprise credit-metered model. It also preserves multi-table relational integrity.

Frequently asked questions

Does SynthForge support composite foreign keys?

Not today. Foreign-key columns reference a single parent column. If your schema requires composite FK, plan to model it as a single surrogate key and a separate uniqueness constraint, or generate the parent IDs and bind them yourself.

What about self-referential foreign keys (e.g., employees.manager_id -> employees.id)?

Possible but not first-class. The generator does not prohibit self-references; check the docs for the latest sampling-strategy guidance.

How does SynthForge handle many-to-many relationships?

Model the junction table explicitly: create a join table with two FK columns, one to each parent. SynthForge generates the parents, then generates the join table by sampling FK pairs.

Are foreign keys guaranteed to resolve, or can children point to non-existent parents?

Guaranteed. Child FK values are sampled from the parent IDs that were just generated, so every reference resolves by construction.

What's the row cap?

Currently a 10,000,000-row hard cap per table, plus per-account rate limits. No daily or monthly request cap. In practice ~1 million rows per table generates in minutes; larger or wider tables take proportionally longer.

Ready to get started?

Design a multi-table schema, generate referentially-intact data, and export to your database. No credit card.

Launch App

Generate test data with foreign keys

Short answer

The situation

How to do it in SynthForge

1. Describe the schema

2. Mark foreign-key columns

3. Generate

4. Export

When something else is the right call

Mockaroo

Faker (Python or Faker.js)

Tonic Fabricate

Frequently asked questions

Related

PostgreSQL test data generator

MySQL test data generator

SQL Server test data generator

Schema design overview

SynthForge vs Mockaroo

SynthForge vs Faker

Generate DuckDB test data

Other use cases

Synthetic data for unit tests

ML training data without the PII

Seed your local database with one command

Staging environment test data

Load testing data, generated correctly

Ready to get started?