Use case · Verified May 2026
Generate test data with foreign keys
If you have ever generated 50,000 customers and 200,000 orders only to discover that half the orders reference customer IDs that do not exist, this page is for you.
Short answer
Define your tables and FK relationships once. SynthForge generates parent rows first, then child rows that draw foreign-key values from the parent IDs that actually exist. Output is ready-to-load DDL and CSV across seven SQL dialects.
The situation
The painful version of this workflow goes: write a Python script with Faker, generate customers, write the customers to a CSV, then write a second script that loads the customer IDs into memory and generates orders by picking randomly from that list. Then you discover line_items also reference orders, so you do it a third time. Then someone changes the schema and you do it all over again.
Or: you use Mockaroo, generate customers, download the CSV, re-upload it as a Dataset, then go to your orders schema and use the Dataset Column type to reference customer IDs. Now do it again for line_items. The result is correct but the workflow is brittle and slow.
SynthForge collapses this into a single operation. You define the schema once. Foreign-key columns reference their parent table by name. The generator runs the tables in dependency order, draws child FK values from the parent IDs that were just generated, and emits the entire dataset in one pass.
How to do it in SynthForge
1. Describe the schema
Type a natural-language description into the AI schema designer (Claude by default), or paste a CREATE TABLE script and let the SQL importer pick out tables and columns. Or build it field-by-field in the visual editor.
2. Mark foreign-key columns
Click a column, set its type to 'foreign key', point it at the parent table.column. Pick a sampling strategy: random (default), uniform across parent rows, sequential (1:N), or weighted.
3. Generate
SynthForge resolves the dependency graph (parents before children) and generates referentially-intact rows in one pass. Up to 10,000,000 rows per request.
4. Export
Choose your dialect: PostgreSQL (with \copy), MySQL (with LOAD DATA INFILE), SQL Server (with bcp), SQLite, MariaDB, DuckDB, or CockroachDB. Or download CSV / JSON / JSONL / Parquet directly.
-- Example output for PostgreSQL
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
name VARCHAR(120) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL
);
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES customers(id),
total NUMERIC(10,2) NOT NULL
);
\copy customers FROM '/tmp/customers.csv' WITH (FORMAT csv, HEADER);
\copy orders FROM '/tmp/orders.csv' WITH (FORMAT csv, HEADER); When something else is the right call
Honest alternatives in case SynthForge is not the best fit for your specific situation.
Mockaroo
Pick Mockaroo if you only need a single table at a time. Multi-table FK requires a chained workflow (generate parent, download, re-upload as Dataset, reference from child).
Faker (Python or Faker.js)
Pick Faker if you are writing inline test fixtures inside a unit test and you only need one or two values per record. Faker has no native concept of foreign keys; you write the FK logic yourself.
Tonic Fabricate
Pick Fabricate if you want a chat-based AI agent and you can fit inside a $0/$29/$Enterprise credit-metered model. It also preserves multi-table relational integrity.
Frequently asked questions
Does SynthForge support composite foreign keys?
What about self-referential foreign keys (e.g., employees.manager_id -> employees.id)?
How does SynthForge handle many-to-many relationships?
Are foreign keys guaranteed to resolve, or can children point to non-existent parents?
What's the row cap?
Related
Other use cases
Try SynthForge for free
Design a multi-table schema, generate referentially-intact data, and export to your database. No credit card.