Use case · Verified May 2026
Load testing data, generated correctly
k6 and Locust drive request load. Your database needs to look like prod, or the test only proves your dev DB is small. SynthForge fills the gap.
Short answer
Generate a multi-table dataset shaped like your prod database (right row counts, right distributions, FK-respecting), bulk-load it into your load-test database, then drive traffic against it. The results are honest because the database under test is honest.
The situation
A k6 script that exercises /api/orders/{id} is meaningless if the orders table has 200 rows. The query planner picks a different plan, the cache is hot for everything, and your test passes for the wrong reason.
What you want is a load-test database with the rough shape of prod: enough rows that index selectivity matters, distributions that match real traffic (a few customers with thousands of orders, most with one or two), and full FK integrity so cascading queries do not 404 mid-test.
The hand-rolled version: a Python script with Faker plus a parent/child loop plus a deduplication helper plus a 'why does this take six hours' moment. The SynthForge version: define the schema once, generate the dataset, bulk-load.
How to do it
1. Mirror your prod schema
Paste a CREATE TABLE script into SynthForge. Mark the FK columns. Save.
2. Set distributions that look like prod
Per-customer order counts often follow a power-law: most customers have 1-3 orders, a few have 1000+. SynthForge supports weighted FK sampling for exactly this shape. For numeric columns: LogNormal for prices, Exponential for time-between-events, Normal for ages.
3. Set realistic per-table row counts
The numbers that matter are the ones that match prod's order-of-magnitude. 1M customers, 30M orders, 100M line_items is typical mid-stage SaaS. Use multiple SynthForge jobs to exceed the 10M-per-request cap.
4. Bulk-load before the load test
Use the dialect-specific loader the export includes. PostgreSQL \copy, MySQL LOAD DATA INFILE, SQL Server bcp. None of these are 'INSERT one row at a time' speed.
# Postgres bulk load (fastest)
psql $LOADTEST_DB -f synthforge-ddl.sql
for tbl in customers orders line_items; do
psql $LOADTEST_DB -c "\copy $tbl FROM '$tbl.csv' CSV HEADER"
done
# Then run k6
k6 run loadtest.js 5. Vary the dataset between runs
Change row counts. Change distributions. Test the case where 99% of customers are new vs the case where 99% have order history. SynthForge regenerates in seconds.
When something else is the right call
Honest alternatives in case SynthForge is not the best fit for your specific situation.
k6 / Locust / JMeter / Gatling
These are the load drivers, not the data layer. They sit on top of whatever data you have. SynthForge fills the data layer.
Replaying production traffic against a sanitized prod copy
Highest fidelity, highest cost, requires a full de-identification pipeline (Tonic Structural). Worth it for serious capacity-planning work.
pgbench / sysbench
Synthetic micro-benchmarks for the database itself, not your app. Good for raw DB performance, not realistic application load tests.
Frequently asked questions
How big a dataset do I really need for a load test?
Can SynthForge generate distributions that look like real traffic?
How do I handle a 100M-row table?
Does the SynthForge data hit the same query plans as my real data?
Related
Other use cases
Try SynthForge for free
Design a multi-table schema, generate referentially-intact data, and export to your database. No credit card.