SynthForge IO

Generate E-commerce Churn ML Training Data

Generate production-ready e-commerce churn datasets with behavioral engagement signals, purchase value patterns, and configurable churn rates, ready for retention modeling.

Binary classification5,000 rows6 features25/75 imbalanceBehavioral metricsNoise 0.1

E-commerce Churn template configuration

Here's the pre-built template configuration. Customize everything after loading.

ecommerce-churn.json
{
  "templateName": "E-commerce Churn",
  "taskType": "classification",
  "numSamples": 5000,
  "features": [
    { "name": "months_active", "type": "numeric", "distribution": "uniform", "min": 1, "max": 48 },
    { "name": "order_frequency", "type": "numeric", "distribution": "exponential", "mean": 2.5 },
    { "name": "avg_order_value", "type": "numeric", "distribution": "log-normal", "mean": 75, "std": 40 },
    { "name": "support_tickets", "type": "numeric", "distribution": "poisson", "mean": 1.5 },
    { "name": "product_category", "type": "categorical", "categories": ["electronics", "clothing", "home", "food", "beauty"] },
    { "name": "used_promo", "type": "boolean", "trueRatio": 0.4 }
  ],
  "target": { "labels": ["retained", "churned"], "weights": [75, 25] },
  "noise": 0.1
}

Built for E-commerce

Every feature is configured with domain-appropriate distributions and realistic parameters.

Behavioral Engagement Metrics

Months active, order frequency, and promo usage capture customer engagement patterns. Uniform, exponential, and boolean distributions model realistic behavioral data.

Purchase Value Patterns

Average order value follows log-normal distributions. Most orders cluster around a typical value with a long tail of high-value purchases, matching real e-commerce patterns.

Support Interaction Signals

Support ticket counts use Poisson distributions. High ticket volumes correlate with churn risk, giving your model a realistic behavioral signal to learn from.

Promo Response Tracking

Boolean promo usage feature with configurable adoption rates. Test how promotional engagement affects churn prediction accuracy.

Who uses E-commerce Churn training data?

Retention & CRM Teams

Build churn prediction models to identify at-risk customers. Test retention intervention strategies with controlled synthetic data before deploying to production.

E-commerce Product Teams

Prototype recommendation and engagement features using realistic customer behavior data. No need to wait for production data pipelines.

Data Science Bootcamps

Teach churn modeling with realistic e-commerce datasets. Students get hands-on experience with behavioral features, class imbalance, and real-world data patterns.

E-commerce Data Realism

SynthForge IO generates e-commerce datasets that mirror the statistical patterns of real customer behavior data.

Log-Normal Order Values

Purchase amounts follow log-normal distributions. Most orders are moderate, with a realistic long tail of high-value transactions matching real e-commerce patterns.

Poisson Support Tickets

Support interactions follow Poisson distributions matching real customer service patterns. Configurable mean rate lets you simulate different support load scenarios.

Configurable Churn Rate

Default 25/75 churn/retained split matches typical e-commerce churn rates. Adjust weights to simulate high-churn markets or test extreme imbalance scenarios.

Multi-Category Products

Five product categories with weighted distributions let you model category-specific churn patterns and test how product mix affects retention.

More ML use cases

Frequently asked questions

What churn rate does this template use?
The default is 25% churned / 75% retained, which matches typical e-commerce churn rates. You can adjust the class weights to any ratio, from 5/95 for low-churn businesses to 50/50 for balanced training sets.
Can I customize the product categories?
Yes. The template includes 5 default categories (electronics, clothing, home, food, beauty), but you can modify the categories and their weights to match your specific product catalog.
How are behavioral features distributed?
Months active uses uniform distribution (1-48 months), order frequency uses exponential (modeling sporadic purchasing), average order value uses log-normal (right-skewed purchase amounts), and support tickets use Poisson (discrete event counts).
What export formats are available?
Datasets export as a ZIP containing CSV and Parquet files with separate train/test/validation splits, a data quality report, and an auto-generated Jupyter notebook for immediate exploration and modeling.
Can I add more features to the template?
Yes. Templates are starting points. You can add, remove, or modify any feature after loading the template. Add features like last_login_days_ago, review_count, or wishlist_size to increase model complexity.

Start Generating E-commerce Churn Training Data

Load the E-commerce Churn template, customize features and parameters, and export publication-ready datasets in seconds.