SynthForge IO

Generate Sensor Anomaly ML Training Data

Generate production-ready sensor anomaly datasets with correlated temperature/pressure readings, vibration signals, and configurable anomaly rates, built for predictive maintenance.

Binary classification5,000 rows6 features10/90 imbalanceCorrelated featuresNoise 0.1

Sensor Anomaly template configuration

Here's the pre-built template configuration. Customize everything after loading.

sensor-anomaly.json
{
  "templateName": "Sensor Anomaly",
  "taskType": "classification",
  "numSamples": 5000,
  "features": [
    { "name": "temperature", "type": "numeric", "distribution": "normal", "mean": 85, "std": 10, "predictiveStrength": 0.6 },
    { "name": "pressure", "type": "numeric", "distribution": "normal", "mean": 30, "std": 5, "predictiveStrength": 0.5 },
    { "name": "vibration", "type": "numeric", "distribution": "exponential", "mean": 2.5, "predictiveStrength": 0.8 },
    { "name": "humidity", "type": "numeric", "distribution": "uniform", "min": 30, "max": 80 },
    { "name": "sensor_type", "type": "categorical", "categories": ["type_A", "type_B", "type_C"] },
    { "name": "is_maintenance_due", "type": "boolean", "trueRatio": 0.15 }
  ],
  "correlations": [
    { "feature1": "temperature", "feature2": "pressure", "coefficient": 0.6 }
  ],
  "target": { "labels": ["normal", "anomaly"], "weights": [90, 10] },
  "noise": 0.1
}

Built for Sensor

Every feature is configured with domain-appropriate distributions and realistic parameters.

Correlated Sensor Readings

Temperature and pressure are correlated at 0.6, reflecting real industrial physics where rising temperature increases pressure. The correlation matrix is verified in the quality report.

Multi-Sensor Feature Types

Numeric readings (temperature, pressure, vibration, humidity), categorical sensor types, and boolean maintenance flags, covering the full range of IoT data types.

Anomaly Rate Configuration

Default 10/90 anomaly/normal split. Adjust to simulate different equipment reliability levels, from 1% (highly reliable) to 30% (aging equipment).

Vibration as Primary Signal

Vibration has the highest predictive strength (0.8), matching real-world predictive maintenance where vibration is the strongest indicator of impending equipment failure.

Who uses Sensor Anomaly training data?

Predictive Maintenance Teams

Build anomaly detection models for manufacturing equipment. Test alert thresholds and maintenance scheduling algorithms with controlled synthetic sensor data.

IoT Platform Engineers

Develop and test sensor data pipelines with realistic multi-sensor datasets. Validate ingestion, feature engineering, and real-time scoring workflows.

Industrial Data Scientists

Prototype and benchmark anomaly detection models with known ground truth. Compare isolation forests, autoencoders, and ensemble methods on controlled data.

Industrial Sensor Realism

SynthForge IO generates sensor data that mirrors the statistical properties of real industrial monitoring systems.

Physics-Based Correlations

Temperature and pressure correlate at 0.6, reflecting real thermodynamic relationships. The correlation matrix ensures multi-sensor readings behave like real equipment data.

Exponential Vibration

Vibration follows exponential distribution, mostly low values with occasional spikes. This matches real equipment where vibration increases signal impending failure.

Sensor Type Segmentation

Three sensor types allow you to model equipment-specific anomaly patterns. Test whether your model generalizes across sensor types or needs per-type calibration.

Maintenance Flag Signal

Boolean maintenance-due indicator with 15% true rate. This auxiliary feature tests whether your model can combine scheduled maintenance context with real-time sensor readings.

More ML use cases

Frequently asked questions

How do feature correlations work in this template?
Temperature and pressure are correlated with a coefficient of 0.6, meaning when temperature rises, pressure tends to rise too. SynthForge IO uses a Cholesky decomposition to generate correlated features while preserving individual distributions. The quality report verifies the actual correlation matches your configuration.
Can I add more sensor types?
Yes. The template is a starting point. You can add features like rotational_speed, current_draw, acoustic_level, or oil_temperature. You can also add correlations between any pair of numeric features.
What anomaly rate should I use?
The default is 10% anomalies, which is a good starting point for model development. For realistic production scenarios, use 1-5%. For balanced training, use 50%. Test multiple rates to understand how your model's precision-recall trade-off changes.
What export formats are available?
Datasets export as a ZIP containing CSV and Parquet files with separate train/test/validation splits, a data quality report, and an auto-generated Jupyter notebook for immediate exploration and modeling.
Why is vibration the strongest predictor?
Vibration has a predictive strength of 0.8 (the highest), matching real-world predictive maintenance where vibration analysis is the most reliable indicator of bearing wear, misalignment, and impending mechanical failure.

Start Generating Sensor Anomaly Training Data

Load the Sensor Anomaly template, customize features and parameters, and export publication-ready datasets in seconds.