Quick Start

Get started with Data Caterer in minutes. Choose your preferred approach:

Java/Scala API (Recommended)

Full programmatic control for complex scenarios and test integration.
YAML

Configuration-based approach. Great for CI/CD pipelines.
UI

Point-and-click interface. No coding required.

Java/Scala API

The recommended approach for full control over data generation. Write your data generation logic in Scala or Java.

Run

git clone git@github.com:data-catering/data-caterer.git
cd data-caterer/example
./run.sh

Press Enter to run the default example, or enter a class name (e.g., CsvPlan).

What Happens

Builds your Scala/Java code into a JAR
Runs it via Docker with the Data Caterer engine
Generates data and reports to docker/sample/

Example Code

class CsvPlan extends PlanRun {
  val accountTask = csv("accounts", "/opt/app/data/accounts", Map("header" -> "true"))
    .fields(
      field.name("account_id").regex("ACC[0-9]{8}").unique(true),
      field.name("name").expression("#{Name.name}"),
      field.name("balance").`type`(DoubleType).min(10).max(1000),
      field.name("status").oneOf("open", "closed", "pending")
    )
    .count(count.records(100))

  execute(accountTask)
}

More Examples

Class	Description
`DocumentationPlanRun`	JSON + CSV with foreign keys (default)
`CsvPlan`	CSV files with relationships
`PostgresPlanRun`	PostgreSQL tables
`KafkaPlanRun`	Kafka messages
`ValidationPlanRun`	Generate and validate data

Run any example: ./run.sh <ClassName>

All example classes are in src/main/scala/io/github/datacatering/plan/.

YAML

Define data generation using a single unified YAML configuration file.

Run

git clone git@github.com:data-catering/data-caterer.git
cd data-caterer/example
./run.sh unified/ecommerce-unified.yaml

What Happens

Builds the example JAR
Runs the unified YAML plan via Docker
Generates data and reports to docker/data/custom/

Example YAML

The unified format combines connections, data sources, and generation config in a single file:

name: "csv_example"
description: "Create transaction data in CSV file"

dataSources:
  - name: "csv_transactions"
    connection:
      type: "csv"
      options:
        path: "/opt/app/data/transactions"
        header: "true"
    steps:
      - name: "transactions"
        count:
          records: 1000
        fields:
          - name: "account_id"
            options:
              regex: "ACC[0-9]{8}"
          - name: "amount"
            type: "double"
            options:
              min: 10
              max: 1000

More Examples

YAML File	Description
`unified/ecommerce-unified.yaml`	Complete e-commerce with foreign keys

Run any example: ./run.sh <filename>.yaml

All unified YAML files are self-contained - no separate task files needed.

UI

A web interface for creating and running data generation plans.

Run

docker run -d -p 9898:9898 -e DEPLOY_MODE=standalone --name datacaterer datacatering/data-caterer:0.19.1

Open http://localhost:9898 in your browser.

What You Can Do

Create connections to databases, files, Kafka, and more
Define data schemas with field types and constraints
Generate test data with a single click
View results and reports in the browser

Try the UI demo

View Results

After running, check the generated report:

Java/Scala examples: docker/sample/report/index.html
YAML examples: docker/data/custom/report/index.html

Sample report preview

Next Steps

Step-by-Step Guide

First data generation guide - learn Data Caterer's full capabilities.
All Guides

Browse all guides for specific use cases and data sources.
Data Sources

Supported connections - databases, files, messaging, and HTTP.