Quick Start
Get started with Data Caterer in minutes. Choose your preferred approach:
-
Full programmatic control for complex scenarios and test integration.
-
Configuration-based approach. Great for CI/CD pipelines.
-
Point-and-click interface. No coding required.
Java/Scala API
The recommended approach for full control over data generation. Write your data generation logic in Scala or Java.
Run
Press Enter to run the default example, or enter a class name (e.g., CsvPlan).
What Happens
- Builds your Scala/Java code into a JAR
- Runs it via Docker with the Data Caterer engine
- Generates data and reports to
docker/sample/
Example Code
class CsvPlan extends PlanRun {
val accountTask = csv("accounts", "/opt/app/data/accounts", Map("header" -> "true"))
.fields(
field.name("account_id").regex("ACC[0-9]{8}").unique(true),
field.name("name").expression("#{Name.name}"),
field.name("balance").`type`(DoubleType).min(10).max(1000),
field.name("status").oneOf("open", "closed", "pending")
)
.count(count.records(100))
execute(accountTask)
}
More Examples
| Class | Description |
|---|---|
DocumentationPlanRun |
JSON + CSV with foreign keys (default) |
CsvPlan |
CSV files with relationships |
PostgresPlanRun |
PostgreSQL tables |
KafkaPlanRun |
Kafka messages |
ValidationPlanRun |
Generate and validate data |
Run any example: ./run.sh <ClassName>
All example classes are in src/main/scala/io/github/datacatering/plan/.
YAML
Define data generation using a single unified YAML configuration file.
Run
git clone git@github.com:data-catering/data-caterer.git
cd data-caterer/example
./run.sh unified/ecommerce-unified.yaml
What Happens
- Builds the example JAR
- Runs the unified YAML plan via Docker
- Generates data and reports to
docker/data/custom/
Example YAML
The unified format combines connections, data sources, and generation config in a single file:
name: "csv_example"
description: "Create transaction data in CSV file"
dataSources:
- name: "csv_transactions"
connection:
type: "csv"
options:
path: "/opt/app/data/transactions"
header: "true"
steps:
- name: "transactions"
count:
records: 1000
fields:
- name: "account_id"
options:
regex: "ACC[0-9]{8}"
- name: "amount"
type: "double"
options:
min: 10
max: 1000
More Examples
| YAML File | Description |
|---|---|
unified/ecommerce-unified.yaml |
Complete e-commerce with foreign keys |
Run any example: ./run.sh <filename>.yaml
All unified YAML files are self-contained - no separate task files needed.
UI
A web interface for creating and running data generation plans.
Run
docker run -d -p 9898:9898 -e DEPLOY_MODE=standalone --name datacaterer datacatering/data-caterer:0.19.0
Open http://localhost:9898 in your browser.
What You Can Do
- Create connections to databases, files, Kafka, and more
- Define data schemas with field types and constraints
- Generate test data with a single click
- View results and reports in the browser
View Results
After running, check the generated report:
- Java/Scala examples:
docker/sample/report/index.html - YAML examples:
docker/data/custom/report/index.html
Next Steps
-
Step-by-Step Guide
First data generation guide - learn Data Caterer's full capabilities.
-
All Guides
Browse all guides for specific use cases and data sources.
-
Data Sources
Supported connections - databases, files, messaging, and HTTP.