Quick Start
Get started with Data Caterer in minutes. Choose your preferred approach:
-
Full programmatic control for complex scenarios and test integration.
-
Configuration-based approach. Great for CI/CD pipelines.
-
Point-and-click interface. No coding required.
Java/Scala API
The recommended approach for full control over data generation. Write your data generation logic in Scala or Java.
Run
Press Enter to run the default example, or enter a class name (e.g., CsvPlan).
What Happens
- Builds your Scala/Java code into a JAR
- Runs it via Docker with the Data Caterer engine
- Generates data and reports to
docker/sample/
Example Code
class CsvPlan extends PlanRun {
val accountTask = csv("accounts", "/opt/app/data/accounts", Map("header" -> "true"))
.fields(
field.name("account_id").regex("ACC[0-9]{8}").unique(true),
field.name("name").expression("#{Name.name}"),
field.name("balance").`type`(DoubleType).min(10).max(1000),
field.name("status").oneOf("open", "closed", "pending")
)
.count(count.records(100))
execute(accountTask)
}
More Examples
| Class | Description |
|---|---|
DocumentationPlanRun |
JSON + CSV with foreign keys (default) |
CsvPlan |
CSV files with relationships |
PostgresPlanRun |
PostgreSQL tables |
KafkaPlanRun |
Kafka messages |
ValidationPlanRun |
Generate and validate data |
Run any example: ./run.sh <ClassName>
All example classes are in src/main/scala/io/github/datacatering/plan/.
YAML
Define data generation using YAML configuration files.
Run
What Happens
- Builds the example JAR
- Runs the YAML plan via Docker
- Generates data and reports to
docker/data/custom/
Example YAML
Plan file (docker/data/custom/plan/csv.yaml):
name: "csv_example_plan"
description: "Create transaction data in CSV file"
tasks:
- name: "csv_transaction_file"
dataSourceName: "csv"
enabled: true
Task file (docker/data/custom/task/file/csv/):
name: "csv_transaction_file"
steps:
- name: "transactions"
type: "csv"
options:
path: "/opt/app/data/transactions"
header: "true"
count:
records: 1000
fields:
- name: "account_id"
options:
regex: "ACC[0-9]{8}"
- name: "amount"
type: "double"
options:
min: 10
max: 1000
More Examples
| Plan File | Description |
|---|---|
csv.yaml |
CSV files |
parquet.yaml |
Parquet files |
postgres.yaml |
PostgreSQL tables |
kafka.yaml |
Kafka messages |
foreign-key.yaml |
Data with relationships |
validation.yaml |
Generate and validate |
Run any example: ./run.sh <filename>.yaml
All plan files are in docker/data/custom/plan/. Task definitions are in docker/data/custom/task/.
UI
A web interface for creating and running data generation plans.
Run
docker run -d -p 9898:9898 -e DEPLOY_MODE=standalone --name datacaterer datacatering/data-caterer:0.18.0
Open http://localhost:9898 in your browser.
What You Can Do
- Create connections to databases, files, Kafka, and more
- Define data schemas with field types and constraints
- Generate test data with a single click
- View results and reports in the browser
View Results
After running, check the generated report:
- Java/Scala examples:
docker/sample/report/index.html - YAML examples:
docker/data/custom/report/index.html
Next Steps
-
Step-by-Step Guide
First data generation guide - learn Data Caterer's full capabilities.
-
All Guides
Browse all guides for specific use cases and data sources.
-
Data Sources
Supported connections - databases, files, messaging, and HTTP.