Skip to content

External Source Validations

Use validations that are defined in external sources such as Great Expectations or OpenMetadata. This allows you to generate data for your upstream data sources and validate your pipelines based on the same rules that would be applied in production.

Example flow with validations from external source

Supported Sources

Source Support
OpenMetadata ✅
Great Expectations ✅
DBT Constraints
SodaCL
MonteCarlo

OpenMetadata

Use data quality rules defined from OpenMetadata to execute over dataset.

var jsonTask = json("my_json", "/opt/app/data/json")
    .validations(metadataSource().openMetadata(
        "http://host.docker.internal:8585/api",
        Constants.OPEN_METADATA_AUTH_TYPE_OPEN_METADATA(),
        Map.of(
                Constants.OPEN_METADATA_JWT_TOKEN(), "abc123",
                Constants.OPEN_METADATA_TABLE_FQN(), "sample_data.ecommerce_db.shopify.raw_customer"
        )
    ));

var conf = configuration().enableGenerateValidations(true);
val jsonTask = json("my_json", "/opt/app/data/json")
  .validations(metadataSource.openMetadata(
    "http://host.docker.internal:8585/api",
    OPEN_METADATA_AUTH_TYPE_OPEN_METADATA,
    Map(
      OPEN_METADATA_JWT_TOKEN -> "abc123", //find under settings/bots/ingestion-bot/token
      OPEN_METADATA_TABLE_FQN -> "sample_data.ecommerce_db.shopify.raw_customer"
    )
  ))

val conf = configuration.enableGenerateValidations(true)
name: "account_checks"
dataSources:
  my_json:
    - options:
        metadataSourceType: "openMetadata"
        authType: "openMetadataJwtToken"
        openMetadataJwtToken: "abc123"
        tableFqn: "sample_data.ecommerce_db.shopify.raw_customer"

Great Expectations

Use data quality rules defined from OpenMetadata to execute over dataset.

var jsonTask = json("my_json", "/opt/app/data/json")
    .validations(metadataSource().greatExpectations("great-expectations/taxi-expectations.json");

var conf = configuration().enableGenerateValidations(true);
val jsonTask = json("my_json", "/opt/app/data/json")
  .validations(metadataSource.greatExpectations("great-expectations/taxi-expectations.json")

val conf = configuration.enableGenerateValidations(true)
name: "account_checks"
dataSources:
  my_json:
    - options:
        metadataSourceType: "greatExpectations"
        expectationsFile: "great-expectations/taxi-expectations.json"