Skip to content

HTTP Source

Creating a data generator based on an OpenAPI/Swagger document.

Generate HTTP requests

Requirements

  • 10 minutes
  • Git
  • Gradle
  • Docker

Get Started

First, we will clone the data-caterer-example repo which will already have the base project setup required.

git clone git@github.com:data-catering/data-caterer-example.git
git clone git@github.com:data-catering/data-caterer-example.git
git clone git@github.com:data-catering/data-caterer-example.git

HTTP Setup

We will be using the http-bin docker image to help simulate a service with HTTP endpoints.

Start it via:

cd docker
docker-compose up -d http
docker ps

Plan Setup

Create a new Java or Scala class.

  • Java: src/main/java/io/github/datacatering/plan/MyAdvancedHttpJavaPlanRun.java
  • Scala: src/main/scala/io/github/datacatering/plan/MyAdvancedHttpPlanRun.scala

Make sure your class extends PlanRun.

import io.github.datacatering.datacaterer.java.api.PlanRun;
...

public class MyAdvancedHttpJavaPlanRun extends PlanRun {
    {
        var conf = configuration().enableGeneratePlanAndTasks(true)
            .generatedReportsFolderPath("/opt/app/data/report");
    }
}
import io.github.datacatering.datacaterer.api.PlanRun
...

class MyAdvancedHttpPlanRun extends PlanRun {
  val conf = configuration.enableGeneratePlanAndTasks(true)
    .generatedReportsFolderPath("/opt/app/data/report")
}

We will enable generate plan and tasks so that we can read from external sources for metadata and save the reports under a folder we can easily access.

Schema

We can point the schema of a data source to a OpenAPI/Swagger document or URL. For this example, we will use the OpenAPI document found under docker/mount/http/petstore.json in the data-caterer-example repo. This is a simplified version of the original OpenAPI spec that can be found here.

We have kept the following endpoints to test out:

  • GET /pets - get all pets
  • POST /pets - create a new pet
  • GET /pets/{id} - get a pet by id
  • DELETE /pets/{id} - delete a pet by id
var httpTask = http("my_http")
        .schema(metadataSource().openApi("/opt/app/mount/http/petstore.json"))
        .count(count().records(2));
val httpTask = http("my_http")
  .schema(metadataSource.openApi("/opt/app/mount/http/petstore.json"))
  .count(count.records(2))

The above defines that the schema will come from an OpenAPI document found on the pathway defined. It will then generate 2 requests per request method and endpoint combination.

Run

Let's try run and see what happens.

cd ..
./run.sh
#input class MyAdvancedHttpJavaPlanRun or MyAdvancedHttpPlanRun
#after completing
docker logs -f docker-http-1

It should look something like this.

172.21.0.1 [06/Nov/2023:01:06:53 +0000] GET /anything/pets?tags%3DeXQxFUHVja+EYm%26limit%3D33895 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:06:53 +0000] GET /anything/pets?tags%3DSXaFvAqwYGF%26tags%3DjdNRFONA%26limit%3D40975 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:06:56 +0000] POST /anything/pets HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:06:56 +0000] POST /anything/pets HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:07:00 +0000] GET /anything/pets/kbH8D7rDuq HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:07:00 +0000] GET /anything/pets/REsa0tnu7dvekGDvxR HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:07:03 +0000] DELETE /anything/pets/EqrOr1dHFfKUjWb HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:07:03 +0000] DELETE /anything/pets/7WG7JHPaNxP HTTP/1.1 200 Host: host.docker.internal}

Looks like we have some data now. But we can do better and add some enhancements to it.

Foreign keys

The four different requests that get sent could have the same id passed across to each of them if we define a foreign key relationship. This will make it more realistic to a real life scenario as pets get created and queried by a particular id value. We note that the id value is first used when a pet is created in the body of the POST request. Then it gets used as a path parameter in the DELETE and GET requests.

To link them all together, we must follow a particular pattern when referring to request body, query parameter or path parameter columns.

HTTP Type Column Prefix Example
Request Body bodyContent bodyContent.id
Path Parameter pathParam pathParamid
Query Parameter queryParam queryParamid
Header header headerContent_Type

Also note, that when creating a foreign field definition for a HTTP data source, to refer to a specific endpoint and method, we have to follow the pattern of {http method}{http path}. For example, POST/pets. Let's apply this knowledge to link all the id values together.

var myPlan = plan().addForeignKeyRelationship(
        foreignField("my_http", "POST/pets", "bodyContent.id"),     //source of foreign key value
        foreignField("my_http", "DELETE/pets/{id}", "pathParamid"),
        foreignField("my_http", "GET/pets/{id}", "pathParamid")
);

execute(myPlan, conf, httpTask);
val myPlan = plan.addForeignKeyRelationship(
  foreignField("my_http", "POST/pets", "bodyContent.id"),     //source of foreign key value
  foreignField("my_http", "DELETE/pets/{id}", "pathParamid"),
  foreignField("my_http", "GET/pets/{id}", "pathParamid")
)

execute(myPlan, conf, httpTask)

Let's test it out by running it again

./run.sh
#input class MyAdvancedHttpJavaPlanRun or MyAdvancedHttpPlanRun
docker logs -f docker-http-1
172.21.0.1 [06/Nov/2023:01:33:59 +0000] GET /anything/pets?limit%3D45971 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:34:00 +0000] GET /anything/pets?limit%3D62015 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:34:04 +0000] POST /anything/pets HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:34:05 +0000] POST /anything/pets HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:34:09 +0000] DELETE /anything/pets/5e HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:34:09 +0000] DELETE /anything/pets/IHPm2 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:34:14 +0000] GET /anything/pets/IHPm2 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:34:14 +0000] GET /anything/pets/5e HTTP/1.1 200 Host: host.docker.internal}

Now we have the same id values being produced across the POST, DELETE and GET requests! What if we knew that the id values should follow a particular pattern?

Custom metadata

So given that we have defined a foreign key where the root of the foreign key values is from the POST request, we can update the metadata of the id column for the POST request and it will proliferate to the other endpoints as well. Given the id column is a nested column as noted in the foreign key, we can alter its metadata via the following:

var httpTask = http("my_http")
        .schema(metadataSource().openApi("/opt/app/mount/http/petstore.json"))
        .schema(field().name("bodyContent").schema(field().name("id").regex("ID[0-9]{8}")))
        .count(count().records(2));
val httpTask = http("my_http")
  .schema(metadataSource.openApi("/opt/app/mount/http/petstore.json"))
  .schema(field.name("bodyContent").schema(field.name("id").regex("ID[0-9]{8}")))
  .count(count.records(2))

We first get the column bodyContent, then get the nested schema and get the column id and add metadata stating that id should follow the patter ID[0-9]{8}.

Let's try run again, and hopefully we should see some proper ID values.

./run.sh
#input class MyAdvancedHttpJavaPlanRun or MyAdvancedHttpPlanRun
docker logs -f docker-http-1
172.21.0.1 [06/Nov/2023:01:45:45 +0000] GET /anything/pets?tags%3D10fWnNoDz%26limit%3D66804 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:45:46 +0000] GET /anything/pets?tags%3DhyO6mI8LZUUpS HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:45:50 +0000] POST /anything/pets HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:45:51 +0000] POST /anything/pets HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:45:52 +0000] DELETE /anything/pets/ID55185420 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:45:52 +0000] DELETE /anything/pets/ID20618951 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:45:57 +0000] GET /anything/pets/ID55185420 HTTP/1.1 200 Host: host.docker.internal}
172.21.0.1 [06/Nov/2023:01:45:57 +0000] GET /anything/pets/ID20618951 HTTP/1.1 200 Host: host.docker.internal}

Great! Now we have replicated a production-like flow of HTTP requests.

Ordering

If you wanted to change the ordering of the requests, you can alter the order from within the OpenAPI/Swagger document. This is particularly useful when you want to simulate the same flow that users would take when utilising your application (i.e. create account, query account, update account).

Rows per second

By default, Data Caterer will push requests per method and endpoint at a rate of around 5 requests per second. If you want to alter this value, you can do so via the below configuration. The lowest supported requests per second is 1.

import io.github.datacatering.datacaterer.api.model.Constants;

...
var httpTask = http("my_http", Map.of(Constants.ROWS_PER_SECOND(), "1"))
        ...
import io.github.datacatering.datacaterer.api.model.Constants.ROWS_PER_SECOND

...
val httpTask = http("my_http", options = Map(ROWS_PER_SECOND -> "1"))
  ...

Check out the full example under AdvancedHttpPlanRun in the example repo.