Overview

Load processors in ONE DATA are used to forward data as input within workflows for further manipulation. This data can be either internal (Custom input table, Data tables, etc.) or external (Databases, APIs, etc.).

The following part contains an introduction to the various load processors used in ONE DATA. Links to the detailed explanation for each processor is also provided.


Multi URL API Load

The Multi URL API Load processor fetches data from REST APIs that can be targeted with GET requests. A path through the retrieved JSON can be provided to further specify the content of the created rows. 

The processor operates on any input data containing a valid JSON URL and has two output nodes.

  • The first output node returns the result JSON Object.
  • The second output node returns the failing URLs along with the corresponding error messages.


Custom Input Table


The Custom Input Table Processor allows the user to set up a new customizable data table with various options. 

The processor needs no input and has one output node that forwards the created table.


Data Table Load

The Data Table Load processor is used to fetch the selected Data Table into the Workflow. The processor needs no predecessor and has two output nodes:

  • Left Node: includes the successfully loaded entries from the Data Table
  • Right Node: contains the set of invalid rows

Dataset Load

The Dataset Load Processor is a deprecated version of the Data Table Load Processor. 



Flexible ETL Load

The Flexible ETL Load Processor (DEPRECATED) operates on a special type of dataset, namely a pre-defined ETL source. 

The processor needs no predecessor and has as output the configured data table to be forwarded for further use.


Database Connection Load

The Database Connection Load processor operates on database connections. The processor needs no predecessor and provides the configured SELECT query result as output.

 

Flexible REST API

The Flexible REST API processor uses REST connections to execute REST API GET, POST and DELETE requests

The processor needs a table with additional URLs as input and has two output nodes:

  • The first output node (left node) returns the result JSON Object.
  • The second output node (right node) returns the failing URLs along with the corresponding error messages.


Important: This processor needs an existing ONE DATA REST connection.


Random Number Generator

The Random Number Generator Processor creates a dataset filled with random values in one or more columns.

This processor can be useful for generating a large dataset for testing in a small amount of time.

The processor needs no input. The output is a dataset with random values according to the configuration. The number of rows depends on the number of partitions and rows per partition chosen. The number of columns depends on the number of distributions added to the configuration.

Cassandra Load

The Cassandra Load processor loads a dataset from a Key-Space of an Apache Cassandra database, the loaded data can be further pre-processed using a custom SQL query.

After searching for the dataset in the given Key-Space and applying the specified preprocessing, this processor will generate the dataset which can be used by other processors.


Data Type Recommendation

The Data Type Recommendation Processor generates a JSON result containing type recommendations for each column of a selected dataset. The processor does not accept direct input, neither it does produce direct output.

Type recommendations are shown in a JSON format under "Result" and "JSON Result" in the processor.


Microservice input

The Microservice Input processor is used to import data from exposed Workflows that serve as Microservices.

It can also be used to "push" data into ONE DATA and trigger a Workflow execution.

This processor needs no predecessor. After receiving the data coming from the HTTP call of the Microservice, the processor will forward it to other processors for further use.