Overview

The Query processor executes a Spark SQL query statement on the input data set.

Spark SQL brings native support for SQL to Spark and streamlines the process of querying data stored both in RDDs (Spark’s distributed datasets) and in external sources. 

More information about Spark SQL can be found in the Spark SQL Programming Guide and Spark SQL Syntax Description.


Input

The Query processor operates on a single input data set containing any type of data.


Configuration

NOTE THAT the input table should be called inputTable when calling it in the SQL statement.


Supported SQL features can be found in the Spark SQL documentation.


Output

Once the query is executed, the response can visualised in the output table.


Example

Workflow

Input data


Example Configuration

We use the following SQL statement to query our data set.

SELECT  i.ContactName,  i.Address   FROM inputTable i
WHERE  i.Country = 'Mexico'

Result


Relevant articles

Double Input Query Processor

Query Helper Processor

Multi-Input Query Processor