# Overview

This processor calculates certain statistical measures (exp: Mean, Median ...) and visualizes the Data in __boxplot__.

# Motivation

To gain important insights from Data and extract helpful information embedded in it, it is necessary to understand this Data which this can be achieved by applying some __Descriptive Analysis__ on the Data of interest.

The Heuristic Summaries Processor helps generate significant statistics from the input Data.

# Configuration

The processor requires a Dataset containing at least one numeric/ratio scaled column/variable.

This processor is generally linked to a __Load Processor__ to interpret input Data, or maybe after applying some __transformations__ on this Data.

The configuration menu of the processor is the following:

__Compression Size__: the higher the value, the higher precision will be BUT execution time rises and memory consumption will be high.

* Merge Interval: define interval for merging *

*tree centroids*NOTE THAT:

these two configuration fields areexperimental

the processor does not infer column type meaning that if a column is declared as of type "String" but contains only numbers the processor will NOT take it into account

# Output

This processor provides two outputs:

- within the processor: Displays a Boxplot graph accompanied with multiple statistical measures (min, max, median ...) for each numeric Column from the input Dataset

- The output node of the processor generates a table with 12 columns: ColumnName, min, max, sum,
__median__,__firstQuartile__, thirdQuartile,__arithmeticMean__,__geometricMean__,__lowerWhisker__, upperWhisker, numberOfRows

# Example

In this example the Heuristic Summary Processor will be applied on a simple Dataset to extract statistical measures from this Data:

# Related Articles

__Distinct Summary__

__Distinct Textual Summary__