This processor provides descriptive summaries (mean, variance, amount of non-zero values, total number of values) for all of the input Data numeric columns.
Workflows in ONE DATA enable the user to exploit and work with large Data, but to be able to extract Value from this data the user must have a general overview of it.
It would be very helpful and time saving to have an effective tool which can present descriptive summaries of the Data, and that is the functionality of this processor.
This processor does not need configuration.
What is important thing is to provide the processor with input Data which contains at least one numerical column.
This processor generates short summaries for each numeric variable in the dataset including its column names, mean values, variances, number of non-zero values and the number of rows in the dataset.
So the output looks as follows:
the list of all numerical columns from input data
the mean value of the column of interest
How values vary around the Mean
total of non zero values
Total number of values
In this example, the Column Summary Processor will be used on a toy Dataset generated by a Custom Input Table: