The Venn Diagram Processor extracts information from a dataset and creates a new dataset that can be used for creating a venn diagram.
The processor requires a dataset suitable for a venn visualization. Usually these are columns that have common items and hence a cut set to visualize.
See the example below to get an impression of possible inputs.
Within the configuration, the columns, which will be used to calculate all information for a venn diagramm, can be selected. It is only allowed to select 2-5 columns. If no column is selected, all suitable (text) columns are used. These also need to be between 2 and 5, otherwise this processor does not work.
The Processor has two output ports.
The left one returns a dataset that contains the columns of the original input dataset and one additional column:
- The values in the columns of the original dataset get replaced by 1 or 0 depending on whether they did contain the distinct value given in the additional column or not.
- The additional column has the name of the column specified in the configuration and contains all the distinct strings that could be extracted from the selected columns.
The number of rows of the dataset will correspond to the number of strings that are in all selected columns.
The right port delivers a dataset that contains columns named after the columns of the original input dataset and two additional columns:
- The original columns will contain the value 0 and 1 in the output dataset depending on whether the value given in the two additional columns refers to the column or not.
- One additional column is called "aggregated". It counts the number of distinct strings that exist in every column that contains a 1 in the selected row.
- The other additional column is called "non-aggregated". It counts the number of distinct strings that exist in every column that contains a 1 in the corresponding row, but do not exist in any of the other columns that contain a 0 in the corresponding row.
In this example, we will create a venn diagram for the appearance of letters in words.
The input dataset contains a name in each column.
In the example workflow we use a Custom Input Table to generate our test dataset by hand. The results on the Venn Diagram Processor are saved to a Result Table.
In the configuration we select all columns and for the report output name, the default value is used.
Left output node
This output can be used to generate a venn diagram.
The following video shows how the result can be used to create a report with a respective diagram.
Right output node
This is a snippet of the right output:
The value 5 in the "aggregated" column of the third row for example refers to SANDR (A is contained two times in the string and therefore not distinct).