The Network Rule Generation Processor can be used to compute recommendations for items from a list of known user transactions. It generates association rules with a single item as precondition and a single item as conclusion by constructing a similarity network of the items using the edges as rules. In order to produce recommendations the generated rules can be applied to a list of transactions using the Association Rule Application Processor.
As input, a longlist containing single-item transactions (one item to one user in each row) is required. In this example, we use a Custom Input Table Processor to generate a small dataset by hand. The column "Customer" holds the IDs of the users and the column "Product" contains the IDs of the items of the transactions.
Value For Missing Transactions
Define the value with which missing transactions are weighted when computing similarity (existing
transactions are weighted 1.0).
In order to generate rules, which recommend users for items, just select the item column into the customers parameter of the processor and the user column into the item parameter.
While the processor logic allows any value for the missing transactions, normally a value <= 0 is used. Setting the value to 0 means that the algorithm only focuses on positive matches. Selecting a negative value results in a reduction of similarity, if a user bought one item but not the other, and an addition of similarity, if a user bought neither of the items. If the value is set to -1 a "no transaction" match has the same value as a "transaction" match. With values between -1 and 0 one can tune down the importance on "no transaction" matches compared to "transaction" matches.
Sensible values for the minimal similarity and confidence highly vary depending on the properties of the input data as well as the choice of the value for missing transactions. For the minimal similarity, suitable values and allowed range depend on the choice of the value for missing transactions. Valid values are between min(1, Value for missing transactions, Value for missing transactions^2) and max(1, Value for missing transactions^2) . There is a check in place to catch invalid configurations.
If there are (almost) no edges in the output lower the minimal similarity.
If nearly all edges are present rise it. The same goes for the rules once there is a sensible amount of edges. As the rules are derived from the edges, there cannot be rules as long as there are no edges.
The main output of the processor is the list of association rules which indicate an item to be recommended on their right hand side (RHS) if the item on the left hand side (LHS) is present. Both sides only hold a single item per rule. Additionally, the confidence for each rule (regarding the input) is given.
As an additional output the processor also provides the edges of the similarity network, which lead to the association rules, with their similarity.
Each edge is listed twice - once with one item as Item1 and once with the other as Item1. This symmetry is meant to support potential postprocessing steps like the search for all edges of a given item.
To view the results, add a Result Table Processor.
The processor also outputs rules with confidence 1, that is, rules which - in the training data - always hold and thus do not lead to recommendations of new items. This is due to the additional option to apply the rules to new data and users who are not in the training set for whom the rules might produce valid recommendations.
Mathematically, the algorithm utilizes a user vector for each item with a 1, if the user bought the item, and the value for missing transactions otherwise as configured. Similarity between two items is then determined by multiplying their vectors. All pairs with enough similarity form edges.
From each edge the algorithm deduces two rules - one in each direction. It then computes confidence by comparing the cases where the rule holds with all cases where the left hand side is present in a user's item set.
The processor will handle multiple transactions of the same item by the same user by ignoring the multitude.
Example Usage in Workflow
The following example shows how to use the Network Rule Generation Processor in combination with other processors to generate recommendations for customers depending on the items they already bought.
As starting point we here use a Custom Input Table Processor to generate a small dataset by hand.The dataset is used by the Network Rule Generation Processor to generate rules and edges. We want to have a look at both outputs to check, if we selected a good configuration for the processor.
In order to apply the rules we need an Association Rule Application Processor. However, this processor needs for the rules to each have an ID to refer to in the output. Thus, we use an Indexing Processor to give each rule a unique ID. Now we can apply rules to the original dataset via the Association Rule Application Processor.
As a final result we receive the recommendations together with the ID of the applied rule and its confidence in the original dataset (which generated the rule):