The ALS Recommender (Alternating Least Square Recommender) can be used to compute recommendations for items from a list of item ratings given by users. The recommendations are generated using an alternating least square matrix factorization algorithm.
Internally, the algorithm uses the ALS algorithm from the spark.mllib.recommendation package. All tuning parameters are provided to ONE DATA users via configuration options.
As input, a table with ratings is expected. Each rating should differ in terms of the customer and the respectively rated item. The values shouldn't be of type string.
If the user IDs and/or item IDs do not come in numeric format, the Alphanumeric to Numeric ID Processor can be used to generate numeric substitution IDs. The input table is expected to have at least three columns.
Warning: High values for the rank and/or the number of iterations can lead to poor performance and even out-of-memory errors
The processor provides three output tables. It recommends items for users based on their previous item ratings by using the alternating least squares approach. Requires numeric user and item IDs. To convert non-numeric ids use the Alphanumeric to Numeric ID Processor.
In this example, the ALS recommender will be used to provide recommendations using customers, items and ratings.
The column customerID holds the IDs of the users, the column productID contains the IDs of the items and the column Rating holds the weight of the respective product.
Tip: To apply the processor to transactions (without ratings) one can simply add a column with a constant value unequal 0 and thus rate every transaction the same. This can be done with the help of an Extended Mathematical Operation Processor.
Configuration for the ALS Recommender processor using the data from input table:
The first of the three processor outputs provides the list of all recommendations together with a predicted rating. For each user the best Number of Recommendations items are listed - one item per row.
As additional information, the processor provides the feature vectors describing the users and the items at two separate outputs:
- Output Customer Features
- Output Item Features