[成果] 推荐系统算法的时效性(Timeliness in recommender systems)

来源: 作者: 发布时间:2017-10-27 浏览次数:

研究成果: Fuguo ZhangQihua LiuAnZeng. Expert Systems With Applications 85(2017)270-278 .doi.org/10.1016/j.eswa.2017.05.038

 

 

简介:推荐系统算法能够高效地从海量在线数据中推荐出与用户行为相关性最高的产品,这一特性使其广泛应用于各大商业网站中。也因此,推荐算法在不断地更新发展,推荐结果的准确性和广泛性在不断提高。然而,在已有的算法中推荐结果的时效性并没有得到很好的解决。为了探究这个问题,我们采用了时间数据划分的方法,我们根据链接的时间信息将链接分为探测集和训练集。研究发现许多推荐算法在时间数据划分中的准确性低于随机数据划分。通过对时效性数据矩阵的分析,我们发现算法准确性降低的原因是这些算法没有利用产品的时间从而更容易推荐过时的产品。为了解决这问题,我们在算法中考虑时间信息,改进了已有的推荐算法。结果表明我们的算法能够极大降低推荐过时产品的可能性,同时推荐结果的准确性也得到了提高。

 

 

Abstract

Due to the high efficiency in finding the most relevant online products for users from the information ocean, recommender systems have now been applied to many commercial web sites. Meanwhile, many recommendation algorithms have been developed to improve the recommendation accuracy and diversity. However, whether the recommended items are timely or not in these algorithms has not yet been well understood. To investigate this problem, we consider a temporal data division which divides the links to probe set and training set strictly according to the time stamp on links. We find that the recommendation accuracy of many algorithms are much lower in temporal data division than in the random data division. With a timeliness metric, we find that the low accuracy is caused by the tendency of these algorithms to recommend out-of-date items, which cannot be detected with the random data division. To solve this problem, we improve the considered recommendation algorithms with a timeliness factor. The resulting algorithms can strongly suppress the probability of recommending obsolete items. Meanwhile, the recommendation accuracy is substantially enhanced.

 

 

原文链接:http://www.sciencedirect.com/science/article/pii/S0957417417303603#abs0001

 

 

Fig. 1. The illustration of the reason of using the timeliness metric in recommendation

Fig. 2. (Color online) The recommendation performance ((a) ranking score; (b) precision; (c) personalization; (d) surprisal) of the mass diffusion and item-based collaborative filtering algorithms under the random data division and temporal data division. The network used in this figure is Movielens.

Fig. 3. (Color online) The recommendation performance ((a) ranking score; (b) precision; (c) personalization; (d) surprisal ) of the mass diffusion and item-based collaborative filtering algorithms under the random data division and temporal data division. The network used in this figure is Netflix.

Fig. 4. (Color online) The distribution of Tα in users’ recommendation lists when different recommendation algorithms are applied. The method marked with “T” is the timeliness-based version of the method. The network data used in this figure is Movielens.

Fig. 5. (Color online) The distribution of Tα in users’ recommendation lists when different recommendation algorithms are applied. The method marked with “T” is the timeliness-based version of the method. The network data used in this figure is Netflix.

Fig. 6. (Color online) The dependence of ranking score and precision on the training set size when the traditional and timeliness-based methods are applied. The size of the training set is measured by the number of days’ data in the training set. The probe set consists of 10% (this ratio is with respect to the total number of links in the data set) future links after the testing time. (a)(b) are the results of the Movielens data set. (c)(d) are the results of the Netflix data set.

Fig. 7. (Color online) (a)(b) show the dependence of ranking score on the training set size when traditional Hybrid method and the timeliness-based Hybrid method are used in Movielens and Netflix, respectively. In these two figures, the parameter θ and λ are respectively set as 1 and the optimal value for ranking score. (c)(d) show the effect of θ on the ranking score in Movielens and Neflix. In these two figures, the parameter λ is set as the optimal value for ranking score. (e)(f) show the dependence of the ranking score on parameter λ in Movielens and Neflix. In these two figures, the parameter θ is set to be 1.

Table 1. Comparison of the traditional and timeliness-based recommendation algorithms in Movielens and Netflix data sets. For the ranks metrics, the smaller the better, while, regarding the timeliness metric, the higher the better. The algorithm with better performance is highlighted in bold font.