On Evaluation Metrics for Diversity-enhanced Recommendations
多样性(政治)
计算机科学
社会学
人类学
作者
Xueqi Li,Gao Cong,Guoqing Xiao,Yang Xu,Wenjun Jiang,Kenli Li
标识
DOI:10.1145/3627673.3679629
摘要
Diversity is increasingly recognized as a crucial factor in recommendation systems for enhancing user satisfaction. However, existing studies on diversity-enhanced recommendation systems primarily focus on designing recommendation strategies, often overlooking the development of evaluation metrics. Widely used diversity metrics such as CC, ILAD, and ILMD are typically assessed independently of accuracy. This separation leads to a critical limitation: existing diversity measures are unable to distinguish between diversity improvements from effective recommendations and those from in effective recommendations. Our evaluations reveal that the diversity improvements are primarily contributed by ineffective recommendations, which often do not positively contribute to user satisfaction. Furthermore, existing diversity metrics disregard the feature distribution of ground-truth items, potentially skewing the assessment of diversity performance. To address these limitations, we design three new accuracy-aware metrics: DCC, FDCC, and DILAD, and conduct a re-evaluation using these metrics. Surprisingly, our results illustrate that the diversity improvements of existing diversity-enhanced approaches are limited and even negative compared to those of accurate recommendations. This finding underscores the need to explore more sophisticated diversity-enhanced techniques for improving the diversity within effective recommendations.