Live Session
Session 6: Graphs
Reproducibility
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis
Vito Walter Anelli (Politecnico di Bari), Daniele Malitesta (Polytechnic University of Bari), Claudio Pomo (Politecnico di Bari), Alejandro Bellogin (Universidad Autonoma de Madrid), Eugenio Di Sciascio (Politecnico di Bari) and Tommaso Di Noia (Politecnico di Bari)
Abstract
Among the most successful research directions in recommender systems, there are undoubtedly graph neural network-based models (GNNs). Through the natural modeling of users and items as a bipartite, undirected graph, GNNs have pushed up the performance bar for modern recommenders.
Unfortunately, most of the original graph-based works cherry-pick results from previous baseline papers without bothering to check whether the results are valid for the configuration under analysis. Thus, our work stands first and foremost as a work on the replicability of results. We provide a code that succeeds in replicating the results proposed in the articles introducing six of the most popular and recent graph recommendation models (i.e., NGCF, DGCF, LightGCN, SGL, UltraGCN, and GFCF). In our experimental setup, we test these six models on three common benchmarking datasets (i.e., Gowalla, Yelp 2018, and Amazon Book). In addition, to understand how these models perform with respect to traditional models for collaborative filtering, we compare the graph models under analysis with some models that have historically emerged as the best performers in an offline evaluation context. Then, the study is extended on two new datasets (i.e., Allrecipes and BookCrossing) for which no known setup exists in the literature. Since the performance on such datasets is not entirely aligned with the previous benchmarking one, we further analyze the possible impact of specific dataset characteristics on the recommendation accuracy performance. By investigating the information flow to the users from their neighborhoods, the analysis aims to identify for which models these intrinsic features in the dataset structure impact accuracy performance. The code to reproduce the experiments is available at: https://split.to/Graph-Reproducibility.