Data and algorithm sharing is an imperative part of data and AI-driven economies. The efficient sharing of data and algorithms relies on the active interplay between users, data providers, and algorithm providers. Although recommender systems are known to effectively interconnect users and items in e-commerce settings, there is a lack of research on the applicability of recommender systems for data and algorithm sharing.
To fill this gap, we identify six recommendation scenarios for supporting data and algorithm sharing, where four of these scenarios substantially differ from the traditional recommendation scenarios in e-commerce applications. Specifically, we identify two traditional scenarios of recommending datasets to users (SC1) and algorithms to users (SC2), and four novel scenarios that are unique to data and algorithm sharing platforms, i.e., the recommendation of datasets to algorithms (SC3), algorithms to datasets (SC4), datasets to datasets (SC5), and algorithms to algorithms (SC6).
We evaluate these recommendation scenarios using a novel dataset based on interaction data of the OpenML data and algorithm sharing platform, which we also provide for the scientific community to foster future research on this important topic. Specifically, we investigate three types of recommendation approaches, namely popularity-, collaboration-, and content-based recommendations.
We find that collaboration-based recommendations provide the most accurate recommendations in all scenarios. Plus, the recommendation accuracy strongly depends on the specific scenario, e.g., algorithm recommendations for users are a more difficult problem than algorithm recommendations for datasets. This indicates that each recommendation scenario that can occur in a data and algorithm sharing platform, can be viewed as a separate recommendation problem. Finally, the content-based approach generates the least popularity-biased recommendations that cover the most datasets and algorithms. Overall, our work showcases how recommender systems can be helpful tools in data and algorithm sharing platforms.
About the Data Economy Workshop
The Data Economy Workshop will be co-located with the 2022 International Conference on emerging Networking EXperiments and Technologies (CoNEXT 2022). The workshop and the conference will be held in the splendid setting of Rome, at the Sapienza University of Rome. For the workshop 9 papers where accepted – find them here.