COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon
亚马逊雨林
比例(比率)
计算机科学
电子商务
数据科学
万维网
地理
地图学
生态学
生物
作者
Changlong Yu,Xin Liu,Jefferson de Carvalho Maia,Yang Li,Tianyu Cao,Yifan Gao,Yangqiu Song,Rahul Goutam,Haiyang Zhang,Bing Yin,zheng li
标识
DOI:10.1145/3626246.3653398
摘要
Applications of large-scale knowledge graphs in the e-commerce platforms can improve shopping experience for their customers. While existing e-commerce knowledge graphs (KGs) integrate a large volume of concepts or product attributes, they fail to discover user intentions, leaving the gap with how people think, behave, and interact with the surrounding world. In this work, we present COSMO, a scalable system to mine user-centric commonsense knowledge from massive behaviors and construct industry-scale knowledge graphs to empower diverse online services. In particular, we describe a pipeline for collecting high-quality seed knowledge assertions that are distilled from large language models (LLMs) and further refined by critic classifiers trained over human-in-the-loop annotated data.Since those generations may not always align with human preferences and contain noises, we then describe how we adopt instruction tuning to finetune an efficient language model~(COSMO-LM) for faithful e-commerce commonsense knowledge generation at scale. COSMO-LM effectively expands our knowledge graph to 18 major categories at Amazon, producing millions of high-quality knowledge with only 30k annotated instructions. Finally COSMO has been deployed in Amazon search applications such as search navigation. Both offline and online A/B experiments demonstrate our proposed system achieves significant improvement. Furthermore, these experiments highlight the immense potential of commonsense knowledge extracted from instruction-finetuned large language models.