计算机科学
学习迁移
传输(计算)
人工智能
机器学习
并行计算
作者
Xiangli Yang,Qing Liu,Rong Su,Ruiming Tang,Zhirong Liu,Xiuqiang He,Jianxi Yang
标识
DOI:10.1016/j.ins.2022.08.009
摘要
The figure presents the pre-trained CTR model, the fine-tuned CTR model, and our proposed AutoFT framework from left to right. The pre-trained CTR model with random parameters starting point is optimized based on the data from source domains or based on the data from all domains. The fine-tuned CTR model is initialized with trained parameters in source scenarios and optimized with data from the new target domain. AutoFT involves three sets of parameters: the light-blue boxes contain parameters from the pre-trained CTR model, the dark-blue boxes contain parameters initialized by traditional fine-tuning strategy will be re-optimized during training and the red boxes contain the parameters of the policy networks. • The proposed AutoFT automatically finds a route between the pre-trained network and the siamese fine-tuned network per instance in the target domain and is compatible with any deep CTR models. • The transfer policy for the embedding layer and the feature interaction layers can be trained synchronously with deep CTR models. • The result shows that the lower layers may represent more general features while higher layers need more fine-tuning to fit a specific target domain. In real business platforms, recommendation systems usually need to predict the CTR of multiple business. Since different scenarios may have common feature interactions, knowledge transferring based methods are often used by re-optimizing the pre-trained CTR model from source scenarios to a new target domain. In addition to knowledge transfer, it is noteworthy that generalizing target domain data outside of the CTR model accurately is also important when re-training all of the fine-tuned parameters. Generally, the pre-trained model trained on large source domains can represent the characteristics of different instances and capture typical feature interactions. It would be useful to directly reuse fine-tuned parameters from source domains to serve the target domain. However, different instances of the target domain may need different amounts of source information to fine-tune the model parameters, and these decisions of freezing or re-optimizing model parameters, which highly depend on the fine-tuned model and target instances, may require much manual effort. In this paper, we propose an end-to-end transfer learning framework with fine-tuned parameters for CTR prediction, called Automatic Fine-Tuning (AutoFT). The principal component of AutoFT is a set of learnable transfer policies that independently determine how the specific instance-based fine-tuning policies should be trained, which decide the routing in the embedding representations and the high-order feature representations layer by layer in deep CTR model. Extensive tests on two benchmarks and one real commercial recommender system deployed in Huawei's App Store show that AutoFT can greatly increase CTR prediction performance when compared to current transferring methodologies.
科研通智能强力驱动
Strongly Powered by AbleSci AI