计算机科学
试验台
生成语法
推论
个性化
生成模型
移动设备
云计算
人工智能
计算机安全
计算机网络
万维网
操作系统
作者
Ye Zhang,Jinrui Zhang,Sheng Yue,Wei Lu,Ju Ren,Xuemin Shen
标识
DOI:10.1109/mwc.006.2300576
摘要
Recently, generative artificial intelligence (GenAI) has gained significant interest on a global scale, particularly with the explosion of some killer GenAl applications, like ChatGPT. However, due to the excessively large sizes of generative models, most current GenAl applications are deployed in the cloud, easily causing high cost, long delay, and potential risk of privacy leakage, thereby greatly impeding GenAl's further expansion and development. In this article, we explore mobile GenAl - deploying large generative models on mobile devices, aiming to bring the GenAl capability to the physical proximity to users. First, we analyze the benefits and opportunities of mobile GenAl in terms of cost, delay, privacy, personalization, and application. Then, we test various large generative models on the mobile testbed, and reveal mobile GenAl's key bottlenecks in inference latency and memory consumption. Accordingly, we propose a weight occupancy strategy for model compression during inference, and discuss the pros and cons thereof. Finally future directions are pointed out to foster continued research efforts.
科研通智能强力驱动
Strongly Powered by AbleSci AI