计算机科学
术语
编码(集合论)
软件
源代码
探索性研究
数据科学
软件工程
程序设计语言
人类学
语言学
哲学
社会学
集合(抽象数据类型)
作者
Haoquan Zhou,Jingbo Li
标识
DOI:10.1145/3544549.3583943
摘要
Recent advances in automatic code generation have made tools like GitHub Copilot attractive for programmers, as they allow for the creation of code blocks by simply providing descriptive prompts to the AI. While researchers have studied the performance of these AI-based tools in general-purpose programming, their effectiveness in data analysis is understudied. Unlike general-purpose programming which focuses more on algorithm-driven tasks like building novel software, data analysis requires a data-driven approach to actually gain insights. It remains unclear how these tools could be utilized to help data scientists analyze real-world problems. In this paper, we conducted a qualitative user study with 5 participants to understand the use of GitHub Copilot in solving problems by scaffolding prompts at different levels of specificity among data scientists. We discovered that effective prompts require carefully selected terminology, properly arranged word order, and sufficiently established interaction between humans and GitHub Copilot. We also spot some potential flaws in GitHub Copilot that hinder data scientists from efficiently scaffolding prompts. Our work points out some improvement directions for both data scientists and GitHub Copilot in the future.
科研通智能强力驱动
Strongly Powered by AbleSci AI