作者
Gisele Sampaio Silva,Rohan Khera,Lee H. Schwamm,Maurizio Acampa,Eric E. Adelman,Johannes Boltze,Joseph P. Broderick,Amy Brodtmann,Hanne Christensen,Lachlan L. Dalli,Kelsey Rose Duncan,Islam Y. Elgendy,Adviye Ergul,Larry B. Goldstein,Janice L. Hinkle,Michelle C. Johansen,Katarina Jood,Scott E. Kasner,Steven R. Levine,Zixiao Li,Gregory Lip,Elisabeth B. Marsh,Keith W. Muir,Johanna M. Ospel,Joanna Pera,Terence J. Quinn,Silja Räty,Annemarei Ranta,Lorie Richards,José R. Romero,Joshua Z. Willey,Argye E. Hillis,Janne M. Veerbeek
摘要
Artificial intelligence (AI) large language models (LLMs) now produce human-like general text and images. LLMs' ability to generate persuasive scientific essays that undergo evaluation under traditional peer review has not been systematically studied. To measure perceptions of quality and the nature of authorship, we conducted a competitive essay contest in 2024 with both human and AI participants. Human authors and 4 distinct LLMs generated essays on controversial topics in stroke care and outcomes research. A panel of