稳健性(进化)
代码库
计算机科学
透明度(行为)
图像质量
独创性
图像(数学)
水准点(测量)
人工智能
源代码
定性研究
地图学
计算机安全
社会学
程序设计语言
生物化学
基因
社会科学
化学
地理
作者
Tong Lee,Michihiro Yasunaga,Chenlin Meng,Yifan Mai,Joon-Sung Park,Agrim Gupta,Yunzhi Zhang,Deepak Narayanan,Hannah Teufel,Marco Bellagente,Minguk Kang,Taesung Park,Jure Leskovec,Jun-Yan Zhu,Li Fei-Fei,Jiajun Wu,Stefano Ermon,Percy Liang
出处
期刊:Cornell University - arXiv
日期:2023-01-01
被引量:6
标识
DOI:10.48550/arxiv.2311.04287
摘要
The stunning qualitative improvement of recent text-to-image models has led to their widespread attention and adoption. However, we lack a comprehensive quantitative understanding of their capabilities and risks. To fill this gap, we introduce a new benchmark, Holistic Evaluation of Text-to-Image Models (HEIM). Whereas previous evaluations focus mostly on text-image alignment and image quality, we identify 12 aspects, including text-image alignment, image quality, aesthetics, originality, reasoning, knowledge, bias, toxicity, fairness, robustness, multilinguality, and efficiency. We curate 62 scenarios encompassing these aspects and evaluate 26 state-of-the-art text-to-image models on this benchmark. Our results reveal that no single model excels in all aspects, with different models demonstrating different strengths. We release the generated images and human evaluation results for full transparency at https://crfm.stanford.edu/heim/v1.1.0 and the code at https://github.com/stanford-crfm/helm, which is integrated with the HELM codebase.
科研通智能强力驱动
Strongly Powered by AbleSci AI