计算机科学
简单(哲学)
人工智能
特征(语言学)
可信赖性
变压器
任务(项目管理)
自然语言处理
机器学习
语言学
哲学
物理
计算机安全
管理
认识论
量子力学
电压
经济
作者
Munkhtulga Battogtokh,G. Flucke,Cosmin Davidescu,Rita Borgo
出处
期刊:Communications in computer and information science
日期:2024-01-01
卷期号:: 398-425
标识
DOI:10.1007/978-3-031-50396-2_23
摘要
Fine-grained text classification with similar and many labels is a challenge in practical applications. Interpreting predictions in this context is particularly difficult. To address this, we propose a simple framework that disentangles feature importance into more fine-grained links. We demonstrate our framework on the task of intent recognition, which is widely used in real-life applications where trustworthiness is important, for state-of-the-art Transformer language models using their attention mechanism. Our human and semi-automated evaluations show that our approach better explains fine-grained input-label relations than popular feature importance estimation methods LIME and Integrated Gradient and that our approach allows faithful interpretations through simple rules, especially when model confidence is high.
科研通智能强力驱动
Strongly Powered by AbleSci AI