计算机科学
预处理器
答疑
人工智能
随机森林
朴素贝叶斯分类器
支持向量机
机器学习
词(群论)
代表(政治)
领域(数学分析)
数据预处理
过程(计算)
自然语言处理
数据挖掘
数学
法学
数学分析
几何学
政治
政治学
操作系统
作者
Oussama Tahtah,Yassine Akhiat,Ahmed Zinedine,Khalid Fardousse
标识
DOI:10.1109/cist56084.2023.10409875
摘要
This paper is a part of larger work aiming the construction of a Question/Answering System with Moroccan legal domain. It concerns mainly the phase of data preparation and question classification. The classification of legal questions is a crucial step in a legal question-answering system. It can help to find the relevant documents that contain the answers. With the increasing volume of legal data, legal question classification can be beneficial for legal practitioners to efficiently identify relevant laws, precedents, and regulations applied to a specific case. Machine learning algorithms and Natural Language Processing (NLP) techniques can effectively process large amounts of unstructured legal data, which can be difficult to be processed manually. In this work, we investigate the use of different word representation techniques and different stemmers in the preprocessing stage, aiming to classify legal questions using three machine learning algorithms, Support Vector Machine, Naïve Bayes, and Random Forest. The experimental results show that the Random Forest algorithm achieved the highest accuracy and precision. Additionally, they confirm the significant impact of question preprocessing on the classification results enhancement.
科研通智能强力驱动
Strongly Powered by AbleSci AI