Gabriel Theron,Timothy Liundi,Andry Chowanda,Anderies
出处
期刊:2021 7th International Conference on Electrical, Electronics and Information Engineering (ICEEIE)日期:2023-09-28卷期号:: 1-6
标识
DOI:10.1109/iceeie59078.2023.10334798
摘要
CAPTCHA is a digital security system that prevents automated bots from accessing an online website or service. With the advancements in Convolutional Neural Networks (CNN), CAPTCHA can be easily bypassed with the correct model and dataset. Using deep learning methods to bypass various types of CAPTCHAs have been researched in previous related papers, while our paper will focus specifically on the image CAPTCHA called reCAPTCHA. This research addresses a gap in existing literature surrounding the effects of utilizing multi-labeled images compared to single-labeled images. To facilitate this endeavor, this research provides a new Multi-Label Image CAPTCHA dataset labeled and collected manually by the researchers. Using a public dataset alongside the newly collected dataset, this paper aims to compare and provide data on the success rate of CNN models trained using two different approaches. The first approach is training the model using a public dataset of single-labeled images acquired from Kaggle. The second approach uses the new dataset that can have multiple labels in each image. The researchers deployed the created models on a reCAPTCHA demonstration website and documented the results.