人工智能
分类器(UML)
计算机科学
模式识别(心理学)
尺度不变特征变换
Mel倒谱
特征提取
支持向量机
计算机视觉
语音识别
作者
Andréia Marini,A. J. Turatti,Alceu S. Britto,Alessandro L. Koerich
标识
DOI:10.1109/icassp.2015.7178383
摘要
This paper presents a novel approach for bird species identification that relies on both visual features extracted from unconstrained bird images and acoustic features extracted from bird vocalizations. The Scale Invariant Feature Transform (SIFT) detects local features in bird images, which are then used to train a support vector machine classifier. The instances that are not classified with a certain degree of certainty are then rejected and reclassified using Mel-frequency cepstral coefficients (MFCCs) extracted from the bird songs if available. Experiments conducted on a dataset of 50 bird species that comprise images from the CUB200-2011 and audio samples from Xeno-Canto have shown that improvements between 1.2 and 15.7 percentage points are achieved when using an acoustic classifier to re-process the instances rejected by the visual classifier, depending on the rejection level.
科研通智能强力驱动
Strongly Powered by AbleSci AI