代码本
人工智能
模式识别(心理学)
直方图
计算机科学
欧几里德距离
核(代数)
计算机视觉中的词袋模型
直方图匹配
聚类分析
数学
视觉文字
图像(数学)
图像检索
组合数学
作者
Jianxin Wu,James M. Rehg
标识
DOI:10.1109/iccv.2009.5459178
摘要
Common visual codebook generation methods used in a Bag of Visual words model, e.g. k-means or Gaussian Mixture Model, use the Euclidean distance to cluster features into visual code words. However, most popular visual descriptors are histograms of image measurements. It has been shown that the Histogram Intersection Kernel (HIK) is more effective than the Euclidean distance in supervised learning tasks with histogram features. In this paper, we demonstrate that HIK can also be used in an unsupervised manner to significantly improve the generation of visual codebooks. We propose a histogram kernel k-means algorithm which is easy to implement and runs almost as fast as k-means. The HIK codebook has consistently higher recognition accuracy over k-means codebooks by 2-4%. In addition, we propose a one-class SVM formulation to create more effective visual code words which can achieve even higher accuracy. The proposed method has established new state-of-the-art performance numbers for 3 popular benchmark datasets on object and scene recognition. In addition, we show that the standard k-median clustering method can be used for visual codebook generation and can act as a compromise between HIK and k-means approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI