Hierarchical Equalization Loss for Long-Tailed Instance Segmentation

计算机科学分割均衡（音频）人工智能图像分割模式识别（心理学）算法解码方法

作者

Yaochi Zhao,Sen Chen,Shiguang Liu,Zhuhua Hu,Jingquan Xia

出处

期刊：IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
日期：2024-01-01 卷期号：26: 6943-6955 被引量：37

标识

DOI：10.1109/tmm.2024.3358080

摘要

Multimedia data has the characteristics of large scale and skewed distribution with a long-tailed shape, which is a challenging imbalance problem faced by deep learning. In long-tailed image instance segmentation, the existing methods deal with this imbalance problem from a single perspective, ignoring the presence of multiple imbalance factors, which results in the limitation of performance. Considering that imbalances exist not only between positive and negative classes, but also between foreground and background subclasses, as well as between hard and easy examples, we argue that the losses of samples should be hierarchically equalized at multi-levels (HEL). In line with this idea, we first propose a focus based hierarchical-equalization loss (FHEL), which employs a class gradient ratio based reweighting mechanism to achieve the balance between classes, and uses a subclass-balance term and a sample-balance term to separately deal with the inter-subclass and inter-sample imbalances. FHEL can improve the performance of long-tailed instance segmentation in an end-to-end manner, avoiding the overfitting risk and manual hard division in the traditional methods. On the basis of FHEL, we further explore the relationship between inter-subclass imbalance and inter-sample imbalance, and propose a constrained-focus based hierarchical-equalization loss (CFHEL) that copes with the imbalances at multi-levels simultaneously with fewer hyperparameters. CFHEL is effective and easy to tune hyperparameters. We conduct extensive experiments on LVIS v1.0 and COCO-LT datasets with different benchmarks. Both FHEL and CFHEL are superior to the existing methods. On LVIS v1.0, with ResNet50 Mask R-CNN, ResNet101Mask R-CNN, ResNeXt101 Mask R-CNN and ResNet101 Cascade Mask R-CNN, CFHEL outperforms its baselines respectively with 19.8%, 18.5%, 21.6% and 21.2% AP% gains, and with 6.7%, 6.6% and 6.5% AP gains, achieving the new state-of-the-arts. On COCO-LT, our CFHEL outperforms the baseline with 13.2% tail AP gains and 3.3% whole AP gains, also achieving the new best performances.

求助该文献

最长约 10秒，即可获得该文献文件

Hierarchical Equalization Loss for Long-Tailed Instance Segmentation

今日热心研友