Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

Abstract

In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection. The proposed framework is aiming to address two limits of the existing CNN based methods. First, region-based CNN methods lack sufficient context to accurately locate salient object since they deal with each region independently. Second, pixel-based CNN methods suffer from blurry boundaries due to the presence of convolutional and pooling layers. Motivated by these, we first propose an end-to-end edge-preserved neural network based on Fast R-CNN framework (named RegionNet) to efficiently generate saliency map with sharp object boundaries. Later, to further improve it, multi-scale spatial context is attached to RegionNet to consider the relationship between regions and the global scenes. Furthermore, our method can be generally applied to RGB-D saliency detection by depth refinement. The proposed framework achieves both clear detection boundary and multi-scale contextual robustness simultaneously for the first time, and thus achieves an optimized performance. Experiments on six RGB and two RGB-D benchmark datasets demonstrate that the proposed method outperforms previous methods by a large margin, in particular, we achieve relative improvement by 6.1% and 10.1% on F-measure on ECSSD and DUT-OMRON dataset, respectively.

Paper

arXiv 1608.08029 (PDF)
@article{wang2016edge,
  title={Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection},
  author={Wang, Xiang and Ma, Huimin and You, Shaodi and Chen, Xiaozhi},
  journal={arXiv preprint arXiv:1608.08029},
  year={2016}
}

Results

We test our method on 6 typical benchmark RGB datasets: ECSSD, DUT-OMRON, JuddDB, SED2, THUR15K and Pascal-S. And 2 RGB-D datasets: NJU2000 and RGBD1000.

Dummy Image

Comparison with state-of-the-art methods on six benchmark datasets. For each dataset, the first row shows the PR curves and the second row shows the F-measure and MAE. The numbers in the PR curves denote the AUC. Our method outperforms other methods by a large margin on PR-curves and F-measure, and achieves comparable performance on MAE. In particular, the PR-curves show that our method achieves much higher precision at high recall, which demonstrates that our method locate salient objects more accurately and uniformly.

Code

  • will be released