Journal of Geodesy and Geoinformation Science ›› 2020, Vol. 3 ›› Issue (3): 39-49.doi: 10.11947/j.JGGS.2020.0304

Previous Articles     Next Articles

A Remote Sensing Image Semantic Segmentation Method by Combining Deformable Convolution with Conditional Random Fields

Zongcheng ZUO1, Wen ZHANG2, Dongying ZHANG3   

  1. 1. School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China
    2. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
    3. School of Hydropower and Information Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
  • Received:2019-11-15 Accepted:2020-05-15 Online:2020-09-20 Published:2020-09-30
  • About author:Zongcheng ZUO (1988—), male, PhD candidate, engineer, majors in high resolution remote sensing image processing and information extraction, patterm recognition and machine learning. E-mail: jason.zuo@sjtu.edu.com
  • Supported by:
    National Key Research and Development Program of China(2017YFC0405806)

Abstract:

Currently, deep convolutional neural networks have made great progress in the field of semantic segmentation. Because of the fixed convolution kernel geometry, standard convolution neural networks have been limited the ability to simulate geometric transformations. Therefore, a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation. Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture. To overcome this shortcoming, the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation. The proposed method can easily be trained by end-to-end using standard backpropagation algorithms. Finally, the proposed method is tested on the ISPRS dataset. The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.

Key words: high-resolution remote sensing image; semantic segmentation; deformable convolution network; conditions random fields