Attention-Based Semantic Segmentation Networks for Forest Applications
Deforestation remains one of the key concerning activities around the world due to commodity-driven extraction, agricultural land expansion, and urbanization. The effective and efficient monitoring of national forests using remote sensing technology is important for the early detection and mitigatio...
Published in: | FORESTS |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Published: |
MDPI
2023
|
Subjects: | |
Online Access: | https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001131189000001 |
Summary: | Deforestation remains one of the key concerning activities around the world due to commodity-driven extraction, agricultural land expansion, and urbanization. The effective and efficient monitoring of national forests using remote sensing technology is important for the early detection and mitigation of deforestation activities. Deep learning techniques have been vastly researched and applied to various remote sensing tasks, whereby fully convolutional neural networks have been commonly studied with various input band combinations for satellite imagery applications, but very little research has focused on deep networks with high-resolution representations, such as HRNet. In this study, an optimal semantic segmentation architecture based on high-resolution feature maps and an attention mechanism is proposed to label each pixel of the satellite imagery input for forest identification. The selected study areas are located in Malaysian rainforests, sampled from 2016, 2018, and 2020, downloaded using Google Earth Pro. Only a two-class problem is considered for this study, which is to classify each pixel either as forest or non-forest. HRNet is chosen as the baseline architecture, in which the hyperparameters are optimized before being embedded with an attention mechanism to help the model to focus on more critical features that are related to the forest. Several variants of the proposed methods are validated on 6120 sliced images, whereby the best performance reaches 85.58% for the mean intersection over union and 92.24% for accuracy. The benchmarking analysis also reveals that the attention-embedded high-resolution architecture outperforms U-Net, SegNet, and FC-DenseNet for both performance metrics. A qualitative analysis between the baseline and attention-based models also shows that fewer false classifications and cleaner prediction outputs can be observed in identifying the forest areas. |
---|---|
ISSN: | 1999-4907 |
DOI: | 10.3390/f14122437 |