Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory

Sound event detection tackles an audio environment's complex sound analysis and recognition problem. The process involves localizing and classifying sounds mainly to estimate the start point and end points of the separate sounds and describe each sound. Sound event detection capability relies o...

Full description

Bibliographic Details
Published in:	Journal of Advanced Research in Applied Sciences and Engineering Technology
Main Author:	Zaini M.N.; Yusoff M.; Sadikin M.A.
Format:	Article
Language:	English
Published:	Penerbit Akademia Baru 2023
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174960423&doi=10.37934%2fARASET.32.2.242254&partnerID=40&md5=bdcb4261a7403ba5cdc4f93bec86f159

id	2-s2.0-85174960423
spelling	2-s2.0-85174960423 Zaini M.N.; Yusoff M.; Sadikin M.A. Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory 2023 Journal of Advanced Research in Applied Sciences and Engineering Technology 32 2 10.37934/ARASET.32.2.242254 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174960423&doi=10.37934%2fARASET.32.2.242254&partnerID=40&md5=bdcb4261a7403ba5cdc4f93bec86f159 Sound event detection tackles an audio environment's complex sound analysis and recognition problem. The process involves localizing and classifying sounds mainly to estimate the start point and end points of the separate sounds and describe each sound. Sound event detection capability relies on the type of sound. Although detecting sequences of distinct temporal sounds is straightforward, the situation becomes complex when the sound is multiple overlapping of much single audio. This situation usually occurs in the forest environment. Therefore, this aim of the paper is to propose a Convolution Recurrent Neural Network-Long Short-Term Memory algorithm to detect an audio signature of intruders in the forest environment. The audio is extracted in the Mel-frequency cepstrum coefficient and fed into the algorithm as an input. Six sound categories are chainsaw, machete, car, hatchet, ambiance, and bike. They were tested using several epochs, batch size, and filter of the layer in the model. The proposed model can achieve an accuracy of 98.52 percent in detecting the audio signature with a suitable parameter selection. In the future, additional types of audio signatures of intruders and further aspects of evaluation can be added to make the algorithm better at detecting intruders in the forest environment. © 2023, Penerbit Akademia Baru. All rights reserved. Penerbit Akademia Baru 24621943 English Article All Open Access; Hybrid Gold Open Access
author	Zaini M.N.; Yusoff M.; Sadikin M.A.
spellingShingle	Zaini M.N.; Yusoff M.; Sadikin M.A. Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
author_facet	Zaini M.N.; Yusoff M.; Sadikin M.A.
author_sort	Zaini M.N.; Yusoff M.; Sadikin M.A.
title	Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
title_short	Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
title_full	Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
title_fullStr	Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
title_full_unstemmed	Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
title_sort	Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
publishDate	2023
container_title	Journal of Advanced Research in Applied Sciences and Engineering Technology
container_volume	32
container_issue	2
doi_str_mv	10.37934/ARASET.32.2.242254
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174960423&doi=10.37934%2fARASET.32.2.242254&partnerID=40&md5=bdcb4261a7403ba5cdc4f93bec86f159
description	Sound event detection tackles an audio environment's complex sound analysis and recognition problem. The process involves localizing and classifying sounds mainly to estimate the start point and end points of the separate sounds and describe each sound. Sound event detection capability relies on the type of sound. Although detecting sequences of distinct temporal sounds is straightforward, the situation becomes complex when the sound is multiple overlapping of much single audio. This situation usually occurs in the forest environment. Therefore, this aim of the paper is to propose a Convolution Recurrent Neural Network-Long Short-Term Memory algorithm to detect an audio signature of intruders in the forest environment. The audio is extracted in the Mel-frequency cepstrum coefficient and fed into the algorithm as an input. Six sound categories are chainsaw, machete, car, hatchet, ambiance, and bike. They were tested using several epochs, batch size, and filter of the layer in the model. The proposed model can achieve an accuracy of 98.52 percent in detecting the audio signature with a suitable parameter selection. In the future, additional types of audio signatures of intruders and further aspects of evaluation can be added to make the algorithm better at detecting intruders in the forest environment. © 2023, Penerbit Akademia Baru. All rights reserved.
publisher	Penerbit Akademia Baru
issn	24621943
language	English
format	Article
accesstype	All Open Access; Hybrid Gold Open Access
record_format	scopus
collection	Scopus
_version_	1809677580720144384

Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory

Similar Items