Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory
Sound event detection tackles an audio environment's complex sound analysis and recognition problem. The process involves localizing and classifying sounds mainly to estimate the start point and end points of the separate sounds and describe each sound. Sound event detection capability relies o...
Published in: | Journal of Advanced Research in Applied Sciences and Engineering Technology |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Penerbit Akademia Baru
2023
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174960423&doi=10.37934%2fARASET.32.2.242254&partnerID=40&md5=bdcb4261a7403ba5cdc4f93bec86f159 |
id |
2-s2.0-85174960423 |
---|---|
spelling |
2-s2.0-85174960423 Zaini M.N.; Yusoff M.; Sadikin M.A. Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory 2023 Journal of Advanced Research in Applied Sciences and Engineering Technology 32 2 10.37934/ARASET.32.2.242254 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174960423&doi=10.37934%2fARASET.32.2.242254&partnerID=40&md5=bdcb4261a7403ba5cdc4f93bec86f159 Sound event detection tackles an audio environment's complex sound analysis and recognition problem. The process involves localizing and classifying sounds mainly to estimate the start point and end points of the separate sounds and describe each sound. Sound event detection capability relies on the type of sound. Although detecting sequences of distinct temporal sounds is straightforward, the situation becomes complex when the sound is multiple overlapping of much single audio. This situation usually occurs in the forest environment. Therefore, this aim of the paper is to propose a Convolution Recurrent Neural Network-Long Short-Term Memory algorithm to detect an audio signature of intruders in the forest environment. The audio is extracted in the Mel-frequency cepstrum coefficient and fed into the algorithm as an input. Six sound categories are chainsaw, machete, car, hatchet, ambiance, and bike. They were tested using several epochs, batch size, and filter of the layer in the model. The proposed model can achieve an accuracy of 98.52 percent in detecting the audio signature with a suitable parameter selection. In the future, additional types of audio signatures of intruders and further aspects of evaluation can be added to make the algorithm better at detecting intruders in the forest environment. © 2023, Penerbit Akademia Baru. All rights reserved. Penerbit Akademia Baru 24621943 English Article All Open Access; Hybrid Gold Open Access |
author |
Zaini M.N.; Yusoff M.; Sadikin M.A. |
spellingShingle |
Zaini M.N.; Yusoff M.; Sadikin M.A. Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory |
author_facet |
Zaini M.N.; Yusoff M.; Sadikin M.A. |
author_sort |
Zaini M.N.; Yusoff M.; Sadikin M.A. |
title |
Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory |
title_short |
Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory |
title_full |
Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory |
title_fullStr |
Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory |
title_full_unstemmed |
Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory |
title_sort |
Forest Sound Event Detection with Convolutional Recurrent Neural Network-Long Short-Term Memory |
publishDate |
2023 |
container_title |
Journal of Advanced Research in Applied Sciences and Engineering Technology |
container_volume |
32 |
container_issue |
2 |
doi_str_mv |
10.37934/ARASET.32.2.242254 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85174960423&doi=10.37934%2fARASET.32.2.242254&partnerID=40&md5=bdcb4261a7403ba5cdc4f93bec86f159 |
description |
Sound event detection tackles an audio environment's complex sound analysis and recognition problem. The process involves localizing and classifying sounds mainly to estimate the start point and end points of the separate sounds and describe each sound. Sound event detection capability relies on the type of sound. Although detecting sequences of distinct temporal sounds is straightforward, the situation becomes complex when the sound is multiple overlapping of much single audio. This situation usually occurs in the forest environment. Therefore, this aim of the paper is to propose a Convolution Recurrent Neural Network-Long Short-Term Memory algorithm to detect an audio signature of intruders in the forest environment. The audio is extracted in the Mel-frequency cepstrum coefficient and fed into the algorithm as an input. Six sound categories are chainsaw, machete, car, hatchet, ambiance, and bike. They were tested using several epochs, batch size, and filter of the layer in the model. The proposed model can achieve an accuracy of 98.52 percent in detecting the audio signature with a suitable parameter selection. In the future, additional types of audio signatures of intruders and further aspects of evaluation can be added to make the algorithm better at detecting intruders in the forest environment. © 2023, Penerbit Akademia Baru. All rights reserved. |
publisher |
Penerbit Akademia Baru |
issn |
24621943 |
language |
English |
format |
Article |
accesstype |
All Open Access; Hybrid Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809677580720144384 |