Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles
Recognition of human behavior is critical in video monitoring, human-computer interaction, video comprehension, and virtual reality. The key problem with behaviour recognition in video surveillance is the high degree of variation between and within subjects. Numerous studies have suggested backgroun...
Published in: | Frontiers in Artificial Intelligence and Applications |
---|---|
Main Author: | |
Format: | Conference paper |
Language: | English |
Published: |
IOS Press BV
2021
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85116447202&doi=10.3233%2fFAIA210037&partnerID=40&md5=6e206eba5ca482f2222fa55657bd3752 |
id |
2-s2.0-85116447202 |
---|---|
spelling |
2-s2.0-85116447202 Nasrualam F.A.H.; Shapiai M.I.; Batool U.; Ramli A.K.; Elias K.A. Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles 2021 Frontiers in Artificial Intelligence and Applications 337 10.3233/FAIA210037 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85116447202&doi=10.3233%2fFAIA210037&partnerID=40&md5=6e206eba5ca482f2222fa55657bd3752 Recognition of human behavior is critical in video monitoring, human-computer interaction, video comprehension, and virtual reality. The key problem with behaviour recognition in video surveillance is the high degree of variation between and within subjects. Numerous studies have suggested background-insensitive skeleton-based as the proven detection technique. The present state-of-the-art approaches to skeleton-based action recognition rely primarily on Recurrent Neural Networks (RNN) and Convolution Neural Networks (CNN). Both methods take dynamic human skeleton as the input to the network. We chose to handle skeleton data differently, relying solely on its skeleton joint coordinates as the input. The skeleton joints' positions are defined in (x, y) coordinates. In this paper, we investigated the incorporation of the Neural Oblivious Decision Ensemble (NODE) into our proposed action classifier network. The skeleton is extracted using a pose estimation technique based on the Residual Network (ResNet). It extracts the 2D skeleton of 18 joints for each detected body. The joint coordinates of the skeleton are stored in a table in the form of rows and columns. Each row represents the position of the joints. The structured data are fed into NODE for label prediction. With the proposed network, we obtain 97.5% accuracy on RealWorld (HAR) dataset. Experimental results show that the proposed network outperforms one the state-of-the-art approaches by 1.3%. In conclusion, NODE is a promising deep learning technique for structured data analysis as compared to its machine learning counterparts such as the GBDT packages; Catboost, and XGBoost. © 2021 The authors and IOS Press. All rights reserved. IOS Press BV 9226389 English Conference paper |
author |
Nasrualam F.A.H.; Shapiai M.I.; Batool U.; Ramli A.K.; Elias K.A. |
spellingShingle |
Nasrualam F.A.H.; Shapiai M.I.; Batool U.; Ramli A.K.; Elias K.A. Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles |
author_facet |
Nasrualam F.A.H.; Shapiai M.I.; Batool U.; Ramli A.K.; Elias K.A. |
author_sort |
Nasrualam F.A.H.; Shapiai M.I.; Batool U.; Ramli A.K.; Elias K.A. |
title |
Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles |
title_short |
Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles |
title_full |
Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles |
title_fullStr |
Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles |
title_full_unstemmed |
Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles |
title_sort |
Skeleton-based action recognition with joint coordinates as feature using neural oblivious decision ensembles |
publishDate |
2021 |
container_title |
Frontiers in Artificial Intelligence and Applications |
container_volume |
337 |
container_issue |
|
doi_str_mv |
10.3233/FAIA210037 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85116447202&doi=10.3233%2fFAIA210037&partnerID=40&md5=6e206eba5ca482f2222fa55657bd3752 |
description |
Recognition of human behavior is critical in video monitoring, human-computer interaction, video comprehension, and virtual reality. The key problem with behaviour recognition in video surveillance is the high degree of variation between and within subjects. Numerous studies have suggested background-insensitive skeleton-based as the proven detection technique. The present state-of-the-art approaches to skeleton-based action recognition rely primarily on Recurrent Neural Networks (RNN) and Convolution Neural Networks (CNN). Both methods take dynamic human skeleton as the input to the network. We chose to handle skeleton data differently, relying solely on its skeleton joint coordinates as the input. The skeleton joints' positions are defined in (x, y) coordinates. In this paper, we investigated the incorporation of the Neural Oblivious Decision Ensemble (NODE) into our proposed action classifier network. The skeleton is extracted using a pose estimation technique based on the Residual Network (ResNet). It extracts the 2D skeleton of 18 joints for each detected body. The joint coordinates of the skeleton are stored in a table in the form of rows and columns. Each row represents the position of the joints. The structured data are fed into NODE for label prediction. With the proposed network, we obtain 97.5% accuracy on RealWorld (HAR) dataset. Experimental results show that the proposed network outperforms one the state-of-the-art approaches by 1.3%. In conclusion, NODE is a promising deep learning technique for structured data analysis as compared to its machine learning counterparts such as the GBDT packages; Catboost, and XGBoost. © 2021 The authors and IOS Press. All rights reserved. |
publisher |
IOS Press BV |
issn |
9226389 |
language |
English |
format |
Conference paper |
accesstype |
|
record_format |
scopus |
collection |
Scopus |
_version_ |
1809678481249796096 |