COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION

Recently, various Deep Actor-Critic Reinforcement Learning (DAC-RL) algorithms have been widely utilized for training mobile robots in acquiring navigational policies. However, they usually need a preventively long learning time to achieve good policies. This research proposes a two-stage training m...

Full description

Bibliographic Details
Published in:	International Journal of Innovative Computing, Information and Control
Main Author:	Sendari S.; Muladi; Ardiyansyah F.; Setumin S.; Mokhtar N.B.; Lin H.-I.; Hartono P.
Format:	Article
Language:	English
Published:	ICIC International 2024
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85186911357&doi=10.24507%2fijicic.20.02.373&partnerID=40&md5=1b41795e03a07d1af74236f5854d811a

id	2-s2.0-85186911357
spelling	2-s2.0-85186911357 Sendari S.; Muladi; Ardiyansyah F.; Setumin S.; Mokhtar N.B.; Lin H.-I.; Hartono P. COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION 2024 International Journal of Innovative Computing, Information and Control 20 2 10.24507/ijicic.20.02.373 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85186911357&doi=10.24507%2fijicic.20.02.373&partnerID=40&md5=1b41795e03a07d1af74236f5854d811a Recently, various Deep Actor-Critic Reinforcement Learning (DAC-RL) algorithms have been widely utilized for training mobile robots in acquiring navigational policies. However, they usually need a preventively long learning time to achieve good policies. This research proposes a two-stage training mechanism infused with human common-sensical prior knowledge, named Two Stages DAC-RL with incentive reward, to alleviate this problem. The actor-critic networks were pre-trained in a simple environment to acquire a basic policy. Afterward, the basic policy was transferred to initialize the training process of a new navigational policy in more complex environments. This study also infused humans’ common-sensical prior knowledge to further mitigate the RL learning burden by giving incentive rewards in beneficial situations for the navigation task. The experiments tested this research’s algorithms against navigation tasks in which the robot should efficiently reach designated goals. The tasks were made more challenging by requiring the robot to cross some corridors to reach the goal while avoiding obstacles. The results showed that the proposed algorithm worked efficiently regarding various start-goal positions across the corridors. © 2024, Int. J. Innov. Comput. Inf. Control. All rights reserved. ICIC International 13494198 English Article
author	Sendari S.; Muladi; Ardiyansyah F.; Setumin S.; Mokhtar N.B.; Lin H.-I.; Hartono P.
spellingShingle	Sendari S.; Muladi; Ardiyansyah F.; Setumin S.; Mokhtar N.B.; Lin H.-I.; Hartono P. COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
author_facet	Sendari S.; Muladi; Ardiyansyah F.; Setumin S.; Mokhtar N.B.; Lin H.-I.; Hartono P.
author_sort	Sendari S.; Muladi; Ardiyansyah F.; Setumin S.; Mokhtar N.B.; Lin H.-I.; Hartono P.
title	COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
title_short	COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
title_full	COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
title_fullStr	COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
title_full_unstemmed	COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
title_sort	COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
publishDate	2024
container_title	International Journal of Innovative Computing, Information and Control
container_volume	20
container_issue	2
doi_str_mv	10.24507/ijicic.20.02.373
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85186911357&doi=10.24507%2fijicic.20.02.373&partnerID=40&md5=1b41795e03a07d1af74236f5854d811a
description	Recently, various Deep Actor-Critic Reinforcement Learning (DAC-RL) algorithms have been widely utilized for training mobile robots in acquiring navigational policies. However, they usually need a preventively long learning time to achieve good policies. This research proposes a two-stage training mechanism infused with human common-sensical prior knowledge, named Two Stages DAC-RL with incentive reward, to alleviate this problem. The actor-critic networks were pre-trained in a simple environment to acquire a basic policy. Afterward, the basic policy was transferred to initialize the training process of a new navigational policy in more complex environments. This study also infused humans’ common-sensical prior knowledge to further mitigate the RL learning burden by giving incentive rewards in beneficial situations for the navigation task. The experiments tested this research’s algorithms against navigation tasks in which the robot should efficiently reach designated goals. The tasks were made more challenging by requiring the robot to cross some corridors to reach the goal while avoiding obstacles. The results showed that the proposed algorithm worked efficiently regarding various start-goal positions across the corridors. © 2024, Int. J. Innov. Comput. Inf. Control. All rights reserved.
publisher	ICIC International
issn	13494198
language	English
format	Article
accesstype
record_format	scopus
collection	Scopus
_version_	1809677675013341184

COMMON-SENSICAL INCENTIVE REWARD IN DEEP ACTOR-CRITIC REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION

Similar Items