University of Worcester Worcester Research and Publications
 
  USER PANEL:
  ABOUT THE COLLECTION:
  CONTACT DETAILS:

Reward shaping with hierarchical graph topology

Sang, J., Wang, Y., Ding, W., Ahmed Khan, Zaki and Xu, L. (2023) Reward shaping with hierarchical graph topology. Pattern Recognition, 143 (109746). ISSN Print ISSN: 0031-3203. Online ISSN: 1873-5142

Full text not available from this repository. (Request a copy)

Abstract

Reward shaping using GCNs is a popular research area in reinforcement learning. However, it is difficult to shape potential functions for complicated tasks. In this paper, we develop Reward Shaping with Hierarchical Graph Topology (HGT). HGT propagates information about the rewards through the message passing mechanism, which can be used as potential functions for reward shaping. We describe reinforcement learning by a probability graph model. Then we generate a underlying graph with each state is a node and edges represent transition probabilities between states. In order to prominently shape potential functions for complex environments, HGT divides the underlying graph constructed from states into multiple subgraphs. Since these subgraphs provide a representation of multiple logical relationships between states in the Markov decision process, the aggregation process rich correlation information between nodes, which makes the propagated messages more powerful. When compared to cutting-edge RL techniques, HGT achieves faster learning rates in experiments on Atari and Mujoco tasks.

Item Type: Article
Additional Information:

The full text of the published version cannot be supplied for this item. Please check availability with your local library or Interlibrary Requests Service.

Uncontrolled Discrete Keywords: Reinforcement learning, Reward shaping, Probability graph, Markov decision process
Divisions: College of Business, Psychology and Sport > Worcester Business School
Related URLs:
Copyright Info: © 2024 Elsevier B.V., its licensors, and contributors., All rights are reserved, including those for text and data mining, AI training, and similar technologies
Depositing User: Katherine Small
Date Deposited: 29 Feb 2024 14:59
Last Modified: 29 Feb 2024 14:59
URI: https://eprints.worc.ac.uk/id/eprint/13654

Actions (login required)

View Item View Item
 
     
Worcester Research and Publications is powered by EPrints 3 which is developed by the School of Electronics and Computer Science at the University of Southampton. More information and software credits.