Transportation Research Procedia, Volume 62 / 11 March 2022
Investigating Successor Features in the Domain of Autonomous Vehicle Control
In this article, the basic Reinforcement Learning (RL) concepts are discussed, continued with a brief explanation of Markov Decision Processes (MDPs). Reasoning for the application of RL in the autonomous vehicle control domain is accompanied with a developed basic environment for simulation-based training of agents. Furthermore, we look at the available literature of successor features, and the recent achievements of its utilization. Our motivation is to tackle the problem of credit assignment with reward decomposition by using successor features, because the complex tasks while driving can cause unsuccessful training and can be challenging. After explaining the applied methodology and showing how it works, state-of-the-art ideas are investigated and infused into the vehicle control realm. Moreover, the paper details how these features can be tailored to the highway driving scenarios and what is the secret behind its capability to boost the performance of the RL agent. In order to investigate the proposed problems in a credible way we applied a high-fidelity traffic simulator (SUMO) as our environment, and concluded different trainings based on various scenarios. We present successor features applied to an autonomous vehicle control setting, such as highway commute. Our results imply that learned skills can help with the multi-objective rewarding problem, and agents applied to changing reward systems can adapt quickly to the new tasks. The only thing to find is the correct decomposition and selection of successor features.