Online trajectory planning with reinforcement learning for pedestrian avoidance

Planning the optimal trajectory of emergency avoidance maneuvers for highly automated vehicles is a complex task with many challenges. The algorithm needs to decrease accident risk by reducing the severity and keeping the car in a controllable state. Optimal trajectory generation considering all aspects of vehicle and environment dynamics is numerically complex, especially if the object to be avoided is moving. This paper presents a hierarchical method for the avoidance of moving objects in an autonomous vehicle, where a reinforcement learning agent is responsible for local planning, while longitudinal and lateral control is performed by the low-level model-predictive controller and Stanley controllers. In the developed architecture, the agent is responsible for the optimization. It is trained in various scenarios to provide the necessary parameters for a polynomial-based path and a velocity profile in a neural network output. The vehicle performs only the first step of the trajectory, which is redesigned repeatedly by the planner based on the new state. In the training phase, the vehicle executes the entire trajectory via low-level controllers to determine the reward value, which realizes a prediction for the future. The agent receives feedback and can further improve its performance. Finally, the proposed framework was tested in a simulation environment and was also compared to human drivers’ abilities.