
Table of Contents
Last update: October 2023. All opinions are my own.
1. Overview
This project focuses on crafting a reward function to maximize speed while keeping the agent near the center line of an AWS DeepRacer track.
2. Reward Function Sketch
def reward_function(params):
import math
track_width = params['track_width']
distance_from_center = params['distance_from_center']
reward = (1 / (math.sqrt(2 * math.pi * (track_width * 2 / 15) ** 2)))
reward *= math.exp(-((distance_from_center + track_width / 10) ** 2
/ (4 * (track_width * 2 / 15) ** 2)))
return float(reward)3. Training Strategy
- Iterated on reward shaping to balance speed and stability.
- Tested variations in simulation to reduce off-track behavior.
- Focused on consistent lap completion rather than single-run peaks.
4. Outcomes
The work centers on creating a reward function that generalizes across tracks while maintaining competitive lap times.
5. Skills and Tools
- Reinforcement learning
- Game theory
- Hyperparameter tuning
