Last update: October 2023. All opinions are my own.

1. Overview

This project focuses on crafting a reward function to maximize speed while keeping the agent near the center line of an AWS DeepRacer track.

2. Reward Function Sketch

def reward_function(params):
    import math
    track_width = params['track_width']
    distance_from_center = params['distance_from_center']
    reward = (1 / (math.sqrt(2 * math.pi * (track_width * 2 / 15) ** 2)))
    reward *= math.exp(-((distance_from_center + track_width / 10) ** 2
                  / (4 * (track_width * 2 / 15) ** 2)))
    return float(reward)

3. Training Strategy

  • Iterated on reward shaping to balance speed and stability.
  • Tested variations in simulation to reduce off-track behavior.
  • Focused on consistent lap completion rather than single-run peaks.

4. Outcomes

The work centers on creating a reward function that generalizes across tracks while maintaining competitive lap times.

5. Skills and Tools

  • Reinforcement learning
  • Game theory
  • Hyperparameter tuning