Prioritized Experience Replay in DRQN

10 minute read

Q learning is a classic and well-studied reinforcement learning (RL) algorithm. Adding neural network Q-functions led to the milestone Deep Q-Network (DQN) algorithm that surpassed human performance on a suite of Atari games (Mnih et al. 2013). DQN is attractive due to its simplicity, but the DQN-based algorithms that are most successful tend to rely on many tweaks and improvements to achieve stability and good performance.

DQN and DRQN in partially observable gridworlds

11 minute read

RL agents whose policies use only feedforward neural networks have a limited capacity to accomplish tasks in partially observable environments. For such tasks, an agent may need to account for past observations or previous actions to implement a successful strategy.

Multi-agent gridworlds

7 minute read

Gridworlds are popular environments for RL experiments. Agents in gridworlds can move between adjacent tiles in a rectangular grid, and are typically trained to pursue rewards solving simple puzzles in the grid. MiniGrid is a popular and flexible gridworld implementation that has been used in more than 20 publications.

Why I’m excited about MARL

10 minute read

I’m excited to be participating in the 2020 cohort of the OpenAI Scholars program. With the mentorship of Natasha Jaques, I’ll be spending the next few months studying multi-agent reinforcement learning (MARL) and periodically writing blog posts to document my progress. In this first post, I’ll discuss the reasons I’m excited about MARL and my plan for the Scholars program.


MineRL: Recurrent replay

4 minute read

I spent some time recently exploring reinforcement learning in the excellent MineRL minecraft environments. I haven’t played much Minecraft, and I haven’t actually accomplished the personally accomplished the holy grail objective of mining a diamond. The prospect of building a bot that can learn to accomplish a task that I haven’t completed – one that is as human-accessible as this – is incredibly exciting!