Below is a detailed description of some of the projects that I have worked on.

MuZero-CPP

This is the first fully-fledged pure C++ implementation of the MuZero algorithm. While other open source projects exist, they are in python and don’t handle multiply devices, batched inference on the GPU, complex action representation, and fast multi-threaded inference well. The goal of this project was to solve those shortcomings, while staying inside the C++ runtime allowing for environments which have those restrictions.

ptutil

For my thesis research and personal projects, I end up writing a lot of PyTorch boilerplate code. Examples include configuration files, training loops, metric logging, checkpointing, etc. After a while, I’ve developed patters/practices that I personally like, and have structured many of my projects similarly. To allow for faster prototyping and experiments for my thesis work, I’ve combined common boilerplate PyTorch code into a library I call ptu (pytorch-util).

Inspiration is taken from libraries such as PyTorch Lightning; and while those library already exist and provide more features that I am willing to write myself, I have a prefered way of setting my experiments and like having both control and ease of adding personalized features. Features supported by the framework include:

  • Training, validation, and testing boiler plate code which is highly modular, allowing for personalized subroutines during any stage of training/testing
  • Multi-experiment configurations using gin-config
  • Quick to add custom callbacks; ones implemented thus far are checkpoint loading/saving, early stoppage, gradient clipping, logging, and metric tracking (tensorboard)
  • A separate but highly similar structured module for RL training

As a warning, this is not tested to the standards for this being used in any production environment, but I have used this on many personal projects and for my personal thesis work. Maybe you want to play around with it, use it as a starting point for your own PyTorch framework!

Stones n Gems

I was given the opportunity to work with the DeepMind team responsible for the Open Spiel framework, to implement the environment I have been using for my thesis research. Previously, there were no suitable RL frameworks which had the type of environment I was planning to use, and at the time I was using Rocks’n’Diamonds along with the extensions I had to personally write and maintain.

Open Spiel is a collection of games and algorithms intended for research in reinforcement learning, search, and planning. It has an easy to use API, designed for both python and C++. The game I wrote, Stones n Gems, is a simplified version of a mixture of common stone and gem games, such as Boulder Dash and Emerald Mines. It offers simple mechanics and interactions between objects, that are often chained together in a clever way to solve complex problems. The environment is also destructive, meaning that deadlocks can occur as a result of the agent’s actions. This environment is what I am currently using in my research.

Rocks n Diamonds

My thesis research deals with games like Boulder Dash and Emerald Mines. Since there were no implementations in RL frameworks, and Rocks’n’Diamonds was the only open source version with support and many user levels, much of my early thesis work was in writing an extension to Rocks’n’Diamonds.

The goal for this project was to create an easy to use framework primarily for myself, but other researchers as well, to add custom controllers to play the game. A lot of work also went into creating a headless version (no graphics), along with performance enhancements so that the engine could be used in simulations (for instance, controllers based on MCTS).

Github Logo

Rocks’n’Diamonds is an open source C arcade style game based off Boulder Dash (Commodore 64), Emerald Mine (Amiga), Supaplex (Amiga/PC) and Sokoban (PC). This project uses the open source engine provided by the folks at Artsoft Entertainment, and extends it with features such as

  • Easily add your own AI controller to play the levels
  • Library functions to help access the engine state and query item locations/properties
  • Ability to replay the levels using the actions the AI agent made
  • Comprehensive logging

While the game engine is in C, all the wrapping code (including user defined controllers) is in C++.