Tags / AI
Projects
2019
- Spieeltjie experiment with multi-agent RL on zero-sum differentiable games Spieeltjie is a single-file package for doing simple experiments with multi-agent reinforcement learning on symmetric zero-sum games. For more information see “Open-ended learning in Symmetric Zero-Sum Games” and “A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning”. The name “spieeltjie” comes from the Afrikaans word for “tournament”. […] This first set of images shows trajectories when starting from a set of random …
- Blinker OpenAI Gym wrapper for active observation The blinker wrapper can wrap any Gym environment, adding an additional, parallel observe action. observe costs a configurable amount of reward (i.e. it produces a negative reward), but is required to obtain a fresh observation. If observe is not chosen, the observation will remain stale. This forces agents to choose the best times to observe, and to avoid observation if they can predict the relevant world state. Additionally, the render(human=true) method will show a visual indication when an …
- Wasserstein GAN DFL contributed to Depth First Learning course on Wasserstein GANs While tutoring at the 2019 Deep Learning Indaba, I got to know the multi-talented Cinjon Resnik, who is currently doing his PhD with Kyunghyun Cho at NYU. After the Indaba, Cinjon invited me to join an experiment he is running in distributed teaching and learning called Depth First Learning. One innovation that particularly resonated with me was the effort DFL makes to plot a path through “paper space”, as a way to explain a core idea or story. The story we chose as the backbone for …
- SOAP feedback alignment & activity propogation in PyTorch SOAP (Second Order Activity Propogation) is a package for experimenting with feedback alignment and activity propogation in PyTorch. It formed my project for the 3-week IBRO-Simons Computational Neuroscience Summer School in 2019. I’d like to share my journey over the few days I spent on this project, because it took me far afield and led to some strong opinions (always a good outcome). […] TODO: Describe it For more information on Feedback Alignment, see Lillicrap’s original …