I present a derivation of the policy gradient theorem for homogeneous multiagent RL systems. The results are... boring.
An ongoing documentation of my implementations of Advent of Code 2020
I describe the process of making this exact website - to help avoid some frustrations I encountered along the way.