I'm currently going through David Silver's Reinforcement Learning course.
I'm 10 years too late starting this, and it still feels as axiomatic as ever. I've tried to write down most of the explanations he goes through and expanded on certain sections I personally felt needed more time to be thought about and understood. I'll be linking them below as I progress through the lectures myself, in case anyone wants to skim them in addition to his course lectures and slides, which are entirely self-contained by the way.
There might be a few notational inconsistencies owing to the fact i'm human, so just remember the source material supersedes everything. If there's a conceptual flaw in my writing, feel free to reach out.