The RL Probabilist A blog by Dibya Ghosh on RL, ML, and probability.

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

A lot of empirical evidence has shown that generalization in RL is hard in practice, but is this an issue with our implementations or something more fundamental? This blog post explores one reason why generalization in RL is fundamentally hard: it turns fully-observed RL problems into more challenging partially-observed ones.

Learning to Reach Goals via Iterated Supervised Learning

This post provides a simple introduction to my recent paper and the algorithm we propose: Goal-conditioned Supervised Learning (GCSL)

Trouble in High-Dimensional Land

Most of the intuitions we build in 2D and 3D break in higher dimensions, a core problem for most machine learning problems. So where do they break?

KL Divergence for Machine Learning

A writeup introducing KL divergence in the context of machine learning, various properties, and an interpretation of reinforcement learning and machine learning as minimizing KL divergence

An Introduction to Control as Inference

We introduce an interpretation of reinforcement learning as inference in a probabilistic graphical model called control-as-inference, which has driven many recent advances in deep RL.