Suggest edit — Generative / Causal / Hierarchical Model-Based Reinforcement Learning

Title

Name

Note

---
title: "Generative / Causal / Hierarchical Model-Based Reinforcement Learning"
visibility: public
---

# Generative / Causal / Hierarchical Model-Based Reinforcement Learning

Category: [[machine-intelligence|Machine Intelligence]]

[Read the original document](https://docs.google.com/document/d/1xBImt4PZUeEra2By4JaRLw3dg8sqJ3znscd2sfFsjZk/edit?usp=drivesdk&sa=D&ust=1596495076463000&usg=AOvVaw1CzZCc1vBXi6XxsLu-o5Km)

---

1. Core Curriculum 
   1. (Learning pathway that will lead to understanding the major approaches)
2. Experiment Ideas
   1. Experiments I should run that would improve capabilities in the space, understanding of the space
3. Philosophy
   1. The reasoning behind the relative importance of this approach
4. Papers & Books worth Reading
   1. Papers, organized by lab or by sub-topic or whatever.

Core Curriculum

Reinforcement Learning
1. Sutton & Barto. Ch. 1, 2, 3, 6 and 9.
2. Bertsekas. Dynamic Programming and Optimal Control.
3. Reinforcement Learning of motor skills with Policy Gradients

Model-Based Reinforcement Learning
1. Value Iteration Networks
2. World Models
3. On Learning to Think
4. Imagination-Augmented Agents for Deep Reinforcement Learning [Also, Planning]
5. Unsupervised Predictive Memory in a Goal-Directed Agent

Hierarchical Reinforcement Learning
1. FeUdal Networks for Hierarchical Reinforcement Learning

Generative Modeling
1. Tutorial on Variational Autoencoders

Causality
1. Pearl. Causality. Ch. 3, 4, 7 and 8.
2. Theoretical Impediments to Machine Learning with Seven Sparks from the Causal Revolution
3. Reinforcement Learning and Causal Models
4. Learning Graphs
   1. Learning Deep Generative Models of Graphs
   2. Grammar VAE
5. Woulda, Shoulda, Coulda: Counterfactually-Guided Policy Search

Honglak’s Talk
Generative World Models
http://www.unofficialgoogledatascience.com/2017/01/causality-in-machine-learning.html
NIPS Causality Workshop

Experiment Ideas

1. Use counterfactuals to learn causal relationships in a world-models style simulation of the environment.
   1. Potential Collaborators:
      1. David Ha
      2. Juergen Schmidhuber
      3. Honglak Lee
      4. Ashish Viswani?
      5. Daniel Galvez (Wrote this)
      6. Imaginative Agents Sync
         1. Danijar Hafner
         2. Jacob Buckman
         3. Eugene Brevdo
         4. Jakob Uszkoreit
2. Use Grammar VAE (or other generative graph model) to generate a causal graph over a latent representation of the causal interactions between actions and the environment. Iteratively update your causal graph, as well as your decision making / learning over that graph.

Philosophy

This is a path to general problem solving. Simulation-based planning (especially after integrating causality) allows the use of a model of the world to make predictions about what set of actions will lead to a desired outcome, and then after taking said actions get feedback on the quality of the model of the world.

Hierarchical, model-based planning
Counterfactuals

Notes

Ways to represent a world model:
1. Latent Variable State Space Model
2. Next Frame / Continuous Control prediction (network)
3. RNN Cell / Hidden State as Model
4. Input Embedding
5. VAE Hidden State

At Brain, talk to:
1. Aurko Roy
2. Arvind Neelakantan
3. Ashish Vaswani
4. David Ha
Papers, By Lab:

Goal: Turn papers on research frontier into a shortlist of methods for building up a model of the environment in reinforcement learning.

Papers

Brain

1. Unsupervised Learning for Physical Interaction through Video Prediction [Also, Robotics]
   1. https://arxiv.org/pdf/1605.07157.pdf
2. Continuous Deep Q-Learning with Model-based Acceleration
   1. http://proceedings.mlr.press/v48/gu16.pdf
3. Value Prediction Network
   1. https://arxiv.org/pdf/1707.03497.pdf
4. Learning to Generate Long-term Future via Hierarchical Prediction
   1. https://arxiv.org/pdf/1704.05831.pdf
5. Discrete Sequential Prediction of Continuous Actions for Deep RL
   1. https://arxiv.org/pdf/1705.05035.pdf
6. Deep Visual Foresight for Planning Robot Motion [Also, Robotics]
   1. https://arxiv.org/pdf/1610.00696.p
7. Stochastic Variational Video prediction
   1. https://arxiv.org/pdf/1710.11252.pdf
8. Geometry-Based Next Frame Prediction from Monocular Video
   1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45984.pdf
9. Decomposing Motion and Content for Natural Video Sequence Prediction
   1. https://sites.google.com/a/umich.edu/rubenevillegas/iclr2017
10. Action-Conditional Video Prediction using Deep Networks in Atari Games
   1. http://papers.nips.cc/paper/5859-action-conditional-video-prediction-using-deep-networks-in-atari-games.pdf
11. World Models
   1. https://arxiv.org/pdf/1803.10122.pdf

Deepmind

1. Learning Model-Based Planning from Scratch [Also, Planning]
   1. https://arxiv.org/pdf/1707.06170.pdf
2. Recurrent Environment Simulators
   1. https://arxiv.org/pdf/1704.02254.pdf
3. Structure Learning in Motor Control: A Deep Reinforcement Learning Model [Also Transfer, Intuitive Physics]
   1. https://arxiv.org/pdf/1706.06827.pdf
4. Imagination-Augmented Agents for Deep Reinforcement Learning [Also, Planning]
   1. https://arxiv.org/abs/1707.06203
5. Continuous Deep Q-Learning with Model-based Acceleration
   1. https://arxiv.org/abs/1603.00748
6. Skip Context Tree Switching
   1. http://proceedings.mlr.press/v32/bellemare14.pdf
7. Bayes-Adaptive Simulation-Based Search with Value Function Approximation
   1. http://www0.cs.ucl.ac.uk/staff/d.silver/web/Publications_files/bafa.pdf
8. Learning and Querying Fast Generative Models for Reinforcement Learning
   1. https://arxiv.org/abs/1802.03006
9. Learning Model-Based Planning from Scratch
   1. https://arxiv.org/abs/1707.06170
10. Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images
   1. https://arxiv.org/pdf/1506.07365.pdf

Berkeley
1. Neural Network Dynamics for Model-based Deep Reinforcement Learning with Model-Free Tuning
   1. https://arxiv.org/pdf/1708.02596.pdf
2. Model-Based Reinforcement Learning with NEural Network Dynamics
   1. http://bair.berkeley.edu/blog/2017/11/30/model-based-rl/
3. Self-Supervised Visual Planning with Temporal Skip Connections
   1. https://arxiv.org/abs/1710.05268
4. Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
   1. https://arxiv.org/abs/1703.03078
5. Deep Spatial Autoencoders for Visuomotor Learning
   1. http://rll.berkeley.edu/dsae/dsae.pdf
6. End-to-End Training of Deep Visuomotor Policies
   1. http://jmlr.org/papers/v17/15-522.html

Other
1. Bayesian Model-Based RL
   1. https://arxiv.org/pdf/1609.04436.pdf

Causality

Types of Causality

1. Counterfactual Simulation
   1. Hierarchical Forward Prediction
2. Time Series Relation + Relationships relative to trends / controls
3. Randomized Controlled Trial
   1. Pseudo-experiments
   2. Differences between groups that can be controlled for
4. Attribution
5. Probabilistic, Manipulative, Counterfactual and Structural Approaches

Papers
1. On Causal and Anticausal Learning
   1. Scholkopf.
2. Imagination-Augmented Agents for Deep Reinforcement Learning [Also, Planning]
   1. https://arxiv.org/abs/1707.06203
3. Bandits with Unobserved Confounders: A Causal Approach
   1. http://ftp.cs.ucla.edu/pub/stat_ser/r460.pdf
4. Markov Decision Processes with Unobserved Confounders: A Causal Approach
   1. https://www.cs.purdue.edu/homes/eb/mdp-causal.pdf
5. Recurrent Environment Simulators
   1. https://arxiv.org/pdf/1704.02254.pdf
6. Learning Model-Based Planning from Scratch [Also, Planning]
   1. https://arxiv.org/pdf/1707.06170.pdf
7. Learning Plannable Representations with Causal InfoGAN
   1. https://arxiv.org/pdf/1807.09341.pdf
8. Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution
   1. https://arxiv.org/pdf/1801.04016.pdf

Recent Papers (Non-RL)
* Learning Representations for Counterfactual Inference
* Deep IV: A Flexible Approach for Counterfactual Prediction
* Causal Learning and Explainations of Deep Neural Networks via Autoencoded Activations
* Structure Agnostic Model, Causal Discovery, and Penalized Adversarial Learning
* CausalGAN: Learning Implicit Causal Generative Models with Adversarial Training
* Discovering Causal Signals in Images
* Causal Generative Neural Networks
* Transfer Learning for Estimating Causal Effects using Neural Networks
* Discovering Context Specific Causal Relationships
* The Deconfounded Recommender: A Causal Inference Approach to Recommendation
* The Blessings of Multiple Causes
* Theoretical Impediments to Machine Learning
* An Algorithmic Information Calculus for Causal Discovery and Reprogramming Systems
* Topological Causality in Dynamical Systems
* Recognising Top-Down Causation
* Scalable Linear Causal Inference for Irregularly Sampled Time Series with Long Range Dependencies

Sites and Presentations
1. Causal Inference in Statistics
   1. Judea Pearl
2. Causal Inference in Machine Learning
   1. Ricardo Silva
3. Stanford Encyclopedia of Philosophy - Causal Models
4. ICML Causal Inference Tutorial
   1. Uri Shalit & David Sontag
5. Causality in Machine Learning
   1. Omkar Muralidharan, Niall Cardin
6. Deep Learning Patterns

Books
1. The Direction of Time
   1. Reichenbach.
2. Causality: Models, Reasoning and Inference. 
   1. Judea Pearl

Thoughts
1. Simulating the world by predicting the next input over a time series across a hierarchy provides tremendous amounts of supervised data for building a causal world model.
   1. The question of whether an action causes an outcome can be answered by consulting the world model with and without the action.
   2. The hierarchical structure allows you to deal with the instability that comes with treating your incremental predictions as true and making more predictions as a function of them.
   3. This also will solve the common-sense knowledge problem.

---

*Source: [Original Google Doc](https://docs.google.com/document/d/1xBImt4PZUeEra2By4JaRLw3dg8sqJ3znscd2sfFsjZk/edit?usp=drivesdk&sa=D&ust=1596495076463000&usg=AOvVaw1CzZCc1vBXi6XxsLu-o5Km)*