[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jeremynixon / Thinking / openai-research-overview.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "OpenAI Research Overview" visibility: public --- # OpenAI Research Overview Category: [[technical|Technical]] [Read the original document](https://docs.google.com/document/d/1BbHW2B6bygh8dYxrvYhGuH6x-eFgdDpdez9MZkrQntU/edit?usp=sharing&sa=D&ust=1596495076376000&usg=AOvVaw0tE_VOgmBbIzO92RfAwBvB) <!-- gdoc-inlined --> --- By Jeremy Nixon [jnixon2@gmail.com]. Nov 2017. Categories: Domain in which the paper’s innovation is novel. 1. Reinforcement Learning 1. Multi-Agent 2. Exploration 3. Imitation Learning 2. Deep Learning 3. Memory 4. Program Learning 5. Representation Learning 6. Variational Inference 7. Generative Models 8. Evolution 9. Applications 1. Security / Safety 2. Robotics 10. Environments 1. Reinforcement Learning 1. Multi-Agent 1. Learning with Opponent-Learning Awareness 1. https://arxiv.org/abs/1709.04326 2. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments 1. https://arxiv.org/abs/1706.02275 3. Emergence of Grounded Compositional Language in Multi-Agent Populations 1. https://arxiv.org/abs/1703.04908 2. Exploration 1. Parameter Space Noise for Exploration 1. https://arxiv.org/abs/1706.01905 2. UCB and InfoGain Exploration via Q-Ensembles 1. https://arxiv.org/abs/1706.01502 3. Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning 1. https://arxiv.org/abs/1611.04717 4. VIME: Variational Information Maximizing Exploration 1. https://arxiv.org/abs/1605.09674 3. Imitation Learning 1. Third-Person Imitation Learning 1. https://arxiv.org/abs/1703.01703 2. One-Shot Imitation Learning 1. https://arxiv.org/abs/1703.07326 4. RL2: Fast Reinforcement Learning via Slow Reinforcement Learning 1. https://arxiv.org/abs/1611.02779 5. Teacher-Student Curriculum Learning 1. https://arxiv.org/abs/1707.00183 6. Equivalence Between Policy Gradients and Soft Q-Learning 1. https://arxiv.org/abs/1704.06440 7. Prediction and Control with Temporal Segment Models 1. https://arxiv.org/abs/1703.04070 2. Deep Learning 1. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks 1. https://arxiv.org/abs/1602.07868 3. Memory 1. Hindsight Experience Replay [Also, Reinforcement Learning] 1. https://arxiv.org/pdf/1707.01495.pdf 4. Program Learning 1. Extensions and Limitations of the Neural GPU 1. https://arxiv.org/abs/1611.00736 5. Representation Learning 1. Variational Lossy Autoencoder 1. https://arxiv.org/abs/1611.02731 6. Variational Inference 1. Improving Variational Inference with Inverse Autoregressive Flow 1. https://arxiv.org/abs/1606.04934 7. Generative Models 1. Generative Adversarial Networks 1. InfoGan: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [Also, Representation Learning] 1. https://arxiv.org/abs/1606.03657 2. Improved Techniques for Training GANs 1. https://arxiv.org/abs/1606.03498 2. On the Quantitative Analysis of Decoder-Based Generative Models 1. https://arxiv.org/abs/1611.04273 3. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy Based Models [Also Reinforcement Learning] 1. https://arxiv.org/pdf/1611.03852.pdf 4. PixelCNN++: Improving the Pixel CNN with Discretized Logistic Mixture Likelihood and Other Modifications 1. https://arxiv.org/abs/1701.05517 5. Learning to Generate Reviews and Discovering Sentiment 1. https://arxiv.org/abs/1704.01444 8. Evolution 1. Evolution Strategies as a Scalable Alternative to Reinforcement Learning 1. https://arxiv.org/abs/1703.03864 9. Applications 1. Security / Safety 1. Deep Reinforcement Learning from Human Preferences 1. https://arxiv.org/abs/1706.03741 2. Concrete Problems in AI Safety 1. https://arxiv.org/abs/1606.06565 3. Adversarial Attacks on Neural Network Policies 1. https://arxiv.org/abs/1702.02284 4. Adversarial Training Methods for Semi-Supervised Text Classification 1. https://arxiv.org/abs/1605.07725 5. Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data 1. https://arxiv.org/abs/1610.05755 6. Debate Amplification 1. https://arxiv.org/pdf/1805.00899.pdf 2. 2. Robotics 1. Domain Randomization for Transferring Deep NEural Networks from Simulation to the Real World 1. https://arxiv.org/abs/1703.06907 2. Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model 1. https://arxiv.org/abs/1610.03518 10. Environments 1. Infrastructure for Deep Learning 1. https://blog.openai.com/infrastructure-for-deep-learning/ 2. Universe 1. https://blog.openai.com/universe/ 3. OpenAI Gym 1. https://arxiv.org/abs/1606.01540 OpenAI Researchers 1. Paul Christiano 2. Ryan Lowe 3. Jean Harb 4. Pieter Abbeel 5. Igor Mordatch x 6. Matthias Plappert 7. Rein Houthooft x 8. Prafulla Dhariwal 9. Szymon Sidor 10. Richard Y. Chen 11. Xi Chen 12. Marcin Andrychowicz x 13. John Schulman 14. Alec Radford 15. Rafal Jozefowicz 16. Yan Duan 17. Bradly C. Stadie 18. Jonathan Ho 19. Jonas Schneider 20. Ilya Sutskever 21. Wojciech Zaremba 22. Rachel Fong 23. Josh Tobin 24. Alex Ray 25. Nikhil Mishra 26. Ian Goodfellow 27. Tim Salimans 28. Diederik P. Kingma 29. Andrej Karpathy 30. Yuri Burda 31. Zain Shah 32. Trevor Blackwell 33. Vicki Cheung Salaries of top employees [Pg. 28] Hours & Salaries of top employees [Pg. 7] OpenAI spent 11 million in 2016, 7 million on salary. For comparison, Deepmind spend 138 million in 2016. --- *Source: [Original Google Doc](https://docs.google.com/document/d/1BbHW2B6bygh8dYxrvYhGuH6x-eFgdDpdez9MZkrQntU/edit?usp=sharing&sa=D&ust=1596495076376000&usg=AOvVaw0tE_VOgmBbIzO92RfAwBvB)*