Archive for February, 2018

Title: Modeling Others using Oneself in Multi-Agent Reinforcement Learning Authors: Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus Categories: cs.AI cs.LG Comments: 9 pages, 9 figures, submitted to ICML 2018 We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility. The reward function depends […]


Title: Reinforcement Learning on Web Interfaces Using Workflow-Guided   Exploration Authors: Evan Zheran Liu, Kelvin Guu, Panupong Pasupat, Tianlin Shi, Percy Liang Categories: cs.AI Comments: International Conference on Learning Representations (ICLR), 2018 Reinforcement learning (RL) agents improve through trial-and-error, but when reward is sparse and the agent cannot discover successful action sequences, learning stagnates. This […]


Title: Budget Constrained Bidding by Model-free Reinforcement Learning in   Display Advertising Authors: Di Wu, Xiujun Chen, Xun Yang, Hao Wang, Qing Tan, Xiaoxun Zhang, Kun Gai Categories: cs.AI Real-time bidding (RTB) is almost the most important mechanism in online display advertising, where proper bid for each page view plays a vital and essential role […]


Title: Convergent Actor-Critic Algorithms Under Off-Policy Training and   Function Approximation Authors: Hamid Reza Maei Categories: cs.AI We present the first class of policy-gradient algorithms that work with both state-value and policy function-approximation, and are guaranteed to converge under off-policy training. Our solution targets problems in reinforcement learning where the action representation adds to the-curse-of-dimensionality; […]


Title: Clipped Action Policy Gradient Authors: Yasuhiro Fujita and Shin-ichi Maeda Categories: cs.LG cs.AI stat.ML Many continuous control tasks have bounded action spaces and clip out-of-bound actions before execution. Policy gradient methods often optimize policies as if actions were not clipped. We propose clipped action policy gradient (CAPG) as an alternative policy gradient estimator that exploits the […]


Title: Continual Reinforcement Learning with Complex Synapses Authors: Christos Kaplanis, Murray Shanahan, Claudia Clopath Categories: cs.AI cs.LG cs.NE Unlike humans, who are capable of continual learning over their lifetimes, artificial neural networks have long been known to suffer from a phenomenon known as catastrophic forgetting, whereby new learning can lead to abrupt erasure of previously […]


Title: Reactive Reinforcement Learning in Asynchronous Environments Authors: Jaden B. Travnik, Kory W. Mathewson, Richard S. Sutton, Patrick M. Pilarski Categories: cs.AI cs.LG Comments: 11 pages, 7 figures, currently under journal peer review The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored. Frequently used models of the interaction between […]


Title: Monte Carlo Q-learning for General Game Playing Authors: Hui Wang, Michael Emmerich, Aske Plaat Categories: cs.AI Comments: 15 pages 6 figures Recently, the interest in reinforcement learning in game playing has been renewed. This is evidenced by the groundbreaking results achieved by AlphaGo. General Game Playing (GGP) provides a good testbed for reinforcement learning, currently […]


Title: Reinforcement Learning from Imperfect Demonstrations Authors: Yang Gao, Huazhe (Harry) Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell Categories: cs.AI cs.LG stat.ML Robust real-world learning should benefit from both demonstrations and interactions with the environment. Current approaches to learning from demonstration and reward perform supervised learning on expert demonstration data and use reinforcement […]


No paper today in the digest about Deep Reinforcement Learning.