Archive for the ‘Reinforcement Learning’ Category

Title: Towards Cooperation in Sequential Prisoner’s Dilemmas: a Deep Multiagent   Reinforcement Learning Approach Authors: Weixun Wang, Jianye Hao, Yixi Wang, Matthew Taylor Categories: cs.AI cs.GT cs.LG cs.MA Comments: 13 pages, 21 figures The Iterated Prisoner’s Dilemma has guided research on social dilemmas for decades. However, it distinguishes between only two atomic actions: cooperate and […]

Title: Selective Experience Replay for Lifelong Learning Authors: David Isele, Akansel Cosgun Categories: cs.AI Comments: Presented in 32nd Conference on Artificial Intelligence (AAAI 2018) Deep reinforcement learning has emerged as a powerful tool for a variety of learning tasks, however deep nets typically exhibit forgetting when learning multiple tasks in sequence. To mitigate forgetting, we […]

Title: Modeling Others using Oneself in Multi-Agent Reinforcement Learning Authors: Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus Categories: cs.AI cs.LG Comments: 9 pages, 9 figures, submitted to ICML 2018 We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility. The reward function depends […]

Title: Reinforcement Learning on Web Interfaces Using Workflow-Guided   Exploration Authors: Evan Zheran Liu, Kelvin Guu, Panupong Pasupat, Tianlin Shi, Percy Liang Categories: cs.AI Comments: International Conference on Learning Representations (ICLR), 2018 Reinforcement learning (RL) agents improve through trial-and-error, but when reward is sparse and the agent cannot discover successful action sequences, learning stagnates. This […]

Title: Budget Constrained Bidding by Model-free Reinforcement Learning in   Display Advertising Authors: Di Wu, Xiujun Chen, Xun Yang, Hao Wang, Qing Tan, Xiaoxun Zhang, Kun Gai Categories: cs.AI Real-time bidding (RTB) is almost the most important mechanism in online display advertising, where proper bid for each page view plays a vital and essential role […]

Title: Convergent Actor-Critic Algorithms Under Off-Policy Training and   Function Approximation Authors: Hamid Reza Maei Categories: cs.AI We present the first class of policy-gradient algorithms that work with both state-value and policy function-approximation, and are guaranteed to converge under off-policy training. Our solution targets problems in reinforcement learning where the action representation adds to the-curse-of-dimensionality; […]

Title: Clipped Action Policy Gradient Authors: Yasuhiro Fujita and Shin-ichi Maeda Categories: cs.LG cs.AI stat.ML Many continuous control tasks have bounded action spaces and clip out-of-bound actions before execution. Policy gradient methods often optimize policies as if actions were not clipped. We propose clipped action policy gradient (CAPG) as an alternative policy gradient estimator that exploits the […]

Title: Continual Reinforcement Learning with Complex Synapses Authors: Christos Kaplanis, Murray Shanahan, Claudia Clopath Categories: cs.AI cs.LG cs.NE Unlike humans, who are capable of continual learning over their lifetimes, artificial neural networks have long been known to suffer from a phenomenon known as catastrophic forgetting, whereby new learning can lead to abrupt erasure of previously […]

Title: Reactive Reinforcement Learning in Asynchronous Environments Authors: Jaden B. Travnik, Kory W. Mathewson, Richard S. Sutton, Patrick M. Pilarski Categories: cs.AI cs.LG Comments: 11 pages, 7 figures, currently under journal peer review The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored. Frequently used models of the interaction between […]

Title: Monte Carlo Q-learning for General Game Playing Authors: Hui Wang, Michael Emmerich, Aske Plaat Categories: cs.AI Comments: 15 pages 6 figures Recently, the interest in reinforcement learning in game playing has been renewed. This is evidenced by the groundbreaking results achieved by AlphaGo. General Game Playing (GGP) provides a good testbed for reinforcement learning, currently […]