Arxiv on Feb. 28th

28Feb18

Title: Modeling Others using Oneself in Multi-Agent Reinforcement Learning
Authors: Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus
Categories: cs.AI cs.LG
Comments: 9 pages, 9 figures, submitted to ICML 2018

We consider the multi-agent reinforcement learning setting with imperfect
information in which each agent is trying to maximize its own utility. The
reward function depends on the hidden state (or goal) of both agents, so the
agents must infer the other players’ hidden goals from their observed behavior
in order to solve the tasks. We propose a new approach for learning in these
domains: Self Other-Modeling (SOM), in which an agent uses its own policy to
predict the other agent’s actions and update its belief of their hidden state
in an online manner. We evaluate this approach on three different tasks and
show that the agents are able to learn better policies using their estimate of
the other players’ hidden states, in both cooperative and adversarial settings.
https://arxiv.org/abs/1802.09640 ,  1597kb)

Title: Reinforcement and Imitation Learning for Diverse Visuomotor Skills
Authors: Yuke Zhu, Ziyu Wang, Josh Merel, Andrei Rusu, Tom Erez, Serkan Cabi,
Saran Tunyasuvunakool, J\’anos Kram\’ar, Raia Hadsell, Nando de Freitas,
Nicolas Heess
Categories: cs.RO cs.AI cs.LG
Comments: 13 pages, 6 figures

We propose a model-free deep reinforcement learning method that leverages a
small amount of demonstration data to assist a reinforcement learning agent. We
apply this approach to robotic manipulation tasks and train end-to-end
visuomotor policies that map directly from RGB camera inputs to joint
velocities. We demonstrate that our approach can solve a wide variety of
visuomotor tasks, for which engineering a scripted controller would be
laborious. Our experiments indicate that our reinforcement and imitation agent
achieves significantly better performances than agents trained with
reinforcement learning or imitation learning alone. We also illustrate that
these policies, trained with large visual and dynamics variations, can achieve
preliminary successes in zero-shot sim2real transfer. A brief visual
description of this work can be viewed in https://youtu.be/EDl8SQUNjj0
https://arxiv.org/abs/1802.09564 ,  7740kb)

Title: Real-Time Bidding with Multi-Agent Reinforcement Learning in Display
  Advertising
Authors: Junqi Jin, Chengru Song, Han Li, Kun Gai, Jun Wang, Weinan Zhang
Categories: stat.ML cs.AI cs.LG

Real-time advertising allows advertisers to bid for each impression for a
visiting user. To optimize a specific goal such as maximizing the revenue led
by ad placements, advertisers not only need to estimate the relevance between
the ads and user’s interests, but most importantly require a strategic response
with respect to other advertisers bidding in the market. In this paper, we
formulate bidding optimization with multi-agent reinforcement learning. To deal
with a large number of advertisers, we propose a clustering method and assign
each cluster with a strategic bidding agent. A practical Distributed
Coordinated Multi-Agent Bidding (DCMAB) has been proposed and implemented to
balance the tradeoff between the competition and cooperation among advertisers.
The empirical study on our industry-scaled real-world data has demonstrated the
effectiveness of our modeling methods. Our results show that a cluster based
bidding would largely outperform single-agent and bandit approaches, and the
coordinated bidding achieves better overall objectives than the purely
self-interested bidding agents.
https://arxiv.org/abs/1802.09756 ,  1469kb)



No Responses Yet to “Arxiv on Feb. 28th”

  1. Leave a Comment

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: