Arxiv on Feb. 16th

16Feb18

Title: Reinforcement Learning from Imperfect Demonstrations
Authors: Yang Gao, Huazhe (Harry) Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor
Darrell
Categories: cs.AI cs.LG stat.ML

Robust real-world learning should benefit from both demonstrations and
interactions with the environment. Current approaches to learning from
demonstration and reward perform supervised learning on expert demonstration
data and use reinforcement learning to further improve performance based on the
reward received from the environment. These tasks have divergent losses which
are difficult to jointly optimize and such methods can be very sensitive to
noisy demonstrations. We propose a unified reinforcement learning algorithm,
Normalized Actor-Critic (NAC), that effectively normalizes the Q-function,
reducing the Q-values of actions unseen in the demonstration data. NAC learns
an initial policy network from demonstrations and refines the policy in the
environment, surpassing the demonstrator’s performance. Crucially, both
learning from demonstration and interactive refinement use the same objective,
unlike prior approaches that combine distinct supervised and reinforcement
losses. This makes NAC robust to suboptimal demonstration data since the method
is not forced to mimic all of the examples in the dataset. We show that our
unified reinforcement learning algorithm can learn robustly and outperform
existing baselines when evaluated on several realistic driving games.
https://arxiv.org/abs/1802.05313 ,  7631kb)

Title: From Gameplay to Symbolic Reasoning: Learning SAT Solver Heuristics in
  the Style of Alpha(Go) Zero
Authors: Fei Wang, Tiark Rompf
Categories: cs.AI

Despite the recent successes of deep neural networks in various fields such
as image and speech recognition, natural language processing, and reinforcement
learning, we still face big challenges in bringing the power of numeric
optimization to symbolic reasoning. Researchers have proposed different avenues
such as neural machine translation for proof synthesis, vectorization of
symbols and expressions for representing symbolic patterns, and coupling of
neural back-ends for dimensionality reduction with symbolic front-ends for
decision making. However, these initial explorations are still only point
solutions, and bear other shortcomings such as lack of correctness guarantees.
In this paper, we present our approach of casting symbolic reasoning as games,
and directly harnessing the power of deep reinforcement learning in the style
of Alpha(Go) Zero on symbolic problems. Using the Boolean Satisfiability (SAT)
problem as showcase, we demonstrate the feasibility of our method, and the
advantages of modularity, efficiency, and correctness guarantees.
https://arxiv.org/abs/1802.05340 ,  52kb)

Title: Mean Field Multi-Agent Reinforcement Learning
Authors: Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang
Categories: cs.MA cs.AI cs.LG

Existing multi-agent reinforcement learning methods are limited typically to
a small number of agents. When the agent number increases largely, the learning
becomes intractable due to the curse of the dimensionality and the exponential
growth of user interactions. In this paper, we present Mean Field Reinforcement
Learning where the interactions within the population of agents are
approximated by those between a single agent and the average effect from the
overall population or neighboring agents; the interplay between the two
entities is mutually reinforced: the learning of the individual agent’s optimal
policy depends on the dynamics of the population, while the dynamics of the
population change according to the collective patterns of the individual
policies. We develop practical mean field Q-learning and mean field
Actor-Critic algorithms and analyze the convergence of the solution.
Experiments on resource allocation, Ising model estimation, and battle game
tasks verify the learning effectiveness of our mean field approaches in
handling many-agent interactions in population.
https://arxiv.org/abs/1802.05438 ,  729kb)

 



No Responses Yet to “Arxiv on Feb. 16th”

  1. Leave a Comment

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s


%d bloggers like this: