Arxiv on Feb. 6th

06Feb18

Title: Coordinated Exploration in Concurrent Reinforcement Learning
Authors: Maria Dimakopoulou, Benjamin Van Roy
Categories: cs.AI

We consider a team of reinforcement learning agents that concurrently learn
to operate in a common environment. We identify three properties – adaptivity,
commitment, and diversity – which are necessary for efficient coordinated
exploration and demonstrate that straightforward extensions to single-agent
optimistic and posterior sampling approaches fail to satisfy them. As an
alternative, we propose seed sampling, which extends posterior sampling in a
manner that meets these requirements. Simulation results investigate how
per-agent regret decreases as the number of agents grows, establishing
substantial advantages of seed sampling over alternative exploration schemes.
https://arxiv.org/abs/1802.01282 ,  441kb)

Title: IMPALA: Scalable Distributed Deep-RL with Importance Weighted
  Actor-Learner Architectures
Authors: Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir
Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane
Legg, Koray Kavukcuoglu
Categories: cs.LG cs.AI

In this work we aim to solve a large collection of tasks using a single
reinforcement learning agent with a single set of parameters. A key challenge
is to handle the increased amount of data and extended training time, which is
already a problem in single task learning. We have developed a new distributed
agent IMPALA (Importance-Weighted Actor Learner Architecture) that can scale to
thousands of machines and achieve a throughput rate of 250,000 frames per
second. We achieve stable learning at high throughput by combining decoupled
acting and learning with a novel off-policy correction method called V-trace,
which was critical for achieving learning stability. We demonstrate the
effectiveness of IMPALA for multi-task reinforcement learning on DMLab-30 (a
set of 30 tasks from the DeepMind Lab environment (Beattie et al., 2016)) and
Atari-57 (all available Atari games in Arcade Learning Environment (Bellemare
et al., 2013a)). Our results show that IMPALA is able to achieve better
performance than previous agents, use less data and crucially exhibits positive
transfer between tasks as a result of its multi-task approach.
https://arxiv.org/abs/1802.01561 ,  4027kb)

Title: Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement
  Learning
Authors: Minghai Chen, Sen Wang, Paul Pu Liang, Tadas Baltru\v{s}aitis, Amir
Zadeh, Louis-Philippe Morency
Categories: cs.LG cs.AI cs.CL
Comments: ICMI 2017 Oral Presentation, Honorable Mention Award

With the increasing popularity of video sharing websites such as YouTube and
Facebook, multimodal sentiment analysis has received increasing attention from
the scientific community. Contrary to previous works in multimodal sentiment
analysis which focus on holistic information in speech segments such as bag of
words representations and average facial expression intensity, we develop a
novel deep architecture for multimodal sentiment analysis that performs
modality fusion at the word level. In this paper, we propose the Gated
Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model that is
composed of 2 modules. The Gated Multimodal Embedding alleviates the
difficulties of fusion when there are noisy modalities. The LSTM with Temporal
Attention performs word level fusion at a finer fusion resolution between input
modalities and attends to the most important time steps. As a result, the
GME-LSTM(A) is able to better model the multimodal structure of speech through
time and perform better sentiment comprehension. We demonstrate the
effectiveness of this approach on the publicly-available Multimodal Corpus of
Sentiment Intensity and Subjectivity Analysis (CMU-MOSI) dataset by achieving
state-of-the-art sentiment classification and regression results. Qualitative
analysis on our model emphasizes the importance of the Temporal Attention Layer
in sentiment prediction because the additional acoustic and visual modalities
are noisy. We also demonstrate the effectiveness of the Gated Multimodal
Embedding in selectively filtering these noisy modalities out. Our results and
analysis open new areas in the study of sentiment analysis in human
communication and provide new models for multimodal fusion.
https://arxiv.org/abs/1802.00924 ,  3382kb)



No Responses Yet to “Arxiv on Feb. 6th”

  1. Leave a Comment

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: