Arxiv on Feb. 21th


Title: Continual Reinforcement Learning with Complex Synapses
Authors: Christos Kaplanis, Murray Shanahan, Claudia Clopath
Categories: cs.AI cs.LG cs.NE

Unlike humans, who are capable of continual learning over their lifetimes,
artificial neural networks have long been known to suffer from a phenomenon
known as catastrophic forgetting, whereby new learning can lead to abrupt
erasure of previously acquired knowledge. Whereas in a neural network the
parameters are typically modelled as scalar values, an individual synapse in
the brain comprises a complex network of interacting biochemical components
that evolve at different timescales. In this paper, we show that by equipping
tabular and deep reinforcement learning agents with a synaptic model that
incorporates this biological complexity (Benna & Fusi, 2016), catastrophic
forgetting can be mitigated at multiple timescales. In particular, we find that
as well as enabling continual learning across sequential training of two simple
tasks, it can also be used to overcome within-task forgetting by reducing the
need for an experience replay database. ,  1794kb)

Title: Meta-Reinforcement Learning of Structured Exploration Strategies
Authors: Abhishek Gupta, Russell Mendonca, YuXuan Liu, Pieter Abbeel, Sergey
Categories: cs.LG cs.AI cs.NE

Exploration is a fundamental challenge in reinforcement learning (RL). Many
of the current exploration methods for deep RL use task-agnostic objectives,
such as information gain or bonuses based on state visitation. However, many
practical applications of RL involve learning more than a single task, and
prior tasks can be used to inform how exploration should be performed in new
tasks. In this work, we explore how prior tasks can inform an agent about how
to explore effectively in new situations. We introduce a novel gradient-based
fast adaptation algorithm — model agnostic exploration with structured noise
(MAESN) — to learn exploration strategies from prior experience. The prior
experience is used both to initialize a policy and to acquire a latent
exploration space that can inject structured stochasticity into a policy,
producing exploration strategies that are informed by prior knowledge and are
more effective than random action-space noise. We show that MAESN is more
effective at learning exploration strategies when compared to prior meta-RL
methods, RL without learned exploration strategies, and task-agnostic
exploration methods. We evaluate our method on a variety of simulated tasks:
locomotion with a wheeled robot, locomotion with a quadrupedal walker, and
object manipulation. ,  6738kb)


No Responses Yet to “Arxiv on Feb. 21th”

  1. Leave a Comment

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: