<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="client.xsl" type="text/xsl"?>
<article article-type="other">
<front>
<journal-meta>
<journal-id/>
<issn/>
<banner>
<!--<href>banner.jpg</href>-->
<size width="100%"/>
</banner>
</journal-meta>
<article-meta>
<title-group>
<article-title>Transfer via Inter-Task Mappings in Policy Search Reinforcement Learning</article-title></title-group>

<author><a href="mailto:mtaylor@cs.utexas.edu"><name>Matthew E. Taylor</name></a></author>
<aff>Department of Computer Sciences <br/>The University of Texas at Austin Austin, Texas 787121188</aff>

<author><a href="mailto:shimon@cs.utexas.edu"><name>Shimon Whiteson</name></a></author>
<aff>Department of Computer Sciences <br/>The University of Texas at Austin Austin, Texas 787121188</aff>

<author><a href="mailto:pstone@cs.utexas.edu"><name>Peter Stone</name></a></author>
<aff>Department of Computer Sciences <br/>The University of Texas at Austin Austin, Texas 787121188</aff>
</article-meta></front>
<body>
<abstract>
<title>ABSTRACT</title>
<p>The ambitious goal of transfer learning is to accelerate learning
on a target task after training on a different, but related, source
task. While many past transfer methods have focused on transferring
value-functions, this paper presents a method for transferring
policies across tasks with different state and action spaces. In particular,
this paper utilizes transfer via inter-task mappings for policy
search methods (TVITM-PS) to construct a transfer functional that
translates a population of neural network policies trained via policy
search from a source task to a target task. Empirical results
in robot soccer Keepaway and Server Job Scheduling show that
TVITM-PS can markedly reduce learning time when full inter-task
mappings are available. The results also demonstrate that TVITMPS
still succeeds when given only incomplete inter-task mappings.
Furthermore, we present a novel method for <italic>learning</italic> such mappings
when they are not available, and give results showing they
perform comparably to hand-coded mappings.</p>
</abstract>
<fpdf>
<href>pdflogo.jpg</href>
<hpdf>AAMAS07_0485_a595a73f6a0e209a7510782e0213cacf</hpdf>
</fpdf>
</body>
</article>

