<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="client.xsl" type="text/xsl"?>
<article article-type="other">
<front>
<journal-meta>
<journal-id/>
<issn/>
<banner>
<!--<href>banner.jpg</href>-->
<size width="100%"/>
</banner>
</journal-meta>
<article-meta>
<title-group>
<article-title>Confidence-Based Policy Learning from Demonstration Using Gaussian Mixture Models</article-title></title-group>

<author><a href="mailto:soniac@cs.cmu.edu"><name>Sonia Chernova</name></a></author>
<aff>Carnegie Mellon University Computer Science, Department Pittsburgh, PA, USA</aff>

<author><a href="mailto:veloso@cs.cmu.edu"><name>Manuela Veloso</name></a></author>
<aff>Carnegie Mellon University Computer Science, Department Pittsburgh, PA, USA</aff>
</article-meta></front>
<body>
<abstract>
<title>ABSTRACT</title>
<p>We contribute an approach for interactive policy learning
through expert demonstration that allows an agent to actively request and effectively represent demonstration examples. In order to address the inherent uncertainty of human
demonstration, we represent the policy as <italic>a set of Gaussian
mixture models</italic> (GMMs), where each model, with multiple
Gaussian components, corresponds to a single action. Incrementally received demonstration examples are used as
training data for the GMM set. We then introduce our <italic>confident execution</italic> approach, which focuses learning on relevant parts of the domain by enabling the agent to identify
the need for and request demonstrations for specific parts of
the state space. The agent selects between demonstration
and autonomous execution based on statistical analysis of
the uncertainty of the learned Gaussian mixture set. As it
achieves proficiency at its task and gains confidence in its
actions, the agent operates with increasing autonomy, eliminating the need for unnecessary demonstrations of already
acquired behavior, and reducing both the training time and
the demonstration workload of the expert. We validate our
approach with experiments in simulated and real robot domains.</p>
</abstract>
<fpdf>
<href>pdflogo.jpg</href>
<hpdf>AAMAS07_0558_07ea9f233b9f8e79e0c146b5c644d76f</hpdf>
</fpdf>
</body>
</article>

