<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="client.xsl" type="text/xsl"?>
<article article-type="other">
<front>
<journal-meta>
<journal-id/>
<issn/>
<banner>
<!--<href>banner.jpg</href>-->
<size width="100%"/>
</banner>
</journal-meta>
<article-meta>
<title-group>
<article-title>Reinforcement Learning in Extensive Form Games with Incomplete Information:<br/> the Bargaining Case Study</article-title>
</title-group>

<author><a href="mailto:lazaric@elet.polimi.it"><name>Alessandro Lazaric</name></a></author>
<aff>Politecnico di Milano, DEI, piazza Leonardo da Vinci 32, I20133, Milan, Italy</aff>

<author><a href="mailto:munoz@elet.polimi.it"><name>Enrique Munoz de Cote</name></a></author>
<aff>Politecnico di Milano, DEI, piazza Leonardo da Vinci 32, I20133, Milan, Italy</aff>

<author><a href="mailto:ngatti@elet.polimi.it"><name>Nicola Gatti</name></a></author>
<aff>Politecnico di Milano, DEI, piazza Leonardo da Vinci 32, I20133, Milan, Italy</aff>

</article-meta></front>
<body>
<abstract>
<title>ABSTRACT</title>
<p>We consider the problem of playing in repeated extensive form games where agents do not have any prior. In this situation classic game theoretical tools are inapplicable and it is common the resort to learning techniques. In this paper, we present a novel learning principle that aims at avoiding oscillations in the agents' strategies induced by the presence of concurrent learners. We apply our algorithm in bargaining, and we experimentally evaluate it showing that using this principle reinforcement learning algorithms can improve their convergence time.</p>
</abstract>
<fpdf>
<href>pdflogo.jpg</href>
<hpdf>AAMAS07_0338_3d4eb9f8ecf2a558586005eedc3f254a</hpdf>
</fpdf>
</body>
</article>

