<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="client.xsl" type="text/xsl"?>
<article article-type="other">
<front>
<journal-meta>
<journal-id/>
<issn/>
<banner>
<!--<href>banner.jpg</href>-->
<size width="100%"/>
</banner>
</journal-meta>
<article-meta>
<title-group>
<article-title>Q-Value Functions for Decentralized POMDPs</article-title>
</title-group>

<author><a href="mailto:faolieho@science.uva.nl"><name>Frans A. Oliehoek</name></a></author>
<aff>Informatics Institute, University of Amsterdam Kruislaan 403<br/> 1098 SJ Amsterdam, The Netherlands</aff>

<author><a href="mailto:vlassis@science.uva.nl"><name>Nikos Vlassis</name></a></author>
<aff>Informatics Institute, University of Amsterdam Kruislaan 403<br/> 1098 SJ Amsterdam, The Netherlands</aff>
</article-meta></front>
<body>
<abstract>
<title>ABSTRACT</title>
<p>Planning in single-agent models like MDPs and POMDPs
can be carried out by resorting to Q-value functions: a
(near-) optimal Q-value function is computed in a recursive manner by dynamic programming, and then a policy is
extracted from this value function. In this paper we study
whether similar Q-value functions can be defined in decentralized POMDP models (Dec-POMDPs), what the cost of
computing such value functions is, and how policies can be
extracted from such value functions. Using the framework
of Bayesian games, we argue that searching for the optimal Q-value function may be as costly as exhaustive policy
search. Then we analyze various approximate Q-value functions that allow efficient computation. Finally, we describe
a family of algorithms for extracting policies from such Q-value functions.</p>
</abstract>
<fpdf>
<href>pdflogo.jpg</href>
<hpdf>AAMAS07_0148_b86e432dcdf71f75f5c8edada8a5ae70</hpdf>
</fpdf>
</body>
</article>

