AAMAS 2011 - May 2-6, 2011 - Taipei, Taiwan

Full Papers

Session BP1 – Best Papers Session I


Building on research previously reported at AAMAS conferences, this paper describes an innovative application of a novel gametheoretic approach for a national scale security deployment. Working with the United States Transportation Security Administration (TSA), we have developed a new application called GUARDS to assist in resource allocation tasks for airport protection at over 400 United States airports. In contrast with previous efforts such as ARMOR and IRIS, which focused on one-off tailored applications and one security activity (e.g. canine patrol or checkpoints) per application, GUARDS faces three key issues: (i) reasoning about hundreds of heterogeneous security activities; (ii) reasoning over diverse potential threats; (iii) developing a system designed for hundreds of end-users. Since a national deployment precludes tailoring to specific airports, our key ideas are: (i) creating a new game-theoretic framework that allows for heterogeneous defender activities and compact modeling of a large number of threats; (ii) developing an efficient solution technique based on general purpose Stackelberg game solvers; (iii) taking a partially centralized approach for knowledge acquisition and development of the system. In doing so we develop a software scheduling assistant, GUARDS, designed to reason over two agents – the TSA and a potential adversary – and allocate the TSA's limited resources across hundreds of security activities in order to provide protection within airports. The scheduling assistant has been delivered to the TSA and is currently under evaluation and testing for scheduling practices at an undisclosed airport. If successful, the TSA intends to incorporate the system into their unpredictable scheduling practices nationwide. In this paper we discuss the design choices and challenges encountered during the implementation of GUARDS. GUARDS represents promising potential for transitioning years of academic research into a nationally deployed system. GUARDS - Game Theoretic Security Allocation on a National Scale James Pitahas 2 papers, Milind Tambehas 8 papers, Christopher Kiekintveldhas 4 papers, Shane Cullen, Erin Steigerwald

Session BP2 – Best Papers Session II


Overlapping Coalition Formation (OCF) games are cooperative games where the players can simultaneously participate in several coalitions. Capturing the notion of stability in OCF games is a difficult task: a player may deviate by abandoning some, but not all of the coalitions he is involved in, and the crucial question is whether he then gets to keep his payoff from the unaffected coalitions. In related work the authors introduce three stability concepts for OCF games – the conservative, refined, and optimistic core – that are based on different answers to this question. In this paper, we propose a unified framework for the study of stability in the OCF setting, which encompasses the concepts considered previously as well as a wide variety of alternative stability concepts. Our approach is based on the notion of an arbitrator, which can be thought of as an external party that determines payoff to deviators. We give a complete characterization of outcomes that are stable under arbitration. In particular, our results provide a criterion for the outcome to be in the refined or optimistic core, thus complementing previously results for the conservative core, and answering questions left open previously. We also introduce a notion of the nucleolus for arbitrated OCF games, and argue that it is non-empty. Finally, we extend the definition of the Shapley value to the OCF setting, and provide an axiomatic characterization for it. Arbitrators in Overlapping Coalition Formation Games Yair Zick, Edith Elkindhas 3 papers

Session A1 – Robotics

Session B1 – Distributed Problem Solving I


Distributed systems can often be modeled as a collection of distributed (system) variables whose values are constrained by a set of constraints. In distributed multi-agent systems, the set of variables occurring at a site (subsystem) is usually viewed as controllable by a local agent. This agent assigns values to the variables, and the aim is to provide distributed methods enabling a set of agents to come up with a global assignment (solution) that satisfies all the constraints. Alternatively, the system might be understood as a distributed database. Here, the focus is on ensuring consistency of the global system if local constraints (the distributed parts of the database) change. In this setting, the aim is to determine whether the existence of a global solution can be guaranteed. In other settings (e.g., P2P systems, sensor networks), the values of the variables might be completely out of control of the individual systems, and the constraints only characterize globally normal states or behavior of the system. In order to detect anomalies, one specifies distributed methods that can efficiently indicate violations of such constraints. The aim of this paper is to show that the following three main problems identified in these research areas are in fact identical: (i) the problem of ensuring that independent agents come up with a global solution; (ii) the problem of ensuring that global consistency is maintained if local constraint stores change; and (iii) the problem of ensuring that global violations can be detected by local nodes. This claim is made precise by developing a decomposition framework for distributed constraint systems and then extracting preservation properties that must satisfied in order to solve the above mentioned problems. Although satisfying the preservation properties seems to require different decomposition modes, our results demonstrate that in fact these decomposition properties are equivalent, thereby showing that the three main problems identified above are identical. We then show that the complexity of finding such decompositions is polynomially related to finding solutions for the original constraint system, which explains the popularity of decomposition applied to tractable constraint systems. Finally, we address the problem of finding optimal decompositions and show that even for tractable constraint systems, this problem is hard. Decomposing Constraint Systems: Equivalences and Computational Properties Wiebe van der Hoekhas 5 papers, Cees Witteveen, Michael Wooldridgehas 6 papers


We consider the fundamental problem of reaching consensus in multiagent systems; an operation required in many applications such as, among others, vehicle formation and coordination, shape formation in modular robotics, distributed target tracking, and environmental modeling. To date, the consensus problem (the problem where agents have to agree on their reported values) has been typically solved with iterative decentralized algorithms based on graph Laplacians. However, the convergence of these existing consensus algorithms is often too slow for many important multiagent applications, and thus they are increasingly being combined with acceleration methods. Unfortunately, state-of-the- art acceleration techniques require parameters that can be optimally selected only if complete information about the network topology is available, which is rarely the case in practice. We address this limitation by deriving two novel acceleration methods that can deliver good performance even if little information about the network is available. The first proposed algorithm is based on the Chebyshev semi-iterative method and is optimal in a well defined sense; it maximizes the worst-case convergence speed (in the mean sense) given that only rough bounds on the extremal eigenvalues of the network matrix are available. It can be applied to systems where agents use unreliable communication links, and its computational complexity is similar to those of simple Laplacian-based methods. This algorithm requires synchronization among agents, so we also propose an asynchronous version that approximates the output of the synchronous algorithm. Mathematical analysis and numerical simulations show that the convergence speed of the proposed acceleration methods decrease gracefully in scenarios where the sole use of Laplacian-based methods is known to be impractical. Consensus Acceleration in Multiagent Systems with the Chebyshev Semi-Iterative Method Renato L.G. Cavalcantehas 2 papers, Alex Rogershas 6 papers, Nicholas R. Jenningshas 9 papers

Session C1 – Game Theory I

Session D1 – Multiagent Learning

Session A2 – Logic-Based Approaches I


An abstract argumentation framework and the semantics, often called Dungean semantics, give a general framework for nonmonotonic logics. In the last fteen years, a great number of papers in computational argumentation adopt Dungean semantics as a fundamental principle for evaluating various kinds of defeasible consequences. Recently, many papers address problems not only with theoretical reasoning, i.e., reasoning about what to believe, but also practical reasoning, i.e., reasoning about what to do. This paper proposes a practical argumentation semantics specic to practical argumentation. This is motivated by our hypothesis that consequences of such argumentation should satisfy Pareto optimality because the consequences strongly depend on desires, aims, or values an individual agent or a group of agents has. We dene a practical argumentation framework and two kinds of extensions, preferred and grounded extensions, with respect to each group of agents. We show that evaluating Pareto optimality can be translated to evaluating preferred extensions of a particular practical argumentation framework. Furthermore, we show that our semantics is a natural extension of Dungean semantics in terms of considering more than one defeat relation. We give a generality order of four practical argumentation frameworks specied by taking into account Dungean semantics and Pareto optimality. We show that a member of preferred extensions of the most specic one is not just Pareto optimal, but also it is theoretically justied. Practical Argumentation Semantics for Socially Efficient Defeasible Consequence Hiroyuki Kido, Katsumi Nitta

Session B2 – Agent-Based System Development I

Session C2 – Social Choice Theory


Strategyproof (SP) classification considers situations in which a decision-maker must classify a set of input points with binary labels, minimizing expected error. Labels of input points are reported by self-interested agents, who may lie so as to obtain a classifier more closely matching their own labels. These lies would create a bias in the data, and thus motivate the design of truthful mechanisms that discourage false reporting. We here answer questions left open by previous research on strategyproof classification, in particular regarding the best approximation ratio (in terms of social welfare) that an SP mechanism can guarantee for n agents. Our primary result is a lower bound of 3 - 2/n on the approximation ratio of SP mechanisms under the shared inputs assumption; this shows that the previously known upper bound (for uniform weights) is tight. The proof relies on a result from Social Choice theory, showing that any SP mechanism must select a dictator at random, according to some fixed distribution. We then show how different randomizations can improve the best known mechanism when agents are weighted, matching the lower bound with a tight upper bound. These results contribute both to a better understanding of the limits of SP classification, as well as to the development of similar tools in other, related domains such as SP facility location. Tight Bounds for Strategyproof Classification Reshef Meir, Shaull Almagor, Assaf Michaely, Jeffrey S. Rosenscheinhas 2 papers

Session D2 – Preferences and Strategies

Session A3 – Distributed Problem Solving II


In this paper we address efficient decentralised coordination of cooperative multi-agent systems by taking into account the actual computation and communication capabilities of the agents. We consider coordination problems that can be framed as Distributed Constraint Optimisation Problems, and as such, are suitable to be deployed on large scale multi-agent systems such as sensor networks or multiple unmanned aerial vehicles. Specifically, we focus on techniques that exploit structural independence among agents' actions to provide optimal solutions to the coordination problem, and, in particular, we use the Generalized Distributive Law (GDL) algorithm. In this settings, we propose a novel resource aware heuristic to build junction trees and to schedule GDL computations across the agents. Our goal is to minimise the total running time of the coordination process, rather than the theoretical complexity of the computation, by explicitly considering the computation and communication capabilities of agents. We evaluate our proposed approach against DPOP, RDPI and a centralized solver on a number of benchmark coordination problems, and show that our approach is able to provide optimal solutions for DCOPs faster than previous approaches. Specifically, in the settings considered, when resources are scarce our approach is up to three times faster than DPOP (which proved to be the best among the competitors in our settings). Resource-Aware Junction Trees for Efficient Multi-Agent Coordination N. Stefanovitch, A. Farinelli, Alex Rogershas 6 papers, Nicholas R. Jenningshas 9 papers

Session B3 – Agent-Based System Development II


Multi-agent systems form the basis of many innovative large-scale distributed applications. The development of such applications requires a careful balance of a wide range of concerns: a detailed understanding of the behaviour of the abstract algorithms being employed, a knowledge of the effects and costs of operating in a distributed environment, and an expertise in the performance requirements of the application itself. Experimental work plays a key role in the process of designing such systems. This paper examines the multi-agent systems development cycle from a distributed systems perspective. A survey of recent experimental studies finds that a large proportion of work on the design of multi-agent systems is focused on the analytical and simulation phases of development. This paper advocates an alternative more comprehensive development cycle, which extends from theoretical studies to simulations, emulations, demonstrators and finally staged deployment. AgentScope, a tool that supports the experimental stages of multiagents systems development and facilitates long-term dispersed research efforts, is introduced. AgentScope consists of a small set of interfaces on which experimental work can be built independently of a particular type of platform. The aim is to make not only agent code but also experimental scenarios, and metrics reusable, both between projects and over simulation, emulation and demonstration platforms. An example gossip-based sampling experiment demonstrates reusability, showing the ease with which an experiment can be defined, modified into a comparison study, and ported between a simulator and an actual agent-operating system. AgentScope: Multi-Agent Systems Development in Focus Elth Ogston, Frances Brazier

Session C3 – Bounded Rationality


In many settings and for various reasons, people fail to make optimal decisions. These factors also influence the agents people design to act on their behalf in such virtual environments as eCommerce and distributed operating systems, so that the agents also act sub-optimally despite their greater computational capabilities. In some decision-making situations it is theoretically possible to supply the optimal strategy to people or their agents, but this optimal strategy may be non-intuitive, and providing a convincing explanation of optimality may be complex. This paper explores an alternative approach to improving the performance of a decision-maker in such settings: the data on choices is manipulated to guide searchers to a strategy that is closer to optimal. This approach was tested for sequential search, which is a classical sequential decision-making problem with broad areas of applicability (e.g., product search, partnership search). The paper introduces three heuristics for manipulating choices, including one for settings in which repeated interaction or access to a decision-maker's past history is available. The heuristics were evaluated on a large population of computer agents, each of which embodies a search strategy programmed by a different person. Extensive tests on thousands of search settings demonstrate the promise of the problem-restructuring approach: despite a minor degradation in performance for a small portion of the population, the overall and average individual performance improve substantially. The heuristic that adapts based on a decision-maker's history achieved the best results. Less Is More: Restructuring Decisions to Improve Agent Search David Sarnehas 2 papers, Avshalom Elmalech, Barbara J. Grosz, Moti Geva

Session D3 – Virtual Agents I


Narrative time has an important role to play in Interactive Storytelling (IS). The prevailing approach to controlling narrative time has been to use implicit models that allow only limited temporal reasoning about virtual agent behaviour. In contrast, this paper proposes the use of an explicit model of narrative time which provides a control mechanism that enhances narrative generation, orchestration of virtual agents and number of possibilities for the staging of agent actions. This approach can help address a number of problems experienced in IS systems both at the level of execution staging and at the level of narrative generation. Consequently it has a number of advantages: it is more flexible with respect to the staging of virtual agent actions; it reduces the possibility of timing problems in the coordination of virtual agents; and it enables more expressive representation of narrative worlds and narrative generative power. Overall it provides a uniform, consistent, principled and rigorous approach to the problem of time in agent-based storytelling. In the paper we demonstrate how this approach to controlling narrative time can be implemented within an IS system and illustrate this using our fully implemented IS system that features virtual agents inspired by Shakespeare's The Merchant of Venice. The paper presents results of an experimental evaluation with the system that demonstrates the use of this approach to co-ordinate the actions of virtual agents and to increase narrative generative power. Controlling Narrative Time in Interactive Storytelling Julie Porteoushas 2 papers, Jonathan Teutenberghas 2 papers, Fred Charleshas 2 papers, Marc Cavazzahas 2 papers

Session A4 – Agent Communication


Commitments provide a flexible means for specifying the business relationships among autonomous and heterogeneous agents, and lead to a natural way of enacting such relationships. However, current formalizations of commitments incorporate conditions expressed as propositions, but disregard (1) temporal regulations and (2) an agent's control over such regulations. Thus, they cannot handle realistic application scenarios where time and control are often central because of domain conventions or other requirements. We propose a new formalization of commitments that builds on an existing representation of events in which we can naturally express temporal regulations as well as what an agent can control, including indirectly as based on the commitments and capabilities of other agents. Our formalization supports a notion of commitment safety. A benefit of our consolidated approach is that by incorporating these considerations into commitments we enable agents to reason about and flexibly enact the regulations. The main contributions of this paper include (1) a formal semantics of commitments that accommodates temporal regulations; (2) a formal semantics of the notions of innate and social control; and (3) a formalization of when a temporal commitment is safe for its debtor. We evaluate our contributions using an extensive case study. Commitments with Regulations: Reasoning about Safety and Control in Regula Elisa Marengo, Matteo Baldoni, Cristina Baroglio, Amit K. Choprahas 2 papers, Viviana Patti, Munindar P. Singhhas 5 papers


Communication is a key capability of autonomous agents in a multiagent system to exchange information about their environment. It requires a naming convention that typically involves a set of predefined names for all objects in the environment, which the agents share and understand. However, when the agents are heterogeneous, highly distributed, and situated in an unknown environment, it is very unrealistic to assume that all the objects can be foreseen in advance, and therefore their names cannot be defined beforehand. In such a case, each individual agent needs to be able to introduce new names for the objects it encounters and align them with the naming convention used by the other agents. A language game is a prospective mechanism for the agents to learn and align the naming conventions between them. In this paper we extend the language game model by proposing novel strategies for selecting topics, i.e. attracting agent's attention to different objects during the learning process. Using a simulated multi-agent system we evaluate the process of name alignment in the case of the least restrictive type of language game, the naming game without feedback. Utilising proposed strategies we study the dynamic character of formation of coherent naming conventions and compare it with the behaviour of commonly used random selection strategy. The experimental results demonstrate that the new strategies improve the overall convergence of the alignment process, limit agent's overall demand on memory, and scale with the increasing number of the interacting agents. On Topic Selection Strategies in Multi-Agent Naming Game Wojciech Lorkiewicz, Ryszard Kowalczykhas 4 papers, Radoslaw Katarzyniak, Quoc Bao Vohas 4 papers

Session B4 – Game Theory and Learning


False-name bids are bids submitted by a single agent under multiple fictitious names such as multiple e-mail addresses. False-name bidding can be a serious fraud in Internet auctions since identifying each participant is virtually impossible. It is shown that even the theoretically well-founded Vickrey-Clarke-Groves auction (VCG) is vulnerable to falsename bidding. Thus, several auction mechanisms that cannot be manipulated by false-name bids, i.e., false-name-proof mechanisms, have been developed. This paper investigates a slightly different question, i.e., how do they affect (perfect) Bayesian Nash equilibria of first-price combinatorial auctions? The importance of this question is that first-price combinatorial auctions are by far widely used in practice than VCG, and can be used as a benchmark for evaluating alternate mechanisms. In an environment where false-name bidding are possible, analytically investigating bidders' behaviors is very complicated, since nobody knows the number of real bidders. As a first step, we consider a kind of minimal settings where falsename bids become effective, i.e., an auction with two goods where one naive bidder competes with one shill bidder who may pretend to be two distinct bidders. We model this auction as a simple dynamic game and examine approximate Bayesian Nash equilibria by utilizing a numerical technique. Our analysis revealed that false-name bidding significantly affects the first-price auctions. Furthermore, the shill bidder has a clear advantage against the naive bidder. False-name Bidding in First-price Combinatorial Auctions with Incomplete Information Atsushi Iwasakihas 5 papers, Atsushi Katsuragi, Makoto Yokoohas 5 papers

Session C4 – Teamwork


We study the computational complexity of finding stable outcomes in hedonic games, which are a class of coalition formation games. We restrict our attention to a nontrivial subclass of such games, which are guaranteed to possess stable outcomes, i.e., the set of symmetric additively-separable hedonic games. These games are specified by an undirected edge-weighted graph: nodes are players, an outcome of the game is a partition of the nodes into coalitions, and the utility of a node is the sum of incident edge weights in the same coalition. We consider several stability requirements defined in the literature. These are based on restricting feasible player deviations, for example, by giving existing coalition members veto power. We extend these restrictions by considering more general forms of preference aggregation for coalition members. In particular, we consider voting schemes to decide if coalition members will allow a player to enter or leave their coalition. For all of the stability requirements we consider, the existence of a stable outcome is guaranteed by a potential function argument, and local improvements will converge to a stable outcome. We provide an almost complete characterization of these games in terms of the tractability of computing such stable outcomes. Our findings comprise positive results in the form of polynomialtime algorithms, and negative (PLS-completeness) results. The negative results extend to more general hedonic games. Computing Stable Outcomes in Hedonic Games with Voting-Based Deviations Martin Gairing, Rahul Savani


An interesting problem of multi-agent systems is that of voting, in which the preferences of autonomous agents are to be combined. Applications of voting include modeling social structures, search engine ranking, and choosing a leader among computational agents. In the setting of voting, it is very important that each agent presents truthful information about his or her preferences, and not manipulate. The choice of election system may encourage or discourage voters from manipulating. Because manipulation often results in undesirable consequences, making the determination of such intractable is an important goal. An interesting metric on the robustness of an election system concerns the frequency in which opportunities of manipulations occur in a given election system. Previous work by Walsh has evaluated the frequency of manipulation in the context of very specific election systems, particularly veto, when the number of candidates is limited to at most three, by showing that manipulation problems in these systems can be directly viewed as problems of (Two-Way) Partition, and then using the best known heuristics of Partition. Walsh also claimed similar results hold for k-candidate veto election by way of problems involving multi-way partitions. We show that the results for k-candidate veto elections do not follow directly from common versions of partition problems and require non-trivial modifications to Multi-Way Partition. With these modifications, we confirm Walsh's claim that these elections are also vulnerable to manipulation. Our new computational problems also allow one to evaluate manipulation in the general case of k-candidate scoring protocols. We investigate the complexity of manipulating scoring protocols using new algorithms we derive by extending the known algorithms of Multi-Way Partition. It is our conclusion that the problems of manipulation in more general scoring protocols of four or more candidates are not vulnerable to manipulation using extensions of the current known algorithms of Multi-Way Partition. This may be due to weaknesses in these algorithms or complexity in manipulating general scoring protocols. Solving Election Manipulation Using Integer Partitioning Problems Andrew Lin

Session A5 – Learning Agents


The field of multiagent decision making is extending its tools from classical game theory by embracing reinforcement learning, statistical analysis, and opponent modeling. For example, behavioral economists conclude from experimental results that people act according to levels of reasoning that form a "cognitive hierarchy" of strategies, rather than merely following the hyper-rational Nash equilibrium solution concept. This paper expands this model of the iterative reasoning process by widening the notion of a level within the hierarchy from one single strategy to a distribution over strategies, leading to a more general framework of multiagent decision making. It provides a measure of sophistication for strategies and can serve as a guide for designing good strategies for multiagent games, drawing it's main strength from predicting opponent strategies. We apply these lessons to the recently introduced Lemonade-stand Game, a simple setting that includes both collaborative and competitive elements, where an agent's score is critically dependent on its responsiveness to opponent behavior. The opening moves are significant to the end result and simple heuristics have achieved faster cooperation than intricate learning schemes. Using results from the past two real-world tournaments, we show how the submitted entries fit naturally into our model and explain why the top agents were successful. Using Iterated Reasoning to Predict Opponent Strategies Michael Wunder, John Robert Yaros, Michael Littman, Michael Kaisershas 3 papers


In continuous learning settings stochastic stable policies are often necessary to ensure that agents continuously adapt to dynamic environments. The choice of the decentralised learning system and the employed policy plays an important role in the optimisation task. For example, a policy that exhibits fluctuations may also introduce non-linear effects which other agents in the environment may not be able to cope with and even amplify these effects. In dynamic and unpredictable multiagent environments these oscillations may introduce instabilities. In this paper, we take inspiration from the limbic system to introduce an extension to the weighted policy learner, where agents evaluate rewards as either positive or negative feedback, depending on how they deviate from average expected rewards. Agents have positive and negative biases, where a bias either magnifies or depresses a positive or negative feedback signal. To contain the non-linear effects of biased rewards, we incorporate a decaying memory of past positive and negative feedback signals to provide a smoother gradient update on the probability simplex, spreading out the effect of the feedback signal over time. By splitting the feedback signal, more leverage on the win or learn fast (WoLF) principle is possible. The cognitive policy learner is evaluated using a small queueing network and compared with the fair action and weighted policy learner. Emphasis is placed on analysing the dynamics of the learning algorithms with respect to the stability of the queueing network and the overall queueing performance. Cognitive Policy Learner: Biasing Winning or Losing Strategies Dominik Dahlem, Jim Dowling, William Harrison


Distributed collaborative adaptive sensing (DCAS) of the atmosphere is a new paradigm for detecting and predicting hazardous weather using a large dense network of short-range, low-powered radars to sense the lowest few kilometers of the earths atmosphere. In DCAS, radars are controlled by a collection of Meteorological Command and Control (MC&C) agents that instruct where to scan based on emerging weather conditions. Within this context, this work concentrates on designing efficient approaches for allocating sensing resources to cope with restricted real-time requirements and limited computational resources. We have developed a new approach based on explicit goals that can span multiple system heartbeats. This allows us to reason ahead about sensor allocations based on expected requirements of goals as they project forward in time. Each goal explicitly specifies end-users' preferences as well as a prediction of how a phenomena will move. We use a genetic algorithm to generate scanning strategies of each single MC&C and a distributed negotiation model to coordinate multiple MC&Cs' scanning strategies over multiple heartbeats. Simulation results show that as compared to simpler variants of our approach, the proposed distributed model achieved the highest social welfare. Our approach also has exhibited similarly very good performance in an operational radar testbed that is deployed in Oklahoma to observe severe weather events. Agent-Mediated Multi-Step Optimization for Resource Allocation in Distributed Sensor Networks Bo Anhas 2 papers, Victor Lesserhas 4 papers, David Westbrook, Michael Zink

Session B5 – Auction and Incentive Design


Mechanism design studies how to design mechanisms that result in good outcomes even when agents strategically report their preferences. In traditional settings, it is assumed that a mechanism can enforce payments to give an incentive for agents to act honestly. However, in many Internet application domains, introducing monetary transfers is impossible or undesirable. Also, in such highly anonymous settings as the Internet, declaring preferences dishonestly is not the only way to manipulate the mechanism. Often, it is possible for an agent to pretend to be multiple agents and submit multiple reports under different identifiers, e.g., by creating different e-mail addresses. The effect of such false-name manipulations can be more serious in a mechanism without monetary transfers, since submitting multiple reports would have no risk. In this paper, we present a case study in false-name-proof mechanism design without money. In our basic setting, agents are located on a real line, and the mechanism must select the location of a public facility; the cost of an agent is its distance to the facility. This setting is called the facility location problem and can represent various situations where an agent's preference is single-peaked. First, we fully characterize the deterministic false-name-proof facility location mechanisms in this basic setting. By utilizing this characterization, we show the tight bounds of the approximation ratios for two objective functions: social cost and maximum cost. We then extend the results in two natural directions: a domain where a mechanism can be randomized and a domain where agents are located in a tree. Furthermore, we clarify the connections between false-name-proofness and other related properties. False-name-proof Mechanism Design without Money Taiki Todo, Atsushi Iwasakihas 5 papers, Makoto Yokoohas 5 papers

Session C5 – Simulation and Emergence


Large heterogeneous teams in a variety of applications must make joint decisions using large volumes of noisy and uncertain data. Often not all team members have access to a sensor, relying instead on information shared by peers to make decisions. These sensors can become permanently corrupted through hardware failure or as a result of the actions of a malicious adversary. Previous work showed that when the trust between agents was tuned to a specific value the resulting dynamics of the system had a property called scale invariance which led to agents reaching highly accurate conclusion with little communication. In this paper we show that these dynamics also leave the system vulnerable to most agents coming to incorrect conclusions as a result of small amounts of anomalous information maliciously injected in the system. We conduct an analysis that shows that the efficiency of scale invariant dynamics is due to the fact that large number of agents can come to correct conclusions when the difference between the percentage of agents holding conflicting opinions is relatively small. Although this allows the system to come to correct conclusions quickly, it also means that it would be easy for an attacker with specific knowledge to tip the balance. We explore different methods for selecting which agents are Byzantine and when attacks are launched informed by the analysis. Our study reveals global system properties that can be used to predict when and where in the network the system is most vulnerable to attack. We use the results of this study to design an algorithm used by agents to effectively attack the network, informed by local estimates of the global properties revealed by our investigation. An Investigation of the Vulnerabilities of Scale Invariant Dynamics in Large Teams Robin Glinton, Paul Scerrihas 3 papers, Katia Sycarahas 7 papers


We study the phenomenon of evolution of cooperation in a society of self-interested agents using repeated games in graphs. A repeated game in a graph is a multiple round game, where, in each round, an agent gains payoff by playing a game with its neighbors and updates its action (state) by using the actions and/or payoffs of its neighbors. The interaction model between the agents is a two-player, two-action (cooperate and defect) Prisoner's Dilemma (PD) game (a prototypical model for interaction between self-interested agents). The conventional wisdom is that the presence of network structure enhances cooperation and current models use multiagent simulation to show evolution of cooperation. However, these results are based on particular combination of interaction game, network model and state update rules (e.g., PD game on a grid with imitate your best neighbor rule leads to evolution of cooperation). The state-of-the-art lacks a comprehensive picture of the dependence of the emergence of cooperation on model parameters like network topology, interaction game, state update rules and initial fraction of cooperators. We perform a thorough study of the phenomenon of evolution of cooperation using (a) a set of popular categories of networks, namely, grid, random networks, scale-free networks, and small-world networks and (b) a set of cognitively motivated update rules. Our simulation results show that the evolution of cooperation in networked systems is quite nuanced and depends on the combination of network type, update rules and the initial fraction of cooperating agents. We also provide an analysis to support our simulation results. The Evolution of Cooperation in Self-Interested Agent Societies: A Critical Study Lisa-Maria Hofmann, Nilanjan Chakraborty, Katia Sycarahas 7 papers

Session D5 – Logic-Based Approaches II


A number of popular logical formalisms for representing and reasoning about the abilities of teams or coalitions of agents have been proposed beginning with the Coalition Logic (CL) of Pauly. Agotnes et al. introduced a means of succinctly expressing quantification over coalitions without compromising the computational complexity of model checking in CL by introducing Quantified Coalition Logic (QCL). QCL introduces a separate logical language for characterizing coalitions in the modal operators used in QCL. Boella et al., increased the representational expressibility of such formalisms by introducing Higher-Order Coalition Logic (HCL), a monadic second-order logic with special set grouping operators. Tractable fragments of HCL suitable for efficient model checking have yet to be identified. In this paper, we relax the monadic restriction used in HCL and restrict ourselves to the diamond operator. We show how formulas using the diamond operator are logically equivalent to second-order formulas. This permits us to isolate and define well-behaved expressive fragments of second-order logic amenable to model-checking in PTIME. To do this, we appeal to techniques used in deductive databases and quantifier elimination. In addition, we take advantage of the monotonicity of the effectivity function resulting in exponentially more succinct representation of models. The net result is identification of highly expressible fragments of a generalized HCL where model checking can be done efficiently in PTIME. Tractable Model Checking for Fragments of Higher-Order Coalition Logic Patrick Doherty, Barbara Dunin-Kęplicz, Andrzej Szałas

Session A6 – Robotics and Learning


Recent research in multi-robot exploration and mapping has focused on sampling environmental fields, which are typically modeled using the Gaussian process (GP). Existing information-theoretic exploration strategies for learning GP-based environmental field maps adopt the non-Markovian problem structure and consequently scale poorly with the length of history of observations. Hence, it becomes computationally impractical to use these strategies for in situ, real-time active sampling. To ease this computational burden, this paper presents a Markov-based approach to efficient information-theoretic path planning for active sampling of GP-based fields. We analyze the time complexity of solving the Markov-based path planning problem, and demonstrate analytically that it scales better than that of deriving the non-Markovian strategies with increasing length of planning horizon. For a class of exploration tasks called the transect sampling task, we provide theoretical guarantees on the active sampling performance of our Markov-based policy, from which ideal environmental field conditions and sampling task settings can be established to limit its performance degradation due to violation of the Markov assumption. Empirical evaluation on real-world temperature and plankton density field data shows that our Markov-based policy can generally achieve active sampling performance comparable to that of the widely-used non-Markovian greedy policies under less favorable realistic field conditions and task settings while enjoying significant computational gain over them. Active Markov Information-Theoretic Path Planning for Robotic Environmental Sensing Kian Hsiang Low, John M. Dolan, Pradeep Khosla


Maintaining accurate world knowledge in a complex and changing environment is a perennial problem for robots and other artificial intelligence systems. Our architecture for addressing this problem, called Horde, consists of a large number of independent reinforcement learning sub-agents, or demons. Each demon is responsible for answering a single predictive or goal-oriented question about the world, thereby contributing in a factored, modular way to the system's overall knowledge. The questions are in the form of a value function, but each demon has its own policy, reward function, termination function, and terminal-reward function unrelated to those of the base problem. Learning proceeds in parallel by all demons simultaneously so as to extract the maximal training information from whatever actions are taken by the system as a whole. Gradient-based temporal-difference learning methods are used to learn efficiently and reliably with function approximation in this off-policy setting. Horde runs in constant time and memory per time step, and is thus suitable for learning online in real-time applications such as robotics. We present results using Horde on a multi-sensored mobile robot to successfully learn goal-oriented behaviors and long-term predictions from off-policy experience. Horde is a significant incremental step towards a real-time architecture for efficient learning of general knowledge from unsupervised sensorimotor interaction. Horde: A Scalable Real-time Architecture for Learning Knowledge from Unsupervised Sensorimotor Interaction Richard S. Sutton, Joseph Modayil, Michael Delp, Thomas Degris, Patrick M. Pilarski, Adam White, Doina Precuphas 2 papers


In several realistic domains an agent's behavior is composed of multiple interdependent skills. For example, consider a humanoid robot that must play soccer, as is the focus of this paper. In order to succeed, it is clear that the robot needs to walk quickly, turn sharply, and kick the ball far. However, these individual skills are ineffective if the robot falls down when switching from walking to turning, or if it cannot position itself behind the ball for a kick. This paper presents a learning architecture for a humanoid robot soccer agent that has been fully deployed and tested within the RoboCup 3D simulation environment. First, we demonstrate that individual skills such as walking and turning can be parameterized and optimized to match the best performance statistics reported in the literature. These results are achieved through effective use of the CMA-ES optimization algorithm. Next, we describe a framework for optimizing skills in conjunction with one another, a little-understood problem with substantial practical significance. Over several phases of learning, a total of roughly 100-150 parameters are optimized. Detailed experiments show that an agent thus optimized performs comparably with the top teams from the RoboCup 2010 competitions, while taking relatively few man-hours for development. On Optimizing Interdependent Skills: A Case Study in Simulated 3D Humanoid Robot Soccer Daniel Urielihas 2 papers, Patrick MacAlpine, Shivaram Kalyanakrishnan, Yinon Bentor, Peter Stonehas 5 papers

Session B6 – Energy Applications


The creation of Virtual Power Plants (VPPs) has been suggested in recent years as the means for achieving the cost-efficient integration of the many distributed energy resources (DERs) that are starting to emerge in the electricity network. In this work, we contribute to the development of VPPs by offering a game-theoretic perspective to the problem. Specifically, we design cooperatives (or “cooperative VPPs”—CVPPs) of rational autonomous DER-agents representing small-to-medium size renewable electricity producers, which coalesce to profitably sell their energy to the electricity grid. By so doing, we help to counter the fact that individual DERs are often excluded from the wholesale energy market due to their perceived inefficiency and unreliability. We discuss the issues surrounding the emergence of such cooperatives, and propose a pricing mechanism with certain desirable properties. Specifically, our mechanism guarantees that CVPPs have the incentive to truthfully report to the grid accurate estimates of their electricity production, and that larger rather than smaller CVPPs form; this promotes CVPP efficiency and reliability. In addition, we propose a scheme to allocate payments within the cooperative, and show that, given this scheme and the pricing mechanism, the allocation is in the core and, as such, no subset of members has a financial incentive to break away from the CVPP. Moreover, we develop an analytical tool for quantifying the uncertainty about DER production estimates, and distinguishing among different types of errors regarding such estimates. We then utilize this tool to devise protocols to manage CVPP membership. Finally, we demonstrate these ideas through a simulation that uses real-world data. Cooperatives of Distributed Energy Resources for Efficient Virtual Power Plants Georgios Chalkiadakis, Valentin Robuhas 2 papers, Ramachandra Kotahas 2 papers, Alex Rogershas 6 papers, Nicholas R. Jenningshas 9 papers

Session C6 – Voting Protocols


Electoral control models ways of changing the outcome of an election via such actions as adding/deleting/partitioning either candidates or voters. These actions modify an election's participation structure and aim at either making a favorite candidate win (“constructive control”) or prevent a despised candidate from winning (“destructive control”). To protect elections from such control attempts, computational complexity has been used to show that electoral control, though not impossible, is computationally prohibitive. Recently, Erdélyi and Rothe proved that Brams and Sanver's fallback voting, a hybrid voting system that combines Bucklin with approval voting, is resistant to each of the standard types of control except five types of voter control. They proved that fallback voting is vulnerable to two of those control types, leaving the other three cases open. We solve these three open problems, thus showing that fallback voting is resistant to all standard types of control by partition of voters–which is a particularly important and well-motivated control type, as it models “two-district gerrymandering.” Hence, fallback voting is not only fully resistant to candidate control but also fully resistant to constructive control, and it displays the broadest resistance to control currently known to hold among natural voting systems with a polynomial-time winner problem. We also show that Bucklin voting behaves almost as good in terms of control resistance. Each resistance for Bucklin voting strengthens the corresponding control resistance for fallback voting. The Complexity of Voter Partition in Bucklin and Fallback Voting: Solving Three Open Problems Gábor Erdélyi, Lena Piras, Jörg Rothehas 2 papers


A possible winner of an election is a candidate that has, in some kind of incomplete-information election, the possibility to win in a complete extension of the election. The first type of problem we study is the Possible co-Winner with respect to the Addition of New Candidates (PcWNA) problem, which asks, given an election with strict preferences over the candidates, is it possible to make a designated candidate win the election by adding a limited number of new candidates to the election? In the case of unweighted voters we show NP-completeness of PcWNA for a broad class of pure scoring rules. We will also briefly study the case of weighted voters. The second type of possible winner problem we study is Possible Winner/co-Winner under Uncertain Voting System (PWUVS and PcWUVS). Here, uncertainty is present not in the votes but in the election rule itself. For example, PcWUVS is the problem of whether, given a set C of candidates, a list of votes over C, a distinguished candidate c ın C, and a class of election rules, there is at least one election rule from this class under which c wins the election. We study these two problems for a class of systems based on approval voting, the family of Copeland elections, and a certain class of scoring rules. Our main result is that it is NP-complete to determine whether there is a scoring vector that makes c win the election, if we restrict the set of possible scoring vectors for an m-candidate election to those of the form (α_1, …, α_{m-4}, x_1, x_2, x_3, 0), with x_i = 1 for at least one i ın {1, 2, 3}. Computational Complexity of Two Variants of the Possible Winner Problem Dorothea Baumeister, Magnus Roos, Jörg Rothehas 2 papers

Session D6 – Trust and Organisational Structure


In the absence of legal enforcement procedures for the participants of an open e-marketplace, trust and reputation systems are central for resisting against threats from malicious agents. Such systems provide mechanisms for identifying the participants who disseminate unfair ratings. However, it is possible that some of the honest participants are also victimized as a consequence of the poor judgement of these systems. In this paper, we propose a two-layer filtering algorithm that cognitively elicits the behavioral characteristics of the participating agents in an e-marketplace. We argue that the notion of unfairness does not exclusively refer to deception but can also imply differences in dispositions. The proposed filtering approach aims to go beyond the inflexible judgements on the quality of participants and instead allows the human dispositions that we call optimism, pessimism and realism to be incorporated into our trustworthiness evaluations. Our proposed filtering algorithm consists of two layers. In the first layer, a consumer agent measures the competency of its neighbors for being a potentially helpful adviser. Thus, it automatically disqualifies the deceptive agents and/or the newcomers that lack the required experience. Afterwards, the second layer measures the credibility of the surviving agents of the previous layer on the basis of their behavioral models. This tangible view of trustworthiness evaluation boosts the confidence of human users in using a web-based agent-oriented e-commerce application. Multi-Layer Cognitive Filtering by Behavioral Modeling Zeinab Noorian, Stephen Marsh, Michael Fleming

Session A7 – Argumentation and Negotiation

Session B7 – Planning


The use of distributed POMDPs for cooperative teams has been severely limited by the incredibly large joint policyspace that results from combining the policy-spaces of the individual agents. However, much of the computational cost of exploring the entire joint policy space can be avoided by observing that in many domains important interactions between agents occur in a relatively small set of scenarios, previously defined as coordination locales (CLs). Moreover, even when numerous interactions might occur, given a set of individual policies there are relatively few actual interactions. Exploiting this observation and building on an existing model shaping algorithm, this paper presents D-TREMOR, an algorithm in which cooperative agents iteratively generate individual policies, identify and communicate possible interactions between their policies, shape their models based on this information and generate new policies. D-TREMOR has three properties that jointly distinguish it from previous DEC-POMDP work: (1) it is completely distributed; (2) it is scalable (allowing 100 agents to compute a “good” joint policy in under 6 hours) and (3) it has low communication overhead. D-TREMOR complements these traits with the following key contributions, which ensure improved scalability and solution quality: (a) techniques to ensure convergence; (b) faster approaches to detect and evaluate CLs; (c) heuristics to capture dependencies between CLs; and (d) novel shaping heuristics to aggregate effects of CLs. While the resulting policies are not globally optimal, empirical results show that agents have policies that effectively manage uncertainty and the joint policy is better than policies generated by independent solvers. Distributed Model Shaping for Scaling to Decentralized POMDPs with Hundreds of Agents Prasanna Velagapudi, Pradeep Varakanthamhas 4 papers, Katia Sycarahas 7 papers, Paul Scerrihas 3 papers

Session C7 – Game Theory II


The fastest known algorithm for solving General Bayesian Stackelberg games with a finite set of follower (adversary) types have seen direct practical use at the LAX airport for over 3 years; and currently, an (albeit non-Bayesian) algorithm for solving these games is also being used for scheduling air marshals on limited sectors of international flights by the US Federal Air Marshals Service. These algorithms find optimal randomized security schedules to allocate limited security resources to protect targets. As we scale up to larger domains, including the full set of flights covered by the Federal Air Marshals, it is critical to develop newer algorithms that scale-up significantly beyond the limits of the current state-of-theart of Bayesian Stackelberg solvers. In this paper, we present a novel technique based on a hierarchical decomposition and branch and bound search over the follower type space, which may be applied to different Stackelberg game solvers. We have applied this technique to different solvers, resulting in: (i) A new exact algorithm called HBGS that is orders of magnitude faster than the best known previous Bayesian solver for general Stackelberg games; (ii) A new exact algorithm called HBSA which extends the fastest known previous security game solver towards the Bayesian case; and (iii) Approximation versions of HBGS and HBSA that show significant improvements over these newer algorithms with only 12% sacrifice in the practical solution quality. Quality-bounded Solutions for Finite Bayesian Stackelberg Games: Scaling up Manish Jainhas 3 papers, Christopher Kiekintveldhas 4 papers, Milind Tambehas 8 papers

Session D7 – Virtual Agents II


This paper introduces a model which connects representations of the space surrounding a virtual humanoid's body with the space it shares with several interaction partners. This work intends to support virtual humans (or humanoid robots) in near space interaction and is inspired by studies from cognitive neurosciences on the one hand and social interaction studies on the other hand. We present our work on learning the body structure of an articulated virtual human by using data from virtual touch and proprioception sensors. The results are utilized for a representation of its reaching space, the so-called peripersonal space. In interpersonal interaction involving several partners, their peripersonal spaces may overlap and establish a shared reaching space. We define it as their interaction space, where cooperation takes place and where actions to claim or release spatial areas have to be adapted, to avoid obstructions of the other's movements. Our model of interaction space is developed as an extension of Kendon's F-formation system, a foundational theory of how humans orient themselves in space when communicating. Thus, interaction space allows for analyzing the spatial arrangement (i.e., body posture and orientation) between multiple interaction partners and the extent of space they share. Peripersonal and interaction space are modeled as potential fields to control the virtual human's behavior strategy. As an example we show how the virtual human can relocate object positions toward or away from locations reachable for all partners, and thus influencing the degree of cooperation in an interaction task. From Body Space to Interaction Space - Modeling Spatial Cooperation for Virtual Humans Nhung Nguyen, Ipke Wachsmuthhas 2 papers


While speaking about social interaction, psychology claims as crucial the temporal correlations between interactants' behaviors: to give to their partners a feeling of natural interaction, interactants, be human, robotic or virtual, must be able to react on appropriate time. Recent approaches consider autonomous agents as dynamical systems and the interaction as a coupling between these systems. These approaches solve the issue of time handling and enable to model synchronization and turn-taking as phenomenon emerging with the coupling. But when complex computations are added to their architecture, such as processing of video and audio signals, delays appear within the interaction loop and disrupt this coupling. We model here a dyad of agents where processing delays are controlled. These agents, driven by oscillators, synchronize and take turns when there is no delay. We describe the methodology enabling to evaluate the synchrony and turn-taking emergence. We test oscillators coupling properties when there is no delay: coupling occurs if coupling strength is inferior to the parameter controlling oscillators natural period and if the ratio between oscillators periods is inferior to 1/2. We quantify the maximal delays between agents which do not disrupt the interaction: the maximal delay tolerated by agents is proportional to the natural period of the coupled system and to the strength of the coupling. These results are put in perspective with the different time constraints of human-human and human-agent interactions. Effect of Time Delays on Agents' Interaction Dynamics Ken Prepin, Catherine Pelachaud


Copyright © 2011 IFAAMAS