We consider peer-to-peer (P2P) networks, where multiple peers are interested in sharing multimedia content. In such P2P networks, the shared resources are the peers' contributed content and their upload bandwidth. While sharing resources, autonomous and self-interested peers need to make decisions on the amount of their resource reciprocation (i.e., representing their actions) such that their individual utilities are maximized. We model the resource reciprocation among the peers as a stochastic game and show how the peers can determine optimal strategies for resource reciprocation using a Markov Decision Process (MDP) framework. Unlike existing resource reciprocation strategies, which focus on myopic decisions of peers, the optimal strategies determined based on MDP enable the peers to make foresighted decisions about resource reciprocation, such that they can explicitly consider both their immediate as well as future expected utilities. To successfully formulate the MDP framework, we propose a novel algorithm that identifies the state transition probabilities using representative resource reciprocation models of peers. These models express the peers' different attitudes toward resource reciprocation. We analytically investigate how the error between the true and estimated state transition probability impacts each peer's decisions for selecting its actions as well as the resulting utilities. Moreover, we also analytically study how bounded rationality (e.g., limited memory for reciprocation history and the limited number of state descriptions) can impact the interactions among the peers and the resulting resource reciprocation. Simulation results show that the proposed approach based on reciprocation models can effectively cope with a dynamically changing environment such as peers' joining or leaving P2P networks. Moreover, we show that the proposed foresighted decisions lead to the best performance in terms of the cumulative expected utilities.
Bibliographical noteFunding Information:
Manuscript received March 16, 2008; revised September 16, 2008. First published December 16, 2008; current version published January 09, 2009. This work was supported by NSF CAREER Award CCF-0541867, and grants from Microsoft Research. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Ling Guan.
- Bounded rationality
- Markov decision process
- foresighted decision
- peer-to-peer (P2P) network
- resource reciprocation game