TY - GEN
T1 - Reinforcement learning in BitTorrent systems
AU - Izhak-Ratzin, Rafit
AU - Park, Hyunggon
AU - Van Der Schaar, Mihaela
PY - 2011
Y1 - 2011
N2 - In this paper, we propose a BitTorrent-like protocol that replaces the peer selection mechanisms in the regular BitTorrent protocol with a novel reinforcement learning based mechanism. The inherent operation of P2P systems, which involves repeated interactions among peers over a long time period, allows peers to efficiently identify free-riders as well as desirable collaborators by learning the behavior of their associated peers. Thus, it can help peers improve their download rates and discourage free-riding (FR), while improving fairness. We model the peers' interactions in the BitTorrent-like network as a repeated interaction game, where we explicitly consider the strategic behavior of the peers. A peer that applies the reinforcement learning based mechanism uses a partial history of the observations on associated peers' statistical reciprocal behaviors to determine its best responses and estimate the corresponding impact on its expected utility. The policy determines the peer's resource reciprocations with other peers, which would maximize the peer's long-term performance.
AB - In this paper, we propose a BitTorrent-like protocol that replaces the peer selection mechanisms in the regular BitTorrent protocol with a novel reinforcement learning based mechanism. The inherent operation of P2P systems, which involves repeated interactions among peers over a long time period, allows peers to efficiently identify free-riders as well as desirable collaborators by learning the behavior of their associated peers. Thus, it can help peers improve their download rates and discourage free-riding (FR), while improving fairness. We model the peers' interactions in the BitTorrent-like network as a repeated interaction game, where we explicitly consider the strategic behavior of the peers. A peer that applies the reinforcement learning based mechanism uses a partial history of the observations on associated peers' statistical reciprocal behaviors to determine its best responses and estimate the corresponding impact on its expected utility. The policy determines the peer's resource reciprocations with other peers, which would maximize the peer's long-term performance.
KW - BitTorrent
KW - P2P
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=79960889230&partnerID=8YFLogxK
U2 - 10.1109/INFCOM.2011.5935192
DO - 10.1109/INFCOM.2011.5935192
M3 - Conference contribution
AN - SCOPUS:79960889230
SN - 9781424499212
T3 - Proceedings - IEEE INFOCOM
SP - 406
EP - 410
BT - 2011 Proceedings IEEE INFOCOM
T2 - IEEE INFOCOM 2011
Y2 - 10 April 2011 through 15 April 2011
ER -