We consider peer-to-peer (P2P) networks, where multiple peers are interested in sharing content. While sharing resources, autonomous and self-interested peers need to make decisions on the amount of their resource reciprocation (i.e. representing their actions) such that their individual rewards are maximized. We model the resource reciprocation among the peers as a stochastic game and show how the peers can determine their optimal strategies for the actions using a Markov Decision Process (MDP) framework. The optimal strategies determined based on MDP enable the peers to make foresighted decisions about resource reciprocation, such that they can explicitly consider both their immediate as well as future expected rewards. To successfully formulate the MDP framework, we propose a novel algorithm that efficiently identifies the state transition probabilities using representative resource reciprocation models of peers. Simulation results show that the proposed approach based on the reciprocation models can effectively cope with a dynamically changing environment of P2P networks. Moreover, we show that the foresighted decisions lead to the best performance in terms of the cumulative expected rewards.