A quantile estimation for massive data with generalized Pareto distribution

Jongwoo Song, Seongjoo Song

Research output: Contribution to journalArticlepeer-review

22 Scopus citations


This paper proposes a new method of estimating extreme quantiles of heavy-tailed distributions for massive data. The method utilizes the Peak Over Threshold (POT) method with generalized Pareto distribution (GPD) that is commonly used to estimate extreme quantiles and the parameter estimation of GPD using the empirical distribution function (EDF) and nonlinear least squares (NLS). We first estimate the parameters of GPD using EDF and NLS and then, estimate multiple high quantiles for massive data based on observations over a certain threshold value using the conventional POT. The simulation results demonstrate that our parameter estimation method has a smaller Mean square error (MSE) than other common methods when the shape parameter of GPD is at least 0. The estimated quantiles also show the best performance in terms of root MSE (RMSE) and absolute relative bias (ARB) for heavy-tailed distributions.

Original languageEnglish
Pages (from-to)143-150
Number of pages8
JournalComputational Statistics and Data Analysis
Issue number1
StatePublished - 1 Jan 2012

Bibliographical note

Funding Information:
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (No. 2010-0004196 (J. Song) and No. 2010-0017185 (S. Song)).


  • Generalized Pareto distribution
  • Massive data
  • Nonlinear least squares
  • Parameter estimation
  • Peak over threshold
  • Quantile estimation


Dive into the research topics of 'A quantile estimation for massive data with generalized Pareto distribution'. Together they form a unique fingerprint.

Cite this