Bandwidth selection for density estimation by mixture with varying concentrations

Authors

DOI:

https://doi.org/10.17721/1812-5409.2025/2.6

Keywords:

finite mixture model, varying concentrations, kernel density estimator, bandwidth selection, Silwerman’s rule of thumb, leave-one-out cross-validation

Abstract

Finite mixture models arise in statistics of biological and medical data when the investigated subjects belong to sub-populations with different distributions of observed variable. In the model of mixture with varying concentrations (MVC) the concentrations of the mixture components can vary from observation to observation. We consider estimation of a probability density for a mixture component in MVC by a modification of the kernel density estimator (KDE). To apply KDE one needs to select a tuning parameter named the bandwidth. Two approaches to the bandwidth selection are considered. The first one is a modification of the Silverman‘s rule of thumb. The second one is a version of the leave-one-out cross-validation algorithm. We present results of simulation which show that both algorithms demonstrate similar behavior for nearly Gaussian densities and the cross-validation outperforms the Silverman‘s rule of thumb on highly non-Gaussian densities.

Pages of the article in the issue: 41 - 46

Language of the article: English

Author Biography

  • Olena Sugakova, Taras Shevchenko National University of Kyiv

    Dr. of Sci., Assoc. Prof.

References

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning data mining, inference, and prediction. Springer.

Maiboroda, R., Miroshnichenko, V., & Sugakova, O. (2022). Jackknife for nonlinear estimating equations. Modern Stochastics: Theory and Applications, 9(4), 377–399. https://doi.org/10.15559/22-VMSTA208

Maiboroda, R., Miroshnichenko, V., & Sugakova, O. (2024). Quantile estimators for regression errors in mixture models with varying

concentrations. Bulletin of Taras Shevchenko National University of Kyiv. Physical and Mathematical Sciences, 1(78), 45–50. https://doi.org/10.17721/1812-5409.2024/1.8

Maiboroda, R., & Sugakova, O. (2012). Statistics of mixtures with varying concentrations with application to dna microarray data analysis. Nonparametric statistics, 24(1), 201–215. https://doi.org/10.1080/10485252.2011.630076

Maiboroda, R., & Sugakova, O. (2020). Tests of hypotheses on quantiles of distributions of components in a mixture. Theory of Probability and Mathematical Statistics, 101, 179–191. https://doi.org/10.1090/tpms/1120

McLachlan, G., & Peel, D. (2000). Finite mixture models. Wiley. https://doi.org/10.1002/0471721182

Pidnebesna, A., Fajnerová, I., Horáček, J., & Hlinka, J. (2023). Mixture components inference for sparse regression: Introduction and application for estimation of neuronal signal from fmri bold. Applied Mathematical Modelling, 116, 735–748. https://doi.org/10.1016/j.apm.2022.11.034

Silverman, B. W. (2018). Density estimation for statistics and data analysis. Routledge.

Stone, C. (1984). An asymptotically optimal window selection rule for kernel density estimates. The Annals of Statistics, 12(4), 1285–1297. https://doi.org/10.1002/047172118210.1214/aos/1176346792

Sugakova, O. (1999). Asymptotics of a kernel estimate for the density of a distribution constructed from observations of a mixture with varying concentration. Theory of Probability and Mathematic Statistics, 59, 161–171.

Titterington, D., Smith, A., & Makov, O. (1985). Analysis of finite mixture distributions. Wiley.

Downloads

Published

2025-12-23

Issue

Section

Algebra, Geometry and Probability Theory

How to Cite

Maiboroda, R., & Sugakova, O. (2025). Bandwidth selection for density estimation by mixture with varying concentrations. Bulletin of Taras Shevchenko National University of Kyiv. Physics and Mathematics, 81(2), 41-46. https://doi.org/10.17721/1812-5409.2025/2.6