Stochastic Complexity, Histograms and Hypothesis Testing of Homogeneity

Authors

  • Guoqi Qian The University of Melbourne

Keywords:

Histogram density estimation, Minimum description length, Model selection, Quantization, Stochastic complexity, Test of homogeneity.

Abstract

Information contained in a sample of quantitative data may be summarized or described by a nonparametric histogram density function. An interesting question is how to construct such a histogram density to express the data information with minimum stochastic complexity.The stochastic complexity is a pseudonym of Rissanen's minimum description length (MDL) which gives the length of a sequence of decipherable binary code resulted from optimally encoding the data information using a probability distribution based code-book. Here we have derived an optimal generalized histogram density estimator to provide both predictive and non-predictive coding description of a data sample. We have also obtained uniform and almost sure asymptotic approximations for the lengths of both descriptions. As an application of this result to statistical inference a new procedure for hypothesis testing of distribution homogeneity is proposed and is proved to have an asymptotic power of 1.

Downloads

Published

2009-12-16

Issue

Section

Mathematical Statistics

How to Cite

Stochastic Complexity, Histograms and Hypothesis Testing of Homogeneity. (2009). European Journal of Pure and Applied Mathematics, 3(1), 51-80. https://www.ejpam.com/index.php/ejpam/article/view/521