Asymptotic distribution of the Pearson chi-square statistic

Imagen tomada de ResearchGate.

Statistical Odds & Ends

I recently learned of a fairly succinct proof for the asymptotic distribution of the Pearson chi-square statistic (from Chapter 9 of Reference 1), which I share below.

First, the set-up: Assume that we have $latex n$ independent trials, and each trial ends in one of $latex J$ possible outcomes, which we label (without loss of generality) as $latex 1, 2, dots, J$. Assume that for each trial, the probability of the outcome being $latex j$ is $latex p_j > 0$. Let $latex n_j$ denote that number of trials that result in outcome $latex j$, so that $latex sum_{j=1}^J n_j = n$. Pearson’s $latex chi^2$-statistic is defined as

$latex begin{aligned} chi^2 = sum_{text{cells}} dfrac{(text{obs} – text{exp})^2}{text{exp}} = sum_{j=1}^J dfrac{(n_j – np_j)^2}{np_j}. end{aligned}$

Theorem. As $latex n rightarrow infty$, $latex chi^2 stackrel{d}{rightarrow} chi_{J-1}^2$, where $latex stackrel{d}{rightarrow}$ denotes convergence in distribution.

Before proving the theorem, we prove a lemma that we will…

View original post 614 more words

General chi-square tests

Imagen tomada de Lifeder.

Statistical Odds & Ends

In this previous post, I wrote about the asymptotic distribution of the Pearson $latex chi^2$ statistic. Did you know that the Pearson $latex chi^2$ statistic (and the related hypothesis test) is actually a special case of a general class of $latex chi^2$ tests? In this post we describe the general $latex chi^2$ test. The presentation follows that in Chapters 23 and 24 of Ferguson (1996) (Reference 1). I’m leaving out the proofs, which can be found in the reference.

(Warning: This post is going to be pretty abstract! Nevertheless, I think it’s worth a post since I don’t think the idea is well-known.)

Let’s define some quantities. Let $latex Z_1, Z_2, dots in mathbb{R}^d$ be a sequence of random vectors whose distribution depends on a $latex k$-dimensional parameter $latex theta$ which lies in a parameter space $latex Theta$. $latex Theta$ is assumed to be a non-empty open subset…

View original post 696 more words

GENERALIDADES SOBRE LA TEORÍA ESTADÍSTICA DE ENCUESTAS POR MUESTREO. PARTE II

ISADORE NABI

La imagen del encabezado ha sido tomada de QuestionPro.

The wealth of nations

Marx’s first sentence in Capital Volume One is: “The wealth of those societies in which the capitalist mode of production prevails, presents itself as an “immense accumulation of commodities”, its unit being a single commodity.” (Moore and Aveling translation).  So, from the beginning, Marx makes a distinction between wealth in societies and how it appears […]

The wealth of nations

Kolmogorov’s strong law of large numbers

The strong law of large numbers (SLLN) is usually stated in the following way: Theorem: For such that the ‘s are independent and identically distributed (i.i.d.) with finite mean , as , What if the ‘s are independent but not identically distributed? Can we say anything in that setting? We can if we add a […]

Kolmogorov’s strong law of large numbers