Central Limit Theorem

Consider a sequence of independent and identically distributed random variables with finite mean ${\displaystyle \mu }$ and finite variance ${\displaystyle \sigma ^{2}}$.

Consider also the sample mean

${\displaystyle M_{n}={1 \over {n}}\sum _{i=1}^{n}{x_{i}}}$
the sum
${\displaystyle S_{n}=x_{1}+x_{2}+...+x_{n}}$
and the sequence of variables
${\displaystyle Z_{n}={{M_{n}-\mu } \over {\sigma /{\sqrt {n}}}}={{S_{n}-n\mu } \over {\sigma {\sqrt {n}}}}}$

Then ${\displaystyle Z_{n}{\xrightarrow {D}}Z}$, where ${\displaystyle Z\sim {N}(0;1)}$.

Alternatively:

• ${\displaystyle \lim _{n\to \infty }P(Z_{n}\leq {x})=\Phi (x)}$
• ${\displaystyle \lim _{n\to \infty }P(M_{n}\leq \mu +x{\sigma \over {\sqrt {n}}})=\Phi (x)}$
• ${\displaystyle \lim _{n\to \infty }P(S_{n}\leq {n}\mu +x{\sigma {\sqrt {n}}})=\Phi (x)}$

For n sufficiently large (${\displaystyle n\geq 30)\Longrightarrow {P}(Z_{n}\leq {x})\approx \Phi (x)}$

• Standardization: ${\displaystyle Z_{n}\approx {N}(0;1)}$ ${\displaystyle n\leq 30}$
• Mean: ${\displaystyle M_{n}\approx {N}(\mu ;{\sigma ^{2} \over {n}})}$
• Sum sequence: ${\displaystyle S_{n}\approx {N}(n\mu ;n\sigma ^{2})}$

Corollary:

Consider the sequence

${\displaystyle Z_{n}^{*}={{M_{n}-\mu } \over {\sigma _{n}/{\sqrt {n}}}}={{S_{n}-n\mu } \over {\sigma _{n}{\sqrt {n}}}}}$

where ${\displaystyle \sigma ^{2}}$ is a sample function such that ${\displaystyle \sigma _{n}{\xrightarrow {a.s.}}\vartheta }$ as ${\displaystyle n\rightarrow \infty }$. Then ${\displaystyle Z_{n}^{*}{\xrightarrow {D}}Z}$ where ${\displaystyle Z\sim {N}(0;1)}$.

Applications of the Central Limit Theorem

Approximation of binomial distribution

For the approximation of a binomial distribution we use the correction of continuity which is 0.5.

${\displaystyle Y_{n}\sim (n,\theta )}$

${\displaystyle P(Z_{n}\leq x)=\Phi (x)}$

{\displaystyle {\begin{aligned}P(Y_{n}\leq x)&=P{\Bigl (}{Y_{n}-n\theta \over {\sqrt {n\theta (1-\theta )}}}\leq {x-n\theta \over {\sqrt {n\theta (1-\theta )}}}{\Bigr )}\\&=P{\Bigl (}Z_{n}\leq {x-n\theta \over {\sqrt {n\theta )1-\theta )}}}{\Bigr )}\\&\approx \Phi {\Bigl (}{x-n\theta \pm \overbrace {0.5} ^{\text{correction of continuity}} \over {\sqrt {n\theta (1-\theta )}}}{\Bigr )}\end{aligned}}}
${\displaystyle P(Y_{n}=y)=0}$

{\displaystyle {\begin{aligned}P(Y_{n}\approx y)&\simeq P(y-0.5

Assessment of the error of the sample mean

For assessing the error of the sample mean we consider the strong law of large numbers ${\displaystyle M_{n}{\xrightarrow {a.s.}}\mu }$and the CLT${\displaystyle {\Bigl (}M_{n}\sim N(\mu ;{\sigma ^{2} \over n}){\Bigr )}}$, as well as :

• Margin of error:
{\displaystyle {\begin{aligned}0.9974&=\Phi (3)-\Phi (3)\\&=P{\Bigl (}-3\leq {M_{n}-\mu \over \sigma /{\sqrt {n}}}\leq +3{\Bigr )}\\&=P{\Bigl (}-3{\sigma \over {\sqrt {n}}}\leq M_{n}-\mu \leq +3{\sigma \over {\sqrt {n}}}{\Bigr )}\\&=P{\Bigl (}|M_{n}-\mu |\leq \overbrace {3{\sigma \over {\sqrt {n}}}} ^{\text{margin of error}}{\Bigr )}\\&=P{\Bigl (}M_{n}-3{\sigma \over {\sqrt {n}}}\leq \mu \leq M_{n}+3{\sigma \over {\sqrt {n}}}{\Bigr )}\\\end{aligned}}}
• Sample variance${\displaystyle (s^{2})}$:

${\displaystyle \sigma _{n}{\xrightarrow {a.s}}\sigma }$
{\displaystyle {\begin{aligned}{\sigma _{n}}^{2}&={1 \over {n-1}}\sum _{i=1}^{n}(x_{i}-M_{n})^{2}\\&={n \over {n-1}}{\Bigl (}{1 \over n}\sum _{i=1}^{n}(x_{i}-M_{n})^{2}{\Bigr )}\\&={n \over {n-1}}{\Bigl (}{1 \over n}\sum _{i=1}^{n}{x_{i}}^{2}-{M_{n}}^{2}{\Bigr )}\end{aligned}}}

• independent and identically distributed ${\displaystyle {x_{i}}^{2}}$:

${\displaystyle E({x_{i}}^{2})=\sigma ^{2}+\mu ^{2}}$
${\displaystyle var(x_{i})=E({x_{i}}^{2})-[E(x_{i})]^{2}}$
${\displaystyle \sigma ^{2}=E({x_{i}}^{2})-\mu ^{2}}$

${\displaystyle \Rrightarrow E({x_{i}}^{2})=\sigma ^{2}+\mu ^{2}}$

${\displaystyle {1 \over n}\sum _{i=1}^{n}{x_{i}}^{2}{\xrightarrow {a.s.}}E({x_{i}}^{2})=\sigma ^{2}+\mu ^{2}}$

• ${\displaystyle {M_{n}}^{2}{\xrightarrow {a.s.}}\mu ^{2}}$

Hence:

• ${\displaystyle {\sigma _{n}}^{2}={n \over n-1}{\Bigl (}{1 \over n}\sum _{i=1}^{n}{x_{i}}^{2}-{M_{n}}^{2}{\Bigr )}}$
• ${\displaystyle {1 \over n}\sum _{i=1}^{n}{x_{i}}^{2}{\xrightarrow {a.s.}}\sigma ^{2}+\mu ^{2}}$
• ${\displaystyle {M_{n}}^{2}{\xrightarrow {a.s.}}\mu ^{2}}$
• ${\displaystyle {n \over n-1}\rightarrow 1}$
• ${\displaystyle {\sigma _{n}}^{2}{\xrightarrow {a.s.}}1(\sigma ^{2}+\mu ^{2}-\mu ^{2})=\sigma ^{2}}$
• ${\displaystyle \sigma _{n}{\xrightarrow {a.s.}}{\sqrt {\sigma ^{2}}}=\sigma }$

Conclusive, to assess the error one can use the following interval

${\displaystyle {\Bigl (}M_{n}-3{s \over {\sqrt {n}}};M_{n}+3{s \over {\sqrt {n}}}{\Bigr )}}$

where

${\displaystyle s^{2}={1 \over n-1}\sum _{i=1}^{n}(x_{i}-M_{n})^{2}}$