Basic Statistical Inference#

Related modules

mqr.inference
mqr.inference.nonparametric
mqr.plot.confint

Detailed examples

nklsxn/mqr-guide

The inference module is largely an interface to other libraries. The module contains functions that calculate sample size, confidence intervals and hypothesis tests. Where a function calls another library to calculate a result, the function’s docstring shows the library and function that is called.

Structure of the inference module#

The structure of the inference module (like others in MQR) is supposed to read a bit like a menu. The basic (ie. not ANOVA/regression) parametric routines are organised like this.

mqr.inference
├── dist
│   └── test_1sample
├── correlation
│   ├── confint
│   ├── test
│   └── test_diff
├── mean
│   ├── size_1sample
│   ├── size_2sample
│   ├── size_paired
│   ├── confint_1sample
│   ...
│   └── test_paired
└── stddev, variance, proportion, rate
    ├── size_1sample
    ...
    └── test_2sample

The non-parametric routines are organised like this.

mqr.inference.nonparametric
├── dist
│   ├── test_1sample
│   └── test_2sample
├── correlation
│   └── test
├── median
│   ├── test_1sample
│   └── test_nsample
├── quantile
│   ├── confint_1sample
│   └── test_1sample
└── variance
    └── test_nsample

Result types#

All sample-size, confidence interval and hypothesis test results are shown as tables in notebooks.

Sample size#

Sample-size calculations have mqr.inference.power.TestPower type, shown in a notebook as:

test_size = mqr.inference.proportion.size_2sample(
    p1=0.2,
    p2=0.4,
    alpha=0.05,
    beta=0.2,
    alternative='less')
test_size
Test Power
proportion
alpha 0.05
beta 0.2
effect 0.2 - 0.4 = -0.2
alternative less
method norm-approx
sample size 63.8621

The TestPower result has fields that correspond to the values in the table:

print(test_size)
print(test_size.sample_size)
TestPower(name='proportion', alpha=0.05, beta=0.2, effect='0.2 - 0.4 = -0.2', alternative='less', method='norm-approx', sample_size=63.862073540422806)
63.862073540422806

Confidence intervals#

Confidence interval calculations have mqr.inference.confint.ConfidenceInterval type, which are shown as:

ci = mqr.inference.rate.confint_1sample(
    count=80,
    n=100,
    meas=1.0,
    conf=0.98)
ci
Confidence Interval
rate of events
98% (wald-cc)
value lower upper
0.8 0.587576 1.01372

Like TestPower results, confidence intervals are an object with fields corresponding to tabulated values. Confidence intervals are also iterable, to allow the bounds to be easily unpacked.

lower, upper = ci
print(f'lower={lower}, upper={upper}')
print(f'bounded={ci.bounded}')
print(f'value={ci.value}')
lower=0.5875763737495605, upper=1.0137241005969062
bounded=both
value=0.8

Hypothesis tests#

Hypothesis tests have type mqr.inference.hyptest.HypothesisTest, and they are shown like this.

np.random.seed(0)

test = mqr.inference.nonparametric.median.test_nsample(
    st.uniform().rvs(20) - 0.4,
    st.uniform().rvs(20) + 0.2,
    st.uniform().rvs(20) * 2,
    method='kruskal-wallis',
    alternative='two-sided')
test
Hypothesis Test
equality of medians
method kruskal-wallis
H0 median(x_i) == median(x_j)
H1 median(x_i) != median(x_j)
statistic 25.2987
p-value 3.20966e-06

All the table rows are available as fields in the hypothesis test objects.

print(f'null-hypothesis: {test.null}')
print(f'statistic={test.stat}')
print(f'p-value={test.pvalue}')
null-hypothesis: median(x_i) == median(x_j)
statistic=25.298688524590176
p-value=3.2096641184463826e-06