Basic Statistical Inference#

Related modules: mqr.inference
mqr.inference.nonparametric
mqr.plot.confint
Detailed examples: nklsxn/mqr-guide

The inference module is largely an interface to other libraries. The module contains functions that calculate sample size, confidence intervals and hypothesis tests. Where a function calls another library to calculate a result, the function’s docstring shows the library and function that is called.

Structure of the `inference` module#

The structure of the inference module (like others in MQR) is supposed to read a bit like a menu. The basic (ie. not ANOVA/regression) parametric routines are organised like this.

mqr.inference
├── dist
│   └── test_1sample
├── correlation
│   ├── confint
│   ├── test
│   └── test_diff
├── mean
│   ├── size_1sample
│   ├── size_2sample
│   ├── size_paired
│   ├── confint_1sample
│   ...
│   └── test_paired
└── stddev, variance, proportion, rate
    ├── size_1sample
    ...
    └── test_2sample

The non-parametric routines are organised like this.

mqr.inference.nonparametric
├── dist
│   ├── test_1sample
│   └── test_2sample
├── correlation
│   └── test
├── median
│   ├── test_1sample
│   └── test_nsample
├── quantile
│   ├── confint_1sample
│   └── test_1sample
└── variance
    └── test_nsample

Result types#

All sample-size, confidence interval and hypothesis test results are shown as tables in notebooks.

Sample size#

Sample-size calculations have mqr.inference.power.TestPower type, shown in a notebook as:

test_size = mqr.inference.proportion.size_2sample(
    p1=0.2,
    p2=0.4,
    alpha=0.05,
    beta=0.2,
    alternative='less')
test_size

Test Power
proportion
alpha	0.05
beta	0.2
effect	0.2 - 0.4 = -0.2
alternative	less
method	norm-approx
sample size	63.8621

The TestPower result has fields that correspond to the values in the table:

print(test_size)
print(test_size.sample_size)

TestPower(name='proportion', alpha=0.05, beta=0.2, effect='0.2 - 0.4 = -0.2', alternative='less', method='norm-approx', sample_size=np.float64(63.86207354042282))
63.86207354042282

Confidence intervals#

Confidence interval calculations have mqr.inference.confint.ConfidenceInterval type, which are shown as:

ci = mqr.inference.rate.confint_1sample(
    count=80,
    n=100,
    meas=1.0,
    conf=0.98)
ci

Confidence Interval
rate of events
98% (wald-cc)
value	lower	upper
0.8	0.587576	1.01372

Like TestPower results, confidence intervals are an object with fields corresponding to tabulated values. Confidence intervals are also iterable, to allow the bounds to be easily unpacked.

lower, upper = ci
print(f'lower={lower}, upper={upper}')
print(f'bounded={ci.bounded}')
print(f'value={ci.value}')

lower=0.5875763737495605, upper=1.0137241005969062
bounded=both
value=0.8

Hypothesis tests#

Hypothesis tests have type mqr.inference.hyptest.HypothesisTest, and they are shown like this.

np.random.seed(0)

test = mqr.inference.nonparametric.median.test_nsample(
    st.uniform().rvs(20) - 0.4,
    st.uniform().rvs(20) + 0.2,
    st.uniform().rvs(20) * 2,
    method='kruskal-wallis',
    alternative='two-sided')
test

Hypothesis Test
equality of medians
method	kruskal-wallis
H₀	median(x_i) == median(x_j)
H₁	median(x_i) != median(x_j)
statistic	25.2987
p-value	3.20966e-06

All the table rows are available as fields in the hypothesis test objects.

print(f'null-hypothesis: {test.null}')
print(f'statistic={test.stat}')
print(f'p-value={test.pvalue}')

null-hypothesis: median(x_i) == median(x_j)
statistic=25.298688524590176
p-value=3.2096641184463826e-06