Basic Statistical Inference#
- Related modules
- Detailed examples
The inference module is largely an interface to other libraries. The module contains functions that calculate sample size, confidence intervals and hypothesis tests. Where a function calls another library to calculate a result, the function’s docstring shows the library and function that is called.
Structure of the inference module#
The structure of the inference module (like others in MQR) is supposed to read a bit like a menu. The basic (ie. not ANOVA/regression) parametric routines are organised like this.
mqr.inference
├── dist
│ └── test_1sample
├── correlation
│ ├── confint
│ ├── test
│ └── test_diff
├── mean
│ ├── size_1sample
│ ├── size_2sample
│ ├── size_paired
│ ├── confint_1sample
│ ...
│ └── test_paired
└── stddev, variance, proportion, rate
├── size_1sample
...
└── test_2sample
The non-parametric routines are organised like this.
mqr.inference.nonparametric
├── dist
│ ├── test_1sample
│ └── test_2sample
├── correlation
│ └── test
├── median
│ ├── test_1sample
│ └── test_nsample
├── quantile
│ ├── confint_1sample
│ └── test_1sample
└── variance
└── test_nsample
Result types#
All sample-size, confidence interval and hypothesis test results are shown as tables in notebooks.
Sample size#
Sample-size calculations have mqr.inference.power.TestPower
type,
shown in a notebook as:
test_size = mqr.inference.proportion.size_2sample(
p1=0.2,
p2=0.4,
alpha=0.05,
beta=0.2,
alternative='less')
test_size
Test Power | |
---|---|
proportion | |
alpha | 0.05 |
beta | 0.2 |
effect | 0.2 - 0.4 = -0.2 |
alternative | less |
method | norm-approx |
sample size | 63.8621 |
The TestPower
result has fields that correspond to the values in the table:
print(test_size)
print(test_size.sample_size)
TestPower(name='proportion', alpha=0.05, beta=0.2, effect='0.2 - 0.4 = -0.2', alternative='less', method='norm-approx', sample_size=63.862073540422806)
63.862073540422806
Confidence intervals#
Confidence interval calculations have mqr.inference.confint.ConfidenceInterval
type,
which are shown as:
ci = mqr.inference.rate.confint_1sample(
count=80,
n=100,
meas=1.0,
conf=0.98)
ci
Confidence Interval | ||
---|---|---|
rate of events | ||
98% (wald-cc) | ||
value | lower | upper |
0.8 | 0.587576 | 1.01372 |
Like TestPower
results, confidence intervals are an object with fields corresponding to tabulated values.
Confidence intervals are also iterable, to allow the bounds to be easily unpacked.
lower, upper = ci
print(f'lower={lower}, upper={upper}')
print(f'bounded={ci.bounded}')
print(f'value={ci.value}')
lower=0.5875763737495605, upper=1.0137241005969062
bounded=both
value=0.8
Hypothesis tests#
Hypothesis tests have type mqr.inference.hyptest.HypothesisTest
,
and they are shown like this.
np.random.seed(0)
test = mqr.inference.nonparametric.median.test_nsample(
st.uniform().rvs(20) - 0.4,
st.uniform().rvs(20) + 0.2,
st.uniform().rvs(20) * 2,
method='kruskal-wallis',
alternative='two-sided')
test
Hypothesis Test | |
---|---|
equality of medians | |
method | kruskal-wallis |
H0 | median(x_i) == median(x_j) |
H1 | median(x_i) != median(x_j) |
statistic | 25.2987 |
p-value | 3.20966e-06 |
All the table rows are available as fields in the hypothesis test objects.
print(f'null-hypothesis: {test.null}')
print(f'statistic={test.stat}')
print(f'p-value={test.pvalue}')
null-hypothesis: median(x_i) == median(x_j)
statistic=25.298688524590176
p-value=3.2096641184463826e-06