Sample#
- class mqr.process.Sample(data, conf=0.95, ddof=1, name=None, num_display_fmt='#.5g')#
Data and descriptive statistics for a single sample from a process.
Construct using a pandas Series (ie. a column from a dataframe). Intended for use by a Study object.
- Attributes:
- namestr
Name of the KPI or measurement.
- conffloat
Confidence level to use in confidence intervals.
- datapd.Series
Sample measurements.
- ad_statfloat
Anderson-Darling normality test statistic.
- ad_pvaluefloat
p-value associated with ad_stat.
- ks_statfloat
Kolmogorov-Smirnov goodness of fit (with normal) test statistic.
- ks_pvaluefloat
p-value associated with ks_stat.
- nobsint
Number of measurements in the sample.
- meanfloat
Sample mean.
- semfloat
Standard error of mean.
- stdfloat
Sample standard deviation.
- varfloat
Sample variance.
- skewnessfloat
Skewness.
- kurtosisfloat
Kurtosis.
- minimumfloat
Smallest observation.
- quartile1float
25th percentile observation.
- medianfloat
Median observation.
- quartile3float
75th percentile obsevation.
- maximumfloat
Largest observation.
- iqrfloat
Inter-quartile range.
- conf_meanConfidenceInterval
Conf interval on the mean.
- conf_varConfidenceInterval
Conf interval on the variance.
- conf_quartile1ConfidenceInterval
Conf interval on the 25th percentile.
- conf_medianConfidenceInterval
Conf interval on the median.
- conf_quartile3ConfidenceInterval
Conf interval on the 75th percentile.
- outliersarray_like
List of points falling further from a quartile than 1.5 * iqr.
Examples
In a jupyter notebook, sample summaries are shown as HTML tables:
>>> data = pd.read_csv(mqr.sample_data('study-random-5x5.csv')) >>> mqr.process.Sample(data['KPI1'])
produces
KPI1
Normality (Anderson-Darling).
Stat
0.34261
P-value
0.48588
N
120
Mean
149.97
StdDev
1.1734
Variance
1.3768
Skewness
0.23653
Kurtosis
0.34012
Minimum
147.03
1st Quartile
149.22
Median
149.97
3rd Quartile
150.56
Maximum
153.27
N Outliers.
5