(Frequentist) Statistics in Python
Contents
1. (Frequentist) Statistics in Python#
So far, we’ve dealt with data processing, handling, and visualisation. These steps are sometimes done ‘behind the scenes’ for early career researchers, who may be presented with a final dataset and asked to run statistical tests. But this hides a wealth of decision making and understanding of data from you,. There is no substitute for working with data yourself - and once you’re done with that, its time to actually do some statistical inference.
Python is fully capable of running a range of statistical analyses on data. It is missing perhaps some of the nuanced functionality in statistics that R
has; but this is the price you pay for using the second-best language for everything.
There are two ways of approaching statistics in Python, and the approach will dictated the kinds of packages that are used. But much like our experience of plotting, these two approaches can directly work with one another.
The first approach (and the most low-level) is the use of the scipy
and statsmodels
packages, which provide a dizzying array of functionality for working statistically - they are very general, suited to statistics across many domains, so may be confusing for psychologists.
We will start with another package that aims to make traditional frequentist statistical analysis straightforward, pingouin
.
pingouin
does not come with Anaconda, so you will need to install it. If you are working in a Jupyter notebook, in a fresh cell, you can use a magic command topip install
it, like so:
%pip install pingouin
Check out the documentation of the pingouin
package.
# Import pingouin, pandas, and seaborn
import numpy as np
import pandas as pd
import pingouin as pg
import matplotlib.pyplot as plt
import seaborn as sns
1.1. A t-test in Python#
The most basic kind of analysis we can run is a T-test comparing two groups of participants on a single dependent variable. We have used the tips
dataset before for data visualisation, and observed some differences - but we have not tested whether those differences are statistically significant. pingouin
contains functions that allow us to answer this question - pg.ttest()
.
# First load tips
tips = sns.load_dataset('tips')
display(tips.head())
/Users/alexjones/opt/anaconda3/envs/py10/lib/python3.10/site-packages/outdated/utils.py:14: OutdatedPackageWarning: The package pingouin is out of date. Your version is 0.5.1, the latest is 0.5.2.
Set the environment variable OUTDATED_IGNORE=1 to disable these warnings.
return warn(
total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|
0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
Do male and female customers differ in the amount of tips they give? Conducting a t-test on this is straightforward. We just need to pass the scores of the first group as the first argument to the function, and the scores of the second group as the second.
If we look at tips
we can see our data requires subsetting for that to happen, as it is in long format.
# Subset the data and store before passing, using loc
females = tips.loc[tips['sex'] == 'Female', 'tip']
males = tips.loc[tips['sex'] == 'Male', 'tip']
# Conduct t test
pg.ttest(females, males)
T | dof | alternative | p-val | CI95% | cohen-d | BF10 | power | |
---|---|---|---|---|---|---|---|---|
T-test | -1.489536 | 215.707021 | two-sided | 0.137807 | [-0.6, 0.08] | 0.185494 | 0.414 | 0.282179 |
pingouin
returns output in a DataFrame, including the T value, the p-value, and degrees of freedom. It also reports an effect size with Cohen’s D, achieved power, and also a Bayes Factor assessing the strength of evidence for the alternative hypothesis - more than enough information about your statistical test for reporting! Here we see a non-significant effect, with lower power.
1.2. A repeated-measures t-test in Python#
By default, pg.ttest()
assumes an independent samples test. A repeated measures comparison is achieved by setting the keyword argument paired
to True. As an example, we could ask whether for each meal, the cost of the meal was higher than the tip - you would hope that this is the case!
results = pg.ttest(tips['total_bill'], tips['tip'], paired=True)
display(results)
T | dof | alternative | p-val | CI95% | cohen-d | BF10 | power | |
---|---|---|---|---|---|---|---|---|
T-test | 32.646505 | 243 | two-sided | 8.020019e-91 | [15.77, 17.8] | 2.635205 | 1.222e+87 | 1.0 |
A very large difference, as you would expect.
1.3. Correlations with Python#
Computing a correlation between a pair of variable is very straightforward. Simply pass the two variables to the pg.corr()
function. For example, is there a correlation between the total bill and the tip?
corr_results = pg.corr(tips['total_bill'], tips['tip'])
display(corr_results)
n | r | CI95% | p-val | BF10 | power | |
---|---|---|---|---|---|---|
pearson | 244 | 0.675734 | [0.6, 0.74] | 6.692471e-34 | 4.952e+30 | 1.0 |
The output provides a range of information again - the n (of which your degrees of freedom is equal to n - 2), the r value itself, a 95% confidence interval around it, the variance explained in r2 (achieved via squaring r), an adjusted r2 (adjusting for the numbers of predictors in the model), the p value, power, and again a Bayes Factor.
Its also worth noting that pingouin
supports partial correlations, repeated measures correlations, and has other convenience functions that make testing correlational hypotheses straightforward.
1.4. One way ANOVA#
Things get more complicated once we move past simple tests and correlations, but thankfully pingouin
makes things simple. Remember that a one way ANOVA examines differences between 3 or more groups on a single dependent variable. This is implemented in the pg.anova
function.
Using the tips
dataset, we can examine whether there are significant differences between in the amount of tips received across the four different days the study was conducted (Thursday - Sunday).
# Conduct a one way ANOVA
one_way = pg.anova(data=tips, dv='tip', between='day', detailed=True)
display(one_way)
Source | SS | DF | MS | F | p-unc | np2 | |
---|---|---|---|---|---|---|---|
0 | day | 9.525873 | 3 | 3.175291 | 1.672355 | 0.173589 | 0.020476 |
1 | Within | 455.686604 | 240 | 1.898694 | NaN | NaN | NaN |
You specify the dataset, the dependent variable, and the between-measures (or grouping factor). The detailed
keyword gives more information back. The output should look familiar from software like SPSS, including df, ms, F, p value, and an effect size of partial eta squared.
1.5. Two way ANOVA#
If we want to add an additional factor, that is straightforward - pass a list of columns to the between
keyword.
We can examine the amount of tips given as a consequence of the day, and the sex of the customer.
# Conduct the two way ANOVA
two_way = pg.anova(data=tips, dv='tip', between=['day', 'sex'])
display(two_way)
Source | SS | DF | MS | F | p-unc | np2 | |
---|---|---|---|---|---|---|---|
0 | day | 7.446900 | 3.0 | 2.482300 | 1.298061 | 0.275785 | 0.016233 |
1 | sex | 1.594561 | 1.0 | 1.594561 | 0.833839 | 0.362097 | 0.003521 |
2 | day * sex | 2.785891 | 3.0 | 0.928630 | 0.485606 | 0.692600 | 0.006135 |
3 | Residual | 451.306151 | 236.0 | 1.912314 | NaN | NaN | NaN |
1.6. Repeated measures designs - one and two way ANOVAs#
Handling repeated measures in an ANOVA context is dealt with by the, pg.rm_anova
function.
The bugs
dataset, where participants rate how much they want to kill an insect based on how frightening and disgusting they perceive it to be, is a good example of fully repeated measures data:
# Read in bugs from the OSF
bugs = pd.read_csv('https://osf.io/mrhjn/download')
display(bugs.head())
Subject | Gender | Region | Education | Lo D, Lo F | Lo D, Hi F | Hi D, Lo F | Hi D, Hi F | |
---|---|---|---|---|---|---|---|---|
0 | 1 | Female | North | some | 6.0 | 6.0 | 9.0 | 10.0 |
1 | 2 | Female | North | advance | 10.0 | NaN | 10.0 | 10.0 |
2 | 3 | Female | Europe | college | 5.0 | 10.0 | 10.0 | 10.0 |
3 | 4 | Female | North | college | 6.0 | 9.0 | 6.0 | 9.0 |
4 | 5 | Female | North | some | 3.0 | 6.5 | 5.5 | 8.5 |
Unfortunately, the data is in the wrong format for analysis - remember that repeated measures data often needs to be put into the ‘long’ format for analysis. The first step is to represent the data correctly.
# Melt the data
bugs_long = bugs.melt(id_vars=['Subject', 'Gender', 'Region', 'Education'],
value_vars=['Lo D, Lo F', 'Lo D, Hi F', 'Hi D, Lo F', 'Hi D, Hi F'],
var_name='Condition', value_name='Rating')
# Split the condition column into two, so as to have two variables
bugs_long[['Disgust_Level', 'Fright_Level']] = bugs_long['Condition'].apply(lambda x: pd.Series(x.split(', ')))
# Finally replace the text values
bugs_long.replace({'Disgust_Level': {'Lo D': 'Low', 'Hi D': 'High'}, 'Fright_Level': {'Lo F': 'Low', 'Hi F': 'High'}}, inplace=True)
display(bugs_long.head(), bugs_long.tail())
Subject | Gender | Region | Education | Condition | Rating | Disgust_Level | Fright_Level | |
---|---|---|---|---|---|---|---|---|
0 | 1 | Female | North | some | Lo D, Lo F | 6.0 | Low | Low |
1 | 2 | Female | North | advance | Lo D, Lo F | 10.0 | Low | Low |
2 | 3 | Female | Europe | college | Lo D, Lo F | 5.0 | Low | Low |
3 | 4 | Female | North | college | Lo D, Lo F | 6.0 | Low | Low |
4 | 5 | Female | North | some | Lo D, Lo F | 3.0 | Low | Low |
Subject | Gender | Region | Education | Condition | Rating | Disgust_Level | Fright_Level | |
---|---|---|---|---|---|---|---|---|
367 | 96 | Male | North | high | Hi D, Hi F | 10.0 | High | High |
368 | 97 | Female | North | NaN | Hi D, Hi F | 10.0 | High | High |
369 | 98 | Female | North | some | Hi D, Hi F | 10.0 | High | High |
370 | 99 | Female | North | some | Hi D, Hi F | 10.0 | High | High |
371 | 100 | Female | Europe | some | Hi D, Hi F | 3.0 | High | High |
The first thing is to apply a simple one way repeated measures ANOVA examining differences between disgust level. Note that for a repeated measures ANOVA, a column denoting the participant number has to be specified.
rma = pg.rm_anova(data=bugs_long, dv='Rating', within='Disgust_Level', subject='Subject', detailed=True)
display(rma)
Source | SS | DF | MS | F | p-unc | np2 | eps | |
---|---|---|---|---|---|---|---|---|
0 | Disgust_Level | 27.485215 | 1 | 27.485215 | 12.043878 | 0.000793 | 0.115758 | 1.0 |
1 | Error | 209.952285 | 92 | 2.282090 | NaN | NaN | NaN | NaN |
This produces some familiar output - numerator and denominator degrees of freedom, mean square, F, p values, effect sizes, and epsilon, and index of sphericity.
Extending this to the two way case is straightforward - add in more levels to the within
keyword.
rma2 = pg.rm_anova(data=bugs_long, dv='Rating', within=['Disgust_Level', 'Fright_Level'], subject='Subject')
display(rma2)
Source | SS | ddof1 | ddof2 | MS | F | p-unc | p-GG-corr | np2 | eps | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Disgust_Level | 48.752841 | 1 | 87 | 48.752841 | 12.175190 | 7.623808e-04 | 7.623808e-04 | 0.122764 | 1.0 |
1 | Fright_Level | 177.556818 | 1 | 87 | 177.556818 | 41.629663 | 6.011447e-09 | 6.011447e-09 | 0.323640 | 1.0 |
2 | Disgust_Level * Fright_Level | 6.545455 | 1 | 87 | 6.545455 | 2.152300 | 1.459622e-01 | 1.459622e-01 | 0.024142 | 1.0 |
This function is capable of removing data with missing values automatically, and provides a similar set of outcomes, including a p value corrected for violations of sphericity.
1.7. Mixed ANOVA#
Often, researchers will blend both repeated measures and between-groups variables. For example, how do depressions scores change between time one and time two (a repeated measure) and for people who have received a new drug compared to those who have received placebo (between measures). The pg.mixed_anova()
function is capable of handling cases where a dataset contains a single repeated measures factor and a single between groups factor.
The exercise
dataset used to illustrate seaborn
plots in the last chapter has this property - individuals are on a low fat or no fat diet, and their pulse is measured after 5, 10, and 15 minutes of exericse. Are there significant differences or interactions in these variables on heart rate?
# Load dataset straight from the internet!
exercise = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/exercise.csv')
display(exercise.head())
Unnamed: 0 | id | diet | pulse | time | kind | |
---|---|---|---|---|---|---|
0 | 0 | 1 | low fat | 85 | 1 min | rest |
1 | 1 | 1 | low fat | 85 | 15 min | rest |
2 | 2 | 1 | low fat | 88 | 30 min | rest |
3 | 3 | 2 | low fat | 90 | 1 min | rest |
4 | 4 | 2 | low fat | 92 | 15 min | rest |
# Submit to mixed anova - remove_na keyword set to False here
mix = pg.mixed_anova(data=exercise, dv='pulse', within='time', between='diet', subject='id')
display(mix)
Source | SS | DF1 | DF2 | MS | F | p-unc | p-GG-corr | np2 | eps | sphericity | W-spher | p-spher | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | diet | 1261.877778 | 1 | 28 | 1261.877778 | 3.147101 | 0.086939 | NaN | 0.101040 | NaN | NaN | NaN | NaN |
1 | time | 2066.600000 | 2 | 56 | 1033.300000 | 11.807751 | 0.000053 | 0.000312 | 0.296619 | 0.746425 | False | 0.660281 | 0.002994 |
2 | Interaction | 192.822222 | 2 | 56 | 96.411111 | 1.101711 | 0.339395 | NaN | 0.037857 | NaN | NaN | NaN | NaN |
The results are a merger of the standard ANOVA and the repeated measures ANOVA, providing sphericity measures for the repeated measures factor.
1.8. ANCOVA#
The final ANOVA design worth discussing is the ANCOVA. The ANCOVA allows for the addition of a covariate, a continuous variable that the researcher wasnt to remove the variance of from the model before examining differences between the groups. This is handled by the pg.ancova
function.
Currently, the function supports only between group designs, and not repeated measures.
As a simple example, we can compare whether males and females offer different tip amounts after controlling for the total cost of the bill.
# ANCOVA example
pg.ancova(data=tips, dv='tip', between='sex', covar='total_bill')
Source | SS | DF | F | p-unc | np2 | |
---|---|---|---|---|---|---|
0 | sex | 0.038803 | 1 | 0.036999 | 8.476290e-01 | 0.000153 |
1 | total_bill | 208.789002 | 1 | 199.082735 | 2.332173e-33 | 0.452376 |
2 | Residual | 252.749941 | 241 | NaN | NaN | NaN |
Additional covariates can be specified as a list.
1.9. Post hoc multiple comparisons#
Now you know how to do any kind of research design under the sun with ANOVA in Python, the next step is to explore the follow up tests. After all, while ANOVA tells you there are significant differences, it offers no information on what those are driven by.
pingouin
offers a range of follow up tests to explore your data, but the focus here is on pg.pairwise_ttests
, which computes t-tests between all pairwise differences of each variable level in your data.
Let’s take a look at a dataset with an interaction. In the exercise
dataset, there is an interaction between diet
and the kind
of exercise:
# Examine interaction in exercise
interaction = pg.anova(data=exercise, dv='pulse', between=['kind', 'diet'], detailed=True)
display(interaction)
Source | SS | DF | MS | F | p-unc | np2 | |
---|---|---|---|---|---|---|---|
0 | kind | 8326.066667 | 2 | 4163.033333 | 37.824471 | 1.935335e-12 | 0.473846 |
1 | diet | 1261.877778 | 1 | 1261.877778 | 11.465164 | 1.080516e-03 | 0.120098 |
2 | kind * diet | 815.755556 | 2 | 407.877778 | 3.705894 | 2.868384e-02 | 0.081081 |
3 | Residual | 9245.200000 | 84 | 110.061905 | NaN | NaN | NaN |
We can see the interaction is significant. How to follow this up with pingouin
and see what is causing it? We use pg.pairwise_ttests
like so:
# Follow up interaction using comparisons at all levels
comparisons = pg.pairwise_ttests(data=exercise, dv='pulse', between=['kind', 'diet'])
display(comparisons)
Contrast | kind | A | B | Paired | Parametric | T | dof | alternative | p-unc | BF10 | hedges | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | kind | - | rest | running | False | True | -6.561162 | 58.0 | two-sided | 1.593561e-08 | 5.95e+05 | -1.672084 |
1 | kind | - | rest | walking | False | True | -2.674622 | 58.0 | two-sided | 9.704606e-03 | 4.824 | -0.681616 |
2 | kind | - | running | walking | False | True | 5.183378 | 58.0 | two-sided | 2.881236e-06 | 5136.555 | 1.320961 |
3 | diet | - | low fat | no fat | False | True | -2.457504 | 88.0 | two-sided | 1.595161e-02 | 3.0 | -0.513659 |
4 | kind * diet | rest | low fat | no fat | False | True | -1.434339 | 28.0 | two-sided | 1.625505e-01 | 0.742 | -0.509591 |
5 | kind * diet | running | low fat | no fat | False | True | -2.754828 | 28.0 | two-sided | 1.020405e-02 | 5.01 | -0.978734 |
6 | kind * diet | walking | low fat | no fat | False | True | -1.425097 | 28.0 | two-sided | 1.651816e-01 | 0.735 | -0.506308 |
We can change the keyword arguments of effsize
and padjust
to control the returned effect sizes and type of adjustment done, respectively - sometimes you need to adjust your alpha level for lots of tests:
# More flexible use of function
pairs_bonf_cohen = pg.pairwise_ttests(data=exercise, dv='pulse', between=['kind', 'diet'],
padjust='bonf', effsize='cohen')
display(pairs_bonf_cohen)
Contrast | kind | A | B | Paired | Parametric | T | dof | alternative | p-unc | p-corr | p-adjust | BF10 | cohen | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | kind | - | rest | running | False | True | -6.561162 | 58.0 | two-sided | 1.593561e-08 | 4.780682e-08 | bonf | 5.95e+05 | -1.694085 |
1 | kind | - | rest | walking | False | True | -2.674622 | 58.0 | two-sided | 9.704606e-03 | 2.911382e-02 | bonf | 4.824 | -0.690584 |
2 | kind | - | running | walking | False | True | 5.183378 | 58.0 | two-sided | 2.881236e-06 | 8.643708e-06 | bonf | 5136.555 | 1.338343 |
3 | diet | - | low fat | no fat | False | True | -2.457504 | 88.0 | two-sided | 1.595161e-02 | NaN | NaN | 3.0 | -0.518087 |
4 | kind * diet | rest | low fat | no fat | False | True | -1.434339 | 28.0 | two-sided | 1.625505e-01 | 4.876516e-01 | bonf | 0.742 | -0.523747 |
5 | kind * diet | running | low fat | no fat | False | True | -2.754828 | 28.0 | two-sided | 1.020405e-02 | 3.061214e-02 | bonf | 5.01 | -1.005921 |
6 | kind * diet | walking | low fat | no fat | False | True | -1.425097 | 28.0 | two-sided | 1.651816e-01 | 4.955448e-01 | bonf | 0.735 | -0.520372 |
There are lots of additional flexible ways the function works - it can handle the case of one way and two ANOVAs and mixed ANOVAs, just by specifying what variables should be between
and within
. Remember to choose the appropriate level of adjustment and effect size that suits your needs - but with this function, you can explore interactions in sufficient detail.
1.10. Linear regression#
The final statistical test that is common in psychology is linear regression, where a continuous outcome variable (such as age, or height) is predicted by other continuous variables (e.g. weight). Regression is among the most flexible of all analysis types - in fact, ANOVA is just a special case of regression.
Linear regression is handled by the pg.linear_regression()
function. It returns somewhat different output than the ANOVA function, containing information regarding the beta values of the regression (how much the dependent variable changes if the associated predictor increases by one unit), their p-values, and some information about the overall model fit.
In the following example, we attempt to build a linear model to predict the amount of loss suffered by an insurance company in a given car crash, based on the insurance premium of the driver, their alcohol consumption, and their speed at the time of the crash. This data is stored in the car_crashes
dataset of seaborn
.
# Load dataset
crash = sns.load_dataset('car_crashes')
display(crash.head())
total | speeding | alcohol | not_distracted | no_previous | ins_premium | ins_losses | abbrev | |
---|---|---|---|---|---|---|---|---|
0 | 18.8 | 7.332 | 5.640 | 18.048 | 15.040 | 784.55 | 145.08 | AL |
1 | 18.1 | 7.421 | 4.525 | 16.290 | 17.014 | 1053.48 | 133.93 | AK |
2 | 18.6 | 6.510 | 5.208 | 15.624 | 17.856 | 899.47 | 110.35 | AZ |
3 | 22.4 | 4.032 | 5.824 | 21.056 | 21.280 | 827.34 | 142.39 | AR |
4 | 12.0 | 4.200 | 3.360 | 10.920 | 10.680 | 878.41 | 165.63 | CA |
# Perform linear regression
lin_reg = pg.linear_regression(crash[['speeding', 'alcohol', 'ins_premium']], crash['ins_losses'])
display(lin_reg)
names | coef | se | T | pval | r2 | adj_r2 | CI[2.5%] | CI[97.5%] | |
---|---|---|---|---|---|---|---|---|---|
0 | Intercept | 58.321239 | 17.891849 | 3.259654 | 0.002078 | 0.388636 | 0.349613 | 22.327480 | 94.314997 |
1 | speeding | -0.297846 | 1.892755 | -0.157361 | 0.875634 | 0.388636 | 0.349613 | -4.105578 | 3.509885 |
2 | alcohol | 0.142754 | 2.234772 | 0.063878 | 0.949338 | 0.388636 | 0.349613 | -4.353028 | 4.638535 |
3 | ins_premium | 0.086772 | 0.016143 | 5.375042 | 0.000002 | 0.388636 | 0.349613 | 0.054295 | 0.119248 |
Unlike SPSS, there is no overall model fit returned by pingouin
. However, each predictor is returned with an associated coefficient and p value, and t value. The overall \(R^2\) of the model is also returned. By default the function will add an intercept to the model via the add_intercept
keyword - if you centre your variables by subtracting the mean from each (easily achieved in Pandas!) then you should set that to False.
1.11. Power Analysis#
Power - the probability that a statistical test tell you there is a result when there is indeed a result, can be calculated in a closed-form way (i.e. without the use of simulations) for some basic tests. pingouin
has some great functionality for this. For example, calculating the sample size needed for a correlation of .10, an alpha of .025, and power of .90, is easy:
# Correlation power
pg.power_corr(r=.10, alpha=.025, power=.90)
1235.0573893376597
Ouch.
There are a number of other power functions, as well as effect size estimator functions that can convert existing effect sizes (e.g. Cohen’s d) into others, or compute effect sizes from summary statistics in published papers (such as t or F values).
1.12. Reliability and beyond#
There is also the capability to estimate statistics of measurement reliability, such as the intra-class correlation or Cronbach’s alpha. Intraclass correlation is shown below, using the wine
dataset from pingouin
:
# Read wine from pingouin
wine = pg.read_dataset('icc')
display(wine.head())
# Compute intra-class correlation within judges on their wine scores
pg.intraclass_corr(data=wine, targets='Wine',
raters='Judge',
ratings='Scores')
Wine | Judge | Scores | |
---|---|---|---|
0 | 1 | A | 1 |
1 | 2 | A | 1 |
2 | 3 | A | 3 |
3 | 4 | A | 6 |
4 | 5 | A | 6 |
Type | Description | ICC | F | df1 | df2 | pval | CI95% | |
---|---|---|---|---|---|---|---|---|
0 | ICC1 | Single raters absolute | 0.727521 | 11.680026 | 7 | 24 | 0.000002 | [0.43, 0.93] |
1 | ICC2 | Single random raters | 0.727689 | 11.786693 | 7 | 21 | 0.000005 | [0.43, 0.93] |
2 | ICC3 | Single fixed raters | 0.729487 | 11.786693 | 7 | 21 | 0.000005 | [0.43, 0.93] |
3 | ICC1k | Average raters absolute | 0.914384 | 11.680026 | 7 | 24 | 0.000002 | [0.75, 0.98] |
4 | ICC2k | Average random raters | 0.914450 | 11.786693 | 7 | 21 | 0.000005 | [0.75, 0.98] |
5 | ICC3k | Average fixed raters | 0.915159 | 11.786693 | 7 | 21 | 0.000005 | [0.75, 0.98] |
In most cases, classical statistical techniques are handled exceptionally well in pingouin
, and there is a huge range of functionality of more advanced tasks (power, repeated measures correlations, etc), and even statistical plotting functions. However, when things are limited, there are other options…