Erweiterte Modellbildung & KI

Datensatz

d <- data.frame(
    id = 1:181, 
    trained = c(rep(0, 95), rep(1, 86)), 
    injured = c(rep(1, 12), rep(0, 83), rep(1, 22), rep(0, 64))
)
head(d, 5) |> knitr::kable()
id trained injured
1 0 1
2 0 1
3 0 1
4 0 1
5 0 1

Datensatz

xtabs(~ d$trained + d$injured)
         d$injured
d$trained  0  1
        0 83 12
        1 64 22

Chi-Quadrat-Test

chisq.test(x = d$trained, y = d$injured, correct = FALSE)

    Pearson's Chi-squared test

data:  d$trained and d$injured
X-squared = 4.9617, df = 1, p-value = 0.02591

Spearman-Korrelation

cor.test(x = d$trained, y = d$injured, method = "spearman")

    Spearman's rank correlation rho

data:  d$trained and d$injured
S = 824636, p-value = 0.02592
alternative hypothesis: true rho is not equal to 0
sample estimates:
     rho 
0.165568 

Pearson-Korrelation

cor.test(x = d$trained, y = d$injured, method = "pearson")

    Pearson's product-moment correlation

data:  d$trained and d$injured
t = 2.2461, df = 179, p-value = 0.02592
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.02019806 0.30408239
sample estimates:
     cor 
0.165568 

T-Test

t.test(injured ~ trained, data = d, var.equal = TRUE)

    Two Sample t-test

data:  injured by trained
t = -2.2461, df = 179, p-value = 0.02592
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.24326592 -0.01573041
sample estimates:
mean in group 0 mean in group 1 
      0.1263158       0.2558140 

ANOVA

aov(injured ~ trained, data = d) |> summary()
             Df Sum Sq Mean Sq F value Pr(>F)  
trained       1  0.757   0.757   5.045 0.0259 *
Residuals   179 26.856   0.150                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Lineare Regression

lm(injured ~ trained, data = d) |> summary()

Call:
lm(formula = injured ~ trained, data = d)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.2558 -0.2558 -0.1263 -0.1263  0.8737 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.12632    0.03974   3.179  0.00174 **
trained      0.12950    0.05765   2.246  0.02592 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3873 on 179 degrees of freedom
Multiple R-squared:  0.02741,   Adjusted R-squared:  0.02198 
F-statistic: 5.045 on 1 and 179 DF,  p-value: 0.02592

Plan für die nächsten Wochen

Dienstag Mittwoch Donnerstag
12.-14.05 L1: Lineares Modell Ü1: Lineares Modell frei (Himmelfahrt)
19.-21.05 Projektarbeit (selbstständig) L2: Erweitertes lineares Modell Ü2: Erweitertes lineares Modell
26.-28.05 frei (Pfingstferien) frei (Pfingstferien) frei (Pfingstferien)
02.-04.06 Hackathlon: Paper reproduzieren L3: Logistische Regression frei (Fronleichnam)
09.-11.06 Ü3: Logistische Regression L4: Machine Learning Ü4: Machine Learning
Visualisierung mit Björn (Teil 1)
30.-02.07 L5: KI & KI-Ethik Hackathlon: Prediction Challenge tba…