Erweiterte Modellbildung & KI

Datensatz

d <- data.frame(
    id = 1:181, 
    trained = c(rep(0, 95), rep(1, 86)), 
    injured = c(rep(1, 12), rep(0, 83), rep(1, 22), rep(0, 64))
)
head(d, 5) |> knitr::kable()

id	trained	injured
1	0	1
2	0	1
3	0	1
4	0	1
5	0	1

Datensatz

xtabs(~ d$trained + d$injured)

         d$injured
d$trained  0  1
        0 83 12
        1 64 22

Chi-Quadrat-Test

chisq.test(x = d$trained, y = d$injured, correct = FALSE)


    Pearson's Chi-squared test

data:  d$trained and d$injured
X-squared = 4.9617, df = 1, p-value = 0.02591

Spearman-Korrelation

cor.test(x = d$trained, y = d$injured, method = "spearman")


    Spearman's rank correlation rho

data:  d$trained and d$injured
S = 824636, p-value = 0.02592
alternative hypothesis: true rho is not equal to 0
sample estimates:
     rho 
0.165568

Pearson-Korrelation

cor.test(x = d$trained, y = d$injured, method = "pearson")


    Pearson's product-moment correlation

data:  d$trained and d$injured
t = 2.2461, df = 179, p-value = 0.02592
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.02019806 0.30408239
sample estimates:
     cor 
0.165568

T-Test

t.test(injured ~ trained, data = d, var.equal = TRUE)


    Two Sample t-test

data:  injured by trained
t = -2.2461, df = 179, p-value = 0.02592
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
 -0.24326592 -0.01573041
sample estimates:
mean in group 0 mean in group 1 
      0.1263158       0.2558140

ANOVA

aov(injured ~ trained, data = d) |> summary()

             Df Sum Sq Mean Sq F value Pr(>F)  
trained       1  0.757   0.757   5.045 0.0259 *
Residuals   179 26.856   0.150                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Lineare Regression

lm(injured ~ trained, data = d) |> summary()


Call:
lm(formula = injured ~ trained, data = d)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.2558 -0.2558 -0.1263 -0.1263  0.8737 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.12632    0.03974   3.179  0.00174 **
trained      0.12950    0.05765   2.246  0.02592 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3873 on 179 degrees of freedom
Multiple R-squared:  0.02741,   Adjusted R-squared:  0.02198 
F-statistic: 5.045 on 1 and 179 DF,  p-value: 0.02592

Plan für die nächsten Wochen

	Dienstag	Mittwoch	Donnerstag
12.-14.05	L1: Lineares Modell	Ü1: Lineares Modell	frei (Himmelfahrt)
19.-21.05	Projektarbeit (selbstständig)	L2: Erweitertes lineares Modell	Ü2: Erweitertes lineares Modell
26.-28.05	frei (Pfingstferien)	frei (Pfingstferien)	frei (Pfingstferien)
02.-04.06	Hackathlon: Paper reproduzieren	L3: Logistische Regression	frei (Fronleichnam)
09.-11.06	Ü3: Logistische Regression	L4: Machine Learning	Ü4: Machine Learning
…	Visualisierung mit Björn (Teil 1)	…	…
30.-02.07	L5: KI & KI-Ethik	Hackathlon: Prediction Challenge	tba…