09 - Statistics and Reproducibility Flashcards

1
Q

Scientific studies and statistics

A
  • correlation = statcl relationship, easy to prove
  • causation = implies existing mechsm between A & B, very diff to prove
  • observational study = large datasets, random pop, not necessarily hyp-driven, rarely causation
  • controlled study = most studies, gps vs control, can lead to causation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Comparing groups

A
- interclass correlation coeff
ICC = var2(among gps)/(var2(among) + var2(w/i groups))
- Student's T-distribution = Gaussian w/ estimated mean
T-test assume unbiased obs° + normal distrib
- too much variables increase the number of significant ones
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Predicting and validating

A
  • mapping inputs and outputs
  • /!\ overfitting
  • dataset divided in training + validation + testing
  • qttv assessment
  • – parameter sweep (make a param vary in reasonable bounds)
  • – sensitivity S = ∆metric/∆param
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Reproducibility

A
  • msrmt: change of results?

- analysis: another person another place gets the same results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Unit testing

A
  • indvdl units of source code are tested to assess if they are fit for use
  • happy path/provoke
  • test-driven programming (written before the code) or continuous integration (run everytime something has changed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Presenting the results

A
  • visualization is important

- grammar and graphics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly