Statistics Flashcards
Can you ever be sure about disproving a hypothesis?
No you cannot be completely sure, however you can be arbitrarily sure if the results are statisitically significant
What does it mean for a result to be statistically significant?
A result is called statistically significant if it is unlikely to have occurred by chance. Normally meaning that the p-value is less than 0.05 (5%). (However can alter the threshold)
What is the p-value?
The p-value is the probability of obtaining the given results if the null-hypothesis is true.
What is a null hypothesis?
A null hypothesis is what is assumed to be true and is being tested against to be disproven. Functionally meaning that both data sets are from the same mechanism, wheras we are trying to prove they are different aka the alternate hypothesis.
How to prove/disprove something with stats
- It is not possible to prove/disprove something with stats
- You can only reject the null hypothesis given enough statistically significant data
- Otherwise the test “didn’t find a statistically significant difference” and “fails to reject the null hypothesis”
What is a research question?
A statement that identifies a phenomenon to be studied.
Ex: I believe that rewards improve memorization skills
What is a hypothesis?
- A statement of the predicted relationship between at least two experimental variables.
- A provisional answer to a research question
- Ex: group chocolate will have a higher memorisation score than group with no reward
Independent vs dependent variable
The dependent variable is the event studied and expected to change whenever the independent variable is altered.
What is a controlled variable
The variables that are** kept constant** to prevent their influence on the effect of the independent variable on the dependent. Ideally everything besides dependent and independent variable is controlled.
What is a confounding variable
Extraneous variables that correlates with both the dependent variable and the independent variable.
Example: Weather temperature correlates with both ice-cream sales and murders.
The goal of experimental design
Experimental design aims at maximizing your chances of finding the signal and not the noise (noise being randomness, confounding variables etc, that may show correlation not causality)
Within vs. between subjects
Within = All participants do the same thing (everyone does A and B)
Between = Certain participants do only certain conditions (certain people do A, certain people do B)
Comparison of within vs. between experiments
Within pros:
+ Less user variation (between groups)
+ Statisical power with less participants
Between pros
+ No baises from other conditions (eg. transfer of learning from doing A before B)
What is counterbalancing?
A method of avoiding confounding among variables/
Presenting conditions in a different order
How is a latin square used for counterbalancing?
A latin square is an n × n array filled with n different Latin letters, each occurring exactly once in each row and exactly once in each column, where each letter corresponds to treatment/condition. Varying the order in this way avoids counfounding variables and transfer of learning.