Lecture 1: Introduction to Statistics (Chapter 1) Flashcards
Why are Statistics needed?
Stats are the best set of tools to decide if a statement is true
What is an inductive statement?
A statement whose truth can be assessed by collecting and analyzing data
Types of Statistical Analysis
- Descriptive Statistics
- Inferential Statistics
Descriptive Statistics?
- Numbers that are used to summarize and describe data
- Good at telling us what our data looks like
- NOT ABLE TO GENERALIZE
Data?
Information collected from a survey etc.
Inferential Statistics?
- Helps us generalize our sample back up to our population, easy to generalize the information
2 TYPES
1. T-stats
2. F-Stats
T-stats?
used to determine if there is a change in 2 groups over time
F-stats?
used to determine if there is a change in multiple groups over time
Difference between Population and a Sample
Population:
- Members of the groups
e.g. All of York University
Sample:
- Subset of a population
e.g. Psychology students at York University
Sampling Bias:
Conclusions made that aren’t generalizable
e.g. taking only male psychology students in york may cause a more biased result
Sampling Error:
Discrepancy of how accurate inference is
i.e. how “off” the information is
Random Sampling:
Every member of the population has an equal chance of being selected
Sample Size:
How big the sample is
LARGER SAMPLE = MORE REPRESENTATIVE
Why would you use a more complex sampling?
Because you aren’t able to build the sample randomly
Stratified sampling:
- Creating subsets and selecting randomly from them
- NOT MATHEMATICALLY RANDOM
e.g. from all york students you randomly selected males
Convenience sampling:
Finding the easiest/ most accessible participants, usually a follow-up survey
e.g. URPP
Different scales of measurement
- Nominal/Categorical Variables
- Ordinal Scale
- Interval Scale
- Ratio Scale
Differences between all scales of measurement
- Nominal
- can be scaled - Ordinal
- AND ranked - Interval
- AND evenly spaced - Ratio
- AND has a natural scale
Continuous vs Discrete Variables
Continuous Variables are continuous
Discrete variables are not continuous
Independent variable (IV)
The variable that explains outcomes
e.g. “x” in y=mx+b
Dependant variable (DV)
The variable that is being explained
e.g. “y” in y=mx+b
Confounding variable
Variables that you can control and randomize away
e.g. factors other than “x” that can affect “y”
Reliability:
Tells us the measurements and how consistent they are
e.g. a weight scale
Validity:
Tells us the accuracy of the measurement
e.g. if we have a bag of potatoes when we step on a scale, its not valid cause its not our real weight
Types of Research Design
- Non-Experimental Designs
- Experimental Designs
Non-Experimental Designs
Correlational research
- Measuring the relationship between 2 variables
Experimental Design
When we want to know if there is a cause for a change
- usually observations
Replication:
Experiments that use the same procedure as a previous one but with a new sample from the same population
Type 1 error:
When you think it is right but it is not
- False Positive
Type 2 error:
When you think it is wrong but it is right
- False Negative
Variables:
Factors that change
Constant:
Factors that don’t change
Methods of Data Collection:
- Independent Design
- Repeated-measures Design
Independent Design:
Manipulates the IV using different participants where different groups take part
Repeated-measures design:
Manipulate the IV using the same participants
Data Ethics:
Principles relating to all stages of working with data
Open Science:
Research that encourages collaboration, sharing of methodologies, data
Data-Related Problems:
Replication Failures:
- Researchers are unable to reproduce/ replicate findings
Problems with Data Collection:
- Researchers design studies and collect data that helps them get what they want
(may not be the most accurate but gives you the results that you want)
Old-Fashioned Statistics:
- Traditional ways of analyzing data that can lead to inaccurate outcomes
HARKing:
Hypothesizing After the Results are Known
Preregistration:
Recommended open-science practice where researchers outline their designs and analysis before conducting the study