Week 2 Flashcards
Define population in terms of data? Example?
A complete set of objects, such as undergraduate students
Define sample in terms of data? Example?
A subgroup of a given population eg undergraduate students in a specific module
What makes a good sample?
No cherry picking
Modifications
Define variables in terms of data? Example?
Characteristics or properties that can take on more than one value and can be changed.
Such as a groups or values eg numbers (age, height), categories (colours) qualitative - descriptive and quantitative - measure able numbers
Two types of variables in experiments?
Independent variables - predictors
Dependent variables - outcomes
What does the independent variable show?
Represents the value being changed or manipulated. It is controlled to see the relationship on the observed outcome
What does the dependent variable show?
The observed result, can be dependent on the IV
What is the control variable?
variables that are kept constant to prevent influence on IV and DV
Four representative data types in statistics?
Nominal/Categorical
Ordinal
Interval
Ratio
What data type is qualitative and which is quantitative ?
Nominal and ordinal - qual
Interval and ratio - quan
What does levels in data variables mean? Example
The number of that certain variable eg a test of vo2 max using bikes, treadmills and rowing machine, there is 3 IV for each machine therefore level 3
Define frequency in data ?
How often a value appears
What is a histogram in data?
Visualisation of how data is distributed
What is the mode?
The largest/most frequently occuring value, can be used for all variables mainly nominal and ordinal
What is the median?
A middle value dividing group into 2, can only be done with ordered variables so excludes nominal
What is the mean? With what variables is it used?
An average of the data set
Can only be defined in interval and ratio variables as they are numerical
Define spread in data? Example?
Distribution of a group, eg a stacks of coins, 60 coins 3 stacks of 30 is shorter spread than 6 stacks of 10
Types of spread and what they mean?
Quantile - cut off points for each section
Quartile - when there are 4 sections
Percentile - 100 sections
Define standard deviation?
Distance from the mean, shows the centre how spread the data is
Define skewness?
Measures the degree of asymmetry, positive skew is to the left, negative is to the right and 0 means it is evenly distributed
Define Kurtosis?
Measures the sharpness
What is the 2nd,3rd and 4th moment?
2nd = variance
3rd = Skew
4th = Kurtosis
Define outliers? How did they occur?
Extreme values compared to other data, usually occurs from inaccuracies with participant, measurements and processing
How to determine outliers?
- Based on z-score (ratio of SD to mean) if it is more than 3 times of SD
- Interquartile range (IQR) = width between 1st and 3rd quartile. If 1.5 above 3rd or 1.5 below 2nd.
How to determine outliers?
Based on z-score (ratio of SD to mean) if it is more than 3 times of SD
What is a box plot? What does it show?
Summarises quartile based stats and displays distribution of data set.
Location of quartiles/data points - usually main 50% (1st to 3rd quartiles)
Range/spread of data
and outliers detected by quartiles.
Nominal Data? Example?
Categorical data with no order or ranking
Eg: blood type, Gender, ethnicity.
Ordinal Data? Example?
Categorical data with clear, ordered ranking but intervals/difference between values are not necessarily equal
Eg: Customer satisfaction (satisfied, very satisfied etc..), education levels.
Interval Data? Example?
Numeric data with equal intervals/difference between values, but no true zero point (0 still has meaning, not an empty value)
Eg: Iq, Temp
Ratio Data? Example?
Numeric data with both equal intervals/differences and a zero point (0 is nothing, empty value) allowing for ratios
Eg: Height, weight, income, age.
What is the sum for 2nd/3rd/4th moment? How to make it dimensional?
(Distance from mean)^2/3/4 to each data point/ number of data points = Variance/Skew/Kurtosis
To make dimensional divide answer by SD^2/3/4 depending on which.