Stats Flashcards
Process of Research in Conducting Statistics
- First determine average results
- Then individual variations
- Then ethical reporting - full disclosure is crucial for accurate interpretation, giving other researchers the chance to replicate the study
The Ethical Imperative: Why Understanding Stats Matters
- Transparency and accountability
- Advancing the field Ethical data practices
- Ethical Data Practices are crucial for maintaining public trust and to avoid misrepresentation and unintended bias of psychological traits
Implications of Misreporting
- Overgeneralisation - misleading one-size-fits-all impression of therapy effectiveness
- Patient harm - wasted time on ineffective treatments
- Research mistrust - damages credibility of psychological studies
- Ethical responsibility - researchers must present complete picture, including limitations
Measures of Central Tendency
- mean, the most common, however others might be more appropriate
- median
- mode
Measures of Dispersion/Variability
- Range
- Variance
- Standard deviation
- Interquartile range
Histograms and Bar Charts
- Graphs for understanding data
- How often the data appears - histogram
- Compare the magnitude of different categories - bar charts
Boxplots
Give median value, interquartile range and the spread of data
- Can reveal key characteristics such as presence of skewness, extent of variability
Scatterplots and Correlations
- Relationships between two variables
- Trends, clusters and outliers
Importance of Data Cleaning and Preparation
- Identifying error - mistakes that might skew the data
- Handling missing data - appropriate methods to handle them
- Standardised formats - all data is the same format so it can be compared
- Transforming variables - apply necessary transformations to meet stat assumptions
Strategies for Handling Missing Data
- Imputation - replace missing values with estimate based on patterns in the existing data
- Listwise deletion - remove any cases with missing data (this can reduce statistical power and introduce bias if the missingness is not random)
- Multiple imputation - generate multiple plausible values for each missing data point to account for uncertainty, then pool the results
- Analysis of missingness - investigate the patterns and mechanisms behind missing data to select the most appropriate handling method
Interpreting Descriptive Stats
- Visualising the data - through patterns, outliers and relationships in graphs etc
- Contextual interpretation - understanding real-world implications of descriptive statistics
- Practical significance - evaluating the magnitude of its effects
Ethical Considerations in Data Presentation
- Transparency
- Avoiding bias
- Context matters
- Responsible reporting
Avoiding Common Pitfalls in Descriptive Stats
- Misinterpreting visualisations
- Choosing inappropriate analyses
- Data entry errors
Practical Applications of Descriptive Stats
- Research design
- Psychological assessment
- Intervention evaluation
- Data visualisation
The Data Analysis Process
- Collect
- Organise
- Analyse
- Interpret
- Collect
- Experimental measurements
- Behavioural observations
- Psychological test scores
Survey responses
- End up with spreadsheets
- Everything in row one is for participant 1 and so forth
- Columns incorporate different variables
- Organise
- Median, mean, minimum and maximum
- Summarise data to find averages → the first step
- Not interested in individual data, but summative data
- Box plots, histograms, relationships of variables (scatter plots)
- Analyse
- Descriptive statistics, differential statistics, to put these into words for specific variables, making sense of the spreadsheets
- Interpret
- What do these numbers mean for the research, what does it suggest
- Must be done accurately
- “This suggests…”
Quantitative Variables
measurable quantities like age, height, test scores (anywhere within a range)
Qualitative Variables
descriptive categories such as gender, eye colour, mood
Types of Data
- Numerical
- Categorical (grouping data)
- Ordinal (ranked data like likert scales)
- Continuous (infinitely divisible data like reaction time)
Nominal Scale - Identity
- Used for categorical variables,
- Numbers are arbitrary, acting as labels instead of names, they indicate difference, not size or order
Ordinal Scale - Identity + Order
- Scores can be ranked/ordered
- Indicate differences and scale
- Nothing more than rank order
- No objective distance between any two points on the scale
- Not measurable
Interval Scale - Identity + Order + Equal Unit Size
- Allow us to separate objects or events into mutually exclusive categories, in an order, and with specific distances
- Indicate differences, scale, interval length and size
Ratio Scale
identity + order + equal unit size + true zero point
Discrete Variables
Data are comprised of indivisible units, represented by whole numbers
- number of children
- errors on a true/false test
Continuous Variables
Data involve numbers that can be divided
Measure of Variability
indicates the degree to which scores are either clustered or spread out in a distribution
Range
difference between lowest and the highest score
SD
Average movement from the middle of the distribution
- Most commonly used measure
- How different from the mean the individual scores may be
- Average of these deviations
Steps to Calculate the SD
- Step 1: calculate the mean
- Step 2: find the average of the difference of mean from each individual score (x1 - M) (this is the deviation)
Mean for the deviation code is zero - Step 3: squared deviation (x1 - M)2
- Step 4: find the mean of the squared deviation (known as the variance)
- Step 5: square root of the variance
- This is done because the variance is not in the same measurements as all the scores are (it is much larger), and so the standard deviation is something that can compare between the scores much better
How are SD and Variance Different
- Both measures of variability
- Both used in inferential statistics
- Similar formula
- Standard deviation: presents measure in original units
- Variance: presents measure in squared units
Data Collection Commandments
- Think about the type of data required to answer the question
- Where will you be collecting the data
- Make sure that the data collection form you are using is clear and easy to use
- Make a duplicate of the data files and keep it in a separate location
- Do not rely on other people to collect or transfer your data unless you have personally trained them and are confident that they understand the data collection process as well as you do
- Plan a detailed schedule of when and where you will be collecting your data
- As soon as possible cultivate possible sources of your participant pool
- Try to follow up on subjects who missed their testing session
- Never discard original data
Self-report Measures
- Administered as questionnaires or interviews
Behavioural self-report measures
- Unreliable
- How often they may do something
Cognitive measures
- What people think
- Unreliable
Affective measures
- How people feel
- Unreliable
Types of Tests
- Assess individual differences in various content areas
Personality tests
- Often self-reported affective tests
Ability tests
- Aptitude tests - measure an individual’s potential to do something
- Achievement tests - measure an individual’s competence in an area
Behavioural Measures
- Observational measures
- Involve some sort of coding system - a means of converting the observations to numerical data
Descriptive Statistics
- Average score (central tendency)
- Shape of the distribution
- Width of the distribution
- Organise data in tables and graphs
The Median
- Mid-point or central value
- Divides the score in half
- Not sensitive to outliers
- Requires all scores to be placed in rank order
Mode
- Most frequently occurring category or score
- Can be determined on all scales of measurement (nominal, ordinal, ratio, interval)
- It is the only measure of central tendency that can be used for data measured on a nominal scale
When to use the Different Measures of Central Tendency
Mode
- When the data are categorical in nature and values can fit into only one class (religion, hair colour)
Median
- When there is extreme scores and don’t want to distort the mean
Mean
- When data isn’t extreme and isn’t categorical
What Do Central Tendencies Look Like in a Symmetrical Unimodal Distribution
mode=median=mean
Positive and Negative Skews
> 50% above mean and <50% below mean
Why Care About Variable Types
- Different measurement approaches for different variables
- Different statistical tests most appropriate for analysis
- Different interpretation methods for correctly interpreting results and drawing accurate conclusions
Nominal Variables
categories with no natural order
Important for
- Understanding patient choices
- Analysing demographic patterns
- Cultural differences in mental health
Ordinal Variables
Ordered categories
Important for
- Better understanding of patients subjective experience
- May be useful in developing individualised treatments
- Informs decision-making and further research
Interval Scales
- Equal distances between points
- No true zero
Ratio Scale
- has true zero
Memory Study Example to show the use of nominal variables and ratio scales
Independent variable: study method (nominal)
- Visual learning
- Auditory learning
- Combined method
Dependent variable: recall score (ratio)
- Number of words remembered
- Response time in milliseconds
Common Mistakes to Avoid with Variables and Scales
Treating ordinal as interval
- Shouldn’t write “depression increased by 2 points on a mild/moderate scale” instead “depression severity increased from mild to moderate”
Inappropriate averages
- Can’t average nominal data
Misleading comparisons
- “Twice as anxious” only works with ratio scales
Understanding Likert Scales
- Fixed-choice rating scale designed to measure attitudes, opinions (subjective measure)
- Consists of statement and then varying degrees to these statements
- Rating in number points
- 5 point scale
Difference response anchors like frequency, satisfaction, quality
- Ordered responses
- Balanced positive and negative options
- Clear midpoint
- Equal apparent intervals between options
Part 1 of Analysis: Categorical Analysis
- Create frequency table
- Look at frequencies
Part 2 of Analysis: Numerical Analysis
- Calculate the mean satisfaction score
- Calculate the standard deviation
Frequency Distribution Graphs
→ show the relationship between score and frequency
Bar graphs
- Categorical data, nominal and ordinal scale
Histograms
- Numerical data, interval and ratio scale
- Bar width: continuous variables (extends to the real limits of the category) and discrete variables (extends exactly half the distance to the adjacent category
Frequency polygons
- Numerical data, interval and ratio scale
- Large numbers
- Compare sets of data with this
- Cumulative frequency distribution
- Changes over time
Scattterplot
- Bivariate numerical data
- x,y pair
- Negative, positive, no linear relationship
Disadvantage of Stem and Leaf Plot
Not the best to present large data sets
- Too many leaves for each stem
- Create groupings that may affect clarity
Characteristics of a Bar Graph
- Categorical
- Do not touch
- Use to display differences in mean
Characteristics of a Histogram
- numerical
- can touch
- frequency distribution
Key Differences between COUNT and COUNTA functions
- COUNT: counts only cells with numerical values
- COUNTA: counts all non-empty cells, including those with text, numbers or any other data
- Counts column title too
Percentile
- Location of scores relative to the rest of the scores in the distribution
- Your percentile in the distribution represents the position of your measurement in comparison with everyone else’s
- It gives the percentage of the population that falls below you
- 50th percentile, 50% of population falls below you
- cf/n x 100 (cumulative frequency/number of individual scores)
Percentile Rank
relative position of a given person in the group in reference to the trait being measured
Percentile Score
score corresponding to a particular percentile rank
Limitations of Percentile Measurement
equal differences do not reflect equal differences in actual scores
- IQ 101 - IQ 100 → 52nd - 50th percentile
- IQ 135 - IQ 128 → 99th - 97th percentile
- distance between scores is not specified
Z-score
- A raw score or x value provides very little information about how that score compares with other values in the distribution
- Z score transformation: the value of a z-score tells exactly where the score is located relative to all the other scores in the distribution
Transforms x score into new number so that
- The sign (+) or (-) tells us if the score is located above (+) or below (-) the mean, and
- The number tells the distance between the score and the mean in terms of the number of standard deviations
- Specifies the precise location of each raw score witin the distribution
z-score =
z score = X-M/S
- deviation divided by standard deviation
- When something has different means and standard deviations you can’t compare the scores
- Z-scores fix this problem
Properties of Normal Distribution
- Bell-shaped
- Symmetrical
- Mode, median and mean are the same value
- 50% below and above the mean
- Unimodal, one peak, one mode
- Most of the observations are clustered around the centre of the distribution
- When standard deviations are plotted along the x-axis, the percentage of scores falling between the mean and any point on the x axis is the same
Kurtosis
how flat or peaked a normal distribution is; a degree of the degree of dispersion among the scores
- Higher peak means there is more scores closer to the mean
- Mean and standard deviation describe these peaks
Z-scores
- transforming ANY DISTRIBUTION of raw scores into Z-scores results in a distribution with a MEAN of 0 and a SD of 1
- z-score quantifies the original score in terms of the number of standard deviations that the original raw score is from the mean of the distribution
- a negative z-score means that the original score was below the mean. A positive z-score means that the original score was above the mean
The Total Area Under the Curve Representing 100% of the Scores
- z = -1 and z = +1 (SD of 1) covers approx 68% of scores
- z = -2 and z = +2 (SD of 2) covers approx 95% of scores
- z = -3 and z = 3 (SD of 3) covers approx 99% of scores
Common Mistakes to Avoid When Interpreting the Percentile Rank
- confusing percentile with percentage correct
- thinking percentile tells us actual score
- misunderstanding whether higher or lower percentiles are better
- thinking 50th percentile means “halfway to maximum”
- assuming percentile indicates absolute rather than relative measurement
Real World Applications of Percentile Rank
- standardised test scores (NAPLAN)
- clinical assessments (IQ)
- medical assessments
- growth monitoring
Probability
Defined as the expected relative frequency of a particular outcome
- by knowing the makeup of population we can determine the probability obtaining specific samples
- definition is accurate only for random samples
Q1
25% of data falls below this point
Q2
median, 50% of data falls below this point
Q3
75% of data falls blow this point
IQR
= Q3 - Q1
Interpreting Box Plot Characteristics
- Symmetry
- the symmetry of the box plot indicates the distribution’s skewness. A symmetric box plot suggests a normal distribution
- IQR
- the size of the box represents the spread of the middle 50% of the data, providing insights into the data’s variability
- Whisker Length
- the length of the whiskers indicates the range of the data, excluding outliers
Comparing Data Sets Using Box Plots
→ side-by-side
- allows for easy comparison of the distribution, median, and spread of multiple data sets
→ overlaid
- Multiple overlapping on the same plot can highlight similarities and subtle differences in the data distribution
→ stacked
- Stacked vertically can help visualise the relative positions and differences between the data sets for larger numbers of groups
Bar Graphs - Advantages and Disadvantages
- show the mean or total data
- better for comparing categorical data or discrete counts
- simple to understand for general audiences
- cannot show outliers or data spread
Boxplots - Advantages and Disadvantages
- show median, quartiles (box edge), range (whiskers), outliers (individual data points)
- better for comparing distributions
- show data spread and sewness
- excellent for spotting unusual patterns
- more complex to interpret for general audiences
Should you ever use multiple figures for the same data?
No, this makes it less concise and clear
Q-Q Plot:
- dotted line is the SD from the mean, where the normal range extends to
Mixture of Normal Distributions
- Multiple separate normal distributions placed together
Once these are combined, the outliers are no longer outliers anymore
- Points that deviate from normality might not be true outliers
- They could be valid data points from a different component of the mixture
- E.g points around -2 and +2 SD are not true outliers – they are the centres of their respective distributions
Importance of Outliers
Impact on analysis
- Can influence mean, SD making them unreliable
Model performance
- Causes models to overfit or perform poorly, leading to inaccurate predictions
Data quality
- Can help detect errors, inconsistencies, etc
Tools for Detecting Outliers
- Visual inspection - through figures
- Statistical methods - z-scores, IQR etc
- Domain expertise - understanding the content and identifying outliers that are unrealistic or unexpected based on domain knowledge
Common Causes of Outliers
- Measurement errors (equipment malfunction, faulty sensors)
- Data entry errors (incorrect formatting, typographical)
- Unusual events (unexpected occurrences)
Strategies for Treating Outliers
Removal → deleting it if considered to be errors, replacing with more representative values, applying mathematical transformations to reduce the impact of outliers
Clinical Responsibility
- Might indicate persons needing immediate help
- Removing data means removing important information
- Balance statistical cleanliness with clinical reality
Research Integrity
- Document all decisions
- Be transparent with outlier handling
- Consider impact on conclusions
- Report results with and without outliers
Sampling Theory: Population and Sample
Sample - a portion of population that is actually measured
- Summary properties or measures of sample values are called statistics
- Concrete
- Finite
- Incomplete (set of people or entities)
Population - all items of interest
- Called parameters
- Abstract
- Complete (all people or entities)
The Law of Large Numbers
- Large samples generally gives better information
- More data= better information
- Larger sample have M closer to the true population
The Central Limit Theorem:
Ifyou take sufficiently large samples from a population, the samples’ means will be normally distributed, even if the population isn’t normally distributed
Ensuring that:
1. The distribution of sample means is normal
- The mean of all the samples would equal the population mean
- the standard deviation of the sampling distribution (the sampling error) gets smaller as the sample size increases
- the shape of the sampling distribution becomes normal as the sample size increases
Frequency Distribution of Raw Scores
- Is based on a real set of data
- Each point on the x-axis represents a raw score and the height of the line represents how frequently that score occurred
- The shape of the distribution can be normal but is often skewed or irregular
Frequency Distribution of Sample Means
- Based on hypothetical set of sample means
- Each point on the x-axis represents a sample mean and the height of the line represents how frequently they are expected to occur
- The shape of the distribution tends to be normal regardless of the distribution of the raw scores
- The standard deviation of these means is called standard error
Sample Error
Occurs when a sample that is not representative of the population being studied is selected
- sample typically doesn’t provide a perfectly accurate representation of its population
- there is some discrepancy (or error) between a statistics computed and the corresponding parameters
Standard Error
- in reference to the distribution of sample means
- provides a measure of how much difference is expected from one sample to another
- measures how well an individual sample represents the population mean
Small VS Large Standard Error
small = the sample means are close together and have similar values
large = the sample means are distributed over wider range and there are large differences from one sample mean to another
Hypothesis Testing
- Data-Driven Decision Making
- Statistical Inference
- Evidence-based Conclusions – determining validity
Null Hypothesis Testing
- Null hypothesis (H0) is that there is no effect or difference between groups being compared
- Something we assume to be true at the beginning of a null hypothesis test, but the goal is to provide evidence against the H0
- If we assume that the null hypothesis is true, what is the likelihood of our data turning out the way it has?
Alternative Hypothesis (H1 o Ha)
- A statement that there is an effect or difference between the groups compared
- H0 is rejected
- Can’t be statistically tested, so measuring against H0 is more important
Types of Errors in Hypothesis Testing
- Type 1 (false positive) - rejecting the null hypothesis when it is actually true
- Type 2 (false negative) - failing to reject the null hypothesis when it is actually false
Decision Rule
Where the line is drawn in terms of there being sufficient evidence from the data to reject the H0
- a decision rule quantifies when we can say “it is unlikely for us to obtain this data if the null hypothesis is true, therefore it would be more reasonable to assert that the null hypothesis is false”
- the decision is chosen by the experimenter (but guided by convention)
Rejecting the H0 as a consequence of applying a decision rule is known as a significance test
- the test statistics is calculated differently depending on what kind of NHST is being carried out
The Test Statistics
→ takes into account differences in scores due to the manipulation or factor of interest
→ considers differences in scores due to extraneous factors, that should have nothing to do with the factor of interest
One and Two Tailed Tests
- One-tailed: only sensitive to a difference in one direction
- Two-tailed: sensitive to differences in either direction
- One-tailed are more limited in the question they are asking but more sensitive to the presence of a difference (more statistical power)
P-Values and Effect Sizes
- A lower p-value is desirable because it implies a conclusion that rejects the null hypothesis as less likely to be an error
- This is what is meant when papers refer to a difference or effect that is “highly significant”
- It does not necessarily imply a large effect size
- Effect size measures how big or important the difference is
Confidence Intervals
Estimate the range within which the true population parameter is likely to fall. They provide a measure of uncertainty around our sample estimate
- Set of values that range between an upper and lower limit
A certain level of confidence that the confidence interval contains the population parameter of interest - Unlike tests, confidence intervals can tell us something about the size of the effect in the population
- Mean might equal 71 but the confidence intervals ranges from 69-73, where the researchers are most confident that the true mean is
Calculating Confidence Intervals
- SE (standard error) = SD / Square root of sample size
- SE tells how precise our sample mean is, how much does the sample mean differ from the population mean
Small Sample
- Less reliable estimate
- Larger standard error
Large Sample
- More reliable estimate
- Smaller standard error
Difference between Confidence Intervals and p-values
- P-values indicate the probability of obtaining the observed results under the null hypothesis (number describing the likelihood of obtaining the observed data if it were to be tested again)
- Confidence intervals provide a range of plausible values for the population parameter, offering a more informative picture (the range of values that would contain the true population 95% of the time when it is completed)
- If the confidence interval contains 0 the difference is not significant
Interpreting Confidence Intervals
If the confidence interval covers 95% then there is a 95% chance that the confidence interval will hold the true population mean
- Overlap - this suggests that you cannot confidently conclude a statically significant difference
- Non-overlap - this suggests that you can conclude a statistically significant difference between groups
Factors affecting sample size
larger samples lead to narrower intervals, providing more precise estimates
Factors affecting confidence level
higher levels result in wider confidence intervals
Factors affecting population variability
higher variability leads to wider intervals, indicating greater uncertainty
Type 1 Inferential Error
Rejecting null hypothesis when the null hypothesis is in fact true
Type 2 Inferential Error
Retaining hypothesis, when it is still false
Imputation
- Imputation - replace missing values with estimate based on patterns in the existing data
Listwise Deletion
- Listwise deletion - remove any cases with missing data (this can reduce statistical power and introduce bias if the missingness is not random)
Multiple Imputation
- Multiple imputation - generate multiple plausible values for each missing data point to account for uncertainty, then pool the results
Analysis of Missingness
- Analysis of missingness - investigate the patterns and mechanisms behind missing data to select the most appropriate handling method