Stats Flashcards
Process of Research in Conducting Statistics
- First determine average results
- Then individual variations
- Then ethical reporting - full disclosure is crucial for accurate interpretation, giving other researchers the chance to replicate the study
The Ethical Imperative: Why Understanding Stats Matters
- Transparency and accountability
- Advancing the field Ethical data practices
- Ethical Data Practices are crucial for maintaining public trust and to avoid misrepresentation and unintended bias of psychological traits
Implications of Misreporting
- Overgeneralisation - misleading one-size-fits-all impression of therapy effectiveness
- Patient harm - wasted time on ineffective treatments
- Research mistrust - damages credibility of psychological studies
- Ethical responsibility - researchers must present complete picture, including limitations
Measures of Central Tendency
- mean, the most common, however others might be more appropriate
- median
- mode
Measures of Dispersion/Variability
- Range
- Variance
- Standard deviation
- Interquartile range
Histograms and Bar Charts
- Graphs for understanding data
- How often the data appears - histogram
- Compare the magnitude of different categories - bar charts
Boxplots
Give median value, interquartile range and the spread of data
- Can reveal key characteristics such as presence of skewness, extent of variability
Scatterplots and Correlations
- Relationships between two variables
- Trends, clusters and outliers
Importance of Data Cleaning and Preparation
- Identifying error - mistakes that might skew the data
- Handling missing data - appropriate methods to handle them
- Standardised formats - all data is the same format so it can be compared
- Transforming variables - apply necessary transformations to meet stat assumptions
Strategies for Handling Missing Data
- Imputation - replace missing values with estimate based on patterns in the existing data
- Listwise deletion - remove any cases with missing data (this can reduce statistical power and introduce bias if the missingness is not random)
- Multiple imputation - generate multiple plausible values for each missing data point to account for uncertainty, then pool the results
- Analysis of missingness - investigate the patterns and mechanisms behind missing data to select the most appropriate handling method
Interpreting Descriptive Stats
- Visualising the data - through patterns, outliers and relationships in graphs etc
- Contextual interpretation - understanding real-world implications of descriptive statistics
- Practical significance - evaluating the magnitude of its effects
Ethical Considerations in Data Presentation
- Transparency
- Avoiding bias
- Context matters
- Responsible reporting
Avoiding Common Pitfalls in Descriptive Stats
- Misinterpreting visualisations
- Choosing inappropriate analyses
- Data entry errors
Practical Applications of Descriptive Stats
- Research design
- Psychological assessment
- Intervention evaluation
- Data visualisation
The Data Analysis Process
- Collect
- Organise
- Analyse
- Interpret
- Collect
- Experimental measurements
- Behavioural observations
- Psychological test scores
Survey responses
- End up with spreadsheets
- Everything in row one is for participant 1 and so forth
- Columns incorporate different variables
- Organise
- Median, mean, minimum and maximum
- Summarise data to find averages → the first step
- Not interested in individual data, but summative data
- Box plots, histograms, relationships of variables (scatter plots)
- Analyse
- Descriptive statistics, differential statistics, to put these into words for specific variables, making sense of the spreadsheets
- Interpret
- What do these numbers mean for the research, what does it suggest
- Must be done accurately
- “This suggests…”
Quantitative Variables
measurable quantities like age, height, test scores (anywhere within a range)
Qualitative Variables
descriptive categories such as gender, eye colour, mood
Types of Data
- Numerical
- Categorical (grouping data)
- Ordinal (ranked data like likert scales)
- Continuous (infinitely divisible data like reaction time)
Nominal Scale - Identity
- Used for categorical variables,
- Numbers are arbitrary, acting as labels instead of names, they indicate difference, not size or order
Ordinal Scale - Identity + Order
- Scores can be ranked/ordered
- Indicate differences and scale
- Nothing more than rank order
- No objective distance between any two points on the scale
- Not measurable
Interval Scale - Identity + Order + Equal Unit Size
- Allow us to separate objects or events into mutually exclusive categories, in an order, and with specific distances
- Indicate differences, scale, interval length and size
Ratio Scale
identity + order + equal unit size + true zero point
Discrete Variables
Data are comprised of indivisible units, represented by whole numbers
- number of children
- errors on a true/false test
Continuous Variables
Data involve numbers that can be divided
Measure of Variability
indicates the degree to which scores are either clustered or spread out in a distribution
Range
difference between lowest and the highest score
SD
Average movement from the middle of the distribution
- Most commonly used measure
- How different from the mean the individual scores may be
- Average of these deviations
Steps to Calculate the SD
- Step 1: calculate the mean
- Step 2: find the average of the difference of mean from each individual score (x1 - M) (this is the deviation)
Mean for the deviation code is zero - Step 3: squared deviation (x1 - M)2
- Step 4: find the mean of the squared deviation (known as the variance)
- Step 5: square root of the variance
- This is done because the variance is not in the same measurements as all the scores are (it is much larger), and so the standard deviation is something that can compare between the scores much better
How are SD and Variance Different
- Both measures of variability
- Both used in inferential statistics
- Similar formula
- Standard deviation: presents measure in original units
- Variance: presents measure in squared units
Data Collection Commandments
- Think about the type of data required to answer the question
- Where will you be collecting the data
- Make sure that the data collection form you are using is clear and easy to use
- Make a duplicate of the data files and keep it in a separate location
- Do not rely on other people to collect or transfer your data unless you have personally trained them and are confident that they understand the data collection process as well as you do
- Plan a detailed schedule of when and where you will be collecting your data
- As soon as possible cultivate possible sources of your participant pool
- Try to follow up on subjects who missed their testing session
- Never discard original data
Self-report Measures
- Administered as questionnaires or interviews
Behavioural self-report measures
- Unreliable
- How often they may do something
Cognitive measures
- What people think
- Unreliable
Affective measures
- How people feel
- Unreliable
Types of Tests
- Assess individual differences in various content areas
Personality tests
- Often self-reported affective tests
Ability tests
- Aptitude tests - measure an individual’s potential to do something
- Achievement tests - measure an individual’s competence in an area
Behavioural Measures
- Observational measures
- Involve some sort of coding system - a means of converting the observations to numerical data
Descriptive Statistics
- Average score (central tendency)
- Shape of the distribution
- Width of the distribution
- Organise data in tables and graphs
The Median
- Mid-point or central value
- Divides the score in half
- Not sensitive to outliers
- Requires all scores to be placed in rank order
Mode
- Most frequently occurring category or score
- Can be determined on all scales of measurement (nominal, ordinal, ratio, interval)
- It is the only measure of central tendency that can be used for data measured on a nominal scale
When to use the Different Measures of Central Tendency
Mode
- When the data are categorical in nature and values can fit into only one class (religion, hair colour)
Median
- When there is extreme scores and don’t want to distort the mean
Mean
- When data isn’t extreme and isn’t categorical
What Do Central Tendencies Look Like in a Symmetrical Unimodal Distribution
mode=median=mean
Positive and Negative Skews
> 50% above mean and <50% below mean
Why Care About Variable Types
- Different measurement approaches for different variables
- Different statistical tests most appropriate for analysis
- Different interpretation methods for correctly interpreting results and drawing accurate conclusions
Nominal Variables
categories with no natural order
Important for
- Understanding patient choices
- Analysing demographic patterns
- Cultural differences in mental health
Ordinal Variables
Ordered categories
Important for
- Better understanding of patients subjective experience
- May be useful in developing individualised treatments
- Informs decision-making and further research
Interval Scales
- Equal distances between points
- No true zero
Ratio Scale
- has true zero