Metod och Analys II Flashcards
Descriptive statistics – three important parts
- Frequency distribution
- Central tendency
- Variability
Descriptive statistics – three purposes
- Determining how many people got each score
- Providing information on the standing of a score relative to all other scores
- Graphically summarizing the set of scores
What could be 4 purposes of Frequency distribution?
- It is a record of the number of people with each score (or in each category) off the variable
- It allows examination of the full distribution at a “glance”
- Ideally, this will allow the reader to get a basic understanding of the data without being overwhelmed by all the raw scores
- It provides a visual assessment of central tendency and variability
What is two important factors when it comes to frequency distribution?
- There should be a listing of each possible score and the frequency occurrence
- As a check, the sum of the frequencies should be equal to n (the sample size)
There is the characteristics of frequency distribution shapes, describe them.
- Modality: the number of humps in a distribution
- Skewness: is a measure of whether the distribution is symmetrical or not
- Kurtosis: characterizes the relative peaked-ness/flatness of a distribution compared to the normal distribution
What is Normal Distribution?
- Can be described as the bell-shaped curve
- The majority of scores lie around the center of the distribution
- Symmetrical curve
What is left (negative) skewed distribution?
- Frequent scores are clustered at the higher end & tail points towards the lower negative scores
- Not symmetrical curve
What is right (positive) skewed distribution?
- Frequent scores are clustered at the lower end & tail points towards the higher or more positive scores
- Not symmetrical curve
What is Leptokurtic distribution?
- The curve is symmetrical, similar to a normal distribution
- But the center peak is much higher
- That is, frequent scores are near the mean
What is Platykurtic distribution?
- The curve is symmetrical, similar to a normal distribution
- But the frequency of most of the values are the same
- As a result, the curve is very flat, or plateau-lake
What happens when data are not normally distributed?
- Data that are positively skewed (many scores are low) may cause the mean score to be artificially inflated
Resulting in the mean pushed to higher a higher score - Data that are negatively skewed (many scores are high) might lead to an artificially deflated mean
Resulting in the mean pushed to a lower score - Leptokurtic distributions (high peak) may offer little variation in the data
Resulting in risk to not detect result - Platykurtic distributions (low peak) may offer too much variation in the data
Resulting in risk to have too high results - IF normal distribution has been compromised, we may have less confidence in the outcome of parametric tests
How do you measure normal distribution in SPSS?
- You have to check for histograms
- If skewness and kurtosis values are -/+ 1 range, we can assume that distribution is normal – strict criteria
- If skewness and kurtosis values are -/+ 2 range, we can assume distribution is normal – reasonable criteria
- When assessing statistical normal distribution – we use Kolmogorov-Smirnov test if the N is larger than 50
we use the Shapiro-Wilk test if the N is smaller than 50
What can we do if the distribution happens to not be normal?
- Check for outliers
- Transform data
- See textbook 57-61
What is Central Tendency?
- The goal of central tendency is to describe the average score on a variable for a distribution (eg sample or population)
- Ideally, this will be a single value, this will be an estimate of the middle or typical score in the distribution
Which three common measures is there to measure central tendency?
- Mean – problable the measure most frequently thought od as the average
- Median – the middle score merely as a function of the total number of scores in the distribution
- Mode – the most frequently occurring score in distribution
In which 3 ways are distribution shapes and central tendency measures correlated?
- In a perfectly normal distribution, the mean, median and mode are the same value
Mean = median = mode - In a positively (right) skewed distribution the mean is bigger than the median and the meadian is bigger than the mode
Mean > median > mode - In a negatively (left) skewed distribution the mean is smaller than the median and the median is smaller than the mode
Mean < median < mode
What is variability?
- Variability refers to how spread out the scores in the distribution are
- The mean is good for representing the typical score of a distribution, but the mean alone does not completely describe the distribution
- For example – two different distribution both has the sample size n = 1000, and each has the mean M = 100
But we still know nothing how the scores are spread out
What is variability?
Which tree ways of measuring variability are there?
- Range
- Interquartile range
- Standard deviation (most frequently used)
What is Standard Deviation?
- A deviation score is merely the difference between individual score (Xi) and the mean of the distribution (e.g. M)
- Deviation score = (Xi-M)
- We can think of the standard deviation as an average deviation score
- For example – we would expect smaller deviation scores, on average, in a distribution that has less variability (spread in the scores)
What is Standard Deviation?
What is the Standard deviation in a normal distributed sample?
- 68% falls within 1 standard deviation from the mean
- 95% falls within 2 standard deviations from the mean
- 99,7% falls within 3 standard deviations from the mean
How can inferential statistics be described?
In opposite to descriptive statistics we no longer try to describe our sample, we now try to imply/inference the statistics on the population
What is a direct vs an indirect approach when it comes to hypothesis testing?
Direct approach
* Conduct the study in the entire population
* Determine if the hypothesis is supported
* Is typically not feasible or even possible
Indirect approach
* Obtain a sample from the population
* Compute statistics in the sample (e.g. mean)
* Infer relations in population from the sample
There are 2 different types of hypotheses, these are?
- Scientific hypothesis
- Statistical hypothesis
- Null hypothesis
- Alternative hypothesis
What is the Scientific hypothesis?
This is what the researcher expects to find
Eg
* A new type of therapy will be more effective at reducing depressive symptoms that the old type
* Depression is related to low life satisfaction