PSY1022 WEEK 11 DISC 5 Flashcards
DESCRIPTIVE STATISTICS
Techniques that help describe a set of data. eg. Graphs, tables, calculating an average score.
Creating a meaningful summary, so patterns and trends in the raw data can be seen.
INFERENTIAL STATISTICS
Methods that use the limited information from samples to answer general questions about populations.
- help researchers determine when it is appropriate to generalize from a sample to a population
STATISTIC
A summary value that describes a sample.
- eg. average age of students in a sample.
Two purposes:
- They describe or summarize the entire set of scores in the sample.
- They provide information about the corresponding summary values for the entire population.
PARAMETER
A summary value that describes a population.
- eg. average age of students in the population.
- symbolised by Greek letters.
FREQUENCY DISTRIBUTIONS
A method of simplifying and organising a set of scores by grouping them into an an organised display that shows the entire set.
- consists of a tabulation of the number of individuals in each category on the scale of measurement (X and f)
- displays set of categories and and number of individuals with scores in each category.
- advantage: allows researcher to view the entire set of scores
- disadvantage: can be tedious with large sets of data.
FREQUENCY DISTRIBUTION GRAPH
Scale of measurement, or categories, on X axis.
Frequencies on Y axis.
- histogram (boxy graph thing)
- polygon (line graph thing)
- when not numerical = bar graph. Similar to histogram, but bars don’t touch.
- first step in examining a set of data. Rarely used in published reports.
FREQUENCY DISTRIBUTION TABLE
Two columns of information
- first is scale of measurement, or categories. Labelled X.
- second is frequency or number of individuals
CENTRAL TENDENCY
A statistical measure that identifies a single score that defines the center of a distribution. The goal of central tendency is to identify the value that is most typical or most / best representative of the entire group.
- mean, median, mode
MEAN
Add all scores, divide by number of individuals.
Represented by M in research papers.
Mean for population represented by µ (mu).
- can be skewed by a few extreme scores
- can’t use with nominal scale
- ordinal is generally inappropriate.
MEDIAN
The score that divides a distribution in half.
Good when there are extremes.
Is the “middle” number. Write scores in order, pick the middle one.
If two in the middle add them together and divide by 2.
Also called 50th percentile.
- can be used for ordinal, ratio, and interval
- good alternative to mean when there are a few extreme scores.
MODE
Is the score or category with the greatest frequency. In a frequency distribution graph, the mode identifies the location of the peak (highest point) in the distribution.
- possible to be bimodal or multimodal.
- can be used for NOIR
VARIABILITY
Descriptive: describes the spread of scores in a distribution.
Inferential: provides a measure of how accurately any individual score or sample represents the entire population.
A primary value to describe a distribution of scores.
Small = clustered together (good representation)
Large - spread out (distorted picture)
Measured by range or standard deviation.
STANDARD DEVIATION
Uses the mean of the distribution as a reference point and measures variability by measuring the standard (average) distance between each score and the mean.
- averaged squared distance from the mean.
- represented by s or SD
- half will be positive and half with be negative
VARIANCE
Average squared distance from the mean.
= s^2
Measures variability.
CALCULATE STANDARD DEVIATION
- For each score, measure distance from mean (score = 84, mean = 80, deviation = 4)
- Find variance by summing all the squared distances
- find average squared distance (divide by n-1, not n).
- square root to find standard deviation
DEGREES OF FREEDOM
n-1
df
produces a variance for the sample that is an accurate and unbiased representation of the population variance
NORMAL CURVE
A symmetrical, bell-shaped frequency polygon representing a normal distribution.
NORMAL DISTRIBUTION
A theoretical frequency distribution that has certain special characteristics.
Also called Gaussian.
- bell-shaped, symmetrical
- mean, median, mode located at center. Only one mode.
- when any standard deviations are plotted on the x-axis the percentage of scores falling between the mean and any point on the x-axis is the same for all normal curves.
KURTOSIS
How peaked or flat a a normal distribution is.
MESOKURTIC
Peaks of medium height, distributions are moderate in breadth.
LEPTOKURTIC
Peak is tall and thin. Distribution is narrow.
PLATYKURTIC
Peak is broad and flat. Distribution is wide.
POSITIVELY SKEWED DISTRIBUTION
A distribution in which the peak is to the left of the center point, and the tail extends toward the right, or in the positive direction.
Mode = highest point
Median = divides in half
Mean = pulled in direction of tail (+ve) due to a few extremely high scores.
– most people have low scores, only a few have high.
NEGATIVELY SKEWED DISTRIBUTION
A distribution in which the peak is to the right of the center point, and the tail extends toward the left, or in the negative direction
Mode = highest point
Median = divides in half
Mean = pulled in direction of tail (-ve) due to a few extremely low scores.
– most people have high scores, only a few have low.
Z-SCORE (STANDARD SCORE)
A number that indicates how many standard deviation units a raw score is from the mean of a distribution. Tells you exactly where the score is located relative to all the other scores (ie. is 78 a high score or a low score?)
- appropriate for interval or ratio scales of measurement
- +ve = above mean, -ve = below mean
- zero means score = mean
(score - mean) / SD = z-score
STANDARD NORMAL DISTRIBUTION
A normal distribution with a mean of 0 and a standard deviation of 1. From converting X scores into z-scores.
- approximately 68% of scores fall between -1.0 and +1.0 standard deviations from the mean.
- 13.5% fall between +/-1.0 and +/-2.0
- 2.0% fall between +/-2.0 and 3.0
- 0.13% have z-score beyond +/-3.0
PROBABILITY
The expected relative frequency of a particular outcome.
- find appropriate proportion on unit normal table
- multiply by 100.
PERCENTILE RANK
A score that indicates the percentage of people who scored at or below a given raw score.
GROUPED FREQUENCY DISTRIBUTION TABLE
The X column lists groups of scores, called class intervals, rather than individual values. eg. 1-10, 10-20. All intervals have the same width. Spaced so there will be approximately 10 intervals.
RELATIVE FREQUENCY
Can use for bar graphs where actual numbers are too big. Used on y axis.
SMOOTH CURVE
If scores in the population are measured on an interval or ratio scale it is customary to present the data as a smooth curve rather than a jagged histogram or polygon.
- because not showing exact frequency.
MODE, MEDIAN, AND MEAN FOR SKEWED DISTRIBUTION
Mode = highest point
Mean = skewed towards tail
Median = in between
(in general)
RANGE
Distance between lowest and highest scores.
Range = highest score - lowest score.
Problems with range
- entirely determined by most extreme values
- therefore crude and unreliable.
STANDARD DEVIATION NUMBERS
In general,
68% of scores fall within one standard deviation
95% fall in two standard deviations
99.7% within three standard deviations
Z-SCORE FORMULA
Z = (X - mean) / standard deviation
can be used for samples or populations
TRANSFORMING Z-SCORES INTO X VALUES
X = mean + Z(standard deviation)
Z-SCORE AS A STANDARDISED DISTRIBUTION
Basically just relabelling x-axis in z-scores.
Does not change shape of graph.
Does not change individual location.
— MEAN will ALWAYS be ZERO, STANDARD DEVIATION will ALWAYS be ONE.
ADVANTAGES OF A STANDARDISED DISTRIBUTION (Z-SCORES)
Can compare two graphs that had different means and standard deviations directly.
PROBABILITIES ON GRAPHS
On frequency distribution = proportion of distribution
On graph = proportion of area under the curve
UNIT NORMAL TABLE
C1: z-score values
C2: proportion between mean and z-score
C3: proportion beyond z-score
Probability = proportion x 100.