Intro Flashcards
What is a measure of central tendency?
A measure that gives a general idea of the centre of the distribution of the data
If you have nominal data, what measure would you use to find the central tendency?
Mode
Advantage and 2 disadvantages of mode?
Not affected by extremely high or low values (outliers)
Dis:
- some distributions have no mode (uniform dist.)
- ignores distribution of observations - non modal values therefore have no weight
Problems with the mode? (2)
Some distributions have more than one mode (bimodal)
With OIR data the modal score may not be central to the distribution as a whole
Four types of measurement of data?
Nominal
Ordinal
Interval
Ratio
NOIR
What is nominal data (3) and example?
Observations of a qualitative variable are measured and recorded as labels or names.
Data is classified into categories and can’t be sorted into an order.
Only mathematical operation permitted is classifying and counting.
Example: gender
2 characteristics of nominal data?
Mutually exclusive
Exhaustive (each object must appear in one of the categories)
What is ordinal data (2) and example?
Data oranges in an order, but differences between values are meaningless.
Most advanced mathematical operation on this data is ranking of categories.
Eg. Level of education:
GCSE
A level
Degree
What is interval data (2) and example?
Meaningful amounts of differences between data values can be determined.
No absolute zero score
Eg. Temperature in Celsius or shoe size
Why can’t you say ‘100 degrees Celsius is twice as hot as 50 degrees Celsius’?
Because 100 is not twice as hot as 50, since 0 doesn’t represent absolute zero (doesn’t represent the absence of heat, only the freezing point of water)
What is ratio data and example?
Extension of the interval data to include an inherent zero starting point.
Eg. Weight, age, temperature in kelvin
Advantages of median? (3)
Not affected by outliers
Unique median for each data set
Can be computed for OI and R data
What is a parameter?
A measurable characteristic of a population
What is a statistic?
A measurable characteristic of a sample
Characteristics of mean? (4+ formula)
Unique (only one per dataset)
Requires interval or ratio data
Every single score affects it
Sum of deviations from mean is always zero
Σ(xi - xbar) = 0
What is the least squares principle? (1+formula)
If the difference between the mean value and the scores are squared and then added, the resultant sum is the minimum possible.
Σ(xi - xbar)^2 = min
Disadvantage of mean?
Affected by extreme values (outliers)
Three common data shapes?
Mean = median - symmetrical dist.
Mean > median - positive skew
Mean < median - negative skew
Coefficient of skewness equation?
sk = (3(xbar - median))/s
s is standard deviation
2 equations for frequency distribution?
2^k > N (k is no. of classes)
Class interval determined by:
i = (H-L)/k THEN ROUND UP
Upper and lower limit of frequency distribution classes?
Must include all values in data
Histogram axis labels?
X - class intervals Y - class frequencies
BARS TOUCH
What is a frequency polygon?
A histogram with straight lines joining the midpoint of adjacent frequency bars
If there is a weird column on the left hand side of a stem and leaf diagram what is it?
Cumulative frequency
What goes on the axis of a bar chart?
X - class intervals Y - class frequencies
What is a line chart good for showing?
Change over time