descriptive statistics Flashcards

1
Q

what is population

A

an entire group of people we are interested in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a sample

A

a subset of our population and is usually represented with n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is categorical data

A
  • 2 or more categories
  • nominal or ordinal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is discrete data

A

has a fixed value with a logical order
usually ordinal, ratio or interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is continuous data

A
  • can take any fractional value
  • usually ration or interval
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how can we present frequency in categorical data

A
  • raw frequency
  • bar graphs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how can we present frequency in discrete data

A
  • raw frequency or percentage - bar graph
  • cumulative frequency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

when would we use measures of central tendency

A
  • when we want to condense the entire frequency distribution to a single number
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the mode

A
  • appears most often
  • nominal data
  • can be more than one value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the median

A
  • the middle of the dataset
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the mean

A
  • the sum of data points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are pros of using median

A
  • insensitive to outliers
  • gives a real, meaningful data value
  • useful for ordinal data and skewed interval/ratio data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are cons of using median

A
  • ignores a lot of data
  • difficult to calculate without a computer
  • can’t use with nominal data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the pros of using the mean

A
  • uses all the data
  • most effective for normally distributed data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are the cons of using mean

A
  • sensitive to outliers
  • values are not always meaningful
  • only meaningful for ratio and interval data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what measures of spread are used in median

A
  • range
  • interquartile range
17
Q

what measures of spread are used in mean

A
  • variance
  • standard deviation
18
Q

what is interquartile range

A
  • similar to range but ignores most extreme values
  • is the range of scores within the middle 50% of scores
  • lower quartile - median of lower half of the data
  • upper quartile - median of upper half of the data
  • interquartile range - upper quartile-lower quartile
19
Q

what is deviance

A
  • each score is subtracted from the mean
  • could see a deviance of 0
20
Q

what is sum of squared errors

A
  • deviance is squared and all deviance are summed
  • more data point = a bigger SS
21
Q

what is variance

A

an average of our sum of squares

22
Q

what are pros of variance

A
  • uses all the data
  • forms the basis of several other tests
23
Q

what are cons of variance

A
  • requires a normal distribution
  • sensitive to outliers
  • units are not sensible
24
Q

what is standard deviation

A
  • a measure of spread that is equal to the unit of measurement of the dependent variable
  • calculated using the square root of the variance