Descriptive part 2 Flashcards

1
Q

Three concepts that are traditional statistics

A

Measures of Central Tendency
Measures of Variation
Measures of Position

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

these are the statistical and parametric measurements of the data and how they are centered

A

Measures of Central Tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Measures of Central Tendency

A

mean, median, or mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

statistical and parametric measurement of how dispered the data are

A

Measures of Variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

describing the position of the data value in relation to the data set

A

Measures of Position

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

a characteristic or measure obtained by using the data values from a sample.

A

statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

a characteristic or measure obtained by using all the data values from a specific population.

A

parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Population size

A

N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

sample size

A

n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

describes where the distribution may be ‘centered’

A

Measures of Central Tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Measures of Central Tendency or also known as…

A

Average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

center of gravity

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

value in the middle

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

most typical value

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

the average of the values: equal to the sum total of all values divided by the number of values

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The central tendency that is affected by the presence of outliers in the data

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

________ letters are used to denote parameters

A

Greek

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

________ letters are used to denote statistics

A

Roman

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the two types of mean?

A

parametric/population mean
statistical/sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Mean: values in the data set are of the whole population.

A

parametric/population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

parametric/population mean is represented by the greek letter

A

μ (mu)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Mean: values that comprise samples.

A

statistical/sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

statistical/sample mean is represented by the roman letter______

A

x̄ (x bar)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

TWO WAYS OF COMPUTING THE MEAN

A

Mean for Ungrouped Data
Mean for Grouped Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Mean: comes from the raw data

A

Mean for Ungrouped Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

ROUNDING RULE FOR THE MEAN

A

The mean should be rounded to one more decimal place than occurs in the raw data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Mean: comes from the frequency distribution table

A

Mean for Grouped Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

the procedure for finding the mean for grouped data uses the _________ of the classes.

A

midpoints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

the middlemost value

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

the midpoint of the data array

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

the symbol for the median

A

MD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

obtained by sorting the values from lowest to highest and getting the value in the middle (halfway point)

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

preferred to be used as a typical value (or center) than mean when distribution is skewed (outliers)

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

most frequently occurring value in a data set, most typical

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

most descriptive when distributions are highly-peaked(leptokurtic), suggesting large concentration on a single value

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

one value occurs with the greatest frequency

A

unimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

two values with the same greatest frequency

A

Bimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

more than two values occurring at the same greatest frequency

A

Multimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

no data value occurs more than once

A

No mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

a rough estimate of the middle

A

Midrange

41
Q

found by adding the lowest and highest values in the data set and dividing by 2

A

Midrange

42
Q

a very rough estimate of the average and can be affected by one extremely high or low value

Not reliable, as values in between data sets are not put into consideration

A

Midrange

43
Q

What is the symbol used for midrange?

A

MR

44
Q

find the mean of a data set in which not all values are equally represented

A

Weighted mean

45
Q

Symbol for Mode

A

None

46
Q

a single bar stands out

A

Unimodal

47
Q

two bars stand out

A

Bimodal

48
Q

more than two bars stand out

A

Multimodal

49
Q

all bars are of the same height

A

No mode

50
Q

describes the symmetry of a histogram

A

Skewness

51
Q

right-side is mirror image of the left-side

A

Symmetric

52
Q

asymmetric distribution and describing where tapering of the sides (tails) are different

A

Skewed

53
Q

the right tail is longer; more values concentrated on the left (more lower values)

if the mean is greater than the median

A

Skewed to the right (positively skewed)

53
Q

the right tail is longer; more values concentrated on the left (more lower values)

if the mean is greater than the median

A

Skewed to the right (positively skewed)

54
Q

the left tail is longer; more values concentrated on the right (more higher

if the mean is less than the median

A

Skewed to the left (negatively skewed)

55
Q

measures the dispersion of the data values, how flat or how peak the value the peak is

A

Kurtosis

56
Q

heavy tails

A

Flat (Platykurtic)

57
Q

sharp peaks

A

Highly-peaked (Leptokurtic)

58
Q

rounded peak, symmetric tails

A

Bell-shaped (Mesokurtic)

59
Q

measure the spread or variability of the values from each other

A

Measures of variation

60
Q

measure the spread or variability of the values from each other

A

Measures of variation

61
Q

What are the 5 measures of variation

A

range
variance
standard deviation
coefficient of variation
interquartile range (IQR)

62
Q

simplest measure of dispersion, used to get a quick idea of the spread

A

Rangw

63
Q

the difference between the highest and lowest value

A

range

64
Q

rest of the values are not used in the calculation
waste of information

one of the weakest measures of dispersion

A

Range

65
Q

average of the squared deviations of values from the mean

A

Variance

66
Q

takes into consideration all of the values

measured in square of the original units
makes it a problem for interpretation

A

Variance

67
Q

square root of the variance
measured in the unit as that of the data values

A

Standard variation

68
Q

The symbol ‘__’ represents the population standard deviation.

A

σ

69
Q

population variance is symbolically represented by

A

σ^2

70
Q

ROUNDING RULE FOR SD

A

The rounding rule for the standard deviation is the same as that for the mean. The final answer should be rounded to one more decimal place than that of the original data.

71
Q

an estimate of the population variance/standard deviation

A

Sample variance and SD

72
Q

Sample variance is denoted by

A

s^2

73
Q

standard deviation of a sample is denoted by

A

s

74
Q

ratio of the standard deviation to the mean

A

Coefficient of Variation

75
Q

used to compare the measure of spread between sets of data that are measured in different units

A

Coefficient of Variation

76
Q

measures that describe the position or location of particular values along the cumulative distribution

A

Measures of position

77
Q

they are sometimes useful for determining cut-off points for certain categories

A

Measures of position

78
Q

3 types of measures of position

A

Standard Score (z Score)
Percentile (quantiles)
Quartiles and Deciles (quantiles)

79
Q

number of standard deviation that a data value is above or below the mean

A

Standard score

80
Q

Standard score is also known as

A

Z score

81
Q

if a standard score is zero, then the data value is the same as the

A

Mean

82
Q

in a normal distribution curve, _______ measures how far a value is from the mean

A

z-score

83
Q

if z-score is 2, the value is 2 standard deviations away from the _____

A

mean

84
Q

divide the data into 100 equal parts

A

Percentile

85
Q

indicate the position of an individual in a group

A

Percentile

86
Q

divide the data into 10 equal parts

A

Decile

87
Q

divide the data into 4 equal parts

A

Quartile

88
Q

the range of values bounded by the 25th and 75th percentiles (P25 and P75)

A

Interquartile Range

89
Q

it gives information on the values of the middle 50% of the data

A

Interquartile range

90
Q

the higher the IQR, the larger the _______- in the middlemost values of the data

A

variation

91
Q

IQR formula

A

IQR = Q3-Q1

92
Q

an extremely high or an extremely low data value when compared with the rest of the data values

A

Outlier

93
Q

relatively less affected by outliers than a nonresistant statistic

A

resistant statistic

94
Q

when a distribution is skewed or contains outliers, ______ may more accurately summarize the data than traditional

A

EDA - Exploratory Data Analysis

95
Q

A ________ can be used to graphically represent the data set. These plots involve five specific values called the five-number summary of the data set

A

boxplot

96
Q

a graph that show some of the most important statistics in the data set, specifically:
- the median (central tendency);
- P25 and P75 (location and variation)
- some extreme values (outliers)

A

Boxplot

97
Q

it is a very versatile graph for showing distributions, comparisons and associations between variables

A

Boxplot