Exam 1 Flashcards

1
Q

Variables take categories as their values such as “yes” “no “ or blue brown green

A

Categorical (qualitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variables that have values that represent a counted or measured quantity

A

Numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variables that arise from a counting process

A

Discrete / numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variables that arise from a measuring process

A

Continuous/ numerical (quantitative )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Facts and figures callected, analyzed, and summarized for presentation and interpretation

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

All the data collected in a particular study are referred to as the __________ for the study

A

Data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The entities on which data are collected

A

Elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A characteristic of interest for the elements

A

Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The set of measurements obtained for a particular element is called _____

A

An observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A data set with n ________ contains n __________.

A

Elements, observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you calculate the total number of data values

A

The total number of data values in a complete data set is the number of elements multiplied by the number of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Nominal data

A

Defined categories such as eye color, political party, marital status

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ordinal categories

A

Categorical - Ordered categories such as good, better, best or low, medium, high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data classified into distinct categories in which no ranking is implied

A

A nominal scale ex: do you have a facebook profile? Y or N; cellular provider? (verizon, AT&T, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Classifies data into distinct categories in which ranking is implied

A

Ordinal data - EX: grades, ratings, product satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data that has the properties of ordinal data and the ___ between observations is expressed in terms of a fixed unit of measure. it is always ___. The scale ____ contain a ____ value that indicates that noting exists for the variable at the _____ point.

A

Interval,, numeric, does not , zero x2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data that has all the properties of internal data and the ___ of two values is meaningful. The scale ____ contain a zero value that indicates that nothing exists for the variabe at the zero point,

A

Ratio, must

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Data that is collected at the same or approximately the same point in time.

A

Cross-sectional data. EX: data detailing the number of building oermits issued in Nov. 2019 in each of the counties of Ohio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data that is collected over several time periods

A

Time series - data detailing the number of building permits issued in lucas county, ohio in each of the last 36 months. Graphs of time series help analysts understand what happened in the past, identify any trendsa over time, and project future levels for the time series.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The set of all elements of interest in a particular study

A

Population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

A subset of the population

A

Sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The process of using data obtained from a sample to make estimates and test hypothesis about the characteristics of a population

A

Statistical inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Collecting data for the entire population

A

Census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Collecting data for a sample

A

Sample survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Collecting data via sampling is used when doing so is:
Less time consuming tham selecting every item in the population. It is less costly than selecting every item in the population. It is less cumbersome and more practical than analyzing the entire population.
26
Summarizes the value of a specific variable for a population.
Population Parameter
27
Summarizes the value of a specific variable for sample data
Sample statistic
28
Tallies the frequencies or percentages of items in a set of categories so that you can see differences between categories
A summary table
29
A summary of data showing the number (frequency) of observations in each of several non-overlapping categories or classes.
Frequency distribution
30
The ____ ______ of a class is the fraction or proportion of the total number of data items belonging to the class. What is the equation?
Relative frequency. Equation is ~ relative frequency of a class = frequency of class/ n
31
How do you calculate percent frequency of a class?
The relative frequency multiplied by 100
32
Used to study patterns that may exist between the responses of two or more categorical variables.
A contingency table
33
It cross tabulates or tallies jointly the responses of the categorical variables
A contingency table.
34
A tabular summary of data for two variables.
A crosstabulation
35
Contingency table - For two variables, the tallies for one variable are located in the ____ and the tallies for the second variable are located in the ________.
Rows, columns
36
A sequence of data, in rank order, from the smallesy value to the largest value.
An ordered array.
37
It shows range (minimum value to maximum value)
Ordered array.
38
May help identify outliers (unusual observations)
An ordered array
39
A summary table in which the data are arranged into numerically ordered class
Frequency distribution
40
You must give attewntion to selecting the appropriate number of _____ ______ for the table, determining a suitable width of a class grouping, and establishing the boundaries of each class grouping to avoid overlapping.
Class groupings.
41
How do you determine the width of a class interval?
Divide the range (highest value-lowest value) of the data by the number of class groupings desired
42
A ______ visualizes a categorical variable as a series of bars. The length of each bar represents either the _____ or ___ of values for each category. Each space is seperated by a space called ______
Bar chart, frequency or percentage, gap
43
A ___ ___ is a circle that is broken up into slices that represent categories. The zie of each ___ varies according to the percentage in each category.
Pie chart
44
A ___ ____ is the out part of a broken circle broken up into pieces that represent categories. The size of each piece varies according to the percentage of each category.
Donut chart
45
Used to portray categorical data. A verticle bar chart where categories are shown in decending order of frequency. A cumulative polygon is shown in the same graph. Used to seperate the “___ ___” from the “___ ___.”
The pareto chart. “vital few,” from the “trivial many”
46
Represents data from a contingency table
Side by side chart
47
Can be used to represent the data from a contingency table
Doughnut chart
48
Organizes data in groups (called ___) so that values within each group (the ____) branch out to the right of each row.
Stem-and-leaf display
49
A vertical bar chart of the data in a frequency distribution is called a _______.
Histogram
50
Formed by having the midpoint of each class represent the data in that class and then connecting teh sequence of midpoints at their respective class percentages.
Percentage polygon
51
Displays the variable of interest along the X axis, and the cumulative percentages along the Y axis. Useful when there are two or more groups to compare.
Cumulative percentage polygon, or ogive.
52
Used for numerical data consisting of paired observations taken from two numerical variables.
Scatter plots
53
Used to examine possible relationships between two numerical variables.
Scatter plots
54
Used to study patterns in the values of a numeric variable over time.
Time-series plot
55
Contructed by tallying the responses of three or more categorical variables.
Multidimensional contingency table.
56
Provides a measure of central location
Mean
57
The average of all the data values
Mean
58
Perhaps the most important measure of location.
The mean
59
The value in the midddle when the data items are arraneged in ascending order
Median
60
Whenever a data set has extreme values, the ____ is the preferred measure of central location
mean
61
For an odd number of observations (in ____ order) the median is the ____ value
ascending, middle
62
For an even number of observations (in ____order), the median is the ______ of the two middle values
Average ~ median = (19+26)/2 = 22.5
63
The ____ of a data set is the value that occurs with the greatest frequency
Mode
64
The greatest frequency can occur at two or more different values
The Mode
65
If the data have exactly two modes, the data are ____.
Bimodal
66
If the data have more than two modes, the data are ______.
multimodal
67
Excells mean function
=AVERAGE(data cell range)
68
Excels median function
=MEDIAN(data cell range)
69
Excells mode function
=MODE.SNGL(data cell range)
70
How does one calculate the geometric mean?
Finding the nth root of the product of n values
71
What is the geometric mean function?
=GEOMEAN(data cell range)
72
The _______ of a data set is a value such that at least _ percent of the items take on this value or less and at least (100- __) percent of the items take on this value or more.
pth percentile, p, p
73
Equation used to compute percentiles
=PERCENTILE.EXC(data range, p/100)
74
Quartiles examples
First quartile = 25th percentile, second quartile = 50th percentile = median, third quartile = 75th percentile
75
Measure of ____ give information on the ____ or _____ or _____ of the data values
Spread, variability, or dispersion
76
The difference between the largest and smallest data values
The range
77
What is the range calculation?
Range = largest value - smallest value
78
The simplest measure of variability
Range
79
Is very sensitive to the smalled and largest data values
Range
80
A measure of variability that utilizes all the data
Variance
81
Based on the difference between the value of each observation and the mean
Variance
82
The ____ _____ of a data set is the positive square root of the variance
Standard deviation
83
Measured in the same units as the data, making it more easily interpreted than the variance
Standard deviation
84
Excel function for sample variance
=VARS.S(data cell range)
85
Excel function for sample standard deviation
=STDEV.S(data cell range)
86
Indicates how large the standard deviation is in relation to the mean
Coefficient of variation
87
The number of standard deviation a data value is from the mean
Z-score
88
Describes how data are distributed
Shape of a distribution
89
Measures the extent to which data values are not symmetrical
Skewness
90
Measures the peakedness of the curve of the distribution- that is, how sharply the curve rises approaching the center of distribution
Kurtosis
91
Sumamry measures describing a population, called ____ are denoted with greek letters
Parameters
92
The sum of the values in the population divided by the population size, N
Population mean
93
The ___ ___ approximated the variation of data in a bell-shaped distribution
Empirical rule
94
The _____ measures the strength of the linear relationshop between two ____ variables
Covariance, numerical
95
Excel function for the coefficient of correlation ~ covariance
=COVARIANCE.S(X,Y0
96
Excel function for coefficient of correlation ~ correlation coefficient
=CORREL(X,Y)