BUSN 203: Ch. 1-3 Vocab Flashcards

(106 cards)

1
Q

Analytics

A

The scientific process of transforming data into insight for making better decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Big Data?

A

A set of data that cannot be managed, processed, or analyzed with commonly available software in a reasonable amount of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three characteristics of Big Data?

A

Great volume, high velocity, and wide variety.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does ‘great volume’ refer to in Big Data?

A

A large amount of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does ‘high velocity’ refer to in Big Data?

A

Fast collection and processing of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does ‘wide variety’ refer to in Big Data?

A

Data that could include nontraditional formats such as video, audio, and text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is categorical data?

A

Labels or names used to identify an attribute of each element.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What types of scales does categorical data use?

A

Nominal or ordinal scale of measurement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can categorical data be numeric?

A

Yes, categorical data may be non-numeric or numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Census

A

A survey to collect data on the entire population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cross-sectional Data

A

Data collected at the same or approximately the same point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data

A

The facts and figures collected, analyzed, and summarized for presentation and interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is data mining?

A

The process of using procedures from statistics and computer science to extract useful information from extremely large databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which fields contribute procedures to data mining?

A

Statistics and computer science.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data Set

A

All the data collected in a particular study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Descriptive Analysis

A

The set of analytical techniques that describe what has happened in the past

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Descriptive Statistics

A

Tabular, graphical, and numerical summaries of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Elements

A

The entities on which data are collected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is an interval scale?

A

A scale of measurement for a variable that demonstrates the properties of ordinal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a key characteristic of interval scales?

A

The interval between values is expressed in terms of a fixed unit of measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Are interval scales always numeric?

A

Yes, interval scales are always numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a nominal scale?

A

A scale of measurement for a variable when the data are labels or names used to identify an attribute of an element.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Can nominal data be numeric?

A

Yes, nominal data may be nonnumeric or numeric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Observation

A

The set of measurements obtained for a particular element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is an ordinal scale?
A scale of measurement for a variable where the order or rank of the data is meaningful.
26
What properties do ordinal data exhibit?
Ordinal data exhibit the properties of nominal data and have a meaningful order or rank.
27
Can ordinal data be numeric?
Yes, ordinal data may be nonnumeric or numeric.
28
Population
The set of all elements of interest in a particular study
29
Predictive Analysis
The set of analytical techniques that use models constructed from past data to predict the future or assess the impact of one variable on another
30
Prescriptive Analysis
The set of analytical techniques that yield a best course of action
31
What is quantitative data?
Numeric values that indicate how much or how many of something
32
What scales of measurement are used to obtain quantitative data?
Interval or ratio scale of measurement
33
Quantitative Variable
A variable with quantitative data
34
Ratio Scale
A scale of measurement for a variable if the data demonstrate all the properties of interval data and the ratio of two values is meaningful
35
Are ratio data always numeric?
Yes, ratio data are always numeric.
36
Sample
A subset of the population
37
Sample Survey
A survey to collect data on a sample
38
What is statistical inference?
The process of using data obtained from a sample to make estimates or test hypotheses about the characteristics of a population.
39
What is the purpose of statistical inference?
To make estimates or test hypotheses about the characteristics of a population.
40
Statistics
The art and science of collecting, analyzing, presenting, and interpreting data
41
Time Series Data
Data collected over several time periods
42
Variable
A characteristic of interest for the elements
43
Bar Chart
A graphical device for depicting categorical data that have been summarized in a frequency, relative frequency, or percent frequency distribution
44
Class Midpoint
The value halfway between the lower and upper class limits
45
Crosstabulation
A tabular summary of data for two variables.
46
In crosstabulation, how are the classes for one variable represented?
By the rows.
47
In crosstabulation, how are the classes for the other variable represented?
By the columns.
48
Cumulative Frequency Distribution
A tabular summary of quantitative data showing the number of data values that are less than or equal to the upper class limit of each class
49
Cumulative Percent Frequency Distribution
A tabular summary of quantitative data showing the percentage of data values that are less than or equal to the upper class limit of each class
50
Cumulative Relative Frequency Distribution
A tabular summary of quantitative data showing the fraction or proportion of data values that are less than or equal to the upper class limit of each class
51
Data dashboard
A set of visual displays that organizes and presents information that is used to monitor the performance of a company or organization in a manner that is easy to read, understand, and interpret
52
Data Visualization
A term used to describe the use of graphical displays to summarize and present information about a data set
53
Dot Plot
A graphical device that summarizes data by the number of dots above each data value on the horizontal axis
54
Frequency Distribution
A tabular summary of data showing the number (frequency) of observations in each of several non-overlapping categories or classes
55
What is a Histogram?
A graphical display of a frequency distribution
56
What type of data is represented in a histogram?
Quantitative data
57
What is placed on the horizontal axis of a histogram?
Class intervals
58
What is placed on the vertical axis of a histogram?
Frequencies, relative frequencies, or percent frequencies
59
Percent Frequency Distribution
A tabular summary of data showing the percentage of observations in each of several non-overlapping classes
60
Pie Chart
A graphical device for presenting data summaries based on subdivision of a circle into sectors that correspond to the relative frequency for each class
61
Quantitative Data
Numerical values that indicate how much or how many
62
Relative Frequency Distribution
A tabular summary of data showing the fraction or proportion of observations in each of several non-overlapping categories or classes
63
Scatter Diagram
A graphical display of the relationship between two quantitative variables. One variable is shown on the horizontal axis and the other variable is shown on the vertical axis
64
Side-by-side Bar Chart
A graphical display for depicting multiple bar charts on the same display
65
Simpson's Paradox
Conclusions drawn from two or more separate crosstabulations that can be reversed when the data are aggregated into a single crosstabulation
66
Stacked Bar Chart
A bar chart in which each bar is broken into rectangular segments of a different color showing the relative frequency of each class in a manner similar to a pie chart
67
Stem-and-leaf Display
A graphical display used to show simultaneously the rank order and shape of a distribution of data
68
Trendline
A line that provides an approximation of the relationship between two variables
69
Boxplot
A graphical summary of data based on a five-number summary
70
Chebyshev's Theorem
A theorem that can be used to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean
71
Coefficient of Variation
A measure of relative variability computed by dividing the standard deviation by the mean and multiplying by 100
72
What is the correlation coefficient?
A measure of linear association between two variables that takes on values between −1 and +1
73
What does a correlation coefficient value near +1 indicate?
A strong positive linear relationship.
74
What does a correlation coefficient value near −1 indicate?
A strong negative linear relationship.
75
What does a correlation coefficient value near zero indicate?
The lack of a linear relationship.
76
What is covariance?
A measure of linear association between two variables.
77
What do positive values of covariance indicate?
A positive relationship between two variables.
78
What do negative values of covariance indicate?
A negative relationship between two variables.
79
What is the Empirical Rule?
A rule that can be used to compute the percentage of data values within one, two, and three standard deviations of the mean.
80
What type of distribution does the Empirical Rule apply to?
Bell-shaped distribution
81
Five-number Summary
A technique that uses five numbers to summarize the data: smallest value, first quartile, median, third quartile, and largest value
82
Geometric Mean
A measure of location that is calculated by finding the nth root of the product of n values
83
What is the formula for calculating the growth factor?
One plus the percentage increase over a period of time
84
What does a growth factor less than 1 indicate?
Negative growth
85
What does a growth factor greater than 1 indicate?
Positive growth
86
Can the growth factor be less than 0?
No, the growth factor cannot be less than 0
87
Interquartile Range (IQR)
A measure of variability, defined to be the difference between the third and first quartiles
88
Mean
A measure of central location computed by summing the data values and dividing by the number of observations
89
Median
A measure of central location provided by the value in the middle when the data are arranged in ascending order
90
Mode
A measure of location, defined as the value that occurs with greatest frequency
91
Outlier
An unusually small or unusually large data value
92
pth Percentile
A value that divides the data into two parts such that approximately p% of the observations are less than the pth percentile and approximately (100p)% of the observations are greater than the pth percentile
93
Point Estimator
A sample statistic, such as 𝑥¯, 𝑠^2, and s, used to estimate the corresponding population parameter
94
Population Parameter
A numerical value used as a summary measure for a population (e.g., the population mean, 𝜇, the population variance, 𝜎^2, and the population standard deviation, 𝜎)
95
What are the 25th, 50th, and 75th percentiles called?
The first quartile, the second quartile (median), and the third quartile.
96
What do quartiles do to a data set?
They divide the data set into four parts, with each part containing approximately 25% of the data.
97
Range
A measure of variability, defined to be the largest value minus the smallest value
98
Sample Statistic
A numerical value used as a summary measure for a sample (e.g., the sample mean, 𝑥¯, the sample variance, 𝑠^2, and the sample standard deviation, s)
99
What is skewness?
A measure of the shape of a data distribution.
100
What type of skewness results from data skewed to the left?
Negative skewness.
101
What type of skewness results from a symmetric data distribution?
Zero skewness.
102
What type of skewness results from data skewed to the right?
Positive skewness.
103
Standard Deviation
A measure of variability computed by taking the positive square root of the variance
104
Variance
A measure of variability based on the squared deviations of the data values about the mean
105
Weighted Mean
The mean obtained by assigning each observation a weight that reflects its importance
106
Z-Score
A value computed by dividing the deviation about the mean (𝑥𝑖- 𝑥¯) by the standard deviation s. A z-score is referred to as a standardized value and denotes the number of standard deviations 𝑥𝑖 is from the mean