BUSINESS ANALYTICS Flashcards

1
Q

Data

A

facts and figures from which conclusions can be drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data set

A

the data that are collected for a particular study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Elements

A

people, objects, events, or other entries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variable

A

any characteristic of an element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Measurement

A

a way to assign a value of a variable to the element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quantitative variable

A

the possible measurements of the values of a variable are numbers that represent quantities; numeric; mathematical operations are meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Qualitative variable

A

the possible measurements fall into several categories; categorical; labels or names used to identify an attribute of each element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cross-sectional data

A

data collected at the same or approx. the same point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Time-series data

A

data collected over different time periods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Existing sources

A

data already gathered by public or private sources; EX: internet, library, US gov’t, data collection agency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Experimental and observational studies

A

data we collect ourselves for a specific purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Response variable

A

variable of interest (Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Factor/independent variable

A

other variables related to response variable (X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Transactional data

A

companies hope to use past behavior and other information to predict customer responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data warehousing

A

a process of centralized data management and retrieval; its objective is the creation and maintenance of a central repository for all of an organization’s data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Big data

A

massive amounts of data; often collected in real time in different forms; sometimes needing quick analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Census

A

an examination of all the population of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Population

A

a set of all elements about which we wish to draw conclusions; size = N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Sample

A

a subset of the elements of a population; comes from the population; size = n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Descriptive statistics

A

the science of describing the important aspects of a set of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Statistical inference

A

the science of using a sample of measurements to make generalizations about the important aspects of a population of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Sample size

A

number of elements (n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Random sample

A

same chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Random selection

A

sample with replacement; sample without replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Finite population
a population of limited size
26
Infinite population
a population of unlimited size
27
Non-probability sampling
convenience sampling, voluntary sampling, judgment sampling
28
Probability sampling
sampling where we know the chance that each element in the population will be included in the sample; required for statistical inference, cluster, systematic, stratified
29
Convenience sampling
sampling where we select elements because they are convenient to sample; not a probability sample
30
Voluntary response sampling
samples in which participants self-select; frequently used by radio and television; over represent people with strong opinions
31
Judgement sampling
samples in which a person who is extremely knowledgeable about the population selects population elements he or she feels are most representative; the quality of the sampling is completely dependent on the researchers' knowledge
32
Business analytics
the use of traditional and newly developed statistical methods, advances in IS, and techniques from management science to explore and investigate past performance; descriptive, predictive, and prescriptive analytics
33
Descriptive analytics
the use of traditional and newer graphics to represent easy-to-understand visual summaries of up-to-the-minute data
34
Predictive analytics
methods used to find anomalies, patterns, and associations in data sets to predict future outcomes
35
Prescriptive analytics
looks at variables and constraints, along with predictions from predictive analytics, to recommend courses of action
36
Data mining
the use of predictive analytics, algorithms, and IS techniques to extract useful knowledge from huge amounts of data
37
Nominative
a qualitative variable for which there is no meaningful order, or ranking, of categories; EX: gender, car color
38
Ordinal
a qualitative variable for which there is a meaningful order, or ranking, of the categories; EX: teaching effectiveness
39
Interval
all the characteristics of ordinal plus measurements are on a numerical scale with an arbitrary zero point; can only meaningfully compare values by the interval between them; EX: temperature
40
Ratio
all the characteristics of interval plus measurements are on a numerical scale with a meaningful zero point; values can be compared by their intervals and ratios; in business and finance most quantitative variables are ratio variables, such as anything related to money; EX: earnings, profit, loss, age, distance, height
41
Sampling designs
methods for obtaining a sample
42
Sample survey
the sample we take
43
Stratified random sampling
divide population into non-overlapping groups (strata) then select a random sample from each strata; we divide the population into groups called strata (or clusters) and then take a certain number of elements from each stratum
44
Multistage cluster sampling
divide population into clusters and then randomly select clusters to sample; we divide population into clusters (or groups) and then randomly select some of the clusters
45
Systematic sampling
list population, select random starting point, sample each n^th element; wee randomly select a starting point and take every n^th piece of data from a listing of the population
46
Dichotomous questions
clearly stated; easy to answer; easy to analyze; limited information
47
Types of surveys
phone surveys, mail surveys, web surveys, personal interviews
48
Phone surveys
inexpensive, low response rate
49
Mail surveys
inexpensive, low response rates (20-30%), requires multiple mailings
50
Web surveys
cheaper still, same problem as mail surveys (low response rates and requires multiple surveys)
51
Personal interviews
more expensive, more control, higher response rates
52
Frequency distribution
a table that summarizes the number of items in each of several nonoverlapping classes
53
Relative frequency
summarizes the proportion of items in each class; for each class, divide the frequency of the class by the total number of observations
54
Formula for relative frequency
frequency of each class / data size (total); multiply by 100 for percent frequency
55
Bar chart
a vertical or horizontal rectangle represents the frequency for each category; height can be frequency, relative frequency, or percent frequency
56
Pie chart
a circle divided into slices where the size of each slice represents its relative frequency or percent frequency; Degree of each slice --> Relative Frequency x360 degrees
57
How to construct a frequency distribution
1. find the number of classes 2. find the class length 3. form nonoverlapping classes of equal width 4. tally and count 5. graph the histogram
58
Cumulative distribution
another way to summarize a distribution; use the same number of classes, class lengths, and class boundaries used for frequency distribution; rather than count, we record the number of measurements that are LESS THAN the upper boundary of that class
59
Ogive
a graph of a cumulative distribution; plot a point above each upper class boundary at a height of the cumulative frequency; connect the points with line segments; can also be drawn using cumulative relative or percent distributions
60
Stem-and-leaf display
the purpose is to see the overall pattern of the data, by grouping the data into classes; best for small to moderately sized data distributions
61
How to construct a stem-and-leaf display
1. decide what units will be used 2. each leaf must be a single digit and stem values will consist of appropriate leading digits 3. place the stem values 4. enter the leaf values (each leaf should be single digit) 5. rearrange the leaves in increasing order 6. can split the stems as needed
62
Leaf units
in general, leaf units can be any power of 10; EX: 0.1, 1, 10, 100, 1000...; if no leaf unit is given for a stem-and-leaf display, we assume its value is 1.0
63
Original data value formula
(stem and leaf) x leaf unit
64
Contingency table
classifies data on two dimensions; rows classify according to one dimension; columns classify according to a second dimension