BUSINESS ANALYTICS Flashcards

1
Q

Data

A

facts and figures from which conclusions can be drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data set

A

the data that are collected for a particular study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Elements

A

people, objects, events, or other entries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variable

A

any characteristic of an element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Measurement

A

a way to assign a value of a variable to the element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quantitative variable

A

the possible measurements of the values of a variable are numbers that represent quantities; numeric; mathematical operations are meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Qualitative variable

A

the possible measurements fall into several categories; categorical; labels or names used to identify an attribute of each element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cross-sectional data

A

data collected at the same or approx. the same point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Time-series data

A

data collected over different time periods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Existing sources

A

data already gathered by public or private sources; EX: internet, library, US gov’t, data collection agency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Experimental and observational studies

A

data we collect ourselves for a specific purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Response variable

A

variable of interest (Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Factor/independent variable

A

other variables related to response variable (X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Transactional data

A

companies hope to use past behavior and other information to predict customer responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data warehousing

A

a process of centralized data management and retrieval; its objective is the creation and maintenance of a central repository for all of an organization’s data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Big data

A

massive amounts of data; often collected in real time in different forms; sometimes needing quick analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Census

A

an examination of all the population of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Population

A

a set of all elements about which we wish to draw conclusions; size = N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Sample

A

a subset of the elements of a population; comes from the population; size = n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Descriptive statistics

A

the science of describing the important aspects of a set of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Statistical inference

A

the science of using a sample of measurements to make generalizations about the important aspects of a population of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Sample size

A

number of elements (n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Random sample

A

same chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Random selection

A

sample with replacement; sample without replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Finite population

A

a population of limited size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Infinite population

A

a population of unlimited size

27
Q

Non-probability sampling

A

convenience sampling, voluntary sampling, judgment sampling

28
Q

Probability sampling

A

sampling where we know the chance that each element in the population will be included in the sample; required for statistical inference, cluster, systematic, stratified

29
Q

Convenience sampling

A

sampling where we select elements because they are convenient to sample; not a probability sample

30
Q

Voluntary response sampling

A

samples in which participants self-select; frequently used by radio and television; over represent people with strong opinions

31
Q

Judgement sampling

A

samples in which a person who is extremely knowledgeable about the population selects population elements he or she feels are most representative; the quality of the sampling is completely dependent on the researchers’ knowledge

32
Q

Business analytics

A

the use of traditional and newly developed statistical methods, advances in IS, and techniques from management science to explore and investigate past performance; descriptive, predictive, and prescriptive analytics

33
Q

Descriptive analytics

A

the use of traditional and newer graphics to represent easy-to-understand visual summaries of up-to-the-minute data

34
Q

Predictive analytics

A

methods used to find anomalies, patterns, and associations in data sets to predict future outcomes

35
Q

Prescriptive analytics

A

looks at variables and constraints, along with predictions from predictive analytics, to recommend courses of action

36
Q

Data mining

A

the use of predictive analytics, algorithms, and IS techniques to extract useful knowledge from huge amounts of data

37
Q

Nominative

A

a qualitative variable for which there is no meaningful order, or ranking, of categories; EX: gender, car color

38
Q

Ordinal

A

a qualitative variable for which there is a meaningful order, or ranking, of the categories; EX: teaching effectiveness

39
Q

Interval

A

all the characteristics of ordinal plus measurements are on a numerical scale with an arbitrary zero point; can only meaningfully compare values by the interval between them; EX: temperature

40
Q

Ratio

A

all the characteristics of interval plus measurements are on a numerical scale with a meaningful zero point; values can be compared by their intervals and ratios; in business and finance most quantitative variables are ratio variables, such as anything related to money; EX: earnings, profit, loss, age, distance, height

41
Q

Sampling designs

A

methods for obtaining a sample

42
Q

Sample survey

A

the sample we take

43
Q

Stratified random sampling

A

divide population into non-overlapping groups (strata) then select a random sample from each strata; we divide the population into groups called strata (or clusters) and then take a certain number of elements from each stratum

44
Q

Multistage cluster sampling

A

divide population into clusters and then randomly select clusters to sample; we divide population into clusters (or groups) and then randomly select some of the clusters

45
Q

Systematic sampling

A

list population, select random starting point, sample each n^th element; wee randomly select a starting point and take every n^th piece of data from a listing of the population

46
Q

Dichotomous questions

A

clearly stated; easy to answer; easy to analyze; limited information

47
Q

Types of surveys

A

phone surveys, mail surveys, web surveys, personal interviews

48
Q

Phone surveys

A

inexpensive, low response rate

49
Q

Mail surveys

A

inexpensive, low response rates (20-30%), requires multiple mailings

50
Q

Web surveys

A

cheaper still, same problem as mail surveys (low response rates and requires multiple surveys)

51
Q

Personal interviews

A

more expensive, more control, higher response rates

52
Q

Frequency distribution

A

a table that summarizes the number of items in each of several nonoverlapping classes

53
Q

Relative frequency

A

summarizes the proportion of items in each class; for each class, divide the frequency of the class by the total number of observations

54
Q

Formula for relative frequency

A

frequency of each class / data size (total); multiply by 100 for percent frequency

55
Q

Bar chart

A

a vertical or horizontal rectangle represents the frequency for each category; height can be frequency, relative frequency, or percent frequency

56
Q

Pie chart

A

a circle divided into slices where the size of each slice represents its relative frequency or percent frequency; Degree of each slice –> Relative Frequency x360 degrees

57
Q

How to construct a frequency distribution

A
  1. find the number of classes
  2. find the class length
  3. form nonoverlapping classes of equal width
  4. tally and count
  5. graph the histogram
58
Q

Cumulative distribution

A

another way to summarize a distribution; use the same number of classes, class lengths, and class boundaries used for frequency distribution; rather than count, we record the number of measurements that are LESS THAN the upper boundary of that class

59
Q

Ogive

A

a graph of a cumulative distribution; plot a point above each upper class boundary at a height of the cumulative frequency; connect the points with line segments; can also be drawn using cumulative relative or percent distributions

60
Q

Stem-and-leaf display

A

the purpose is to see the overall pattern of the data, by grouping the data into classes; best for small to moderately sized data distributions

61
Q

How to construct a stem-and-leaf display

A
  1. decide what units will be used
  2. each leaf must be a single digit and stem values will consist of appropriate leading digits
  3. place the stem values
  4. enter the leaf values (each leaf should be single digit)
  5. rearrange the leaves in increasing order
  6. can split the stems as needed
62
Q

Leaf units

A

in general, leaf units can be any power of 10; EX: 0.1, 1, 10, 100, 1000…; if no leaf unit is given for a stem-and-leaf display, we assume its value is 1.0

63
Q

Original data value formula

A

(stem and leaf) x leaf unit

64
Q

Contingency table

A

classifies data on two dimensions; rows classify according to one dimension; columns classify according to a second dimension