CHAPTER 1 Flashcards

1
Q

Data set

A

data collected to study info about element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

variable

A

characteristic of an element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

measurement

A

assigning a value of a variable to the element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Quantitative/numerical

A

answer how much/how many

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

qualitative/categorical

A

record several categories an element fall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

cross-sectional data

A

data collected at the same point in time (e,g in a month)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

time-series data

A

data collected over different time periods (e.g: 1999-3000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

primary data

A
  • collected by individual/business
  • directly thru planned experimentation/observation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

secondary data

A

from existing sources (by public/private sections)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Steps to start a study

A
  • define variable of interest/response variable
  • other variables (factors)
    + can manipulate the value of these factors -> experimental
    + can not manipulate the value of these factors -> observational
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Performing survey/observe

A
  • ask abt behaviors, opinions, beliefs, characteristics
  • observe behaviors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

data warehousing

A

process of centralised data management -> maintenance + creation => central repository for all org’ data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

big data

A

massive amount of data
fast rates in real time and different forms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

population

A

set of all elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

population of measurements

A

carry out a measurement to assign a value of a variable to each and every population’s element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Census

A

examine all population measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

sample

A

subset of the elements of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

sample of measurement

A

measure a charac. of the elements in a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

descriptive stat.

A

science of describing the important aspects of a set of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

stat. inference

A

science of using a sample of measurement to make Generalizations abt the important aspects of a population of measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

random sample

A

sample selected so that every set of n elements in the population has the same chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

business analytics

A

the use of Traditional and newly developed stat. methods, advances in Information systems, and itech from Mana. Science to continuously and iteratively explore and investigate past business performance, with the purpose of gaining insight and improving business planning and operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

data mining

A

the process of discovering useful knowledge in extremely large data sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

sample with replacement

A

place the element chosen on any particular selection back into the population => give a chance to be chosen on any succeeding selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

sample without replacement

A

do not place the element chosen on a particular selection back into the population. => cannot choose again => best to sample w/o replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

frame

A

a list of all of the population elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

random number table

A

a table containing random digits that is often used to select a random sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Process

A

a process is a sequence of operations that takes inputs (labor, materials, methods, machines, and so on) and turns them into outputs (products, services, and the like)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

finite population

A

a population that contains a finite number of elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

infinite population

A

a population that is defined so that there is no limit the number of elements that could potentially belong to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

profitability sampling

A

sampling where we know the chance (prob.) that each population element will be included in the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

convenience sampling

not probability sampling

A

sampling where we select elements because they are easy or convenient to sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Voluntary respnse sample

overrepresent people with strong (usually negative) opinions

a type of convenience sampling

A

sampling in which the sample participants self-select

34
Q

judgement sampling

not probability sampling

A

sampling where an expert selects population elements that he/she feels are representative of the population

dangerous to use the sample to make stat inferences about the population because it depends upon the judgment of the person selecting the sample

35
Q

improper sampling

unethical

A

purposely selecting a biased sample

e.g: using a nonrandom sampling procedure that overrepresents population elements supporting a desired conclusion or that underrepresents the population not supporting the conclusion

36
Q

misleading charts, graphs, and descriptive measures

unethical

A

unethical stat practice

37
Q

inappropriate statiscal analysis or inappropriate interpretation of statiscal results

A

select many different samples and running many different tests

produce a result that seems to be true but not

38
Q

descriptive analytics

A

The use of traditional and more recently developed statistical graphics to present to executives (and sometimes customers) easy-to-understand visual summaries of up-to-the-minute information concerning the operational status of a business.

39
Q

graphical descriptive analytics

A

use the traditional and/or newer graphics to present to executives (and sometimes customers) easy-to-understand visual summaries of up-to-the minute info concerning the operation status of a business.

40
Q

numerical descriptive analytics

A

association learning, text mining, cluster analysis, and factor analysis.

41
Q

association learning

A

identify items that tend to co-occur and finding the rules that describe their co-occurrence.

42
Q

text mining

A

The science of **discovering knowledge, insights and patterns ** from a collection of textual documents or databases

using latent semantic analysis

43
Q

Latent semantic analysis

A

analyze the relationship between a collection of documents and the words they contain to produce a set of key concepts or factors related to the documents and words

44
Q

cluster analysis

A

Finding natural grouping or clusters within data without having to prespecify a set of categories

45
Q

Factor analysis

A

Start with a large number of correlated variables and finding fewer underlying, uncorrelated factors that describe the essential aspects of the large number of correlated variables

reducing large number of variables to fewer underlying factors helps a business focus its activities and strategies

46
Q

predictive analytics

A

methods used to find anomalies, patterns, and associations in data sets, with the purpose of predicting future outcomes. The applications of predictive analytics include anomaly (outlier) detection, association learning, classification, cluster detection, prediction and factor analysis

supervised learning technique

methods used to predict values of a response variable on the basis of one or more predictor variables.

47
Q

classification

A

assign items to a specificed categories or classes

48
Q

2 classes of predictive analytics

A
  • nonparametric predictive analytics
  • parametric
49
Q

parametric predictive analytics

A

find a **math equation ** that relates the response variable to the predictor variable(s) and involves unknown parameters that must be estimated and evaluated by using simple data;

50
Q

parametric predictive analytics include

A
  • classical linear regression
  • logistic regression
  • discriminate analysis
  • neureal networks
  • time series forecasting
51
Q

prescriptive analytics

A

combine external and internal constraints with results from descriptive or predictive analytics to recommend an optimal course of action

52
Q

Prescriptive analytics include

A
  • decision theory methods
  • linear optimization
  • nonlinear optimization
  • simulation
53
Q

supervised learning

A

uses a training set to teach models to yield the desired output

54
Q

2 types of quantitative variables

A

ratio and interval

55
Q

ratio variable

A
  • quantitative variable
  • measured on a scale such that ratios of its values are meaningful
  • there is an inherently defined zero value

distance of 0 miles = no distance at all
30 miles is twice as far as 15

56
Q

Interval variable

A
  • quantitative variable
  • ratios are not meaningful
  • no inherently defined zero value

0 degree = cold

57
Q

2 types of qualitative variable

A

ordinal and nominative

58
Q

ordinal variable

A
  • qualitative
  • meaningful ordering/ranking of the categories

good-average-poor/1->5

59
Q

nominal variable

gender, color.etc

A
  • qualitative variable
  • no meaningful ordering/ranking
60
Q

sampling design

A

methods for obtaining a sample

61
Q

stratified random sample

A

divide the pop. into nonoverlapping groups of similar elements (strata)
- random sample is selected from each stratum
- these samples are combined to form the full sample

wise to stratify when the pop. consists of 2 or more groups that differ with respect to the variable of interest. (age, gender, ethnic group, income)

62
Q

multistage cluster sampling

A
  1. Stage 1: Randomly select a sample of counties from all of the counties in the US
  2. Randomly select a sample of townships from each county in Stage 1
  3. Randomly select a sample of voting precincts from each township selected in Stage 2
  4. Randomly select a sample of registered voters from each voting precinct selected in Stage 3

take a sample of registered voters from all registered voters in the US

advantageous when selecting sample from a very large geographical region (a frame doesn’t exist)

63
Q

systematic sampling

A

a sample taken by moving systematically through the population.

  • Select a sample of n elements w/o replacement from a frame of N elements: divide N by n (round down to nearest whole number) = l
  • Randomly select one element from the first l elements in the frame
  • The remaining elements in the sample are obtained by selecting every l th element following the first element
64
Q

types of survey questions

A
  • dichotomous (yes/no)
  • MCQ
  • open-ended questions
65
Q

Dichotomous Questions

A
  • clearly stated
  • can be answered quickly
  • yield data that are easily analyzed
  • cons: info many be limited by the two-option format
66
Q

MCQ

A
  • several different forms
  • either categorical or numerical
67
Q

open-ended questions

A
  • most honest and complete information
  • no suggested answers to divert or bias a person’s respone
68
Q

phone survey

A
  • inexpensive
  • conducted by callers who have very little training
  • impersonal nature -> respondent may misunderstood some of the questions
  • some people cannot be reached and that others may refuse to some or all of the questions

=> low response rate

69
Q

response rate

A

the proportion of all people whom we attempt to contact that actually respond to a survey.

70
Q

mail surveys (self-administered surveys)

A
  • inexpensive
  • recipients often won’t reply unless they receive some kind of financial incentive or other reward
  • the process can take significantly longer than a phone survey
71
Q

web-based surveys

A
  • same problems as mail surveys
  • respondents may record their true reactions incorrectly because they have misunderstood some of the questions posed
72
Q

personal interview

A
  • more control
  • more likely to respond (because of face-to-face)
  • questions are less likely to be misunderstood because the people conducting the interviews are typically trained employees who can clear up any confusion
  • cons: interviewers can potentially “lead” a respondent by body language + more costly

mall survey, 50% response rate

73
Q

target population

A

the entire population of interest to us in a particular study

74
Q

Sample frame

A

a** list of sampling elements** (people or things) from which the sample will be selected
(should closely agree with the target population)

75
Q

Sampling error

A

The difference between a numerical descriptor of the population and the corresponding descriptor of the sample

76
Q

Two types of sample errors

A
  • errors of nonobservation: related to population elements that are not observed
  • errors of observation: occurs when the data collected in a survey differs from the truth
77
Q

Error of coverage

A

sample frame is different from the target population

  • undercoverage: some pop. elements are excluded from the process of selecting the sample
78
Q

Nonresponse

problem

A

occurs whenever some of the individuals who were supposed to be included in the sample are not

79
Q

selection bias

A
  • bias in the results
  • related to how survey applicants are selected
80
Q

response bias

A
  • bias results
  • related to how survey participants answer the survey questions