class10: Analyze the Data Flashcards

1
Q

what is the soar analytics model?

A
  1. Specify the question
  2. obtain the data
  3. analyze the data
  4. report the results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

a group with something in common

A

population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

characteristic of a population/characteristic of a sample

A

parameter / statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the sample used for?

A

sample is used to make inferences, conclusion about the characteristics of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

measures that describe a population/sample

A

descriptive statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

measures calculated only using a sample

A

inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

explain 3 sampling methods

A
  1. simple random sampling
  2. stratified random sampling
    • divide members into similar groups before sampling
  3. cluster sampling
    - divide the population into groups => select few groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

process of reducing the size of the data set to a more manageable and suitable size for a business analysis projects

A

data reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

4 common methods of data reduction

A
  1. filtering
  2. deduplication
  3. aggregation
  4. compression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

4 types of bias in business analytics

A
  1. nonresponse
  2. selection
  3. confirmation
  4. outlier
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Shows all possible values for a variable and how often they (could) occur

A

data distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

a statistical function that describes the possible values in a population and the chance that any given observation can take a given range or value

A

probability distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

explain 2 types of numerical data

A
  1. continuous data
    - any numerical value, infinite
    ex, height, weight, currency
  2. discrete data
    - whole number, finite
    ex, customers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

3 basic understanding of the data
: the starting point of analyzing

A
  1. structure
  2. dispersion
  3. frequency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

the distribution shape

A

kurtosis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

most common observation in a data set

A

mode

17
Q

5 words describing the center point of the data (data structure )

A

mean
median
mode
kurtosis
symmetry

18
Q

3 개념 describing the dispersion of data

A
  1. Range
    - maximum - minimum
  2. variance
  3. standard deviation
    -
19
Q

two conditions to be a normal distribution

A
  1. mean=mode=median=0
  2. standard deviation = 1
20
Q

information that results from the examination of data to understand the past answers to the question “what happened?”

A

descriptive analytics

21
Q

build on descriptive analytics and try to answer the question “why did this happen?”

A

diagnostic analytics

22
Q

information that results from analyses that focus on predicting the future

A

predictive analytics

23
Q

Informationthatresultsfrom
analyses to provide a recommendation of what should happen—answers the question “what should be done?”`

A

prescriptive analytics

24
Q

what kind of analytics is useful to evaluate performance?

A

descriptive analytics

25
Q

3 issues of diagnostic analysis

A
  1. mistaking correlation for causation
  2. cannot predict the future
  3. the answers may not be 100% definitive
26
Q

3 approaches of diagnostic analysis to find trends(relationship)

A
  1. data drilling
  2. data mining
  3. goals
27
Q

5 concerns with diagnostic analysis

A
  1. confounding variable
  2. luck of casual connection
  3. data degrading
  4. reporting issues
  5. low n
28
Q

implies a deeper connection/relationship of influence

determines the relationship’s effect

A

regression

29
Q

causality exists when the two conditions are met

A
  1. significant correlation
  2. chronological secquence / experiment / theory
30
Q
A