Final Flashcards

1
Q

Relationship between INCOME (in $) and CONTINENT of birth could be analyzed using an F-test

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Relationship between HEIGHT (in cm) of respondents and CONTINENT of birth could be analyzed using an Chi Squared test

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Relationship between HEIGHT of respondents (in cm) and SEX could be observed using a Scatter Diagram

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Relationship between HEIGHT of respondents (in cm) and WEIGHT (in Kg.) could be observed using a Box Plot

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If we BIN two scale variables HEIGHT of respondents (in cm) and WEIGHT (in Kg) we would get ORDINAL versions that could be analyzed using a crosstab

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

t-Test is usually the right option to explore bivariate relationship if we have a scale variable and a categorical variable with more than two categories

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ANOVA test is sometimes followed by a Post - Hoc Test (Bonferroni, LSD

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The Null Hypothesis for an F- test is that the mean of a scale variable is the same across different categories of a categorical variable

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ANOVA is one of the usual inferential tests that complements a SCATTER X/Y graph

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If we get a p-value of 0.04 in a bivariate INDEPENDENCY test, that means that we have evidences of relationship with a 96% of maximum confidence

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

CLUSTER is an SUPERVISED CLASSIFICATION method

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cluster analysis is mainly used to aggregate of CASES, not FIELDS

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cluster is one of the main technical resources for PREDICTIVE ANALYTICS

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cluster is one of the main technical resources for PREDICTIVE ANALYTICS

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The evaluation of a cluster solution is NOT MAINLY a technical assessment task

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

We normally STANDARDIZE metric and categorical variables to run a cluster analysis

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

We normally run a HIERARCHICAL cluster only if the number of cases / individuals is relatively small

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

We can use a HIERARCHICAL cluster combining metric and categorical variables

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The AGGLOMERATION schedule is not a piece of interest in a TWO Step Cluster

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The choice of the distance measure in hierarchical clusters depends, basically, on the type of variables (categorical, scale,…)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

One of the advantages of Two-Step is its inherent ability to handle outliers

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Normally, a good cluster solution is shaped with a LARGE number of clustering variables (not less than 15)

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A field/variable can be very relevant to define/ distinguish a SPECIFIC CLUSTER, without being of great importance for the solution as a whole

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The selection of the clustering VARIABLES highly conditions the cluster solution we get

25
Factor analysis Is a SUPERVISED analysis technique
false
26
Factor analysis It is normally used when we want to summarize a numerous group of variables into a lower number of factors
True
27
Factor Is commonly used as a technique to reduce the number of cases in a dataset
False
28
We could use a FACTOR analysis to test if different items in a questionnaire are linked with the same latent/underlying concept
True
29
Factor analysis Can be used to create a COMPOSITE INDEX
True
30
A low specificity for most of the inputs variables may suggest that Factor analysis is feasible
True
31
Standard/Classic factor analysis is suitable for metric and categorical variables
False
32
Using PCA extraction method, FACTORS will be always orthogonal if no rotation is applied
True
33
Rotation is used to improve factor interpretation
True
34
Oblique rotation is normally realistic since factors are normally correlated
True
35
Sometimes, we will find of interest to retain factors with eigenvalues lower than ONE
True
36
The factor SCORES will all have zero mean
True
37
Prior to rotation, a variable may exhibit high correlation with more than one factor at the same time
True
38
Correlation between input variables should always be positive if we want to carry a good factor analysis
False
39
In a good factor analysis, we expect to get as few factors as possible accounting for as much variance as possible
True
40
A low level of communality for a variable means that we will need a specific factor for that variable in our analysis
True
41
Given that input variables are used in standardized fashion, the sum of all the eigenvalues of all factors equals the number of variables
True
42
When there exists a common underlying factor for every variable we will get a high eigenvalue for the first factor
True
43
Trees Is a SUPERVISED analysis technique
true
44
Trees It is normally used when we want to balance “causation understanding” Vs prediction
True
45
CHAID uses iterated F-Test to grow the tree
False
46
One of TREES drawbacks is that is a bit hard to interpret because of the technical complexity of output
False
47
It its prone to over-fitting and thus, it requires a careful evaluation
True
48
CHAID is able to use scale variables automatically
True
49
The more we split the nodes, the more we avoid the over – fitting risk
False
50
It is a more flexible than regression when causality relationships are not uniform acroos all our sample
True
51
Trees. It is normally used for scale targets
False
52
We could use the result of a FACTOR analysis as input for a given CLUSTER analysis
True
53
We could use the result of a FACTOR analysis as input for a given REGRESSION analysis
True
54
We could use the result from a CLUSTER analysis as an input in a given TREE Analysis
True
55
We could use the result from a CLUSTER analysis as an input in a given standard FACTOR analysis
False
56
We could use the result from a CLUSTER analysis as an input in a given REGRESSION analysis without any transformation
False
57
We could use STANDARD REGRESSION in order to explain the result of a CLUSTER analysis
False
58
We could use STANDARD REGRESSION in order to explain a given FACTOR score with a set of explanatory variables
True