Final Flashcards

1
Q

Relationship between INCOME (in $) and CONTINENT of birth could be analyzed using an F-test

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Relationship between HEIGHT (in cm) of respondents and CONTINENT of birth could be analyzed using an Chi Squared test

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Relationship between HEIGHT of respondents (in cm) and SEX could be observed using a Scatter Diagram

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Relationship between HEIGHT of respondents (in cm) and WEIGHT (in Kg.) could be observed using a Box Plot

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If we BIN two scale variables HEIGHT of respondents (in cm) and WEIGHT (in Kg) we would get ORDINAL versions that could be analyzed using a crosstab

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

t-Test is usually the right option to explore bivariate relationship if we have a scale variable and a categorical variable with more than two categories

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ANOVA test is sometimes followed by a Post - Hoc Test (Bonferroni, LSD

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The Null Hypothesis for an F- test is that the mean of a scale variable is the same across different categories of a categorical variable

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ANOVA is one of the usual inferential tests that complements a SCATTER X/Y graph

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If we get a p-value of 0.04 in a bivariate INDEPENDENCY test, that means that we have evidences of relationship with a 96% of maximum confidence

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

CLUSTER is an SUPERVISED CLASSIFICATION method

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cluster analysis is mainly used to aggregate of CASES, not FIELDS

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cluster is one of the main technical resources for PREDICTIVE ANALYTICS

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cluster is one of the main technical resources for PREDICTIVE ANALYTICS

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The evaluation of a cluster solution is NOT MAINLY a technical assessment task

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

We normally STANDARDIZE metric and categorical variables to run a cluster analysis

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

We normally run a HIERARCHICAL cluster only if the number of cases / individuals is relatively small

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

We can use a HIERARCHICAL cluster combining metric and categorical variables

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The AGGLOMERATION schedule is not a piece of interest in a TWO Step Cluster

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The choice of the distance measure in hierarchical clusters depends, basically, on the type of variables (categorical, scale,…)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

One of the advantages of Two-Step is its inherent ability to handle outliers

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Normally, a good cluster solution is shaped with a LARGE number of clustering variables (not less than 15)

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A field/variable can be very relevant to define/ distinguish a SPECIFIC CLUSTER, without being of great importance for the solution as a whole

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The selection of the clustering VARIABLES highly conditions the cluster solution we get

A

True

25
Q

Factor analysis Is a SUPERVISED analysis technique

A

false

26
Q

Factor analysis It is normally used when we want to summarize a numerous group of variables into a lower number of factors

A

True

27
Q

Factor Is commonly used as a technique to reduce the number of cases in a dataset

A

False

28
Q

We could use a FACTOR analysis to test if different items in a questionnaire are linked with the same latent/underlying concept

A

True

29
Q

Factor analysis Can be used to create a COMPOSITE INDEX

A

True

30
Q

A low specificity for most of the inputs variables may suggest that Factor analysis is feasible

A

True

31
Q

Standard/Classic factor analysis is suitable for metric and categorical variables

A

False

32
Q

Using PCA extraction method, FACTORS will be always orthogonal if no rotation is applied

A

True

33
Q

Rotation is used to improve factor interpretation

A

True

34
Q

Oblique rotation is normally realistic since factors are normally correlated

A

True

35
Q

Sometimes, we will find of interest to retain factors with eigenvalues lower than ONE

A

True

36
Q

The factor SCORES will all have zero mean

A

True

37
Q

Prior to rotation, a variable may exhibit high correlation with more than one factor at the same time

A

True

38
Q

Correlation between input variables should always be positive if we want to carry a good factor analysis

A

False

39
Q

In a good factor analysis, we expect to get as few factors as possible accounting for as much variance as possible

A

True

40
Q

A low level of communality for a variable means that we will need a specific factor for that variable in our analysis

A

True

41
Q

Given that input variables are used in standardized fashion, the sum of all the eigenvalues of all factors equals the number of variables

A

True

42
Q

When there exists a common underlying factor for every variable we will get a high eigenvalue for the first factor

A

True

43
Q

Trees Is a SUPERVISED analysis technique

A

true

44
Q

Trees It is normally used when we want to balance “causation understanding” Vs prediction

A

True

45
Q

CHAID uses iterated F-Test to grow the tree

A

False

46
Q

One of TREES drawbacks is that is a bit hard to interpret because of the technical complexity of output

A

False

47
Q

It its prone to over-fitting and thus, it requires a careful evaluation

A

True

48
Q

CHAID is able to use scale variables automatically

A

True

49
Q

The more we split the nodes, the more we avoid the over – fitting risk

A

False

50
Q

It is a more flexible than regression when causality relationships are not uniform acroos all our sample

A

True

51
Q

Trees. It is normally used for scale targets

A

False

52
Q

We could use the result of a FACTOR analysis as input for a given CLUSTER analysis

A

True

53
Q

We could use the result of a FACTOR analysis as input for a given REGRESSION analysis

A

True

54
Q

We could use the result from a CLUSTER analysis as an input in a given TREE Analysis

A

True

55
Q

We could use the result from a CLUSTER analysis as an input in a given standard FACTOR analysis

A

False

56
Q

We could use the result from a CLUSTER analysis as an input in a given REGRESSION analysis without any transformation

A

False

57
Q

We could use STANDARD REGRESSION in order to explain the result of a CLUSTER analysis

A

False

58
Q

We could use STANDARD REGRESSION in order to explain a given FACTOR score with a set of explanatory variables

A

True