Week 3 Flashcards

1
Q

categorical is the same as what type of variable?

A

Nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Arbitrary labels such as male, homeowner and non-smoker belong to what types of variables?

A

categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Are categorical variables discrete or continuous?

A

Discrete -labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Categorical variable labels can be two things?

A

Nominal (vanilla, chocolate) or numerical (group 1 or group 17).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Do the numerical values of categorical variable labels mean anything mathematically?

A

No, because they are merely labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are ordinal scales involve discrete or continuous labels?

A

discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Does the order matter in ordinal scales? Why?

A

Yes. They have inherent order, they are ranks. Moving along the scale indicates a change in amount, but doesn’t indicate how much change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an example of an ordinal scale?

A

Movie ratings, degree of education. Horse racing: you know what position they came but you don’t know by how much.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Are the steps of an ordinal scale?

A

No, they may not be.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which two aspects describe an interval scale?

A

Order and equal intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Are the variables in interval scales discrete or continuous?

A

continuous (although the measurement may not be).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Can you perform mathematical operations on an interval scale?

A

Yes, addition and subtraction. How much more (or less) of something is there?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Do interval scales have a true 0?

A

No. For example, temperature is an interval scale. 0 Does not mean no heat. 0 Does not mean an absence of a thing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the three characteristics of a ratio scale?

A

Order, equal intervals and a true zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Give some example of a ratio scale

A

mass, length, time e.t.c 0 metres is an absence of length, and 0kg is an absence of mass.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Can you calculate ratios of different values (ratio scale)?

A

Yes, because 50kg is 2X greater than 25kg.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Does understanding the scale in statistics matter?

A

Yes. Valid interpretation depends on knowing the scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What type of scale is a Likert Scale?

A

Ordinal variables - discrete numerical values/

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a continuous variable?

A

It is theoretically an infinite resolution between minimum and maximum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Can continuous variables be converted to discrete variables?

A

Yes, although it may cause a loss of information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Can discrete variables be converted to continuous variables?

A

No.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Can a construct be continuous?

A

Yes, but the method of quantifying it may be discrete (e.g happiness using a likert scale)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the four types of scales?

A

Nominal, ordinal, interval and ratio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is it called when you have two modal values?

A

Bimodal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is it called when you have more than two modal values?

A

Multimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the median?

A

Middle number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How do I find the median if there are even numbers?

A

Add the two middle scores and divide them by 2.

28
Q

What is the mean?

A

Average.

29
Q

How do you calculate the range?

A

Maximum-minimum

30
Q

As the sample size increases, what does the range do?

A

The range also increases

31
Q

How many equal groups do quartiles sort data into?

A

Four.

32
Q

What is the interquartile range?

A

The difference between Q3 and Q1.

33
Q

What does the interquartile range measure?

A

How much the data is spread out. The bigger the IQR, the greater dispersion.

34
Q

Is the median or the mean more affected is the outlier is used or discarded?

A

The average will be more affected.

35
Q

Should nominal data be summarised using descriptive statistics (e.g IQR, mean, median)?

A

No - it wouldn’t make sense considering nominal data is merely labels.

36
Q

Should ordinal data be associated with some descriptive statistics?

A

It can be with some. For example, median, quartiles and IQR.

37
Q

Do data values tend to be symmetrically clustered around the mean? IF they are, what shape would they make?

A

Yes they are generally clustered around the average. It therefore forms a bell-shaped or normal curve.

38
Q

Is a right-skewed graph positively or negatively skewed?

A

Positively.

39
Q

Is a left-skewed graph positively or negatively skewed?

A

Negatively.

40
Q

What does a positively skewed graph tell us about data?

A

That the high data values are more spread out then the low values.

41
Q

What does a negatively skewed graph tell us about data?

A

That the low data values are more spread out than the high values.

42
Q

If the mean and median are approximately equal, what does this tell us about the data?

A

That it is quite symmetric.

43
Q

Which graphs can be used to assess skewness?

A

Histograms.

44
Q

What is variance?

A

It is roughly the average of the squared difference to the mean.

45
Q

What does the variance measure?

A

The spread of the data.

46
Q

What is the formula for the standard deviation?

A

The square root of the variance

47
Q

If data is skewed, what is something that can make I more symmetric?

A

mathematical transformation

48
Q

What is kurtosis?

A

The shape of the tails.
Long and fat tails mean low kurtosis (platykurtic).
Peaked distribution and small tails refer to high kurtosis (leptokurtic).

49
Q

What does EDA stand for?

A

Exploratory data analysis

50
Q

What does EDA mean?

A

Refers to procedures designed to present at a in an informative way

51
Q

What is frequency?

A

One way of summarising data for a categorical variable. Represents the count of observations in each category.

52
Q

What is relative frequency?

A

Refers to the proportion of the whole represented by the counts in a category.

53
Q

What is a two-way table (or contingency table) of frequencies used for summarising?

A

Two categorical variables

54
Q

What are pie charts, and are they the most affective?

A

Graphical representations used for a single categorical variable with, typically few categories. There are better options.

55
Q

How are bar graphs different to compound bar charts?

A

Bar charts are used to represent more than one variables

56
Q

What are bar graphs?

A

Used for either one or two categorical variables. They are better than pie charts for summarising information.

57
Q

What are stem and leaf plots?

A

Group data into intervals of equal length. Actual values are retained, possibly rounded.

58
Q

What are histograms?

A

Group the data, usually into equal-sized intervals.

59
Q

What are intervals between the boxes in histograms often called?

A

‘Bins’.

60
Q

What are box plots?

A

They are a way of presenting continuous data and giving a picture of how the data are distributed.

61
Q

Which part of the data do box plots focus on?

A

The central 50 per cent of the data. (median and IQR). Whiskers cover the remaining data.

62
Q

Histograms, box plots, bar graphs and other graphs are used to compare what?

A

Used to compare continuous data across different categories (levels) of a categorical variables. Plots must be constructed on the same scale as the continuous variable.

63
Q

What are scatterplots used for?

A

Used to consider the relationship between two quantitative variables (e.g price and thickness of textbooks).

64
Q

Can scatterplots be constructed to include categorical variables?

A

Yes, to differentiate the relationship between the quantitative variables (e.g price vs thickness by cover type)

65
Q

What are the 7 characteristics of a good graph?

A
  1. Clear images
  2. smooth and sharp lines
  3. font is legible and simple
  4. units of measurement are provided
  5. axes are clearly labeled
  6. elements within the figure are clearly labeled or explained
  7. error bars included when graphing descriptive statistics
66
Q

What are 5 examples of bad graphs?

A
  1. outright mistakes
  2. 3D graphs when the third dimension doesn’t represent anything and obscures or distort the info
  3. bar charts with a scale starting above zero
  4. scatterplots with the x and y scales not restricted to the range of the data
  5. fanciful plots that result in optical illusions that are known to mislead