Lecture 8 Flashcards

1
Q

What is a variable

A

Any quantity that can be measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Finish the sentence “In a dataset there will be ___ of a variable for each individual in the sample”

A

Observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s a central tendency

A

The typical value of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is dispersion

A

How far from the typical value the individual observations of a variable are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an association?

A

How a variable relates to another variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are inferential statistics?

A

Stats used to make predictions about parameters of the population based on two factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are parameters?

A

Characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What estimates the parameters?

A

Statistics computed from a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is probability?

A

The chance that a particular event will occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is sampling distribution

A

The probability that we obtain the parameters observed in our sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is hypothesis testing?

A

The data supporting our beliefs about the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When do we use descriptive statistics?

A

To summarise sample data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do we use statistical inference?

A

To generalise about population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What determines or influences what statistical methods we can apply?

A

The level of measurement of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What’s descriptive statistics for?

A

To summarise the key features of data.
- To make it understandable for human readers
- To identify characteristics
- To identify patterns
- To provide basis for further analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are measure of Central tendency?

A

Mean (x̄), median (M), mode (Z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are measure of central tendency

A

Single number that represents the ‘typical’ value of a variable (an average: mean, median, mode)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How would you visualise data?

A

In frequency tables i.e. Bar charts and Histograms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is skewness?

A

Distributions that have a relatively higher proportion of values at the low (left) or high (right) end of the range (on the graph)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Where can you visualise skewness best?

A

Comparing values of means, median and mode in histograms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does a normal distribution look like?

A

Evenly spread above and below the mean (bell shape)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which side does a positive skew lean towards?

A

Right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which way does a negative skew lean towards?

A

Left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the mean the best representation of?

A

The average in most cases of continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does the median identify?

A

The central point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the median useful for?

A

Correcting skewed data or when continuous variables are measured on subjective scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the mode suitable for?

A

Nominal data or grouped data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does dispersion measure?

A

How far, on average, each observation is from the central tendency (mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What does the dispersion figure represent?

A

The variation in values within a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What do lower values of dispersion indicate?

A

That the central tendency (mean) is a better representation of the ‘typical value’ (more accurate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What does the range and interquartile range provide?

A

A basic measure, useful for visualisation and identifying outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Why should we use variance and standard deviation?

A

They are more statistically powerful measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What’s the interquartile range?

A

The range of the middle 50% of values (Median of upper and lower halves)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is variance?

A

The mean of the squared differences between each data point and the mean

35
Q

What is standard deviation

A

Square root of the variance (most common measure of dispersion)

36
Q

What can measures of dispersion not be applied to?

A

Nominal variables

37
Q

What is a good visual form for understanding dispersion of a variable and identifying outliers?

A

Box plots

38
Q

What is a plot outlier?

A

Values, figures, or data that lie outside the box plot limits

39
Q

How do you calculate variance?

A

Mean of the squared differences between each value in the dataset

40
Q

How do you calculate standard deviation

A

Square root of the variance

41
Q

What does the measure of association consider?

A

The relationship between two variables

42
Q

What does Kurtosis mean?

A

Flatness

43
Q

What is a large SD? (Standard deviation)

A

Flat distribution

44
Q

What is a small SD? (Standard deviation)

A

Narrow distribution

45
Q

What does standard deviation tell us about in terms of distribution?

A

The flatness of distribution

46
Q

What’s the statistic for categorical data?

A

Chi-squared x^2

47
Q

Whats the statistic for continuous data?

A

Pearson’s correlation coefficient (r)

48
Q

What does Chi-Squared measure?

A

The association between two categorical variables

49
Q

What does chi-squared compare?

A

The expected frequency if there was no relationship with the observed frequency in the sample data

50
Q

Correlation (r)

A

Strength and magnitude (direction) of the association between two variables

51
Q

What does a positive correlation mean?

A

Increase in one variable associated with an increase in the other

52
Q

What does a negative correlation mean

A

Increase in one variable associate with a decrease in the other

53
Q

What does 0 correlation mean

A

No association

54
Q

What does Chi-Squared rely on?

A

Testing for statistical significance

55
Q

What is a statistical significance

A

Importance or quality of the data/stats

56
Q

What is a critical value

A

How far from expected centre do you need to be before saying something is unusual here

57
Q

Correlation is used for?

A

Ordinal and scale data (continuous)

58
Q

Chi-Squared is used for?

A

Nominal (Categorical) data

59
Q

What is covariance?

A

The degree to which two variables deviate from their expected values (mean) in similar ways

60
Q

What does a positive covariance indicate?

A

Variables that tend to ‘move together’ away from
their means: if we observe a high value of x, we also expect to see a high value of y

61
Q

What’s a scatter plot good for?

A

Checking if there is a linear relationship between two variables

62
Q

What does negative covariance indicate?

A

Variables that move in opposite directions: if we
observe a high value of x, we expect to see a value of y below its mean

63
Q

What is a strong correlation’s r value?

A

r = ± 0.8

64
Q

What is a weak correlation’s r value?

A

r ± 0.3

65
Q

What is an omitted variable?

A

A factor that could lead to changes in X and Y

66
Q

What is a reverse casuality?

A

A change in Y leads to the change in X

67
Q

What is Sample selection bias?

A

When Individuals sampled have a different tendency to show the association than the whole population

68
Q

What’s a measurement error?

A

Values in the data that differ from the true value of the variable

69
Q

What does association NOT imply?

A

Causation

70
Q

What are alternative reasons for finding a relationship?

A

Omitted variables, reverse casuality, sample selection and measurement error

71
Q

What does the test we use depend on?

A

Data meeting certain assumptions

72
Q

What does the statistical inference process rely on?

A

Estimating the probability of obtaining our sample results, based on the distribution of sample statistics and population parameters.

73
Q

What’s the standard normal (Z) distribution’s mean and SD value?

A

Mean = 0
SD = 1

74
Q

What does a values Z-score represent

A

How many standard deviations it lies from th mean?

75
Q

What does a higher Z score mean?

A

A lower probability of observing the value

76
Q

What is the central limit theorem?

A

the sampling distribution of the mean will always be normally distributed, as long as the sample size is large enough

77
Q

What is the ‘sampling distribution of the sample means?’

A

The distribution consisting of the means of all
random samples (n) of a given size that can be
drawn from a population.

78
Q

Alternative hypothesis (H1) means

A

what we believe the
data will support

79
Q

What’s the Null hypothesis (H0)

A

It covers all states we want to
disprove

80
Q

What is the statistical significance level?

A

0.05 (Significance level of 5%)

81
Q

Give me 4 levels of measurement data

A

Nominal, ordinal, interval or ratio data

82
Q

What do larger samples provide?

A

More reliable results

83
Q
A