Introduction to Statistics Flashcards

You may prefer our related Brainscape-certified flashcards:
0
Q

Ordinal

A

Categorical variable that can be ordered

1234 in a race, or ordering ones qualifications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Nominal

A

Categorical variable that cannot be ordered

Male,female, religious group, ethnic group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Population

A

All of the information that we are interested in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Interval

A

Metric variable where numbers are used to label and order, the intervals between the numbers are equal
Celsius or Fahrenheit, the interval still means something.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Ratio

A

Metric variable, numbers are used to label and order. Zero means the absence of something
Age, or numbers of answers In a test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sample

A

A subset of all the information. Ideally representative of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sampling Bias

A

Any effect that makes our results non representative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Proportion Calculation

A

Frequency divided by total number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variable

A

Anything that we want to measure that varies such as age, gender, vehicle type etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Metric Variable

A

Occurs naturally as numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Categorical Variable

A

Those that can be put into groups, numbers are assigned arbitrarily

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Frequency

A

How many in each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Valid percent

A

Not counting the missing amount, always quote the valid percent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Descriptive Statistics

A

The best way we can describe a variable or statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which procedure for categorical data

A

Frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which procedure for metric data

A

Explore procedure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The Mean

A

The average

Add up all the numbers, divided by how many there are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The Median

A

The middle number,or 50% point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Standard Deviation

A

How spread the data is, the larger the number, the more spread the number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Minimum

A

The smallest number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Maximum

A

The largest number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Mode

A

The most common occurance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Histogram

A

Used for Metric data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Percentiles

A

The percentage of observations that are less than the stated value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Normal Distribution

A

Bell Curve,
Symmetric distribution,
Mean in centre
Area under the bell curve presents probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

68-95-99.7% Rule

A

One std deviation either side of the mean captures 68% of data,
Two std deviations either side of the mean captures 95% of data,
Three std deviations either side of the mean captures 99.7% of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the z - value?

A

Number of standard deviations away from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Z score formula

A

Value of interest, subtract the mean, divided by standard deviations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

When is a z score unusual?

A

When it is more than two std deviations from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Variance

A

Takes into account all of the data, not just the two end points.
Variance looks at how much each individual score differs from the mean. Squaring them, then averaging them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

With percentile a what it the median?

A

The 50% point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

In percentiles what is the first quartile?

A

The 25% percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

In percentiles what is the third quartile?

A

The 75% percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Reporting Categorical Data

A

Sample Size, sample proportion / percentage, 95% confidence interval, anything else of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Reporting Metric Date

A

Shape, centre (mean / median), Spread, Outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is Inference?

A

Taking information from a sample, inferring about a population from a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is a hypothesis?

A

Turning a research question into a statement. hypothesis is not a question. Hypothesis is to be tested

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is binomial test?

A

Looks at categorical data, specifically those with two categories, compares a percentage / proportion to a fixed value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is one sample t-test?

A

For metric date, compares a mean to a fixed value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is the structure of a report?

A

Hypothesis - what is the sample being measured
Sample - sample size, who is in the sample?
Comparison -
Name of test -
Quote test statistics - if significance include 95% confidence
Conclusion - use appropriate language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What do we include when quoting the mean?

A

Standard deviation (s= )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

When is a p-value significant?

A

When it’s below 0.05 (<0.05)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What do we include when reporting a t value?

A

t-value -
Degrees of freedom (df) -
P value -

t(115) = 2.453, p = .016

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is a p-value?

A

p value is probability that our test statistic takes the observed value or a value more extreme.
The smaller the p value, the stronger the evidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

How is the p-value quoted?

A

Not with the zero in front of the decimal,
Always quote the tree numbers. p=.115 only with carrot when we are
Say below .001 ( <.001)

45
Q

What is sampling variation?

A

The difference between sampling.

46
Q

What are the underlying propositions of sampling theory?

A

Normal distribution -
The mean of the sample proportion ( or sample mean) equals population proportion (or population mean)
Standard deviation of sample distribution depends on the size of the sample

47
Q

What is in the centre of a sampling distribution?

A

The proportion in the population

48
Q

In sampling distribution, what does the p-value represent?

A

The area outside the 95% markers. The 5% probability.

49
Q

What defines experimental design?

A

When the researcher is able to manipulate the IV.

We can then have causal conclusions.

50
Q

What defines observational design?

A

We are just observing what happens,
Not manipulating the IV.
No causal conclusions.

51
Q

When I observational research conducted?

A

When the researcher is unable to conduct experimental study,

Or it is unethical.

52
Q

What can we determine from observational (correlational) design?

A

We cannot determine something for certain,

We cannot make definitive statements.

53
Q

What is a nuisance variable?

A

A variable that correlates (might effect) the dependant variable,
The IV is NEVER a nuisance variable,
Nuisance variable must vary

54
Q

What is a subject nuisance variable?

A

Associated with the participant, age, gender, driving experience etc.

55
Q

What is a situational nuisance variable?

A

Accociated with the conditions of the experiment.

56
Q

What is repeated measures design?

A

When the same participants are used for both conditions.

57
Q

What is matched pairs design?

A

Two separate groups with where the participants are matched as similar to one another as possible.

58
Q

What is independent groups design?

A

Groups are randomly separated.

59
Q

What is the best way to deal with nuisance variables?

A

To hold them constant.

60
Q

What is a confounding factor?

A

A variable that alters the logic of the experiment by being correlated to the IV and DV.

61
Q

What is simple random sampling?

A

A random sample with an arbitrary starting position. Random numbers are drawn to select the sample.

62
Q

What is stratified sampling?

A

Where the population comprises subgroups.

63
Q

What is multi stage sampling?

A

Where we combine different sampling methods.

64
Q

What is cluster sampling?

A

Population has some kind of natural (ideally homogenous) group (cluster),
Eg: all Victorians = clutter would be local government area. Sample within the cluster.

65
Q

What is systematic sampling?

A

From a random starting point, sample every Kth item.

66
Q

What are we looking for in DV and IV?

A

Cause and effect

67
Q

What is another name for prediction?

A

Hypothesis

68
Q

What is another word for correlation?

A

Observation.

69
Q

How do nuisance variable effect our results?

A

They mask or hide the effects of the independent variable,

They destroy the logic of an experiment.

70
Q

What is an independent sample t - test?

A

Compares sample means for two groups, making inference

71
Q

What are the assumptions of a independent t-test?

A

DV is metric,
Independence of observations,
Both samples must come from normal distribution,
Equal Variance, both sample should have similar spread

72
Q

What does a t-value round to?

A

The t value is rounded to two decimal places

73
Q

When quoting how is the 95% interval rounded?

A

In relation to the sample rounding.

74
Q

What is a paired samples t test

A

Used to test the relationships when we have repeated measures or matched pairs research design.

75
Q

What do we want to infer about a sample?

A

Something about the population

76
Q

How do we report paired sample t test?

A

With the mean for each group first, then the sample mean difference xd.

77
Q

What does the 95% confidence interval allow us to do?

A

Infer about the population

78
Q

What indicates significance?

A

The p value and the means

79
Q

What are the assumptions of a paired sample t test

A

Metric data
Independence of observations
Normality

80
Q

What does a p value represent?

A

The probability in that the sample can say something about the population.

81
Q

What is correlation?

A

Looking at the relationship between two metric variables

82
Q

Where is the independent variable on the scatterplot?

A

On the x axis (horizontal)

83
Q

Where is the dependant variable on the scatterplot?

A

On the y axis (vertical)

84
Q

How do we describe scatterplots?

A

Direction
Form
Strength
Outliers

85
Q

What does correlation not mean?

A

Causation

86
Q

Pearson’s R correlation coefficient?

A

The measure of the strength of a linear association between two metric variables

87
Q

When can’t Pearson’s R apply?

A

When the form is non linear (curved)

88
Q

Co efficient of determination R2?

A

Tells us more about the relationship between two variables

89
Q

Co efficient of determination R2 formula?

A

Example: .123 x .123

R squared

90
Q

How do we interpret R2?

A

Example: .085
8.5%

             .123
              12.3%
91
Q

How do we interpret the 95% confidence interval for correlations?

A

Indicates that in the population the strength of the linear relationship is between …

92
Q

Spurious Correlation

A

Where we have strong positive correlation where it does not make sense, sometimes a third factor.

93
Q

The symbol for correlation in the population?

A

Rho,

Looks like a p

94
Q

Positive relationship?

A

Upwards from left to right.

More of IV means more of DV

95
Q

Negative relationship?

A

Downwards from left to right

More of IV means less of DV

96
Q

What is the strength of a relationship?

A

An indication of how well you can predict the value of the DV when you know the value of the IV

97
Q

What is this p value telling me? p = .005

A

That there is 5 chances In 1000

98
Q

What is this p value telling me? p = .050

A

That there is 5 chances in 100

99
Q

What is the regression equation?

A

Y = a + b x X

100
Q

What does Y represent in regression equation?

A

Dependant Variable

101
Q

What does a represent in regression equation?

A

A constant known as the vertical intercept

102
Q

What does b represent in the regression calculation?

A

Slope or regression coefficiant.

103
Q

What does X represent in regression calculation?

A

Independent variable.

104
Q

Regression?

A

To calculate the linear relationship

105
Q

What is the constant?

A

The vertical intercept

106
Q

In a report, was is the conclusion trying to tell us?

A

What evidence we are looking for to draw a conclusion.

107
Q

What is a causal conclusion?

A

That a change in the IV will produce a change in the DV

108
Q

What is x2

A

Chi squared (pronounced ki) testing the relationship between two categorical variables.

109
Q

What does the chi squared test measure?

A

The relationship between the two measured variables, not the difference like in some tests, categorical variables.

110
Q

What is a parametric test?

A

There is a specific population parameter that we are trying to estimate using the sample statistic

111
Q

What is a non parametric test?

A

A test that does not measure the relationship between sample and population. Simply measures significance.