Stats Year 1 Flashcards

1
Q

What are the 5 things that must be in a hypothesis testing

A
  • Null hypothesis
  • Alternative hypothesis
  • Test statistic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define the significance level

A

The probability of incorrectly rejecting the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define a population

A

The whole set of items that are of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define census

A

Observes or measures every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define sample

A

A selection of observations taken from a subset of the population which is used to find out information about the population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define sampling units

A

Individual units of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a sampling frame

A

Sampling units of a population that are individually named or numbered to form a list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an advantage of a census

A

It should give a completely accurate result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 3 disadvantages of a census

A
  • Time consuming and expensive
  • Cannot be used when the testing process destroys the item
  • Hard to process large quantity of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 3 advantages of using a sample

A
  • Less time consuming and expensive than a census
  • Fewer people have to respond
  • Less data to process than in a census
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 2 disadvantages of using a sample

A
  • The data may not be as accurate
  • The sample may not be large enough to give information about small sub-groups of the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a simple random sample

A

Where every sample has an equal chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you carry out a simple random sample

A
  • Allocate each person or thing in the sampling frame a unique number
  • Random number generate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is systematic sampling

A

Required elements are chosen at regular intervals from an ordered list e.g. every 5 numbers are selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is stratified sampling

A

The population is divided into mutually exclusive strata (males and females for example) and a random sample is taken from each

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the equation to calculate how many people/things should be involved in the sample per strata in stratified sampling

A

Number in stratum/ number in population then x by overall sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the 3 advantages of simple random sampling

A
  • Free of bias
  • Easy and cheap to implement for small populations and small samples
  • Each sampling unit has a known and equal chance of selection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the 2 disadvantages of simple random sampling

A
  • Not suitable when the population size or the sample size is large n
  • A sampling frame is needed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the 2 advantages of systematic sampling

A
  • Simple and quick to use
  • Suitable for large samples and large populations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the 2 disadvantages of systematic sampling

A
  • A sampling frame is needed
  • It can introduce bias if the sampling frame is not random
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the 2 advantages of stratified sampling

A
  • Sample accurately reflects the population structure
  • Guarantees proportional representation of groups within a population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the 2 disadvantages of stratified sampling

A
  • Population must be clearly classified into distinct strata
  • Selection within each stratum suffers from the same disadvantages as simple random sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is quota sampling

A

An interviewer or researcher selects a sample that reflects the characteristics of the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is opportunity sampling

A

Consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the 4 advantages of quota sampling

A
  • Allows a small sample to still be representative of the population
  • No sampling frame required
  • Quick, easy and inexpensive
  • Allows for easy comparison between different groups within a population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the 4 disadvantages of quota sampling

A
  • Non-random sampling can introduce bias
  • Population must be divided into groups, which can be costly or inaccurate
  • Increasing scope of study increases number of groups, which adds time and expense
  • Non-responses are no recorded as such
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the 2 advantages of opportunity sampling

A
  • Easy to carry out
  • Inexpensive
28
Q

What are the 2 disadvantages of opportunity sampling

A
  • Unlikely to provide a representative sample
  • Highly dependent on individual researcher
29
Q

What are variables that are associated with numerical observations called

A

Quantitative variables

30
Q

What are variables associated with non-numerical observations called

A

Qualitative variables

31
Q

What is a continuous variable

A

A variable that can take any value in a given range

32
Q

What is a discrete variable

A

A variable that can take only specific values in a given range

33
Q

What is the mode

A

Value that occurs the most

34
Q

What is the median

A

The middle value

35
Q

How do you calculate the variance

A

Mean of the squares minus the square of the mean

36
Q

How do you calculate standard deviation from the variance

A

Square root it

37
Q

What are the 3 common definitions of an outlier if not stated in the question

A
  • Upper quartile + 1.5(IQR)
  • Lower quartile - 1.5(IQR)
  • The mean plus or minus 2 standard deviation
38
Q

The process of what is known as cleaning the data

A

Process of removing anomalies from a data set

39
Q

How do you calculate the frequency density

A

Frequency/ class width

40
Q

When comparing data what 2 things must you compare

A
  • A measure of location
  • A measure of spread
41
Q

What things are measures of location

A
  • Mode
  • Mean
  • Median
  • Quartiles
  • Percentiles
42
Q

What things are measures of spread

A
  • Range
  • Interquartile range
  • Variance
  • Standard deviation
43
Q

What are the 5 possible correlation descriptions

A
  • Strong negative correlation
  • Weak negative correlation
  • No correlation
  • Weak positive correlation
  • Strong positive correlation
44
Q

Is the explanatory variable dependent or independent

A

Independent

45
Q

Where does the explanatory variable go, x or y axis

A

x axis

46
Q

Is the response variable dependent or independent

A

Independent

47
Q

Which axises should you plot the response variable

A

Y-axis

48
Q

What type of relationships do the variables have if a change in one causes a change in the other

A

Casual relationship

49
Q

What is the equation of a regression line

A

y=a+bx

50
Q

What is an experiment

A

A repeatable process that gives rise to a number of outcomes

51
Q

What is an event

A

A collection of one or more outcomes

52
Q

What is a sample space

A

The set of all possible outcomes

53
Q

What is the term used to describe when events have no outcomes in common

A

Mutually exclusive

54
Q

For mutually exclusive events, how do you calculate the P(A+B)

A

P(A)+P(B)

55
Q

For independent events, how do you calculate P(AandB)

A

P(A) X P(B)

56
Q

How do you calculate whether events are independent

A

P(AandB) = P(A) X P(B)

57
Q

How do you calculate whether events are mutually exclusive

A

P(AandB) = P(A) + P(B)

58
Q

When using the binomial distribution function on your calculator using CD, what must you remember it is calculating

A

Equal to or less than the number (the x)

59
Q

What is the basic equation for binomial distribution

A

B(n,p)
- n= number of trials
-p= probability

60
Q

What is the null hypothesis

A

The hypothesis that you assume to be correct

61
Q

What is the alternative hypothesis

A

Tells you about the parameter if your assumption is shown to be wrong

62
Q

When dealing with a 2-tailed test, what happens to the signifance level

A

It is split in half, half of the percentage is given to below the number and the other half is given to above the number
e.g. for a 5% significance level 2.5% is given to above and 2.5% is given to below

63
Q

What is the rejection/ critical region

A

The region of the probability that would cause you to reject the null hypothesis

64
Q

What is the critical value

A

The first value to fall inside the rejection region

65
Q

What is the actual significance level

A

The probability of incorrectly rejecting the null hypothesis, the significance level given in the question is rarely the exact % you find from the calculator
- Say the question gives a 5% significance level, and the value given you 4.4% instead of exactly 5% the actual significance level is 4.4%

66
Q

How many rejections regions are there in a 2-tailed test

A

2, one at each end of the distribution