Stats- DATA Flashcards

1
Q

What are range and iQR a measure of?

A

A measure of the spread around the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are variance and SD a measure of?

A

Measure of spread around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When using histograms, what is the formula for frequency?

A

Frequency= frequency density x class width (width of the boxes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When using histograms, what information is needed top calculate mean and SD?

A

Frequency, midpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the equations for each quartile?

A

Q1 = (n+1)/4
Q2 = (n+1)/2
Q3 = 3(n+1)/4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the equation for the IQR?

A

Q3 - Q2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How could removing outliers be a useful or not a useful thing?

A

Good if the outliers are data errors
Not useful if it is the actual data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the equations for if data is an outlier?

A

Outlier if x:
> Q3 + (Q3-Q1)
< Q1 + (Q3-Q1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When comparing two sets of data, what can you talk about with median and means?

A

Average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When comparing two sets of data, what can you talk about with SD and IQR?

A

The spread of data, consistency and variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can you tell if data is symmetrical, positively skewed or negatively skewed?

A

Symmetrical- mode= median= mean
Q3-Q2 = Q2-Q1

Positively skewed- mode<median<mean> Q2-Q1</mean>

Negatively skewed- mode>median>mean
Q3-Q2 < Q2-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is bivariate data?

A

Data that comes in pairs
(X,Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the PMCC and how can it be interpreted?

A

Product moment correlation coefficient
r= 0 means there is no correlation, and the data does not have linear patterns

-1 means strong weak correlation
+1 means strong positive correlation

The closer to zero, the weaker the correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a regression line? What is its equations?

A

The exact line of best fit
Y = a + bx
Where a is the y intercept and b is the gradient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you interpret the gradient of a regression line?

A

For every 1 (unit of x), the (unit of y) increases/ decreases by (the gradient)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

On regression lines, what are the units for the gradient?

A

The units of y per the units of x

17
Q

What are the three types of random sampling?

A

Simple random
Stratified
Systematic

18
Q

What is a simple random sample?

A

Each member of the population is allocated a number, and a random number generator is used to randomly select individuals (ignoring repeats of numbers)
Each member of the population has equal chances of being selected

19
Q

What are advantages and disadvantages of simple random sampling?

A

Advantages-
Easy and cheap
Removes bias

Disadvantages-
May be time consuming if the population is large
A sample frame is needed

20
Q

What is stratified sampling?

A

Where a population is divided into different groups, that represents the proportion of groups within the population. Random sampling is then used to select individuals from each strata
No individual can be in more than one stratum (groups are called strata/stratum)

21
Q

What are advantages and disadvantages of stratified sampling?

A

Advantages-
The strata reflects the proportions within the population, and the structures of the population

Disadvantages-
The groups within the population must be very clear

22
Q

What is systematic sampling?

A

Where a sample size n is chosen out of the population size.
Use K= n/N
Every Kth person is chosen for the sample

23
Q

What are advantages and disadvantages of systematic sampling?

A

Advantages-
Simple and quick
Suitable for large populations

Disadvantages-
A sample frame is needed
It can be bias if the sample frame is not random

24
Q

What is a census?

A

A measure/ observation of the whole population

25
Q

State three types of non-random sampling

A

Opportunity
Quota
Cluster

26
Q

What is cluster sampling?

A

Where the population is divided into equally sized groups called clusters
One or two clusters are chosen at ransom to be the sample

27
Q

What is opportunity sampling?

A

Where a researcher samples from people they have easiest access to until their desired sample size is reached

28
Q

What are advantages and disadvantages of oppertunity sampling?
?

A

Advantages- easy and cheap

Disadvantages- not likely to be representative of population. Dependent on the individual researcher

29
Q

What is quota sampling?

A

Where researchers are given a quota of types of people to interview. The quotas are in proportion to the relevant subgroups in the whole population
Opportunity sampling is then used to sample each quota

30
Q

What are advantages and disadvantages of quota sampling?

A

Advantages-
Sample frames are not needed
Quick and easy
Small samples can still be representative of the whole population

Disadvantages-
Can be bias
When dividing the population, the quota can be inaccurate