Statistics Flashcards

1
Q

Define population

A

The whole set of items that are of interest to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define census

A

Observes or measures every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define sample

A

A selection of observations taken from a subset of the population which is used to find out information about the population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Advantages of disadvantages of a census and of a sample

A

Advantages of census
• It should give a completely accurate result

Disadvantages of census
• Time consuming and expensive
• Cannot be used when the testing process destroys the item
• Hard to process large quantity of data

Advantages of sample
• Less time consuming and expensive than
a census
• Fewer people have to respond
• Less data to process than in a census

Disadvantages of sample
• The data may not be as accurate
• The sample may not be large enough to give information about small subgroups of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define sampling units

A

Individual units of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define sampling frame

A

List where sampling units of a population are individually named or numbered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

3 methods of random sampling

A

•simple random sampling
•systematic sampling
•stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define and give Advantages and disadvantages of simple random sampling

A

the researcher randomly selects a subset of participants from a population

Advantages
• Free of bias
• Easy and cheap to implement for small populations and small samples
• Each sampling unit has a known and equal chance of selection of workers is not a whole number round to the nearest whole number.

Disadvantages
• Not suitable when the population size or the sample size is large as it is potentially time consuming, disruptive and expensive.
• A sampling frame is needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Advantages and disadvantages of systematic sampling

A

Advantages
• Simple and quick to use
•Suitable for large samples and large populations

Disadvantages
• A sampling frame is needed
•It can introduce bias if the sampling frame is not random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Advantages and disadvantages of stratified sampling

A

Advantages
• Sample accurately reflects the population structure
• Guarantees proportional representation of groups within a population

Disadvantages
• Population must be clearly classified into distinct strata
• Selection within each stratum suffers from the same disadvantages as simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define a simple random sample of size n

A

Every sample of size n has an equal chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define systematic sampling

A

The required elements are chosen at regular intervals from an ordered list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define stratified sampling

A

The population is divided into mutually exclusive strata (e.g. males and females) and a random sample is taken from each

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Formula to calculate the number of people we should sample from each stratum

A

The number samples in a stratum = (number in stratum / number in population) x overall sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2 types of non-random sampling

A

•quota sampling
•opportunity sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define quota sampling

A

an interviewer or researcher selects a sample that reflects the characteristics of the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define opportunity sampling

A

consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Advantages and disadvantages of quota sampling

A

Advantages
• Allows a small sample to still be representative of the population
• No sampling frame required
• Quick, easy and inexpensive
• Allows for easy comparison between different groups within a population

Disadvantages
• Non-random sampling can introduce bias
• Population must be divided into groups, which can be costly or inaccurate
• Increasing scope of study increases number of groups, which adds time and expense
• Non-responses are not recorded as such

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Advantages and disadvantages of opportunity sampling

A

Advantages
• easy to carry out
• Inexpensive

Disadvantages
• Unlikely to provide a representative sample
• Highly dependent on individual researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define quantitative variables/data

A

Variables or data associated with numerical observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define qualitative variables/data

A

Variables or data associated with non-numerical observations

22
Q

Define continuous variable

A

A variable that can take any value in a given range

23
Q

Define discrete variable

A

A variable that can take only specific values in a given range

24
Q

Define mode or modal class

A

The value or class that occurs most often

25
Q

Define median

A

The middle value when the data values are put in order

26
Q

Formula of mean

A

_
x = Ex / n

27
Q

Formula for mean in frequency table

A

_
x = Efx / Ef

28
Q

Find median of both listed data and of grouped data

A

listed data:
Find n
-if decimal round up
-if whole - halfway between this item and the one after

Grouped data:
find n/2 then use linear interpolation

29
Q

Linear interpolation

A

Lower class boundary + ((amount into frequency / frequency of class) x class width)

30
Q

P_57
n=43

A

43x0.57=24.51

31
Q

Q_1 of 100 numbers

A

100/4=25th
Interpolation using 25th number

32
Q

P_10 of 41 numbers

A

41 x 10%=4.1 4.1st
Interpolation using 4.1st number

33
Q

Variance formula

A

Small sigma squared = (sum of squared values / number of values) - mean^2

34
Q

Standard deviation

A

Sigma = root of variance

35
Q

Coding standard deviation

A

Only multiply or divide affect

36
Q

Common definition of an outlier

A

Either greater than Q_3 + k(Q_3 - Q_1)
Or less than Q_1 - k(Q_3 - Q_1)

37
Q

Cleaning the data

A

= the process of removing anomalies from a data set

38
Q

Formula to calculate the height of each bar (frequency density) on a histogram

A

Area of bar = k x frequency

39
Q

Frequency polygon from histogram

A

Join the middle of the top of each bar with equal class widths

40
Q

When comparing data sets comment on:

A

A measure of location
A measure of spread

41
Q

What is Bivariate data

A

data which has pairs of values for two variables

42
Q

What does Correlation describe

A

the nature of the linear relationship between two variables

43
Q

causal relationship

44
Q

regression line

45
Q

The coefficient b tells you the change in y for each unit change in x
How does correlation change b

A

• If the data is positively correlated, b will be positive
• If the data is negatively correlated, b will be negative

46
Q

When should you use the regression line

A

to make predictions for values of the dependent variable that are within the range of the given data

47
Q

Venn diagram

48
Q

Mutually exclusive events

A

P (A or B) = P(A) + P(B)

49
Q

Independent events

A

P (A and B) = P(A) x P(B)

50
Q

Tree diagram