Terms to memorise Flashcards

1
Q

Population

A

Whole set of items which are of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample

A

A subset of the population intended to represent the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sampling unit

A

Each individual thing in a population or sampling frame.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sampling frame

A

A named or numbered list of all the sampling units in a population or sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Census vs sample

A

Data collected from an entire population vs data collected from a subset of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does “trace” mean?

A

Less than 0.05mm of rainfall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the beaufort scale measure?

A

Daily mean windspeed in knots. From 0-5 where 5 is highest mean windspeed. Calm, light, moderate, fresh

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does daily max relative humidity measure?

A

Percentage of air saturated with water vapour.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What relative humidity gives rise to foggy and misty conditions

A

Over 95% daily max relative humidity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does n/a mean?

A

Missing data due to instrument failure maybe?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which locations have high daily mean windspeeds?

A

Leuchars, Hurn, Camborne, Jacksonville and Perth. Coastal locations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is visibility measured?

A

How far can be seen into the horizon. In DM’s. 1DM = 10m

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is calculated mean from grouped data an estimate

A

We are picking the midpoint and we don’t know the exact height of each group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standard deviation phrase

A

Mean of the squares minus the square of the mean, all square rooted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What measures change and which ones don’t when you apply coding.

A

Mean median change by the coding, but standard deviation doesn’t change when adding/subtracting, only changes for multiplication/division.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does correlation tell us?

A

The strength of the relationship between 2 variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is data with two variables called?

A

bivariate data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a causal relationship? Does a strong correlation always mean a causal relationship?

A

A change in one variable directly causes a change in the other. No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When is the linear regression line justified?

A

When there is a linear relationship and the points on the graph lie close to a straight line. Or, r is close to 1/-1.

20
Q

When should a linear regression line not be used to make predictions?

A

When having to extrapolate, as data outside the range of values given might not follow the same expected pattern as that of data given. Also when predicting values of independent variable, given the dependent variable.

21
Q

What is an experiment?

A

A repeatable process that gives rise to a number of outcomes.

22
Q

What is an event?

A

A set of one or more outcomes.

23
Q

What does mutually exclusive mean?

A

Two events can’t happen at the same time, there is no interesection.

24
Q

What does independence mean?

A

One event happening doesn’t affect the probability that the other one will happen.

25
Q

What does a random variable represent?

A

A single experiment/trial. It consists of outcomes with a probability for each. P(X=x) says that the probability that the outcome of the random variable X was the specific outcome x.

26
Q

When can distribution be modelled as binomial?

A

1) Fixed number of trials, n.
2) Two possible outcomes, success and failure.
3) Fixed probability of success, p.
4) Trials are independent of each other.

27
Q

What is a hypothesis test?

A

A statistical test that is used to determine whether there is enough evidence in a sample of data to infer
that a certain condition is true for the entire population.

28
Q

What is a hypothesis

A

A statement made about the value of the population parameter that we wish to test by collecting evidence in the form of a sample.

29
Q

What is a test statistic?

A

A statistic that is calculated from sample data in order to test a hypothesis about a population.

30
Q

What is the significance level?

A

The maximum probability where we would reject the null hypothesis.

31
Q

What is the null hypothesis?

A

The default position.

32
Q

What is critical region?

A

A region where if the value of the test statistic fell in, would lead to us rejecting the null hypothesis.

33
Q

What is the alternate hypothesis?

A

The hypothesis that there has been some change in the population parameter.

34
Q

What is the actual significance level?

A

The actual probability of being in the critical region.

35
Q

What does pmcc (r) describe? How could correlation be strong however r isnt close to 1?

A

The linear correlation between two variables. It can take values between -1 and +1. Data could exhibit strong correlation for a different model, for example an exponential model.

36
Q

What is r and what is p in a pmcc hypothesis test?

A

r is the test statistic and p is the population parameter.

37
Q

What needs to be done in a hypothesis test when it is two-tailed?

A

Half the significance level?

38
Q

When can distribution be modelled as normal distribution?

A

Curve is bell shaped. Data is symmetrical about the mean/median/mode. Variable is continuous and area under the graph is equal to 1.

39
Q

What is standard deviation rule?

A

68% within one standard deviation of mean, 95% within two, and 99.7% within 3.

40
Q

What is standardising the normal distribution?

A

z=X-mean/standard deviation.

41
Q

When can we approximate the binomial distribution?

A

When n is large and p is close to 0.5

42
Q

What needs to be applied when doing a normal approximation?

A

A continuity correction. Convert to greater or equal to, and then enlarge the range by 0.5

43
Q

What is extrapolation? What is danger of extrapolation?

A

Using a model to predict values outside of the range of the original data. It’s unreliable as the model may not be valid outside of the range.

44
Q

What is a sample space?

A

A list of all possible outcomes.

45
Q

When do we compare using mean/sd and when do we compare using median/IQR?

A

median/IQR when there are no outliers.

46
Q

What is the assumption during linear interpolation?

A

Data in each class is evenly distributed - just an esitmate.