Definitions Flashcards

1
Q

DATA

A

Raw information from which statistics are created

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

POPULATION

A

The pool from which a statistical sample is drawn. eg. total number of tech start ups in Asia

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SAMPLES

A

Samples are units collected from the statistical population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

SYSTEMATIC SAMPLING

A

Systematic sampling is where units are collected at regular intervals eg. every 10th person.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

STRATIFIED SAMPLING

A

Dividing population into strata (SUB GROUPS) and then selecting units from each strata. Random samples are then taken from each strata, normally in proportion to the actual percentage of occurrence of the strata in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

CLUSTER SAMPLING

A

Cluster sampling begins by dividing population into clusters. eg suburbs. Then randomly select clusters. Every unit in the clusters selected are included.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

CATEGORICAL DATA

A

Categorical variables are variables that put them into categories, eg. male/female, black/white, age group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

NUMERICAL DATA

A

Numerical data is data that can me measured such as time, height, weight or amount.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DISCRETE DATA

A

A discrete variable is one where data is counted eg. How many eggs a hen lays each day. The variable can never be negative, and there will never be half an egg. All numbers can be written down, and are whole numbers. Can be qualitative or quantitative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CONTINUOUS VARIABLE

A

A continuous variable is where data is measured. How many litres of milk will a cow give daily.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ORDINAL DATA

A

Ordinal measure of data is where data is arranged in order, however differences between data have no meaning. eg on a scale of 1-10 how happy are you.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

QUANTATTIVE

A

Quantitative variable has a value or numerical measurement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

QUALITATIVE

A

Qualitative variable describes an individual by placing it into a category or group, eg male or female.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SIMPLE RANDOM SAMPLE

A

Sample taken from a population randomly where each unit has the same chance of being selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

REPRESENTATIVE

A

A representative sample is a sample that represents the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

BIAS (Statistics)

A

The opposite of representative, this is where there is bias in a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Co-efficient of variation.

A

CV= Sample mean / sample standard deviation X 100%. Used to compare the spread of two different data types. eg. pounds to rupees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Variance in regards to standard deviation.

A

The variance tells us the square of standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Descriptive statistics.

A

The explanation of data from a sample through the use of graphs and other descriptive tools. eg averages, modes, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Statistics

A
Collection
Organisation
Analysis
Interpretation of
DATA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Inferential statistics

A

Using the data from a sample to infer information about a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Sampling frame

A

List of individuals that make up the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Sampling error vs non-sampling error

A

Sampling error is the difference between the measurements from the sample and population. Non-sampling error is from poor sample design, sloppy data collection or faulty measuring equipment etc.

24
Q

Observational study vs Experiment.

A

Observational study is where observations and measurements are taken in a way that doesn’t change the response of the variable. Experiment is where a treatment is deliberately imposed on the individuals in order to observe a possible change.

25
Q

Control group

A

This is the group that receives a dummy treatment to compare against the test group.

26
Q

Lurking variable

A

Will generally have an effect on both the explanatory and response, will generally be difficult to measure.

27
Q

Confounding variable

A

A variable that cannot be controlled but will have an effect on what is being measured and is taken into account when conducting an experiment. A variable that can produce effects that are confused of confounded with the effects of the independent variable

28
Q

Discrete probability distribution

A

A discrete probability distribution is a distribution where the possible outcomes are discrete ie. roll of the dice or a toss of the coin.

29
Q

How do you know that a probability distribution is valid

A

It will add up to 1.

30
Q

A

Less than and equal to

31
Q

How to write “Probability of between 1 and 3 happening?”

A

P(1≤X≤3)

32
Q

µ or x bar.

A

Mu. In statistics represents the population mean. Xbar represents the sample mean.

33
Q

Σ

A

The sum of

34
Q

σ

A

Population Standard deviation

35
Q

E(X) statistics

A

Expected value of X

36
Q

What is a probability distribution

A

Describes the values that could occur and the probability that each value might occur.

37
Q

X~Bin (n,p)

5 properties?

A

Binomial distribution.

  • Must have set number (n) trials
  • Each trial has only two possible outcomes, “success” or “failure”.
  • Results of each trial are independent of other trials.
  • Fixed probability (p) “success” in each trial.
  • (x) is defined as a number of successes in (n) trials.
38
Q

At most

At least

A

At most is up to and including the number ≤.

At least is greater than and including ≥.

39
Q

Short cut formulas for µ and σ of binomial distributions.

A

Mean µ = (np)

STDEV σ = √np(1-p)

40
Q

X~N(µ,σ)

A

Formula for normal distribution

41
Q

Standardise formula (z score)

A

x - µ
_____
σ

42
Q

What is the standard deviation of the SAMPLING DISTRIBUTION OF THE MEAN called?

A

Standard error.

43
Q

n=

A

sample size

44
Q

SAMPLING DISTRIBUTION OF THE MEAN formula for changing standard deviation to sample error.

A

σ / √n

45
Q

e

A

e is the error amount.

46
Q

Rules for CLT?

A

Sample must be large enough.

Must be random sample. (30)

47
Q

k

A

Critical value

48
Q

Is it a random sample?

A

Not sure, read the question, ask the question.

49
Q

What is the rule for T distribution use?

A

If n> 30, use normal distribution. If n< 30, use T-distribution.
T-distribution must come from a normally distibuted population.

50
Q

Quartile
Decile
Percentile

A

Quartile distribution divided by 4 0.25
Decile distribution divided by 10 0.1
Percentile distribution divided by 100 0.01

51
Q

Difference between x-bar and p-hat

A

You need to be a little cautious about assuming that particular symbols like xbar and phat will always have the same meaning, as they are just symbols. However, those two are quite common and consistent. The first is a mean which is the sum of the observations divided by the number of observations. The second is a proportion, the number of ‘successes’ divided by the number of ‘attempts’.

52
Q

p

A

p is considered to be the exact probability of an event happening on a given trial.

53
Q

Conditional probability

A

Where one variable effect the next ie if you have a bag of red and blue marbles, pulling one out changes the probability of the colour of the next one

54
Q

Contingency table

A

Table where frequency proportions of events can be plotted and then cross calculated

55
Q

Statistical independence

A

When one outcome does not effect another outcome or event.