Introduction to statistics Flashcards
What is statistics?
Exploring. Analysing. Summarising data. Designing methods. Collect data. Drawing conclusions from data. Making decisions.
Where will we use statistic?
At university: Research. Communication. Design. Analysis of laboratory experiments. Surveys.
Career: Evaluating experimental results. Epidemiology. Pharmaceutical. Food industry. Clinical trials. Marketing studies. Sales. Data informing policies.
From where can the data come?
Laboratory experiments.
Questionnaires.
Observations.
What is a variable?
A characteristic of interest. Measured/observed. A factor for group data. Height. Cholesterol levels. Colour.
What can data be?
Numerical= measurements. Categorical = group.
Is the variables numerical or categorical? Height of males. Cakes produced. Gender. Voting. Education. Cholesterol levels. Salt concentration.
N. N. C. C. C. N. N.
When is a variable continuous or discrete?
When the variable is a measurement then it is continuous.
Is the variable discrete or continuous?
Weight. Participants. Height. Blood cholesterol concentration. Cell count. Enzyme activity. Live births. Reaction time (msecs).
C. D. C. C. D. C. D. C.
When is a variable nominal and when ordinal?
When the data can be ordered then they are ordinal.
Is the variable ordinal or nominal?
Gender. Army rank. Favourite restaurant. Voting. Education levels. Marital status. Exam grade. Council tax band.
N. O. N. N. O. N. O. O.
What do statistics summarise?
Centre.
Position.
Spread.
Shape of data.
Which are the measures that characterise the centre of a dataset?
Mean.
Mode.
Median.
What is the mode?
The most frequently occurring value in the data set.
What is a odal class in a histogram?
The most frequently occurring value range.
How many modal classes can we have in a histogram?
2 the most.
How can we calculate the mean?
Sum of all values/number of values (n).
How can we calculate the median?
When we put values in an order.
Find the number in the middle.
Formally: n + 1 /2 = value in dataset.
Find that value in that position we calculated.
Where can we find the mode?
In all types of variables.
Where can we find the median?
Only for ordinal or numerical variables.
Where can we find the mean?
Only for numerical variables.
How can we find the mode from an age group dataset?
Find the variable with the highest number of students.
= occur more.
How can we find the mode of a gender dataset?
Value with highest frequency = occur more.
Which are the measures of Position?
Quartiles.
Q1: lower quartile.
Q2: median.
Q3: upper quartile.
What do quartiles do?
Divide an ordered tests into specific/equal parts?
Characterise the shape of dataset.
What does the median Q2 do?
Splits ordered data series into 2.
What do the quartiles Q1 and Q3 do?
Split the upper and lower halves of the ordered data series?
What is Q1?
Lower first quartile.
25% of values lie below Q1.
What is Q2?
Median second quartile.
50% of values lie below Q2.
What is Q3?
Upper third quartile.
75% of values lie below Q3.
Which are the quartile positions?
- 25x(n+1): lower quartile.
- 5x(n+1): median.
- 75x(n+1): upper quartile.
What do the formulae of quartiles calculate?
The position of the value in an ordered data series.
Data:
60 70 82 90 68 68 76 76 62 74 76 70 80 62 78 76 68 60 74 60 80
Work out quartiles.
Order data:
60 60 60 62 62 68 68 68 70 70 74 74 76 76 76 76 78 80 80 82 90
n=21
n+1 = 22
Q1 position= 0.25x(n+1) =0.25 x 22=5.5
Q1=(62+68)/2=65
Q2 position= 0.5x(n+1) = 0.50 x 22=11
Q2=74
Q3 position= 0.75x(n+1)=0.75 x 22=16.5
Q3=(76+78)/2=77
How does SPSS Output present the quartiles?
As percentiles of 25, 50, 75.
What if the position we find from Q1 is at 0.25?
Q1 = value + (next value-62)/4 = value.
What if the position we find for Q3 is 0.75?
Q3 = value - (value - next value) / 3 = find the value.