Lecture 2: descriptive statistics Flashcards

1
Q

What are the goals of analysis?

A

1- To summarise data from a sample included in an experiment or observational study
2- To test hypothesis, and make interferences to the larger population from which a sample was drawn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the types of statistical analysis?

A

Descriptive statistics

Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is descriptive statistics?

A

Methods used to summarise or describe the main features of a collection of data

Describe the characteristics of a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Inferential statistics?

A

Methods used to make inferences from the sample to the larger population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the mothers of Descriptive statistics?

A

Graphical techniques- Diagrams : Histograms, box-and-whisker plots, scatterplots, bar charts, pie charts

Numerical techniques- Summary Statistics: Mean, standard deviation, range, median, inter-quartile range (IQR), mode, frequencies, percentages (incl. incidence, prevalence, risk, odds)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of Diagrams are used for Numerical data in Descriptive Statistics?

A

–Histogram
–Box-and-whisker plots (boxplots) for comparison by a categorical variable (e.g. sex)
–Scatterplots – relationship between two interval variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What type of Diagrams are used for Categorical data in Descriptive Statistics?

A

–Bar charts, pie charts

–Clustered or stacked bar charts for comparison by a second categorical variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What type of Diagrams are used for Numerical data?

A
  • Histogram (for continuous data)
  • Box-and-whisker plots (boxplots)
  • Scatterplots
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the characteristics of a Normal distribution?

A

–Symmetrical or bell-shaped

–Exactly half of the values are to the left of the center and the other half to the right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the characteristics of a Skewed distribution?

A

–Asymmetric distribution
–Right or positive skew – extreme values to the right
–Left or negative skew – extreme values to the left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the 5-number summary used in Box-and-whisker plot ?

A
Minimum = Min
1st Quartile= Q1
Median= Q2
3rd Quartile= Q3
Maximum = Max
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Boxplots are useful for?

A

Comparing groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Scatter plots are useful for?

A

Showing correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the Diagrams used for Categorical data (and Quantitative discrete)?

A
  • Bar charts
  • Clustered or stacked bar charts
  • Pie charts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the simplest way to present data?

A

By using Frequencies (counts) or percentages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the different ways you can display frequencies and percentages?

A

Table
Bar chart
Frequency distribution
Pie chart

17
Q

What are the two preferred methods for numerical summaries in Descriptive Statistics?

A
  • Measures of central tendency

* Measures of dispersion/spread

18
Q

What are Measures of central tendency?

A

Also know as AVERAGES

Used to identify the “centre” around which data are distributed.
–Mean: arithmetic average
–Median: middle value of a data set
–Mode: most frequently occurring value

19
Q

What is the Mean?

A

Arithmetic average

Mean=sum of data point/ number of data points

20
Q

What is the Median?

A

Middle value of a data set

Divides the data into 2 equal sets

  • If there is an odd # of elements, median is the middle number
  • If there is an even # of elements, median is the average of 2 middle numbers
21
Q

What is the Mode?

A

Most frequently occurring value

22
Q

What does Numerical descriptive statistics measure?

A

Measures of central tendency
–Mean
–Median
–Mode

23
Q

The choice of summary measure is determined by?

A

The distribution of the data

24
Q

In a symmetric distribution, mean and median?

A

Are the same

25
If median and mean are different, this indicates that?
The data are Skewed
26
What are Measures of variability/dispersion?
The spread of the distribution - how widely the observations are spread out around the measure of central tendency
27
What are the commonly used measures of dispersion to indicate how spread-out the data is?
- Range (min , max) –Interquartile Range IQR (the 25th and 75th percentiles) –Standard Deviation SD (measure of variability around the mean)
28
What is the Range?
The difference between the highest and lowest value Max - min
29
What is the cons if the Range?
Not very representative
30
What is the Interquartile range- IQ?
Splits ordered data into 4 quartiles and measures the range covered by 50% of the distribution. IQ= Q3 - Q1
31
What is the standard deviation?
Average difference of all data points from the sample mean
32
What is the Empirical rule?
For data with symmetric shape, the standard deviation has the following characteristics: •68% of sample data falls within ± 1 SD •95% of sample data falls within ± 2 SD
33
For data with symmetric shape, the standard deviation has the following characteristics?
* 68% of sample data falls within ± 1 SD | * 95% of sample data falls within ± 2 SD
34
Empirical rule example •At a 1 year review weight loss meeting, the mean weight lost by patients was 10kg •The standard deviation of the group was calculated to be 2.5kg
Therefore: ± 1 S.D = 10kg ± 2.5kg = 7.5kg and 12.5 kg Therefore, we can state that 68% of patients lost between 7.5 and 12.5kg ± 2 S.D = 10kg ± 2(2.5kg) = 5kg and 15 kg Therefore, we can state that 95% of patients lost between 5kg and 15kg
35
What kind of distribution it is when Mean & standard deviation are present?
Normal distribution
36
What kind of distribution it is when Mercian & Interquartile range are present?
Skewed –Left skew – extreme values on the left –Right skew – extreme values on the right