Descriptive Statistics Flashcards

1
Q

Which comes first? Descriptive or Inferential?

A
  1. Descriptive
  2. Inferential
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Descriptive Statistics ?

A

Descriptive statistics is a means of describing features of a data set by generating summaries about data samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Inferential Statistics?

A

Inferential statistics use measurements from the sample of subjects in the experiment to compare the treatment groups and make generalizations about the larger population of subjects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the measures of central tendency?

A
  1. Arithmetic Mean
  2. Median
    3.Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Arithmetic Mean?

A

Average Value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is arithmetic mean suitable for?

A

Suitable for symmetrical distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Symmetrical Graph Shape and distribution ?

A

Bell-shaped & Normal Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Asymmetrical Graph Shape and distribution ?

A

Distribution skewed to right
Positively skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What affects the mean of asymmetrical graphs?

A

The outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Median?

A

The median is the value in the middle of a data set, meaning that 50% of data points have a value smaller or equal to the median and 50% of data points have a value higher or equal to the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why is median not affected by outliers?

A

Median is a robust measure thus not affected but outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is median ideal for?

A

Asymmetrical Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why is median ideal for asymmetrical data?

A

It is a robust measurement and not affected by outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Measures of central tendency?

A

Central tendency is a descriptive summary of a dataset through a single value that reflects the center of the data distribution. Along with the variability (dispersion) of a dataset, central tendency is a branch of descriptive statistics. The central tendency is one of the most quintessential concepts in statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which measure of central tendency do you use for a/symmetrical data?

A

Symmetrical Data-arithmetic mean

Asymmetrical Data-Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Mode?

A

Most frequent value

17
Q

Why is mode not affected by outliers?

A

It is a robust measurement

18
Q

Robust measurement?

A

Robust measures of scale are methods that quantify the statistical dispersion in a sample of numerical data while resisting outliers. The most common such robust statistics are the interquartile range (IQR) and the median absolute deviation (MAD).

19
Q

How can we know that a graph is symmetrically distributed ?

A

The measures of central tendency will be close together.

For asymmetrical distribution has measures of central tendency spaced out

20
Q

Measures of spread?

A
  1. Variance and standard deviation
  2. Range
    3.Interquartile Range
21
Q

Variance?

A

Average squared distance from the mean

Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value.

22
Q

Why don’t we use variance?

A

Variance gives you a squared unit and thus not easy to use and apply

23
Q

Standard Deviation?

A

Square root of the variance

The standard deviation is the average amount of variability in your data set. It tells you, on average, how far each score lies from the mean.

24
Q

Why is standard deviation ideal to use?

A

-Doesn’t have squared root
-Affected by outliers
-

25
Q

Which kind of distribution do we use variance and standard deviation for?

A

We use standard deviation + mean for symmetrical distribution.

We do not use for asymmetrical distribution.

26
Q

When are statistics stronger for variance and standard deviation?

A

When the standard deviation is low

27
Q

What is standard deviation a good decriptor of?

A

Good descriptor of spread for normal, symmetrical, distributions

28
Q

Range?

A

Range is between the maximum and minimum values?

29
Q

Range equation?

A

Range=Maximum value-Minimum Value

30
Q

Range>Interquartile Range?

A

Interquartile range

31
Q

Interquartile range?

A

-25th and 75th percentiles
-covers 50% of values
- the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data.

32
Q

Nominal Data:

-Mode
-Median, IQR ,Range
-Median, SD

A

Mode: Yes
Median: No(can’t establish logical order)
Mean, SD: NO

33
Q

Ordinal Data:

-Mode
-Median, IQR ,Range
-Median, SD

A

Mode: Yes
Median: Yes(data has order)
Mean, SD: No

34
Q

Interval Data:

-Mode
-Median, IQR ,Range
-Median, SD

A

Mode: Yes
Median: Yes
Mean, SD: Yes

35
Q

Ratio Data:

-Mode
-Median, IQR ,Range
-Median, SD

A

Mode: Yes
Median: Yes
Mean, SD: Yes

36
Q

Why do we prefer to use IQR more than range

A

When measuring variability, statisticians prefer using the interquartile range instead of the full data range because extreme values and outliers affect it less. Typically, use the IQR with a measure of central tendency, such as the median, to understand your data’s center and spread.