Statistics (2) Flashcards

1
Q

What is the purpose of descriptive statistics?

A

To explore and compare data meaningfully, assess major differences, determine data distribution shape, check for missing or unusual data, see data noise, and verify data fit for further testing

Descriptive statistics provides a summary of the data but does not allow for objective decisions regarding hypotheses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are some key functions of descriptive statistics?

A
  • Explore and compare data meaningfully
  • Assess major differences between conditions/variables
  • Determine the shape of data distributions
  • Check for missing data or outliers
  • See the amount of noise in the data
  • Verify data fit for further statistical testing

These functions help in understanding the basic characteristics of the data before applying inferential statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or False: Descriptive statistics can help us make objective decisions about our alternative hypothesis.

A

False

For objective decisions regarding the alternative hypothesis, inferential statistics is required.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fill in the blank: Descriptive statistics allows us to check for _______ or unusual data.

A

missing data

Identifying missing data and outliers is critical for ensuring the integrity of data analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is needed to arrive at an objective decision about the alternative hypothesis?

A

Inferential statistics

Inferential statistics allows researchers to make predictions or inferences about a larger population based on sample data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are descriptive statistics used for?

A

Descriptive statistics allow us to:
* Look at measures of central tendency, dispersion, and variation
* Organise and aggregate or disaggregate data in a meaningful way
* Get a ‘feel’ for any relevant patterns
* Present data graphically or in a tabular format

Descriptive statistics summarize data without making inferences about a larger population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do inferential statistics allow us to do?

A

Inferential statistics allow us to:
* Test hypotheses about distributions
* Determine whether differences or relationships are statistically meaningful
* Express whether we can retain or reject the null hypothesis

Inferential statistics make predictions or generalizations about a population based on a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Fill in the blank: Descriptive statistics focus on analyzing _______ data.

A

[observed]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fill in the blank: Inferential statistics are used to determine if differences or relationships are statistically _______.

A

[meaningful]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

True or False: Descriptive statistics can present data graphically.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True or False: Inferential statistics provide a summary of data without making predictions.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the normal curve also known as?

A

Standard normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the important measures that tend to be in the center of the distribution?

A

Mean, median, and mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do the mean, median, and mode represent in a distribution?

A

Numbers that are representative of the distribution as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Fill in the blank: The mean, median, and mode are measures of _______.

A

Central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or False: The mode is the measure that represents the least common value in a distribution.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the significance of the center of the distribution?

A

It is where important measures tend to be found

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the mean in statistics?

A

The mean is the average of a set of numbers, calculated by adding all items in a set and dividing by the number of items.

The mean is commonly used to represent general performance in statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What types of data is the mean primarily used with?

A

The mean is used mostly with interval and ratio data.

Interval data is numerical data where the difference between values is meaningful, while ratio data has a true zero point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How is the mean mathematically represented?

A

X = (Σxi) / N

Where X is the mean, Σxi is the sum of all items in the set, and N is the number of items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

True or False: The mean can only be calculated for integer values.

A

False

The mean can be calculated for both integers and decimals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the median?

A

The middle of a set of values if arranged from smallest to largest.

The median is particularly useful in non-normal distributions or with extreme scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When is the median most useful?

A

When you have a non-normal distribution, extreme scores, or ordinal data.

Ordinal data refers to data that can be ranked but not measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the mode?

A

The most commonly occurring number in a set of data.

The mode is most frequently used with nominal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

When is the mode most frequently used?

A

With nominal data.

Nominal data is categorical data without a specific order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What do measures of variability or dispersion indicate?

A

They indicate how the data varies and the spread of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What additional information can measures of variability provide?

A

They can provide insight into the amount of ‘noise’ in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

List common measures of dispersion.

A
  • Range
  • Interquartile range
  • Mean absolute deviation
  • Variance
  • Standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

True or False: The interquartile range (IQR) is a common measure of dispersion.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Fill in the blank: Common measures of dispersion include range, interquartile range, mean absolute deviation, ______, and standard deviation.

A

[variance]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the range in descriptive statistics?

A

The difference between the smallest and largest value in a distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is a limitation of using the range?

A

It is susceptible to extreme scores in a distribution

33
Q

What is the interquartile range (IQR)?

A

The range of the middle 50% of values, between the 25th and 75th percentile

34
Q

Why is the interquartile range (IQR) preferred over the range?

A

It is not susceptible to extreme scores

35
Q

How can the interquartile range (IQR) be displayed graphically?

A

On a boxplot

36
Q

Fill in the blank: The IQR is the range of values between the _______ and _______ percentile.

A

25th and 75th

37
Q

What is the absolute mean deviation?

A

A measure of how much difference or deviation there is from the mean

It is calculated by finding the difference between each value and the mean, ignoring negative signs.

38
Q

How is the absolute mean deviation calculated?

A

By working out the difference between each value and the mean, summing them, and dividing by N

N represents the total number of values.

39
Q

What is a more useful measure than absolute mean deviation?

A

Standard deviation

Standard deviation maps onto the standard normal distribution and helps assess proportions within the whole distribution.

40
Q

What is the relationship between standard deviation and variance?

A

The standard deviation is the square root of variance.

41
Q

What is variance?

A

Variance is a statistic that indicates the overall amount of variability in a set of data by adding up the squared differences between each value and the mean, divided by N-1.

42
Q

How is variance calculated?

A

Variance is calculated by the formula:

Var(X) = (Σ(Xi - x̄)²) / (N - 1)

where Xi represents each value, x̄ is the mean, and N is the number of observations.

43
Q

What does variance indicate?

A

Variance indicates the overall amount of variability in a set of data.

44
Q

What is the relationship between variance and statistical formulas?

A

Variance is used in many statistical formulas.

45
Q

In the variance formula, what does N represent?

A

N represents the number of observations in the data set.

46
Q

In the variance formula, what does Xi represent?

A

Xi represents each individual value in the data set.

47
Q

True or False: Variance is expressed in the same units as the original data.

A

False

48
Q

Fill in the blank: The formula for sample variance is Var(X) = _______.

A

(Σ(Xi - x̄)²) / (N - 1)

49
Q

Fill in the blank: The formula for population variance is Var(X) = _______.

A

(Σ(Xi - μ)²) / N

50
Q

What is the standard deviation?

A

A better measure of variance that is easier to understand, and is the square root of the variance.

Standard deviation (s) is measured in the original units of measurement and relates to the standard normal distribution.

51
Q

What does the standard deviation help us understand?

A

It helps us get a much better sense of the distribution of scores in our data.

52
Q

What is the formula for calculating sample standard deviation?

A

s = √(Σ(xi - x̄)² / (N - 1))

53
Q

What does ‘N’ represent in the standard deviation formula?

A

The number of observations in the sample.

54
Q

True or False: The standard deviation is only applicable to population data.

A

False

55
Q

Fill in the blank: The standard deviation relates to the _______ distribution.

A

standard normal

56
Q

What is the relationship between variance and standard deviation?

A

Standard deviation is the square root of variance.

57
Q

What is the standard error of the mean (SE)?

A

A measure of the error in estimating the population mean, especially with small sample sizes

The SE reflects the standard deviation of the population mean.

58
Q

Why is the standard error important?

A

It assesses the degree of error between sample means and indicates uncertainty around knowing the population mean

The sample mean is usually not exactly the same as the population mean.

59
Q

How is the standard error calculated?

A

By imagining a re-sampling of scores that provide different deviations from the mean

This method allows for an assessment of the degree of error.

60
Q

Fill in the blank: The standard error is a measure of the _______ in estimating the population mean.

A

error

61
Q

True or False: The standard error indicates the degree of certainty around knowing the population mean.

A

False

The standard error indicates the degree of uncertainty.

62
Q

What is the standard error of the mean (SE)?

A

A measure of the error in estimating the population mean, especially with small sample sizes

The SE reflects the standard deviation of the population mean.

63
Q

Why is the standard error important?

A

It assesses the degree of error between sample means and indicates uncertainty around knowing the population mean

The sample mean is usually not exactly the same as the population mean.

64
Q

How is the standard error calculated?

A

By imagining a re-sampling of scores that provide different deviations from the mean

This method allows for an assessment of the degree of error.

65
Q

Fill in the blank: The standard error is a measure of the _______ in estimating the population mean.

A

error

66
Q

True or False: The standard error indicates the degree of certainty around knowing the population mean.

A

False

The standard error indicates the degree of uncertainty.

67
Q

What are the two main measures of the shape of a distribution?

A

Height and breadth

These measures help to describe how data is distributed.

68
Q

What term refers to the shape of a distribution?

A

Kurtosis

Kurtosis can vary from tall and thin to short and wide.

69
Q

What do we refer to as the degree of asymmetry in a distribution?

A

Skew

Skew can vary in severity and can be positive, negative, or zero.

70
Q

What is a positive skewness value indicative of?

A

Positive skew

A positive skew means that the tail on the right side of the distribution is longer or fatter.

71
Q

What does a negative skewness value indicate?

A

Negative skew

A negative skew means that the tail on the left side of the distribution is longer or fatter.

72
Q

What skewness value indicates a symmetrical distribution?

A

0

A skewness value of 0 indicates no asymmetry in the distribution.

73
Q

Fill in the blank: The shape of a distribution can vary from _______ to _______.

A

tall and thin, short and wide

This variation is captured by the concept of kurtosis.

74
Q

What can extreme values in a distribution of data do to measures of central tendency?

A

Skew them positively or negatively.

75
Q

What is the purpose of the Shapiro-Wilk test in JAMOVI?

A

To test for normality in a data distribution.

76
Q

What is one visual method to identify extreme values in data?

A

A boxplot.

77
Q

Fill in the blank: Extreme values in a distribution of data can _______ the measures of central tendency.

A

skew

78
Q

True or False: The Shapiro-Wilk test is a method for testing the presence of extreme values in a dataset.

A

False