Statistics Flashcards

Question 1

Q

Which measure of central tendency represents the most frequently occurring value in a dataset?

A) Mean
B) Median
C) Mode
D) Range

Question 2

Q

Which measure of central tendency is best used when the dataset contains outliers?

A) Mean
B) Median
C) Mode
D) Range

Answer

A

B) Median

Question 3

Q

difference between highest and lowest observation in a data

Question 4

Q

True or False

The median is less affected by outliers compared to the mean, making it a better measure of central tendency when outliers are present

Question 5

Q

Which measure of variability indicates the average distance of each data point from the mean?

A) Range
B) Interquartile range
C) Variance
D) Standard deviation

Answer

A

D) Standard deviation

Question 6

Q

used to measure how far the data values are dispersed from the mean

Question 7

Q

True or False

Standard deviation measures the average distance of each data point from the mean, providing insight into the spread of the data.

Question 8

Q

True or False

In a positively skewed distribution, the mean is lower than the median, which is lower than the mode, due to the tail on the right side.

Answer

A

False

In a positively skewed distribution, the mean is greater than the median, which is greater than the mode, due to the tail on the right side

Question 9

Q

Which of the following is a measure of the central location of a dataset?

A) Standard deviation
B) Variance
C) Median
D) Interquartile range

Answer

A

C) Median

Question 10

Q

Which measure of spread is defined as the difference between the first quartile and the third quartile?
A) Range
B) Standard deviation
C) Variance
D) Interquartile range

Answer

A

D) Interquartile range

Question 11

Q

In hypothesis testing, what is the p-value?

A) The probability of accepting the null hypothesis
B) The probability of rejecting the null hypothesis when it is true C) The probability of observing the test results under the null hypothesis
D) The level of significance

Answer

A

C) The probability of observing the test results under the null hypothesis

Question 12

Q

Which of the following is a Type I error?

A) Rejecting the null hypothesis when it is true
B) Accepting the null hypothesis when it is false
C) Failing to reject the null hypothesis when it is false
D) Failing to accept the null hypothesis when it is true

Answer

A

A) Rejecting the null hypothesis when it is true

Question 13

Q

If you are going to describe the findings of a survey about what annual income is for the people of Makati City, in which you have both extremely wealthy and extremely poor people, which two measures would you use?

A) Mean and Mode
B) Mean and Range
C) Mean and Standard Deviation
D) Mode and Standard Deviation

Answer

A

C) Mean and Standard Deviation

Question 14

Q

In a confidence interval, what does the margin of error represent?

A) The range of values within which the population parameter lies B) The standard deviation of the sample
C) The maximum error allowed in the estimate
D) The sample mean

Answer

A

C) The maximum error allowed in the estimate

Question 15

Q

indicates the range within which we expect the true population parameter to lie

Answer

A

margin of error

Question 16

Q

What does a confidence level of 95% mean?

A) There is a 95% probability that the sample mean is within the confidence interval
B) 95% of the population data lies within the confidence interval C) 95% of the time, the true population parameter lies within the confidence interval
D) There is a 5% chance that the sample mean lies outside the confidence interval

Answer

A

C) 95% of the time, the true population parameter lies within the confidence interval

Question 17

Q

What is the purpose of a t-test?

A) To compare the variances of two populations
B) To compare the means of two populations
C) To test the independence of two variables
D) To test the relationship between two variables

Answer

A

B) To compare the means of two populations

Question 18

Q

values wanted to explain or forecast; values depend on something else; denote it as y

Answer

A

Dependent Variable

Question 19

Q

explains the other one; denote it as x

Answer

A

Independent Variable

Question 20

Q

point where the regression line crosses the Y-axis, representing the value of Y when X is zero

Answer

A

intercept

Question 21

Q

What does the coefficient of determination (R2) indicate?

A) The strength of the linear relationship between two variables
B) The percentage of variation in the dependent variable explained by the independent variable
C) The slope of the regression line
D) The correlation between two variables

Answer

A

B) The percentage of variation in the dependent variable explained by the independent variable

Question 22

Q

explaining or predicting a single Y variable from two or more X variables

Answer

A

Multiple Regression Analysis

Question 23

Q

occurs when independent variables in a regression model are highly correlated

Answer

A

Multicollinearity

Question 24

Q

Which of the following best describes heteroscedasticity in regression analysis?

A) The error terms have constant variance
B) The error terms have increasing or decreasing variance
C) The error terms are normally distributed
D) The error terms are autocorrelated

Answer

A

B) The error terms have increasing or decreasing variance

Question 25

Q

means that the variance of the error terms changes across observations

Answer

A

Heteroscedasticity

Question 26

Q

to find the relationships between two data factors

Answer

A

Logistic Regression

Question 27

Q

In logistic regression, what type of dependent variable is used?

A) Continuous
B) Ordinal
C) Nominal
D) Binary

Answer

A

D) Binary

Question 28

Q

Binary Explanation: Logistic regression is used for modeling binary (____) outcomes

Answer

A

(dichotomous)

Question 29

Q

What is the purpose of an ANOVA test?

A) To compare the means of two groups
B) To compare the variances of two groups
C) To compare the means of three or more groups
D) To test the independence of two categorical variables

Answer

A

C) To compare the means of three or more groups

Question 30

Q

In ANOVA, what does a significant F-test indicate?

A) All group means are equal
B) At least one group mean is different
C) The variances are equal across groups
D) There is a linear relationship between groups

Answer

A

B) At least one group mean is different

Question 31

Q

What is the null hypothesis in a chi-square test of independence?

A) The two variables are independent
B) The two variables are dependent
C) The two variables have equal variances
D) The two variables have equal means

Answer

A

A) The two variables are independent

Question 32

Q

purpose of this test is to determine if a difference between observed data and expected data is due to chance, or if it is due to a relationship between the variables you are studying.

Answer

A

chi-square test

Question 33

Q

In a chi-square test, what is the expected frequency?

A) The observed frequency in each category
B) The frequency expected if the null hypothesis is true
C) The sum of observed frequencies
D) The total number of observations

Answer

A

B) The frequency expected if the null hypothesis is true

Question 34

Q

Which of the following is a characteristic of a simple random sample?

A) Each member of the population has an equal chance of being selected
B) The population is divided into subgroups and samples are taken from each subgroup
C) Samples are chosen based on convenience
D) Samples are chosen based on specific characteristics

Answer

A

A) Each member of the population has an equal chance of being selected

Question 35

Q

What is stratified sampling?

A) Dividing the population into strata and randomly selecting samples from each stratum
B) Selecting samples based on convenience
C) Selecting every nth member of the population
D) Grouping the population into clusters and randomly selecting clusters

Answer

A

A) Dividing the population into strata and randomly selecting samples from each stratum

Question 36

Q

What is the main advantage of using a larger sample size?

A) It reduces the population size
B) It increases the standard error
C) It increases the accuracy of the sample mean
D) It reduces the variability of the population

Answer

A

C) It increases the accuracy of the sample mean

Question 37

Q

The coefficient of variation (CV) is measured in terms of what unit?

A) same unit with the data
B) squared unit
C) percent
D) square root of the given unit

Answer

A

C) percent

Question 38

Q

standardized measure of the dispersion of a probability distribution or frequency distribution

Answer

A

coefficient of variation (CV)

Question 39

Q

What is the shape of the normal distribution?

A) Skewed left
B) Skewed right
C) Symmetrical bell-shaped
D) Uniform

Answer

A

C) Symmetrical bell-shaped

Question 40

Q

Which of the following best describes the central limit theorem?

A) The sum of a large number of random variables is normally distributed
B) The mean of a large number of random variables is normally distributed
C) The variance of a large number of random variables is normally distributed
D) The median of a large number of random variables is normally distributed

Answer

A

B) The mean of a large number of random variables is normally distributed

Question 41

Q

states that the distribution of the sample mean approaches a normal distribution as the sample size becomes large

Answer

A

central limit theorem

Question 42

Q

What does the term ”statistical power” refer to?

A) The probability of making a Type I error
B) The probability of making a Type II error
C) The probability of correctly rejecting the null hypothesis
D) The probability of accepting the null hypothesis

Answer

A

C) The probability of correctly rejecting the null hypothesis

Question 43

Q

likelihood that a test will detect an effect when there is an effect to be detected

Answer

A

Statistical power

Question 44

Q

What is the purpose of standardizing a variable?

A) To change the variable’s mean to 1
B) To change the variable’s standard deviation to 0
C) To make the variable’s mean 0 and standard deviation 1
D) To convert the variable to a binary format

Answer

A

C) To make the variable’s mean 0 and standard deviation 1

Question 45

Q

What is the purpose of a boxplot?

A) To display the frequency of data
B) To show the distribution of data based on a five-number summary
C) To display the relationship between two variables
D) To show the central tendency of data

Answer

A

B) To show the distribution of data based on a five-number summary

Question 46

Q

visually displays the distribution of a dataset using the minimum, first quartile, median, third quartile, and maximum

Question 47

Q

Which of the following statements about measures of variability must always be true if the standard deviation is equal to 1?

A) The standard deviation is equal to the variance
B) The standard deviation is less than the variance
C) The standard deviation is greater than the variance
D) None of the above

Answer

A

A) The standard deviation is equal to the variance

Question 48

Q

Which of the following statements is true about a normal distribution?

A) It is skewed to the right
B) It is skewed to the left
C) It is symmetric about the mean
D) It has two peaks

Answer

A

C) It is symmetric about the mean

Question 49

Q

What is the purpose of using a scatter plot?

A) To display the frequency of different categories
B) To show the relationship between two variables
C) To compare the means of different groups
D) To show the distribution of a single variable

Answer

A

B) To show the relationship between two variables

Question 50

Q

What does the null hypothesis in hypothesis testing typically state?

A) There is an effect or difference
B) There is no effect or difference
C) The effect or difference is greater than expected
D) The effect or difference is less than expected

Answer

A

B) There is no effect or difference

Question 51

Q

typically states that there is no effect or difference, serving as a starting point for statistical testing

Answer

A

null hypothesis

Question 52

Q

What is a confidence interval?

A) A range of values within which the sample mean lies
B) A range of values within which the population parameter lies
C) The range between the smallest and largest values in a dataset
D) The range of values within one standard deviation of the mean

Answer

A

B) A range of values within which the population parameter lies

Question 53

Q

What is the main purpose of a control group in an experiment?

A) To provide a comparison for the experimental group
B) To increase the sample size
C) To reduce the variability within the data
D) To eliminate the need for random sampling

Answer

A

A) To provide a comparison for the experimental group

Question 54

Q

In the context of regression analysis, what is multiple regression used for?

A) Analyzing the effect of a single independent variable on a dependent variable
B) Analyzing the effect of multiple independent variables on a dependent variable
C) Analyzing the effect of a single independent variable on multiple dependent variables
D) Analyzing the effect of multiple independent variables on multiple dependent variables

Answer

A

B) Analyzing the effect of multiple independent variables on a dependent variable

Question 55

Q

What is the difference between a bar chart and a histogram?

A) A bar chart displays categorical data, while a histogram displays numerical data
B) A bar chart displays numerical data, while a histogram displays categorical data
C) A bar chart is used for one variable, while a histogram is used for two variables
D) There is no difference; they are the same

Answer

A

A) A bar chart displays categorical data, while a histogram displays numerical data

Question 56

Q

What is an outlier in a dataset?

A) A value that is exactly equal to the mean
B) A value that is very different from the other values in the dataset C) A value that occurs most frequently
D) A value that falls within the interquartile range

Answer

A

B) A value that is very different from the other values in the dataset

Question 57

Q

Which of the following refers to the degree of flatness or peakedness of a curve?

A) Central Tendency
B) Dispersion
C) Skewness
D) Kurtosis

Answer

A

D) Kurtosis

Question 58

Q

measures the ”tailedness” of the distribution, indicating whether the data are heavy-tailed or light-tailed relative to a normal distribution.

It provides information about the height and sharpness of the central peak relative to that of a normal distribution

Question 59

Q

measures the ”tailedness” of the distribution, indicating whether the data are heavy-tailed or light-tailed relative to a normal distribution.

It provides information about the height and sharpness of the central peak relative to that of a normal distribution

Question 60

Q

represents the size of distribution of values that are expected for a specific variable.

Answer

A

Dispersion

Question 61

Q

measure of the asymmetry of a distribution; right (positive) or left (negative) skewness