Introduction to statistics Flashcards

1
Q

What is the primary goal of statistics?
a) Collecting and organizing data
b) Analyzing and interpreting data
c) Deriving information from data
d) All of the above

What does statistics help us do with enormous data?
a) Collect and organize it efficiently
b) Analyze and interpret it effectively
c) Separate sense from nonsense
d) All of the above

What is the role of statistics in dealing with data?
a) To make data more confusing
b) To make data more organized
c) To make sense of data
d) None of the above

A

d) All of the above
d) All of the above
c) To make sense of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Statistics is the science of __________, __________, __________, and __________ information from data.

A knowledge of statistics helps separate _________ from _________.

Statistics helps us deal with _________ data that is around us.

A

Statistics is the science of collecting, organizing, analyzing, and interpreting information from data.
A knowledge of statistics helps separate sense from nonsense.
Statistics helps us deal with enormous data that is around us.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. is the following example an instance of qualitative (dichotomic), qualitative (polynomic), or quantitative (discrete) data?
    Example: Eye color (blue, brown, green)
  2. Is the following example an instance of qualitative (dichotomic), qualitative (polynomic), or quantitative (discrete) data?
    Example: Gender (male, female)
  3. Is the following example an instance of qualitative (dichotomic), qualitative (polynomic), or quantitative (discrete) data?
    Example: Number of siblings
  4. Is the following example an instance of qualitative (dichotomic), qualitative (polynomic), or quantitative (continuous) data?
    Example: Height (in centimeters)
  5. Is the following example an instance of qualitative (dichotomic), qualitative (polynomic), or quantitative (discrete) data?
    Example: Level of education (primary, secondary, tertiary)
  6. Is the following example an instance of qualitative (dichotomic), qualitative (polynomic), or quantitative (continuous) data?
    Example: Temperature (in degrees Celsius)
A
  1. Qualitative (polynomic
  2. Qualitative (dichotomic)
  3. Quantitative (discrete)
  4. Quantitative (continuous)
  5. Qualitative (polynomic)
  6. Quantitative (continuous)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which type of variable categorizes or describes an element of a population?
a) Qualitative
b) Quantitative

Which type of variable quantifies an element of a population?
a) Qualitative
b) Quantitative

Arithmetic operations, such as addition and averaging, are not meaningful for data resulting from which type of variable?
a) Qualitative
b) Quantitative

A

a) Qualitative
b) Quantitative
a) Qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A characteristic that may differ from one entity to another is called a ________ variable.

Qualitative variables are also known as ________ or ________ variables.

Quantitative variables are also known as ________ or ________ variables.

A

variable
attribute, categorical
numerical, quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain the difference between qualitative (categorical) and quantitative (numerical) variables and why arithmetic operations are not meaningful for qualitative data.

Provide an example of a qualitative variable and explain why arithmetic operations cannot be applied to it.

Give an example of a quantitative variable and explain why arithmetic operations can be applied to it.

A

Qualitative (categorical) variables describe characteristics or attributes and do not involve numerical values. They can be divided into categories or groups, and arithmetic operations like addition or averaging cannot be performed on these categories as they lack numerical meaning. On the other hand, quantitative (numerical) variables provide numerical quantities that can be measured or counted, allowing for arithmetic operations.

Example: Hair color. Arithmetic operations are not meaningful for hair color because hair colors are categories like blonde, brunette, or red. It does not make sense to add or average hair colors.

Example: Age. Age is a quantitative variable because it represents a numerical quantity. Arithmetic operations like addition and averaging can be applied to age, allowing us to calculate things like average age, add ages together, or find the difference between two ages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Categorical data are commonly summarized using:
a) Percentages
b) Averages
c) Medians
d) Standard deviations

Measurement data are typically summarized using:
a) Percentages
b) Averages
c) Medians
d) Standard deviations

Which summary measure is commonly used for categorical data?
a) Mean
b) Median
c) Percentage
d) Standard deviation

Categorical data are commonly summarized using ________ (or proportions).

Measurement data are typically summarized using ________ (or means).

Subjective:

A

a) Percentages
b) Averages
c) Percentage

percentages
averages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Which of the following is an example of qualitative dichotomic data?
    a) Age of students in a class
    b) Height of trees in a forest
    c) Gender (Male/Female)
    d) Number of cars in a parking lot
  2. Which of the following is an example of qualitative polynomic data?
    a) Temperature in degrees Celsius
    b) Number of siblings a person has
    c) Blood pressure reading
    d) Eye color (Blue, Brown, Green)
  3. Which of the following is an example of quantitative discrete data?
    a) Time taken to complete a race in seconds
    b) Number of pets in a household
    c) Weight of fruits in kilograms
    d) Height of students in centimeters
  4. Which of the following is an example of quantitative continuous data?
    a) Number of books in a library
    b) Shoe sizes of students
    c) Income levels of individuals
    d) Number of goals scored in a soccer match
A
  1. Answer: c) Gender (Male/Female)
  2. Answer: d) Eye color (Blue, Brown, Green)
  3. Answer: b) Number of pets in a household
  4. Answer: c) Income levels of individuals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which type of variable categorizes or describes an element of a population?
a) Nominal
b) Ordinal
c) Discrete
d) Continuous

Which type of variable involves an ordering or relative ranking of measurements?
a) Nominal
b) Ordinal
c) Discrete
d) Continuous

Which type of variable can assume a countable number of values with a gap between any two values?
a) Nominal
b) Ordinal
c) Discrete
d) Continuous

Which type of variable can assume an uncountable number of values?
a) Nominal
b) Ordinal
c) Discrete
d) Continuous

Nominal variables categorize or describe an element of a population based on ________ rather than by numerical measurement.

Ordinal variables involve an ordering or ________ ranking of measurements.

Discrete variables are quantitative variables that can assume a countable number of values, with a ________ between any two values.

Continuous variables are quantitative variables that can assume an ________ number of values

A

a) Nominal
b) Ordinal
c) Discrete
d) Continuous
Fill in the blanks:

qualities
relative
gap
uncountable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain the difference between nominal and ordinal variables, providing examples for each.

Give an example of a discrete variable and explain why there is a gap between any two values.

Provide an example of a continuous variable and explain why it can assume an uncountable number of values

A

Nominal variables categorize or describe elements based on qualities or characteristics, without any specific order or ranking. Examples include flower color (red, blue, yellow) and gender (male, female).

Ordinal variables involve an ordering or ranking of measurements, indicating the relative position or preference. Examples include academic positions (assistant professor, associate professor, full professor) and Likert scale (strongly agree, agree, neutral, disagree, strongly disagree).

Example: Number of children in a family. There is a gap between any two values (e.g., 2 and 3) because you cannot have a fractional or intermediate value for the number of children.

Example: Weight in kilograms. Weight can take an uncountable number of values (e.g., 4.3 kg, 1.9 kg) because there are infinite possible weight measurements within a given range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Categorical data are commonly summarized using:
a) Percentages
b) Averages
c) Medians
d) Standard deviations

Measurement data are typically summarized using:
a) Percentages
b) Averages
c) Medians
d) Standard deviations

Which summary measure is commonly used for categorical data?
a) Mean
b) Median
c) Percentage
d) Standard deviation

Categorical data are commonly summarized using ________

Measurement data are typically summarized using ________

A

a) Percentages
b) Averages
c) Percentage
Fill in the blanks:

percentages
averages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain why categorical data is commonly summarized using percentages (or proportions) and provide an example.

Describe why measurement data is typically summarized using averages (or means) and provide an example.

A

Categorical data is summarized using percentages (or proportions) because it helps express the relative frequency or distribution of different categories in the data. For example, if we have data on the hair length of students, we can summarize it by saying, “11% of students have long hair.”

Measurement data is typically summarized using averages (or means) because it provides a central tendency or average value of the measurements. It helps understand the typical or representative value of the data. For example, if we have data on the ages of students, we can summarize it by saying, “The average age of SD students in 2020 is 24.2 years.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the term “population” refer to in a study?
a) The sample size
b) The characteristics of the objects
c) The entire collection of objects of interest
d) The data collection process

What can be included as objects in a population?
a) Only people
b) Only animals
c) Only plants
d) People, animals, and more

What is the purpose of data collection in relation to the population?
a) To determine the sample size
b) To analyze the characteristics of the objects
c) To estimate the population size
d) To identify the research question

A population is an entire collection of ________ in which a study is interested.

The population size can vary from very ________ to very ________.

………………….. involves certain characteristics of the objects, such as weight in Kg

A study that involves the population is called a
……………….

A

c) The entire collection of objects of interest
d) People, animals, and more
b) To analyze the characteristics of the objects
Fill in the blanks:

objects
large, small
data collection
census

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain the concept of a census and how it relates to studying a population.

Provide an example of a study where the population size is large, and explain why a census might be appropriate in that case

A

A census is a study that involves examining and gathering information from the entire population of objects of interest. It aims to collect data on all members of the population rather than a sample. A census provides a comprehensive understanding of the population’s characteristics, allowing researchers to draw accurate conclusions.

Example: A study on the voting preferences of a country’s citizens. The population size in this case would be very large, potentially in the millions or billions. Conducting a census would be appropriate because it would involve collecting data from every eligible voter to determine their voting preferences accurately. A census ensures that no subgroups within the population are overlooked, providing a complete picture of the entire population’s voting behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are summaries of population data called?
a) Parameters
b) Variables
c) Samples
d) Statistics

How many values are there for each population parameter?
a) One
b) Multiple
c) Depends on the population size
d) Depends on the sample size

What symbol is used to represent the population size?
a) N
b) µ (mu)
c) σ2
d) σ

Fill in the blanks:

Summaries of population data are called ________.

There is ________ value(s) for each population parameter.

The population size is denoted by ________.

The population mean is denoted by ________.

The population variance is denoted by ________.

A

Multiple-choice:

a) Parameters
a) One
a) N
Fill in the blanks:

Parameters
One
N
µ (mu)
σ2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain the concept of a population parameter and provide an example.

Discuss why there is usually only one value for each population parameter and why it is estimated.

A

A population parameter is a summary measure that describes a characteristic of a population. It represents a fixed value that is usually unknown and needs to be estimated. For example, the population mean is a parameter that represents the average value of a particular variable in the entire population.

There is usually only one value for each population parameter because parameters aim to describe the population as a whole. Since it is impractical to measure the entire population, we often take a sample from the population and use statistics calculated from the sample to estimate the population parameters. Estimation allows us to make inferences about the population based on the information we have from the sample.

17
Q

What is a sample?
a) The entire population of interest
b) A sub-subset of the population
c) The estimated parameters of the population
d) The summary statistics of the population

How is the sample size denoted?
a) N
b) µ (mu)
c) σ2
d) n

Is the sample size smaller or larger than the population size?
a) Smaller
b) Larger
c) Same
d) Depends on the study

Why is it important to determine the sample size before drawing the sample?
a) To estimate population parameters accurately
b) To increase the representativeness of the sample
c) To reduce sampling bias
d) All of the above

Fill in the blanks:

A sample is a sub-subset of the ________ drawn from the population of interest.

The sample size is denoted by ________.

The sample size is ________ than the population size.

A

Multiple-choice:

b) A sub-subset of the population
d) n
a) Smaller
d) All of the above
Fill in the blanks:

population
n
smaller

18
Q

Discuss the factors that should be considered when determining the sample size for a study.

Explain why the sample size is typically smaller than the population size

A

When determining the sample size for a study, factors such as the desired level of precision, the variability of the population, the research question, available resources, and the desired confidence level should be considered. A larger sample size generally provides more precise estimates and increases the power of statistical tests, but it may also be more costly and time-consuming to collect.

The sample size is typically smaller than the population size because it is not feasible or practical to measure or collect data from every individual in the population. Sampling allows researchers to obtain a representative subset of the population that can provide valuable insights and generalizable results without the need to study the entire population. The size of the sample is determined based on statistical considerations to ensure that it is large enough to draw meaningful conclusions while being manageable in terms of time, cost, and resources.

19
Q

How is the sample size denoted?
a) N
b) µ (mu)
c) σ2
d) n

What symbol is used to represent the sample mean?
a) N
b) x bar
c) σ2
d) s

What symbol is used to represent the sample variance?
a) N
b) x bar
c) σ2
d) s2

What are summaries of sample data called?
a) Parameters
b) Variables
c) Samples
d) Statistics

Fill in the blanks:

The sample size is denoted by ________.

The sample mean is denoted by ________.

The sample variance is denoted by ________.

A

Multiple-choice:

d) n
b) x bar
d) s2
d) Statistics
Fill in the blanks:

n
x bar
s2

20
Q

Explain why different samples drawn from the same population can give different sample statistics.

Discuss the difference between parameters and statistics.

A

Different samples drawn from the same population can give different sample statistics due to random sampling variability. Each sample is a subset of the population, and the individuals included in each sample can vary. Random variations in the composition of samples can lead to different observed values for sample statistics, such as the sample mean or variance.

Parameters refer to summary measures that describe characteristics of a population, while statistics refer to summary measures calculated from sample data that provide information about the corresponding population parameters. Parameters aim to describe the entire population, while statistics aim to estimate population parameters based on the information available from the sample.

21
Q

Descriptive statistics summarize data in terms of:
a) Center point, dispersion, and distribution
b) Population parameters and sample statistics
c) Confidence intervals and p-values
d) Correlations and regression coefficients

Which of the following is a measure of center point?
a) Variability
b) Median
c) Correlation
d) Standard deviation

Inferential statistics are used to:
a) Summarize data
b) Compare sample statistics
c) Estimate population parameters
d) Calculate confidence intervals

Fill in the blanks:

Descriptive statistics summarize the data in terms of center point, dispersion, and ________.

Inferential statistics are used to estimate ________ parameters and test for significant differences between populations.

A

Multiple-choice:

a) Center point, dispersion, and distribution
b) Median
c) Estimate population parameters
Fill in the blanks:

distribution
population

22
Q

Discuss the importance of the measures of center point (mean, median, mode) in summarizing data and provide an example of when each measure would be useful.

Explain the process of hypothesis testing in inferential statistics and how it is used to compare significant differences between populations.

A

Measures of center point (mean, median, mode) are essential in summarizing data as they provide information about the central tendency of the dataset. The mean represents the arithmetic average of the data and is useful when the data is normally distributed. The median represents the middle value when the data is ordered, and it is more robust to outliers. The mode represents the most frequently occurring value in the data. For example, if we have data on the ages of students, the mean would be useful to understand the average age, while the median would be more appropriate if there are extreme values or outliers affecting the mean.

Hypothesis testing in inferential statistics involves formulating a null hypothesis and an alternative hypothesis, collecting sample data, and performing statistical tests to determine if there is enough evidence to reject the null hypothesis. The process involves calculating test statistics, such as t-tests or chi-square tests, and comparing them to critical values or p-values. This helps assess whether there are significant differences between populations or relationships between variables. Hypothesis testing allows researchers to make predictions and draw conclusions about the larger population based on the information obtained from the sample.