1.1-1.5 Flashcards
Define statistics.
Statistics is the science of data. More specifically, it is the science of the collection, classification,
analysis, and interpretation of information/data.
Why is it important to ensure statistics are being applied properly? Provide an example.
Must ensure statistics are being applied properly, otherwise, there can be catastrophic
effects
o Example: Sally Clark was falsely convicted of the murder of her own child due to
the improper use of statistical findings
What are the two processes involved in statistics?
o Describing data sets
o Drawing conclusions from that data set
Define measurement in statistics.
Measurement is the process we use to assign numbers to variables of individual
population units
What are some ways measurements can be made>
▪ Using instruments
▪ Asking questions
▪ Ratings using scales
Why is it important to know whether the data you are dealing with is qualitative or quantitative?
It is essential that you understand whether the data you are interested in are
quantitative or qualitative, since the statistical method appropriate for describing,
reporting, and analyzing the data depends on the data type (quantitative or qualitative).
What are the three ways of obtaining data?
o Published source
o Designed experiment
o Observational study
What are the two classes of variables?
Quantitative and qualitative.
What are the two subclasses of quantitative variables?
discrete and continuous
What are the two subclasses of qualitative variables?
Ordinal and nominal
What are the four elements of a descriptive statistics problem?
- The population or sample of interest
- One or more variables (characteristics of the population or sample units) that are to be
investigated - Tables, graphs, or numerical summary tools
- Identification of patterns in the data
What are the five elements of an inferential statistics problem?
- The population of interest
- One or more variables (characteristics of the population units) that are to be investigated
- The sample of population units
- The inference about the population based on information contained in the sample
- A measure of the reliability of the inference
What is a response value?
What we are interested in finding
What is a sample size?
Number of participants in the study
What is a population?
Set of all experimental units we are interested in studying
What is a sample?
Subset of population of which we have actual observations
Why do we use samples?
Use samples because it is often impossible to gather data on all experimental
units
Define descriptive statistics.
Utilizes numerical and graphical methods to look for patterns in a
data set, to summarize the information revealed in a data set, and to present that
information in a convenient form.
WHat is inferential statistics?
Utilizes sample data to make estimates, decisions, predictions, or
other generalizations about a larger set of data.
What is a variable?
A characteristic or property of an individual experimental (or observational)
unit in the population.
What is a census?
A study that includes a measurement of every experimental unit.
What is a statisitical inference?
An estimate, prediction, or some other generalization about a
population based on information contained in a sample
What is a measure of reliability?
A statement (usually quantitative) about the degree of
uncertainty associated with a statistical inference.
Define qualntitative data
Data that are measured on a naturally occurring numerical scale.
Differentiate between interval and ratio data.
Can be Interval Data – The origin has no meaning
or
o Ratio Data – Origin is a meaningful number
Define qualitative data.
Measurements that cannot be measured on a natural
numerical scale; they can only be classified into one of a group of categories
Differentiate between ordinal and nominal data.
Can be Nominal – Unable to be ranked
or
o Ordinal – Can be ranked/ordered
What is a published source?
The data set of interest has already been collected.
What are designed experiments?
- Conducted by the researcher and the units in the study have
strict controls.
WHat is an observational study?
Researchers observe experimental units in their natural settings
for an observational study and records the variables of interest. In this case, there are no
controls placed on the units
What is a representative sample?
- Exhibits characteristics typical of those possessed by the target
population.
What is a simple random sample?
Simple Random Sample - of n experimental units is a sample selected from the
population in such a way that every different sample of size n has an equal chance of
selection.
What is selection bias?
Selection bias occurs when some experimental units in the population have less chance
of being included in the sample than others
What is nonresponse bias? When is it likely to happen?
a type of selection bias that results when data on all experimental
units in a sample are not obtained. In surveys where people choose whether or not they will respond.
What is measurement error? Give an exmaple.
- inaccuracies in the values of the data collected. In surveys, the
error may be due to ambiguous or leading questions and the interviewer’s effect on the
respondent.
What is a parameter?
A numerical summary of the population
What are some drawbacks of random sampling>
Not everyone wants to participate.
Those selected may not be qualified to participate.
The random sample might not have enough range.
Explain stratified random sampling.
The population is divided into non-overlapping subpopulations (strata).
Characteristics of cases are more similar within strata than across strata.
* Then take a simple random sample in each stratum.
* Combine the random selection from each stratum to make overall sample.
Explain cluster sampling. When is this method commonly used?
The population is divided into large number of (small) clusters.
* We randomly sample a few clusters.
* We then choose all cases within the sampled clusters.
* This technique is usually used when the population is dispersed, e.g. across a wide geographic region.
What is systematic sampling?
Cases are selected in an ordered sampling scheme.
* First case is selected at random, subsequent elements
follow a predetermined pattern.
* Selecting kth case from the list of all cases.
What is the most common type of nonprobability sampling? Explain what it is.
Most common nonprobability sampling is volunteer sampling.
* when surveying (e.g., sending questionnaire) to a selection of people is not at random and receive volunteer
response.
Describe the difference between a statistic and a parameter.
A parameter is a number describing a whole population (e.g., population mean), while a statistic is a number describing a sample (e.g., sample mean).
Explain the difference between an explanatory variable and a response variable.
An explanatory variable is the expected cause, and it explains the results. A response variable is the expected effect, and it responds to other variables.