HW1 CH6 - intro to Stats & Sampling Flashcards
Define individuals
the objects described by a set of data that contains information
define variables
describes some characteristic of an individual such as a person’s height, sex, or age
define categorical variable
Places each individual into a category such as male or female
define quantitative variable
has numerical values that measure some characteristic of each individual, such as height in cm or age in years (numerical information)
Define ordinal variable
a ranked categorical variable (example: freshman, sophmore, junior, senior)
explain exploratory data analysis
uses graphs and numerical summaries to describe the variables in a data set and the relations among them
define distribution
describes what values the variable takes and how often it takes these values
define frequency distribution
the variable (levels or choices) and how often each choice occurs. Examples are the number of occurrences
TEST!!!! define relative frequency
ratio of frequency to the total number of observations in the data set. (the frequency/the total) = relative frequency
define relative frequency distribution
a listing of distinct values from the data set and their relative frequencies.
define pie charts
displays the distribution of variables of a categorical variable by comparison
define bar graphs
A graph drawn using vertical bars. The height of each bar represents the frequency or relative frequency of each category. For categorical data, the bars are kept separate.
define histograms
displays the distribution of a quantitative variable
define dotplots
displays the distribution of a quantitative variable
define outliers
are observations that lie outside the overall pattern of a distribution. always try to explain them
define time plot
used when observations on a variable are taken over time that graphs time horizontally and the values of the variable vertically
Define population
the entire group of individuals about which we want information
define sample
The part of the population from which we collect information. We use this to draw conclusions about the entire population
A good sample…
represents the population
Use the line from the Table of Random Digits (Table B) shown below to generate 5 random numbers between 01 and 49. The first random number is _____.
14459 26056 31424 80371 65103 62253 22490 61181
14
If people tend to respond differently to a question depending on whether the questioner is male or female, which type of bias is present?
Response Bias
A study that contacts participants at regular time intervals to collect relevant information about events that occur after the start of the study is called a __________ study.
Prospective
What type of study can be used to provide the most convincing evidence of cause and effect?
An experiment
Determine whether the data set is a population or a sample.
All male students at Lincoln High School.
Population because it is a collection of all male students
You would like to select an SRS of 5 packages of peanuts from a case containing 30 packages of peanuts. You begin by labeling the packages 01 to 30. A line from a table of random digits is shown below.
14459 26056 31424 80371 65103 62253 22490 61181
If the line above is used to select the sample, what will be the label numbers for the five packages of peanuts in the random sample?
14, 03, 10, 22, 06
A Gallup poll sponsored by the disposable diaper industry asked, “It is estimated that disposable diapers account for less than 2% of the trash in today’s landfills. In contrast, beverage containers, third-class mail and yard waste are estimated to account for about 21% of the trash in landfills. Given this, in your opinion, would it be fair to ban disposable diapers?” 84% responded no. (Data from EESEE.)
Which type of bias does this poll suffer from?
Wording effect as one favors the other
A researcher wonders if listening to classical music will improve subjects’ abilities to memorize a list of words. She asks 5 students to listen to music while memorizing the words. Another group of students does not listen to music while memorizing the words. All the subjects are then tested to see how many of the words they remember, and the researcher observes the results. This is a(n) __________
experiment
When making decision to build a campus dormitory, a university that has many commuter students wants to know the percentage of students who prefer to live in a campus dormitory. The population of interest is the collection of __________
all students at this university
In a __________ study, researchers enroll subjects from a common demographic background and observe them at regular intervals over an extended period of time.
Cohort
In a survey about a new immigration law in Georgia, “46% of the 132 Georgia farmers, agricultural processors, and farm service businesses who responded said they were experiencing some degree of labor shortage” (Al Hackle, “Groups seek reform, not repeal,” Statesboro Herald, June 23, 2011). The 132 who responded is the ______________.
sample
A researcher wonders if children who engage in active play for at least 10 minutes a day have a lower obesity rate than children who do not. Staff from her laboratory shadow 100 children for 1 week and record the number of minutes of active play each child participated in each day. They also record the child’s weight and BMI to assess obesity.
an observational study
A __________ design tells us how to select the sample.
sampling
One line from the Table of Random Digits is shown below. Use the random digits to generate 5 random numbers between 01 and 49. The third random number is _____.
14459 26056 31424 80371 65103 62253 22490 61181
31
“Ann Landers” is famous for giving advice. Once, in response to a letter from an engaged couple, she asked her readers to write and respond to the question “If you could do it over again, would you have children?” She received over 10,000 responses, 70% of whom said they would not have had children if they could do it over again. These results __________.
don’t indicate much due to voluntary response
A study that contacts participants at regular time intervals to collect relevant information about events that occur __________ of the study is called a prospective study.
after the start
A study of a violent crime rates in a city found that the rates of violent crimes were highest on the days when ice cream sales were also high. A researcher would like to conclude that eating ice cream may be a cause of violent behavior.
The effect of ice cream sales on violent crime may be confounded with the effect of temperature. Thus temperature is a potential __________ variable.
lurking
The National Survey of Student Engagement (NSSE) regularly surveys freshmen and seniors at colleges and universities about their experiences both in and out of the classroom. Random samples from participating schools are selected, then students are asked and encouraged to complete the survey. At one university there are 3523 freshmen; a sample of 1000 of these is desired. We will use the random digits below to select the first four student labels to be included in the sample.
Random Digits: 14459 26056 31424 80371 65103 62253 22490 61181 34041
If we label the 3523 students from 0001 to 3523, the second student selected in the sample is
0371
A Gallup poll sponsored by the disposable diaper industry asked, “It is estimated that disposable diapers account for less than 2% of the trash in today’s landfills. In contrast, beverage containers, third-class mail and yard waste are estimated to account for about 21% of the trash in landfills. Given this, in your opinion, would it be fair to ban disposable diapers?” 84% responded no. (Data from EESEE.) The _____ of the question introduced bias in this poll.
wording
A researcher wonders if listening to music will help students memorize a list of words. She gives the students the list to take home and memorize. The next day the researcher asks each student whether or not they listened to music while memorizing the words. All the subjects are then tested to see how many of the words they remember.
What type of study is this?
An observational study
what is a case control study
In a case-control study, case-subjects are selected as a random sample of individuals with a condition of interest, and control-subjects are selected as a random sample of individuals without the condition. The two groups are then compared to help identify factors which may be associated with the condition.
Define discrete variable
a variable with a finite number of possible values. (whole numbers) “listing of counting”
define continuous variable
a variable with an infinite number of possible values (numbers that can be rounded off) Examples are measurements: “length, time, and weight”
define single-value grouping
a method of grouping in which each class represents a single possible value
define cutpoint grouping
used with continuous data and each group contains a range of values. difference between each group is the width.
what are the rules for the cutpoint grouping?
- number of classes should be between 5 and 20
- all values must be include and each value belongs ONLY to one class
- all classes share the same width
- If possible, class width should be a whole number (max value-min value) / (# of classes)
Define Stratified Random sampling
the division of a population into smaller groups from which the random samples are selected.
Define simple random sampling
𝑛 individuals from the population chosen in such a way that every set of 𝑛 individuals has an equal chance to be the sample actually selected.
Define Random Cluster Sampling
divides the population into groups. Geographically chooses the chosen cluster and uses those samples that were randomly chosen. For example, only uses mice from box 6
Define Voluntary Sampling (biased) also known as volunteer or self-selected sample
Let individuals choose whether to participate in the study (write-ins, call-ins, or online quick votes)
Define the Systematic sampling method
creating a sorted list of the population, randomly choosing the first participant to include in the sample, and selecting every 𝑛th participant after the first one until enough participants are selected for the sample.
Define The Convenience Sampling method
selecting the most easily accessible items from the population for the sample
Define unblinded Study
both the experimenter and the subject know which treatment the subject is receiving
Define single-blind study
Neither the experimenter nor the subject knows the which treatment is being administered, but not both
define double-blind study
both the researcher and the subject are unaware of which treatment was administered
Define Response bias
the behavior of the respondent
define nonresponse bias
when a selected individual cannot be contacted or refuses to participate
define undercoverage sampling or coverage bias
When some groups in the target population are left out of the process of selecting the sample
What is the definition that links population and sample?
A population is the complete group under study. A sample is the sub-collection of members of the population from which data are collected