Module 1 Flashcards
Statistics
Field of study involved the science of collecting, describing, and analyzing data
Population
Consists fo all individuals or objects of interest; often, the group being studied
Sample
A group selected from the population or a subset of the population; provides information used to infer information about populations
Why are samples used
Because it difficult getting access to an entire population
Question posed to entire STAT250 section:
What is your favorite streaming service?
What is the population?
What is the sample?
Can the sample data be used to make inferences about the population?
Population: All Mason students
Sample: STAT250 students in this section
Can this sample be used to make inferences about the population? Yes & No; Are all students taking a math/logic course as their mason core?
What happened when Truman was predicted to lose?
The polls were taken by people (generally wealthier and of a different voting party practice than the entire population) who did not serve as an efficient sample size of the population so the pre-mature prediction was inaccurate.
Random Sample
Names in a hat method;
When choosing a simple random sample of n units, all groups of the size n in the population have the same chance of becoming the sample
Best ways to do random sampling?
Small group: names in a hat
Large group: technology (random.org, rguroo, statkey, excel)
What things might prohibit sampling?
Inability to contact population; cost
Non-Random Sample
Targeting a group which may or may not be a conflict on obtaining data
Is it possible to successfully gather a random sample without tech?
Nope.
Explain Question Wording bias
- Question wording, or confusing or non-neutral language/leading questions;
- inaccurate responses (response bias)
- self-reporting data (usually have strong opinions one way or another)
What types of variables are considering during data collection?
Explanatory Variable & Response Variable
Explanatory Variable
The variable we think is “explaining” the change in the response variable (usually the variable scientists are manipulating)
Response Variable
The variable we think is beign impacted or changed by the explanatory variable; it “responds” to changes in the explanatory variable
What is an association relationship between variables?
Two variables are associated if values of one variable tend to be related to values of the other variable
What is a causation relationship between two variables?
Two variables are causally related (or associated) if changing the value of the explanatory variable influences the value of the response variable
Which of the following statements imply a causal relationship? Also,
think about which are the explanatory and response variables.
a) “Daily Exercise Improves Mental Performance.”
b) “Want to lose weight? Eat more fiber!”
c) “Sales stay the same no matter what medium of advertising is used.”
d) “Goldfish who live in large ponds tend to be larger than goldfish who live in
small ponds.”
a & b have causal relationships
Is the following statement a causal or association relationship?
“goldfish who live in large ponds tend to belarger than goldfish who live in small ponds”
Association: Words like “tend to” or “usually” do not imply causation
What words do not imply causation?
“tend to” or “usually”
What type of variable relationship does the following statement imply?
“Sales stay the same no matter what medium of advertising is used”
Neither causation nor association
Identify the explanatory and response variables:
a) “Daily Exercise Improves Mental Performance.”
b) “Want to lose weight? Eat more fiber!”
c) “Sales stay the same no matter what medium of advertising is used.”
d) “Goldfish who live in large ponds tend to be larger than goldfish who live in
small ponds.”
Explanatory variables; Response variables
a) daily exercise; mental performance
b) fiber consumption; weight maintenance
c) advertising; sales
d) pond size; goldfish size
T/F An associated between an explanatory and response variable, especially when very strong, always guarantees that the two variables are causally related
False; some variables are associated even when they have no cause and effect relationship
Confounding Variable
- A third variable that is associated with both the explanatory and response variable;
- Also called a confounding factor or lurking variable
T/F Whenever a confounding variable is present there is a causal relationship
False; a causal relationship cannot be determined when a confounding variable is present
Name two methods of data collection
Observational & Experimental
Observational Study
A study in which the research does not actively control the value of any variable, but simply observes values as they naturally exist
Experiment Collection method
A study in which the researcher actively controls one or more of the explanatory variables; also known as a randomized experiment
Explain some characteristics of randomized experiments
- researcher controls the assignment of one or more of the variables
- to avoid confounding, the researcher will use random assignment
- in a randomized experiment, the value of the explanatory variable for each unit is determined randomly, before the response variable is measured
- if a randomized experiment yields an association between the explanatory and response variable, we can establish a causal relationship between these two variables
What are some characteristics of observational studies?
- observes individuals and measures variables of interest but does not attempt to influence the responses
- often, the researcher is collecting available data after the fact
- can almost never be used to establish causation since they almost always possess confounding variables
Experimental or Observational Method
We contact a random sample of 100 people and record how much each person exercises and also measure the chemicals in the brain of each person.
Observational; no variables were manipulated/controlled/explanatory
Experimental or Observational Method:
Using a random sample of 100 people, we randomly assign half of them to participate in a regular exercise program for six weeks while the other half makes no changes. At the end of the time period, we measure the
brain chemical of each person.
Experimental Method; variables were controlled/manipulated and affected the (response) variables
What are three explanations for why an association may be observed in sample data?
1) There is a causal relationship or causal association
2) There is an association, but it is due to confounding
3) There is no actual association; the association seen in the sample data occurred by random chance
Evaluating evidence for the causal explanation requires evaluating __________ against the other two competing explanations
evidence
What is the best evidence against (2) confounding variables?
the use of random assignment
Two main types of randomized experiments
Randomized Comparative Experiment. & Matched Pairs Experiment
Randomized Comparative Experiment
Randomizing cases into different treatment groups or a treatment and control groupand then compare results on the response variable
Matched Pairs Experiment
Each case gets both treatments in random orer(or cases get paired up in some other obvious way). Then, we examine individual differences in the response variable between the two treatments
Four methods for randomization
1) Put all names/numbers in a hat
2) Put names or numbers on cards, shuffle cards, deal out cards into as many piles as there are treatments
3) For the matched pairs experiment, a coin can be flipped to determine which treatment the individual will receive first
4) Use technology (randomize,org, rguroo, statkey)
Which method of randomization are the two following:
1) To see if people read faster on paper or a Kindle, one study was done by randomly assigning 16 participants into two groups (paper or Kindle) and timing their reading of a set of instructions.
2) Another study was done in which 16 people read two sets of instructions of similar length, one on a kindle and one on paper. The order in which they read the instructions was randomized.
1) Randomized comparative experiment
2) Matched pairs design; this amkes more sense in this case since individuals reading speed may ary a great deal