Stats 511 Flashcards

Question 1

Q

Creative Case study: Creativity study: 47 subjects randomly divided in two groups. Each subject gets 1 questionnaire.
Questionaries given to either invoke intrinsic motivation (ex: pleasure for writing something good) or extrinsic motivation (receiving public recognition for their writing). Then they write Haiku (this is how creativity is measured).
12 poets then evaluate each haiku and assign a creativity score from 1-40. The average of the judges score of their haiku is the subjects creativity score.

Salary Study:
Did Harris Trust and Savings bank discriminate by paying higher salaries to men than woman between 1969 and 1977? The data set given: data of all starting salaries for 32 males and 61 females. This data set includes all people hired during this time.

List the difference between these studies.

Which one is a sample, which one is census (population)

Which one is randomized experiment?
Which one is Observational Study?

Which one is approximately balanced?
Which one is not balanced?

List other differences.

Answer

A

Sample: Creativity study only 47 people (sample) out of larger group of individuals.

Census:
Bank study: All people (population) who were hired during a time period.

Randomized: Creativity study is (randomized experiment) and is randomly assigned subjects to groups.

Observational: Groups in Salary study determined by the subjects (observational)

Approximately balanced: Sample sizes in creativity study = approximately balanced. (because both groups N are about even)

Not Balanced: Sample size in salary/bank study are not balanced.

Other:
1. “Response” (creativity score) is attempting to quantify something vague (creativity). But the salary’s are already numbers, so it’s easier to quantify.

Period of time of the study: a week or so for creativity study vs 8 years for salary study.
Difference in expense
Creativity score is subjective
Need to control the environment for the creativity study.
Were the poets blinded to the treatment? The salary’s were assigned without blinding.

Question 2

Q

What is Scope of Interest?

Answer

A

What can we infer from the study? What can we infer from this study? What does it tell us about the world?

Question 3

Q

What can we ask to gather scope of interest? 2 separate questions.

Answer

A

Were subjects randomly assigned to groups:
If subjects WERE randomly assigned. Then we can infer causation or that the treatment CAUSED any differences we observed in the response.

If the subjects were NOT randomly assigned into groups, we CANNOT infer causation.
In the salary study, we can’t infer that gender CAUSED salary differences. There may be underlining cause associated with gender (example: education)

Was it a random sample?
If sample is a random sample from a larger population than results can be inferred to be reflective of the larger population. Otherwise, results only apply to subjects observed.
Not easy to get a random sample.
Imagine repeating the random sample many times, eventually you’ll see the whole population.

Question 4

Q

Inference to Population?

Answer

A

Inferences to populations can be drawn from random sampling studies, but not
otherwise. In a random sampling study, units are selected by the investigator from a well defined population. All units in the population have a chance of being selected, and the investigator employs a chance mechanism (like a lottery) to determine actual selection.

The subjects of the creativity study
volunteered their participation- it was self selected, not a random sampling study.

Question 5

Q

Casual Conclusions/Inference: Can statistical analysis alone be used to establish causal relationships?
Confounding variable

Answer

A

Statistical inferences of cause-and-effect relationships: can be drawn from randomized
experiments, but not from observational studies.

Confounding Variable: is related both to group membership and to the outcome.
Its presence makes it hard to establish the outcome as being a direct consequence
of group membership.

Question 6

Q

Do observational studies have value?

Answer

A

Yes
1. Establishing causation is not always the goal.
2. Establishing causation may be done in other ways (Example examining people exposed to radiation from atomic blast, and those far enough away not to be impacted by atomic blast).
3. Analysis of observational data may lend evidence toward causal theories and suggest
the direction of future research.

Question 7

Q

The most basic form of random sampling is:

Answer

A

A simple random sample
A simple random sample of size n from a population is a subset of the population
consisting of n members selected in such a way that every subset of size n is afforded
the same chance of being selected.
A typical method assigns each member of the population a computer-generated
random number.

Question 8

Q

An inference
A statistical inference

Answer

A

An inference is a conclusion that patterns in the data are present in some broader
context.
A statistical inference is an inference justified by a probability model linking the
data to the broader context.

Question 9

Q

Statistical Inferences Based on Chance Mechanisms

Question 10

Q

What is the Population?

Answer

A

Unless you have a random sample, the population is conceptual. Imagine repeating the study many times. The population is the collection of subjects over these many repetitions.

Creativity study: Not a random sample but volunteers.
Over many repetitions of the study you’d have a large collection of volunteers, and that is your conceptual populations, and you don’t really know that much about it.

Question 11

Q

Statistical Hypothesis Test Structure

Answer

A

Null Hypothesis:
Test Statistic:
Sampling Distribution: distribution of test statistic over many repetitions of the study.

Question 12

Q

Measuring uncertainty in randomized Experiments

Question 13

Q

Typical Randomized Experiment (Image)

Question 14

Q

Standard deviation

Answer

A

Measure of spread. Amount of variation of a set of numbers. Low SD means values tend to be close to the mean. High SD indicates values are spread out over wider range.

Question 15

Q

Null Hypothesis and Alternative Hypothesis and T-Test

Answer

A

Insert Image

Question 16

Q

Historgram

Answer

Study These Flashcards

A

ordinarily used to show broad features, not exquisite detail, and the
broad features will be apparent with many choices.

Question 17

Q

stem-and-leaf diagram
https://www.youtube.com/watch?v=8WMTdnDLAj4

Answer

Study These Flashcards

A

is a cross between a graph and a table. It is used to get a
quick idea of the distribution of a set of measurements with pencil and paper or to
present a set of numbers in a report.
Display 1.10 shows stem-and-leaf diagrams. digits in each observation are separated into a stem and a leaf. Each number
in a set is represented in the diagram by its leaf on the same line as its stem. All
possible stem values are listed in increasing order from top to bottom, whether or
not there are observations with those stems. At each stem, all corresponding leaves
are listed, in increasing order. Outliers may require a break in the string of stems.
The stem-and-leaf diagrams show the centers, spreads, and shapes of distributions
in the same way histograms do. Their advantages include exact depiction of
each number, ease of determining the median and quartiles of each set, and ease
of construction. Disadvantages include difficulty in comparing distributions when
the numbers of observations in data sets are very different and severe clutter with
large data sets.

Question 18

Q

Boxp and whisker plot

Answer

Study These Flashcards

A

Stats 511 Flashcards

(18 cards)