UNIT 4 Flashcards
Random Variable
a quantitative variable whose value depends on chance.
A _____ _____ describes the outcomes of a
statistical experiment in words
Random Variable
Discrete Random Variable/ Probability distribution function
A listing of the possible
values and corresponding probabilities of a discrete random variable, or a formula for the probabilities
Probability histogram
A graph of the probability distribution that displays the possible values of a discrete
random variable on the horizontal axis and the probabilities of those values on the vertical axis.
A discrete PDF (Probability Distribution Function) has ___ characteristics
two
Each probability is between _ and _, inclusive, in other words includes zero and one
0,1
PDF Function
π β€ π·(πΏ = π) β€ 1
Sigma Ξ£ means to
Sum up
The sum of all probabilities in a distribution is always _
1
equation for the sum of all probabilities
Ξ£ P(X=x)=1
Expected Value
is the long-term average or mean. It is the long term mean of doing
an experiment over and over. It is the expected average.
Variable
A characteristic that varies from one person or thing to another
Qualitative Variable
A non-numerically valued variable; categorical variable (hair color)
Quantitative Variable
A numerically valued variable; numerical variable (weight)
Discrete Variable:
A quantitative variable whose possible values can be listed. In particular, a quantitative
variable with only a finite number of possible values is a discrete variable.
Continuous Variable
A quantitative variable whose possible values form some interval of numbers
Data
Values of a variable.
Qualitative Data
Values of a qualitative variable.
Quantitative Data
Values of a quantitative variable
Discrete Data
Values of a discrete variable; result of counting
Continuous Data:
Values of a continuous variable; result of measuring
The number of books in a backpack is ____, the weight of these books is ____.
discrete; continuous
Census
Collects information from the entire population for which data already exist
Sampling
Collect information from a representative part of the population.
Sampling is done when
it would be impractical to collect data from the entire population
Experimentation
data generated by carefully conducting an experiment
Simple Random Sampling
A sampling procedure for which each possible sample of a given size is equally likely to be the one
obtained
Simple Random Sampling With Replacement (SRSWR):
: wherein a member of the population can be
selected more than once
Simple Random Sampling Without Replacement (SRS)
wherein a member of the population can be
selected at most once
random-number generators
software that generates random numbers
. Systematic Random Sampling
Population/Sample Size rounded down = m
use a generator to generate a number (k) between 1 and m
Select k, k+m k+2mβ¦
Clusetr Sampling
Divid pop into groups/clusters
obtain random sample from clusters
use all members from step 2 as the sample
Stratified Random Sampling with Proportional Allocation
Divide Pop unto Strata (subpops)
fom each subpop, create a proportional sample size (strata/population= % sample for each strata)
all members from step two are the sample
Convenience Sampling
A type of non-random sampling, uses results that are readily available; conducting a study related to
organic food and collecting data from customers as they walk through the doors of Whole Foods
You are conducting a survey of students in a
dormitory. You choose your sample by knocking on
the door of every 10th room.
Choosing every 10th room makes this a
______ ______. The sample may be
representative, as long as students were
randomly assigned to rooms.
Systematic sample
To survey opinions on a proposed new water line,
a research firm randomly draws the addresses of
150 homeowners from a public list of all
homeowners.
The records presumably list all homeowners,
so drawing randomly from this list produces a
_____ _____ _____. It has a good
chance of being representative of the
population.
Simple Random sample
Agricultural inspectors for Jefferson County check
the levels of residue from three common pesticides
on 25 ears of corn from each of the 104 cornproducing farms in the county.
Each farm may have different pesticide use,
so the inspectors consider corn from each
farm as a subgroup (stratum) of the full
population. By checking 25 ears of corn from
each of the 104 farms, the inspectors are
using _____ _____. If the ears are
collected randomly on each farm, each set of
25 is likely to be representative of its farm.
Stratified sampling
Anthropologists determine the average brain size
of early Neanderthals in Europe by studying skulls
found at three sites in southern Europe.
By studying skulls found at selected sites,
the anthropologists are using a
_______ _____. They have little
choice, because only a few skulls remain
from the many Neanderthals who once lived
in Europe. However, it seems reasonable to
assume that these skulls are representative
of the larger population.
Convenience sample
Sampling Bias
occurs when a sample is collected from a population and some members of the population are
not as likely to be chosen as others
Sampling bias can lead to
incorrect conclusions being drawn about the population being
studied
Sampling Errors
are those that occur in the actual sampling process; such as the sample not being large
enough
Non-Sampling Errors
are tied to factors not related to the sampling process such a defective
counting device
A sample can ____ __ an exact representative of the population (unless the sample is exactly equal to the
population) so there will always be some ____ ____.
never be;sampling error
Distribution of a Data Set
is a table, graph, or formula that provides the values of the observations and
how often they occur
Unimodal
one peak
bimodal
two equal peaks
multimodal
many equal peaks
Symmetrical distributions can be shaped as
bell, triangular, rectangular
Skewed Ditributions skew ___ or ___
right or left
Reverse J Shaped distribution
swoop down left to right
Population Data
The values of a variable for the entire population
Sample Data:
The values of a variable for a sample of the population
The distribution of population data is called the
Population Distribution, or the distribution of the variable.
The distribution of sample data is called a
Sample Distribution
Truncated/Non-Truncated Graphs
By truncating the scale on the vertical axis it gives the impression that the
differences between the bars are far greater than they really are.
Improper Scaling
Number of homes this year will be double last year, so the developer doubled the width
and height, which makes it look like four times the number of homes will be built.
Random Variable
is a quantitative variable whose value depends on chance
describes the outcomes of a
statistical experiment in words
Random Variable
Typically, upper case
letters such as X or Y are used to represent
Random Variables
Continuous Random Variable
a random variable whose possible values form some interval or range of
numbers
Continuous Random Variables Represent
values that are measured such as baseball batting averages, IQ scores, the length
of time a long-distance phone call lasts, SAT scores,
Probability Density Function pdf:
A curve representing the probability distribution of a continuous random
variable.
function of graphs
f(x)
Cumulative Distribution Function cdf:
Area under the curve used to evaluate probabilities
The area under the curve is always equal to
1
π·(π < πΏ < π ) is the probability that
the random Variable (X) falls between values c and d on the x axis
Probability is found for ____ of x-values and NOT for _____ x-values
intervals ;individual
π·(πΏ = π) =0
the probability that X equals a specific value is zero
Uniform Distribution
a distribution that has constant probability since all events are equally likely to occur
Almost all the observations in any data set lie within _____ standard
deviations to either side of the mean
three
Number of Standard Deviations is more commonly
referred to as the _____
z-score
Z Score =
(π«πππ π½ππππ β π΄πππ)/
πΊππππ πππ π«πππππππn
ΞΌ
mu=mean of population
Ο
= standard deviation