Final Exam Flashcards
Statistics
Is the science of conducting studies to collect, organize, analyze, and draw conclusions from data.
Variable
A characteristic or attribute that can assume different values.
Population
Consists of all subjects (human or otherwise) that are being studied
Sample
A group of subjects selected from a population.
Bias Sample
Sample is collected in a way that some members were selected is unfair.
Descriptive Statistics
Consists of the collection, organization, summarization, and presentation of data.
(Describes, data can be shown in graphs, tables, etc.)
Inferential Statistics
Consists of generalizing from samples to populations, performing estimations & hypothesis tests, determining relationships among variables, and making predictions.
(Statistician tries to make inferences)
Qualitative Variables
Variables that have distinct categories according to some characteristic or attribute (Ex. hair color, drink brand, Jersey #, gender, religion, geographic location)
Quantitative Variables
Variables that can be counted or measured (Ex. age, height, weight, body temp, # of frogs)
Discrete Variables
Assume variables that can be counted and assigned values like 0,1,2,3, etc. (Ex. # of frogs in a contest, # of children in a family, calls received in a month)
Continuous Variables
Can assume an infinite # of values between any 2 specific values. They are obtained by measuring and often contain fractions and decimals. (Ex. distance a frog jumps, temp of a frog)
Nominal Level of Measurement
Classifies data into mutually exclusive (nonoverlapping) categories in which no order or ranking can be placed on the data (Ex. classifying people by zip codes, political party, religion, or marital status)
Ordinal Level of Measurement
Classifies data into categories that can be ranked; however, precise differences between the ranks DON”T exist (Ex. T-shirt size, placings, letter grades)
Interval Level of Measurement
Ranks data and precise differences between units of measure DO exist; however, there is no meaningful zero (Ex. IQ score, temperature)
Ratio Level of Measurement
Possess all the characteristics of interval measurement and there is a true zero. Also, true ratio exists when the same variable is measured on two different members of the population (Ex. scales used to measure weight, height, and area) (Ration Ex. one person can lift 200lbs. one person can lift 100lbs. this would be a 2:1 ratio between them)
Random sample
Sample where all members of the population have an equal chance of being selected.
Systematic sample
Sample is obtained by selecting every kth member of the population (Ex. picking every 5th person in line)
Stratified Sample
Sample obtained by dividing the population into subgroups/strata according to some characteristic relevant to the study (there can be several subgroups) subjects are then selected at random from each subgroup.
Cluster Sample
Obtained by dividing the population into sections/Clusters and then selecting one or more clusters at random and using all the members of the cluster(s) as the sample. (Used when the population is too large or involves multiple locations)
Convenience Sample
Researcher uses subjects that are convenient (Ex. interviewing people who walk into the mall)
Observational Study
Researcher merely observes what is happening or what has happened in the past and tries to draw conclusions based on these observations.
Experimental Study
Researcher manipulates one of the variables and tries to determine how the manipulation influences other variables
Independent Variable
The variable that is being manipulated by the researcher; independent variable is AKA the explanatory variable. This is the x-axis.
Dependent Variable
Resultant variable or the outcome variable. This is the y-axis
Statistic
Characteristic or measure obtained by using the data values from a sample
Parameter
Characteristic or measure obtained by using all the data values from a specific population
Mean
The sum of the total X values, divided by the total number of values
Median
Midpoint of the data array. Symbol is MD
Mode
The value that occurs most often in a data set
Range
Highest value minus the lowest value
Five Number Summary
- Minimum
- Q1
- Median
- Q3
- Maximum
Outliers
Extreme values
Unimodal
Data set that only has 1 value that occurs with greatest frequency
Bimodal
Data set that has 2 or more values with the same greatest frequency
Multimodal
Data set that has more than 2 values that occur with the same greatest frequency
No mode
No data values occur more than once.
Margin of Error
Also called the maximum error of the estimate, is the maximum likely difference between the point estimate of a parameter and the actual value of the parameter
Normalcdf
Used to find the probability or area under the curve
Invnorm
Used to find the z-value (or value in the context of an application problem, such as, finding the length, height, salary, score, or price…)
Confidence intervals
90% – 1.65
95% – 1.96
99% – 2.58
Confidence Level
Interval estimate of a parameter is the probability that the interval estimate will contain the parameter, assuming that a large number of samples are selected and that the estimation process on the same parameter is repeated
The term zx/2(sigma/square root of n) represents the ________.
Margin of error
Z-interval
Used when population standard deviation is known.
1-probZInt
Used to estimate proportions
Z-interval
Used when population standard deviation is known
Null Hypothesis
Statistical hypothesis that states that there is no difference between a parameter and a specific value, or that there is no difference between 2 parameters
Alternative Hypothesis
Statistical hypothesis that states the existence of a difference between a parameter and a specific value, or states that there is a difference between 2 parameters
Type 1 error
Occurs if you reject the null hypothesis when it is true
Type 2 error
Occurs if you do not reject the null hypothesis when it is false
Independent Samples
Two samples when the subjects selected for the 1st sample in no way influence the way the subjects are selected in the 2nd sample
Dependent Samples
Two samples where the selection of subjects for the 1st group in some way influenced the selection of subjects for the other group
Significance level
The maximum probability of committing a type 1 error
Properties of a good estimator
- unbiased
- consistent
- efficient