Exam 1 Flashcards
Data
- the facts and figures collected, analyzed, and summarized for presentation and interpretation
Observation
- the set of measurements obtained for a particular element.
Nominal Scale
- the data for a variable consists of labels or names (the order of the labels IS NOT meaningful).
Ordinal Scale
- the data for a variable consists of labels or names (the order of the labels IS meaningful).
Interval Scale
- numeric data where the interval between values is a fixed unit of measure.
Ratio Scale
- the data have the properties of the interval scale, and the ratio of 2 values is meaningful.
Categorical Data
- use labels or names to identify an attribute of an ele-ment.
Quantitative Data
- Use of numbers
Cross-Sectional Data
- are data collected at approximately the same point in time.
Time Series Date
- data collected over several time periods.
Statistical Inference
- The process of using data collected on a sample to draw conclusions about a population
Population
- the set of all elements of interest in a particular study.
Census
- The process of collecting data on the entire population
Sample
- a subset of the population
Sample Survey
- The process of collecting data on a sample
Frequency Distribution
- Tabular summary of the data showing the number of items in each class.
Relative Frequency Distribution
- Shows the proportion of items belonging to a class.
- Relative Frequency = Frequency/n
Bar Graph
- Graph showing the frequency, relative frequency or percent frequency distribution.
Pie Chart
- Presents the relative or percent frequency distribution.
Histogram
- Graph that shows the frequency, relative frequency or percent frequency distribution
Cumulative Frequency Distributions
- Shows the number of data items with values less than or equal to the upper class limit
Stem and Leaf Display
- Shows the rank order of the data.
- Shows the shape of the data set
Cross Tabulations
- Tabular summary of two variables
Scatter Diagram
- a graphical representation for two quantitative variables.
Sample Statistics
- Numbers that describe a sample
Population Parameters
- Numbers that describe a population
Mean
- Average Value
Median
- Middle Value
Mode
- Value that occurs most often
Percentiles
- The pth percentile is the value with at least p percent of the observations less than or equal to it and at least (100p) percent of the observations greater than or equal to it.
Quartiles
- 1st Quartile = 25th percentile
- 2nd Quartile = 50th percentile = median
- 3rd Quartile = 75th percentile
Outliers
- observations that are much larger or smaller than the rest of the data.
Small Outliers
- less than Q1 − 1.5(Q3 − Q1).
Large Outliers
- greater than Q3 + 1.5(Q3 − Q1)
Range
- The difference between the largest and smallest numbers in the data set.
Interquartile Range (IQR)
- The difference between the third quartile and the first quartile.
Variance
- The variance is based on the difference between each observation and the mean.
Standard Deviation
- The square root of the variance.
Z-Scores
- z-scores give the relative locations of observations within the data
- z-scores show how far a particular value is from the mean
- Z-Score = (Observation - Mean) / Standard Deviation
- z is the number of standard deviations the observation is from the mean
Empirical Rule
- For a mound-shaped distribution (uni modal, symmetric, normal distribution) we can get better approximations
– 68.3% of the values of a normal random variable are within plus or
minus one standard deviation of its mean.
– 95.4% of the values of a normal random variable are within plus or minus two standard deviation of its mean.
– 99.7% of the values of a normal random variable are within plus or minus three standard deviation of its mean
Probability
- a numerical measure of the likelihood that an event will occur
Experiment
- a process that generates well defined outcomes
Sample Space
- the set of all outcomes
Classical Method
- Used when all the experimental outcomes are equally likely.
Relative Frequency Method
- used when data are available to estimate the proportion of the time each outcome occurs
Subjective Method
- used when we cannot assume that the outcomes are equally likely, and we have little relevant data available
Event
- A collection of outcomes (or sample points)
Complement of an event
- Suppose A is an event. The complement of A, denoted by Ac is another event that consists of all possible outcomes that are not in A.
Union of Two Events
- If A and B are events, the union of A and B, denoted A ∪ B, is the event containing all outcomes that belong to A, B, or both.
Intersection of 2 events
- If A and B are events, the intersection of A
and B, denoted A ∩ B, is the event containing all outcomes that belong to both A and B.
The Addition Law
- P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Mutually Exclusive
- If events A and B have no outcomes in common
- P(A ∪ B) = P(A) + P(B)
Random Variable
- a numerical description of an experiment.
Discrete Random Variable
- A random variable that assumes either a finite number of values or an infinite sequence of values such as 0, 1, 2
Continuous Random Variable
- A random variable that may assume any numerical value in an interval or collection of intervals
Probability Distributions
- describe how probabilities are distributed over the values of the random variable.
Expected Value or Mean
- compute a measure of central location for a discrete random variable.
Variance (Probability)
- measure the variability for a discrete random variable.
Normal Distribution
- the most important probability distribution for continuous random variables.