MBAD 503 Flashcards
Best illustrates the distinction between statistical significance and practical importance
Increased life of hard drive from 240,000 to 250,000 hours
25% Students in stats class watch 8+ hours of TV a week so I conclude 25% of university students do the same. Which fallacy is this?
Uses a sample not representative of all the students
NASA Challenger and Columbia disasters suggest that
Limited data may still contain important clues
Smoking isn’t harmful, my aunt lived to 90, illustrates which fallacy
Small sample generalization
Bob didn’t wear lucky T shirt to class so he failed his test, illustrates which fallacy
Post hoc reasoning
T/F. Statistics is the science of believability
True
Characteristics of the statistically - savvy
Technically current
Communicates well
Can deal with imperfect information
Which are practical constraints facing a business researcher
Time and money are limited
Research on humans is fraught with danger and ethics
The world is no laboratory so some things are impratical
Which is not true?
Inconsistent treatment of data by researcher is a symptom of poor survey or research design
Science of stats tells us whether the sample evidence is convincing
The post hoc fallacy says that when B follows A then B is caused by A
Valid statistical inferences may be made when sample sizes are small the the rules are followed for handling them.
Inconsistent treatment of data by researcher is a symptom of poor survey or research design
Bond rating from firms like B+, AA, etc are examples of which measurement of data?
Ordinal
Type of charge card is an example of which kind of variable?
Nominal
Duration of a flight is an example of which kind of variable?
Continuous Ratio
Number of Nobel prize winning faculty at Oxnard U is an example of which kind of data measurement?
Discrete Ratio (involves zero)
Temp in degrees Celsius at 7:00am today is an example of which measurement of data?
Continuous Interval
T/F. Cluster Sampling is useful when strata characteristics are unknown?
True
Before deciding to asses heavy fines on noisy airlines, which sampling method would the FAA probably use to measure peak noise of jets departing?
Stratified Sample
To record aircraft size, type, carrier for a week and use this to construct a stratified sample
Sampling Bias can best be reduced by?
Random Sampling
If we use a random number generator between 0-99, we would most likely find that
Some numbers would occur more than once
Which describes the observations in a dataset consisting of the GPAs and credits taken in the current quarter for randomly selected EWU students?
The EWU students
What is a time series variable
Net earnings reported by Xenia Corp. for the last 10 quarters
CDC wants to estimate extra hospital stay that occurs when pts experience post op a-fib. They divide the USA into 9 regions. In each region, hospitals are selected at random which each hospital size group. In each hospital, surgery pts are sampled according to known percentages by age, gender, etc.
Which sampling methods are used?
Cluster
Stratified
Simple Random
T/F. Running times for 500 runners in a race would be a univariate data set.
True
T/F. List of the ages, genders, salaries, years of experience for 50 CEOs is a multivariate data set.
True
Bond ratings for Aardco INC are B+ while bonds of Deva Corp are AA. Which level of measurement would be appropriate for this data?
Ordinal
Auto exhaust emission of CO2 is what kind of data measurement?
Ratio
Number of passengers bumped on a particular flight is what data measurement?
Ratio
What sampling method is quicker and easier?
Convenience
Professor chose 7 students from his stats class of 35 students by picking those with red shirts that day. Which kind of sample is this?
Convenience
30 work orders are selected from a filing cabinet of 500 by choosing every 15th folder. Which sampling method is this?
Systematic
A population has groups with a small amt of variation within them but large variation among or between the groups themselves. The proper sampling technique is?
Stratified
A manager chose 2 people from his team of 8 to give an oral presentation because he thought they were representative of the whole teams views. What sampling technique did he use in choosing these 2 people?
Judgement
A professor wants to know how many MBA students would take a summer elective and took a survey of the class she was teaching. What kind of sample is this?
Convenience
A sampling technique used when groups are defined by their geographical significance is?
Cluster
Which is not an area of application of statistics in business?
Questioning executives strategic decisions
Students evaluation of a professors teaching is an example of which measurement type?
Interval
Tom’s SUV rolled over. SUVs are dangerous. This best illustrates which fallacy type?
Small sample generalization
Your rating of the food served at a restaurant using a three point scale of 0=gross, 1=decent, 2=yummy is what kind of data measurement?
Ordinal (ranking)
Frequency Polygon
Line graph connecting the midpoints of the histogram bin intervals plus extra at the beginning and end.
Ogive is?
Useful for?
Line graph of the cumulative frequencies. Useful for finding percentiles or for comparing shape with a benchmark.
Stem and Leaf plot
Exploratory data analysis tool
Frequency tally
Stacked Dot Plot
Compares two or more groups, like home prices in 4 different regions
Sturges’ Rule
Bin Width = (Xmax- Xmin)/k
Relative frequencies calculation of data in a table
Absolute frequency per bin / total number of data values
Cumulative Relative Frequencies
Accumulate relative frequency values as bin limits increase.
Histogram
Graphical representation of a frequency distribution
Appearance is identical if vertical axis shows frequency, relative frequency, or percent
Shows SHAPE of a population.
Frequency Polygon
Line graph connecting midpoints of histogram bin intervals at the beginning and end
Log Scale is used for
Time series data that could grow at a compound rate, common when period of time is long or for data that grows rapidly.
Which is vertical and which is horizontal, bar and column?
Bar is horizontal, column is vertical
Pareto chart
Column chart of categorical data in descending order of frequency
Stacked column chart
Bar height is sum of several subtotals
Scatter Plot
Pairs of observations, starting point for bivariate analysis. Investigate the relationship between 2 variables.
Pivot Table
Interactive analysis of a data matrix
Row and column data types of a pivot table
Variables must be what kind?
Categorical or discrete numerical
Numerical
Nonzero Origin
Exaggerates the trend
Elastic graph proportions
Exaggerates trend, to avoid, keep aspect ratio below 2.0
Difference between bar and column charts
Bar is qualitative data and column is numerical
Which is least likely to be used in choosing bin frequency?
Sturges’ Rule
Aesthetic Judgement
Nice limits
Always starting at zero
Always starting at zero
Are line charts used for cross sectional data?
No
A column chart would not be suitable to display which data?
500 company CEO salaries (too many numbers)
Better would be a histogram
What kind of data is allowable for a pie chart?
Categorical / nominal
Sturges’ rule
1+3.322 log (number of entries)
Attributes of Sturges’ Rule
Just a guideline
Purpose is to determine bins to use
Double sample size, then add one bin class
Pie Charts are popular in business because (3 reasons)
Convey a false sense of science
Can be labeled with data to facilitate interpretation
Can display major changes in parts of a whole
Empirical Rule
Gaussian distribution (bell shaped)
How to estimate sigma (range)?
(Xmax-Xmin)/6
How to find quartiles?
Find the median then the median of the bottom half and the top half.
Box Plot
Exploratory data analysis based on the 5 data summary, Xmax, Xmin, Q1, Q2, Q3
Midhinge
Average of first and third quartiles
Covariance
Measures the degree which values of x and y change together. If they’re unrelated the covariance is zero.
Coefficient of Variation
Standard deviation / mean
Which way skewed if mean > median?
Skewed Right
Correlation coefficient
The standardized value of the covariance
T/F The skewness coefficient is zero in a sample from any normal distribution
False
T/F Coefficient of variation cannot be used when the mean is zero
True, because CV = SD/mean
T/F Standard Deviation is in same units as the mean?
True
Geometric mean is?
Nth root of data points multiplied together where N is the number of data points
Disadvantage of the Range is?
Only extreme values are used in its calculation
Mode is least appropriate for?
Continuous data
Which types of statistics offer robust (resistant to outliers) measures of center?
Median, Midhinge, Trimmed mean
Empirical rule says that….
about 32% of the data are beyond one SD from the mean
What percentages will lie within the first 3 standard deviations?
68%, 95%, 99%
Quick formula for estimating the SD?
Range / 6 OR (Xmax-Xmin)/6
Inner fence calculation
Q1-1.5(Q3-Q1) = lower inner fence Q3+1.5(Q3-Q1) = upper inner fence
Outer fence calculation
Q1-3.0(Q3-Q1) = lower outer fence Q3+3.0(Q3-Q1) = upper outer fence
Geometric Distribution
How many Bernoulli trials would it take to get a the first positive result
Hypergeometric Distribution
Like Bernoulli but WITHOUT replacement
Poisson Distribution
Number of occurrences within a specific unit of measure (like time)
Bernoulli Experiment
Random experiment with only 2 outcomes
Uniform Distribution
Every outcome has the same change, like rolling a dice
Expected Value
Each result multiplied by it’s probability and then all added together, does not have to be an actual option or possible result
Binomial Distribution SD equation
Sq.root(nπ(1-π))