Exam 1 (Modules 1-3) Flashcards
What should be avoided in constructing “good” graphs?
minimize white space, avoid clutter on graph, avoid 3D effects
Determine the five-number summary
(put data set in ascending order): minimum, Q1, Median (Q2), Q3, maximum
Define “statistics”
Science of collecting, organizing, summarizing, and analyzing information. To describe and understand sources of variation in data.
Define “the lurking variable”
“Correlation does not equal causation!”
Define “statistic”
numerical summary of a SAMPLE (Roman letters)
Define “descriptive statistics”
organizing and summarizing data (numerical summaries, graphs, tables)
Define “inferential statistics”
take result from a sample, extend it to the population, and measure the reliability of the result
Define “parameter”
numerical summary of a POPULATION (Greek letters)
Discrete variable
quantitative variable that has either a finite number of possible values OR a countable number of possible values. *Count to get the value. EX: number of pets, number of college credits, number of seats in an auditorium
Continuous variable
quantitative variable that has an infinite number of possible values that are not countable. *Measure to get the value. EX: distance, total rainfall, age, data use on a cell phone per month
The Process of Statistics
1) Identify the research objective (what questions need to be answered?). 2) Formulate the research question (with at least 1 variable). 3) Collect the data needed to answer the question(s). 4) Describe the data. 5) Perform inference.
Define “statistical thinking”
using statistics to analyze and critique information you come across, in order to be an informed consumer of information
Qualitative variable
contains a classification system for its variable values. May be text or numeric. EX: gender, zip code, nationality, phone number, numbers on team shirts
Quantitative variable
the variable values are a numerical range that can be added or subtracted to provide meaningful results. Equal interval magnitude scale. Can be discrete OR continuous. EX: height, weight
Frequency distribution
lists each category of data and the number of occurrences in each category of data. Frequency column = number of observations.
Relative frequency
proportion/percent of observations within a category. RF = frequency / number of observations
Pareto chart
bar graph whose bars are drawn in decreasing order of frequency or relative frequency
Classes
categories in which data are grouped (i.e., 25-34, 35-44). Class width = difference between consecutive lower class limits.
Class width value (CWV)
CWV = (largest data value - smallest data value) / number of classes (between 5-20)
Describe what can make a graph misleading or deceptive
scale of the graph, inconsistent scale, misplaced origin (aka not starting at 0), use of 3D effects
What makes a “good” graph”?
Not too much white space, avoid “prettifying,” avoid 3D effects
3 characteristics of distribution
shape (bell-shaped, skewed), center (average value), spread (how far data goes from average value)
Population arithmetic mean
(u - mu; N = size of population). u = (x1 + x2 + … xN) / N
Sample arithmetic mean
(x-bar; n = size of sample). x-bar = (x1 + x2 + …xn) / n