Statisitics Flashcards
Probability
A mathematical tool to study randomness, dealing with the likelihood of an event occurring.
Statistics
The science that deals with the collection, analysis, interpretation, and presentation of data.
Descriptive Statistics
Organizing and summarizing data.
Inferential Statistics
Drawing conclusions from data using formal methods to determine our confidence level of those conclusions.
Population
A collection of persons, things, o objects under study.
Sample
A subset of a population that are studied directly to gain information about the larger populaiton.
Statistic
A number that represents a property of a sample
Parameter
A numerical characteristic of a population that can be estimated by a statisitc
Representative Sample
A sample that accurately represents the parameters of the whole population
Variable
A characteristic or measurement that can be determined for each member of the population.
Typically denoted as X or Y
Numerical Variable
A variable with units of equal weight.
Categorical Variables
Variables that identify a category that the object is in.
Data
The values of a variable
Datum
A single value.
Qualitative Data
The result of categorizing or describing attributes of a population. AKA categorical data.
Quantitative Data
Numbers. The result of counting or measuring attributes of a population.
Quantitative Discrete Data
Data that is measured on a scale that has a finite number of values within a finite interval.
Quantitative Continuous Data
Data measured on a scale that has an infinite number of values within a finite interval
Pie Chart
A graph in which categories of data is represented wedges of a disk and are proportional in size to the percent of individuals in each category
Bar Graph
A graph in which the length of the bar is proportional to the number of individuals in each category
Pareto Chart
A bar graph in which bars are ordered from largest to smallest
Random Sampling
A sampling method in which each individual has an equal chance of be selected for the sample
Simple Random Sample
A random sampling method in which any group of n individuals is equally likely to be chosen as any other group of n individuals
Stratified Sample
A sample obtained by divide the population into groups called strata and then taking a proportionate number from each stratum
Cluster Sample
A sampling method by which one divides the population in clusters or groups and then randomly selects some of the clusters.
Systematic Sample
A sampling method in which a starting point is chosen at random and then every nth piece of data from the population is added to the sample
Convenience Sampling
A non-random method of sampling that involves takes the data that is readily available.
Sampling with Replacement
Involves the member that has been chosen to go back into the population. This allows for the possibility of being chosen more than once.
Sampling without Replacement
When a member of a population can only be chosen once.
Sampling Errors
Errors in data resulting from the sampling process such as too small of a sample size
Nonsampling Errors
Errors in data not resulting from the sampling process such as a defective counter.
Sampling Bias
Created when some members of a population are more likely to be chosen than other members.
Level of Measurement
The way a set of data is measured
Nominal Scale
Used to measure qualitative data. These are categories are not ordered in any way
Ordinal Scale
Similar to the nominal scale, it categorizes. But unlike the nominal scale, it is able to order the data.
Interval Scale
A measuring scale that has a definite ordering, ability to measure and calculate the difference in data points, and does not have a starting point
Ratio Scale
A quantitative measuring scale in which there is a starting point (0), and ratios can be calculated between data points
Frequency
The number of times a value of the data occurs
Relative Frequency
The ratio of the frequency of a particular data point to the total number of outcomes.
Cumulative Relative Frequency
The sum of all previous relative frequencies.
Explanatory Variable
The variable that causes a change in another. AKA independent variable.
Response Variable
A variable that changes as a result of a change in the explanatory variable. AKA dependent variable.
Treatments
The different values of the explanatory variable
Experimental Unit
A single object or individual to be measured
Lurking Variables
Additional variables that can cloud a study
Random Assignment
Refers to randomly assigning the experimental units to the treatment groups.
Control Group
A group that is given a placebo treatment in which the treatment cannot influence the response group
Blinding
When a person involved in a research study does not know who is receiving the active treatments and who is receiving the placebo
Double Blind Experiment
A research study in which both the researchers and the subjects are blinded
Descriptive Statistics
An area of statistics concerned with displaying data through numerical and graphical ways.
Stem-and-Leaf Graph or Stemplot
A two column table, [‘stem’, ‘leaf’], with the leaf being the data point’s final significant digit and the stem being the rest of the digits. The rows are in descending order from least to greatest.
Outlier
An observation of data that does not fit the rest of the data. Sometime called an extreme value.
Line Graph
A graph that uses the x-axis to plot one variable and the y-axis to plot another variable. Line segments are used to connect each point.
Bar Graphs
A graph that uses bars to display the magnitude of the data.
Histogram
A graph that consist of adjoining boxes. The horizontal axis is labeled with eh data it represents while the vertical axis is labeled with either the frequency or relative frequency.
Frequency Polygon
A line graph with the data on the x axis and the frequency on the y axis
Time Series Graph
A graph with time on the horizontal axis and the data on the vertical axis
Quartiles
Measures of location on the horizontal axis. Q1 (25%), Q2 (50% or median), Q3 (75%).
Divides ordered data into quarters.
Percentiles
Divides ordered data into hundredths.
Median
The center of the data. If the number N of data points is even, then the median is the average of the two values closest to the N/2. If it odd, then it is the value of the ((N-1)/2)+1 data point.
Interquartile Range (IQR)
The spread between the first and third quartile.
IQR = Q3-Q1
Box Plots or Box-Whisker Plots
Gives a good image of the concentration of data.
Constructed with the minimum value, Q1, the median (Q2), Q3, and the maximum value.
The min/max are the endpoints of of the axis, Q1 marks the edge of the box closest to the min and Q3 marks the edge of the box closest to the max.
|——–|=====|====|———-|
min