Module 3 Flashcards
Data analytics from start to finish: The tools of BA range from collecting the data (the start) to analyzing data (the finish)
this includes the following:
- the data analytics process
- the importance of statistics
Data analytics proces: what do we do with data?
- collect
-clean
-organize
-analyze
-communicate
measures of the data: we study:
- measure of the center of the data (mean, median, mode)
- measures of variation: range and standard dviation
mean
the most common
mean = sum of values divided by the number of values
most affected by outliers
The median
-middle number
-less senstive to outliers
mode
occurs most often
not affected by outliers
may be no mode or several modes
sample standard deviation
most commonly used measure of variation
shows variation about the mean
square root of variance
in excel it STDEV
Histogram
-representation of the ___________ of ______ data
-great for checking how the data is ____
- distribution, numerical
-spread
correlation
- measures whether two variables _____
- a ___ correlation means they are unrelated
-a ____ correlation means they move in the same direction
-the _____ is a good way to visualize correlation
-coorelation is not ____
- move together
-0
-positive
-XY scatterplot
-causation
Two types of categorical variables
input variables
-group data into segments
-0/1
Outcome variable
-group data into two outcomes
-essential to the concept of probabiltity
hypothesis testing
method to decide whether the data in hand (____) sufficiently support a particular hypothesis about population parameters
a hypothesis test makes _____ statements about ____ parameters
- samples
-probablistic, population
P value
if <.05, we ____ H0
if > .05, we ____ H0
- reject
-retain
Type 1
Type 2
Type 3
- both populations are the same group, comparing them
- comparing ample 1 with sample 2, 2 populations with equal variances
- comparing two samples, 2 populations with unequal variances
Tails
2 tail, whether 2 populations are different from another
1 tail, whether one population mean is greater than or less that the other