EDA (Exploratory Data Analysis) : what is the stat process Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is the Mean

A

The Mean is the statistical average of a set of numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Median

A

The Median is the middle of a set of numbers , in a data set this can be the value with equal values (rows, json objects, etc.) on each side of it, making it the middle or median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Mode

A

The Mode is the most Recurrent value in the dataset or set of numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Range

A

The Range is the difference of the largest value in a set of number minus the smallest value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the Central Tendencies

A

Mean, Median, Mode, skewed mean, skewed median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Variance

A

The variance is the squared distance from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the Standard Deviation

A

The Standard Deviation is the square root of the variance and is the average amount we expect a point to differ from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are Correlations

A

Are the method we use to test the relationship between quantitative or categorical data, or more simply, how are things related.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Correlation Coefficient

A

A Correlation Coefficient is a way we put a value to the relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Empirical Probability

A

Is the probability that we observe from the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Theoretical Probability

A

Is more of an ideal or truth out there in the universe that we can’t directly see.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Additive Rule

A

This rule states that if an event cannot be more than one state, then the probability of 2 events happening within an occurrence is the sum of both events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a GLM(general linear model)

A

These models explain that data can be modeled with an accurate model and some degree of error and they portray a line of best fit to the data.
This can be interpreted as y = b +mx instead of
y = mx + b in most cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are Confidence Intervals

A

is an estimated range of values that seem reasonable based on what we’ve observed. It’s center is still the sample mean, but we’ve got some room on either side for our uncertainty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the T-Distribution

A

continuous probability distribution that’s unimodal(has one peak); it’s a useful way to represent sampling distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the Normal Distribution

A

A normal distribution is a symmetric bell curve that occurs when the mode, median, and mean are all the same when you visualize it.

17
Q

What is the Central Limit Theorem

A

The central limit theorem suggests that the distribution of sample means for an independent random variable, will get gradually closer to a normal distribution as the size of the sample gets bigger and bigger.
even if the original population distribution isn’t normal itself

18
Q

What is Null Hypothesis Significance Testing

A

A form of the Reductio AD Absurdum Argument, which tries to discredit an idea by assuming the idea is true and then showing that if you make that assumption, something contradictory happens.

19
Q

What is a P-value

A

In probability terms, the p-value is the probability of getting a sample as or more extreme than ours, given that the null hypothesis is true

20
Q

What is a Critical Value

A

the value of our test statistic that marks the limits of our extreme values.

21
Q

What is a Test Statistic

A

Are a procedure that allow us to quantify how close things are to our expectations or theories.

22
Q

What are Boxplots

A

This is a form of visualization that uses some of the measures of central tendency to picture the data

23
Q

What are Stem and Leaf plots

A

A form of plot