Statistics Flashcards
What are statistical measures?
Measure of frequency - Histogram and Frequency Distribution
Measure of central tendency - Mean, Median and Mode
Measure of spread - Standard deviation, Variance and Range
Mean formula
μ = ∑N / Nsum
Median formula
μ = N / Nsum or μ = N+1 / Nsum
Mode formula
Max count of data point
Standard deviation formula
√ ∑(μ -xi)2 / N
Normal distrbution curve what is the % of values distributed?
68.2% = -1sd to +1sd
13.6% = -1sd to -2sd and +1sd to 2sd
1.7% = -2sd to -3sd and +2sd to 3 sd
Formula for Linear Regression
y=b0(intercept) + b1x1(coefficientvariable) + E(error)
Output of logistic regression is binary, true or false?
True
What is log transformation?
Process of transforming a Non linear curve to a linear curve
Sigmoid Curve
Its a non linear curve that is part of Logistic regression
Which Python lib is used for data wrangling?
Pandas
Which Python lib is used for machine learning?
Ski-kit learn
Which Python lib is used for statistical functions?
NumPy
What is the difference between List and Tuple?
List is defined by [] and Tuple by (). List can be modified and Tuple cannot
What is array in Python?
Its a list which is understable by NumPy lib
arr[1:2]. What index value is inclusive and what is not?
1 is inclusive and 2 is noy
How to create arithmatic progression in NumPy array?
Using arrange function
How to convert a 1dim array to 2 dim array in NumPy?
Using reShape
Joint Probability (Prob of A and B occureing simultaneously)
Conditional Probability (Prob of A cosidering B has occurred)
Marginal Probability (Prob of A irresepective of B)
P(A)*P(B)
P(A intersection B) / P(B)
P(A and B)/P(B)
What is probability distribution
A probability distribution provides a list of all values
that the random variable can take along with the
probability of each value occuring.
Discreet Random Variable vs Continuous Variable
A discrete random variable can only take a very specific
value out of a predefined set of values. For example a
throw of a dice can only have 1 of 6 possible values, a
coin toss can only have 1 of 2 possible values.
A continuous random variable, can take any value
within a certain range, for example a mileage of a car,
weight of a person etc.
Bernoulli distribution
Vs
Binomial distrbution
Vs
Uniform Distribution
Vs
Poisson distribution
Vs
Normal Distribution
Vs
Exponential Distribution
Single trial with only 2 possible outcomes
Vs
Multiple trial with only 2 possible outcomes
Vs
Equal probability of an event out of any trial
Vs
The number of events occurring in a certain time
interval
Vs
Most values of the above events fall in the middle
Vs
To predict the time till when the next event will occur