5. distributions Flashcards
Normal distribution
1.If we plot a graph with a given values of variable in X axis and counting the values in Y axis we’ll get a bell shaped curve
2.Centre of curve- mean
3.left of curve- 50% of val
4.right of curve- 50% of val
There are four type functions of normal distribution
1. dnorm(x, mean, sd)
2. pnorm()
3. qnorm()
4. rnorm()
parameters:
1. x- vector
3. mean- Mean of sample data whose default value is 0
4. sd- SDE of sample data whose default value is on 1
dnorm()
dnorm(x, mean, sd)
—density
- Calculates the height of the probability distribution at each point for a given mean and sd
eg:
x <- c(10, 20, 30, 40)
y <- dnorm(x, mean = 0, sd = 1)
plot(x,y)
pnorm()
pnorm(x, mean, sd)
Direct look up
-Also known as cumulative distribution function
- Dysfunction calculates the probability of normal distributed random numbers which is less than given number
eg:
x <- seq(-1, 20, by = .2)
#Choosing the mean as 2.0 and standard deviation as 0.5.
y <- pnorm(x, mean = 2.0, sd = 0.5)
#Giving a name to the chart file.
png(file = “dnorm.png”)
#Plotting the graph
plot(x,y)
#Saving the file.
dev.off()
qnorm()
qnorm(x, mean, sd)
– Inverse lookup
- Takes Probability value as input and calculates a number whose cumulative value matches with the probability value
- This is inverse of relative distribution function(pnorm)
eg:
**x <- seq(-1, 20, by = .2) **
#Choosing the mean as 2.0 and standard deviation as 0.5.
**y <- pnorm(x, mean = 2.0, sd = 0.5) **
#Giving a name to the chart file.
**png(file = “dnorm.png”) **
#Plotting the graph
plot(x,y)
#Saving the file.
dev.off()
rnorm()
rnorm(val, mean, sd)
- Generating normally distributed random numbers by taking sample sizes input
eg:
#Creating a sequence of numbers between -1 and 20 incrementing by 0.2.
**x <- rnorm(1500, mean = 80, sd = 15) **
#Giving a name to the chart file.
**png(file = “rnorm.png”) **
#Plotting the graph
plot(x,y)
#Saving the file.
dev.off()
Binomial distribution
-Discrete probability distribution
- Find the probability of success of an event
- The event has only two Possible outcomes in the series of experiments the best example is tossing of a coin
functions for binom distr:
1. dbinom(x, size, prob)
2. pbinom()
3. qbinom()
4. rbinom()
parameters:
1. x- vector
dbinom():
Direct Look-Up, Points
-probability density distribution at each point. In simple words, it calculates the density function of the particular binomial distribution.
eg:
x <- seq(0,100,by = 1)
#Creating the binomial distribution.
y <- dbinom(x,50,0.5)
#Giving a name to the chart file.
png(file = “dbinom.png”)
#Plotting the graph.
plot(x,y)
#Saving the file.
dev.off()
pbinom():
pbinom():Direct Look-Up, Intervals
- Calculate cumulative probability a single value representing the probability of an event
- In simple words it calculates cumulat….n of particular binomial distribution
eg:
#Probability of getting 20 or fewer heads from 48 tosses of a coin.
x <- pbinom(20,48,0.5)
#Showing output
print(x)
qbinom():
Finding number of heads with the help of qbinom() function
qbinom(): Inverse Look-Up
- Takes probability value and generates number whose cumulative value matches with the probability value
- In simple words it calculates inward cumulative distribution function of binomial distribution
- eg: Number of heads that have probability of 0.45 when a coin tossed 51 times
x <- qbinom(0.45,48,0.5)
#Showing output
print(x)
rbinom()
Finding random values
- Generates required number of random values for given probability from a given sample
eg: Nine random values from a sample of 160 with probability 0.5
x <- rbinom(9,160,0.5)
#Showing output
print(x)
poisson
1.Probability distribution of independent event occurs in a particular time interval
formulaes:
“C:\Users\mural\OneDrive\Desktop\BSCS\3rd year\5th sem\R\formulas.docx”
2.Poisson distribution has been named after Siméon Denis Poisson(French Mathematician).
3.Many probability distributions can be easily implemented in R language with the help of R’s inbuilt functions.
There are four Poisson functions available in R:
dpois
ppois
qpois
rpois
1) dpois()
This function is used for illustration of Poisson density in an R plot. The function dpois() calculates the probability of a random variable that is available within a certain range.
Syntax:
dpois(k,λ, log)
where,
K: number of successful events happened in an interval
λ: mean per interval
log: If TRUE then the function returns probability in form of log
eg:
dpois(2, 3)
dpois(6, 6)
Output:
[1] 0.2240418
[1] 0.1606231
2) rpois()
for generating random numbers from a given poisons distribution
syn:
rpois(q, λ)
where,
q–Number of random numbers needed
λ-mean
eg:
rpois(2, 3)
rpois(6, 6)
Output:
[1] 2 3
[1] 6 7 6 10 9 4
3) qpois()
for generating quantile of given poisonous distribution
Contiles divide the graph of probability distribution into intervals which have equal probabilities
syn:
qpois(k, λ, log)
k- Number of successful events happened in an interval
λ: mean
log: If true then the function returns probability in the form of log
y <- c(.01, .05, .1, .2)
eg:
qpois(y, 2)
qpois(y, 6)
Output:
[1] 0 0 0 1
[1] 1 2 3 4
4) ppois()
Gives** probability of random variable that will be equal to or less than a number**
syn:
ppois(q, λ, log)
where,
k- Member of successful events happened in an interval
λ- Mean
log: If true then function returns probability in form of log
eg:
ppois(2, 3)
ppois(6, 6)
Output:
[1] 0.4231901
[1] 0.6063028
linear regression
- Most commonly used type of predictive analysis used in Inferential statistics(cheque the response of dependent variable when a unit change happens in independent variable)
- Statistical approach for modelling the relationship between dependent variable and given set of independent variables
or
Relationship is estimated between two variables:
One response variable
One predictor variable - It produces straight line on the graph
- The goal is to Identify the line that minimises the Differences between observed data points and the lines Given values
y= ax + b
x- independent variable or Predictor variable
y - dependent variable or response variable
a and b - coefficients
in r:
lm(formula, variables)
lm(var1~var2)
2 types:
1) Simple linear regression
2) Multiple linear regression
eg:
x <- c(1, 2, 3, 4, 5)
y <- c(2, 3, 4, 5, 6)
model <- lm(y ~ x)
summary(model)
plot(x, y, main = “Linear Regression”, xlab = “X”, ylab = “Y”)
abline(model, col = “red”)
Regression analysis
Statistical tool to estimate relationship between two or more variables
3 types:
1. Linear regression
2. Multiple linear regression
3. Logistic regr
Simple linear regression
- Statistical method that is used for productive analysis
- Show Linear relationship between dependent vari and One or more independent variables