Final Flashcards
ep=randn(30,1)*5;
sample a normal distribution with zero mean and prescribed standard deviation/spread
x=10+ep;
shift for a prescribed mean
ep=randn(30,1)5;
x=10+ep;
figure(1)
dotplot(x,30,0.05,1)
function dotplot(x,nbins,delta,ymax)
[n,edges]=histcounts(x,nbins);
for i=1:length(n)
for j=1:n(i) plot((edges(i)+edges(i+1))/2,jdelta,’ob’,’markersize’,8,’markerfacecolor’,’w’); hold on end
end
set(gca,’YTick’,[]);
ylim([0 ymax])
end
code for dot plot
function dotplot(x,nbins,delta,ymax)
dot plot of data in x with nbins, dot spacing delta, and y-axis maximum ymax
[n,edges]=histcounts(x,nbins);
prescribe the bins and counts (matlab algorithm)
set(gca,’YTick’,[])
hide the y-axis ticks
plot(x,’o’,’markersize’,8,’markerfacecolor’,’w’); hold on ylabel(’value’)
xlabel(’sample number’)
scatter and time-series plot
boxplot(x)
ylabel(’value’)
xlabel(’sample’)
box plot
x=[8 4 1 8 4 7 6 9 8]; n=length(x); x=sort(x)
order the data for inspection.
q1=x(round(1/4(n+1))) q2=x(round(2/4(n+1))) q3=x(round(3/4*(n+1))) IQR=(q3-q1)
here, quartiles from rounding, not interpolation. Matlab interpolates
min(x(x>q1-1.5IQR)) max(x(x<q3+1.5IQR))
whiskers
x((q1-3IQR)<x & x<(q1-1.5IQR))
x((q3+1.5IQR)<x & x<(q3+3IQR))
outliers
x(x<(q1-3IQR)
x(x>(q3+3IQR))
extreme outliers
histogram(x,10) xlabel(’value’) ylabel(’counts’)
histogram
scatterhist(x,y)
xlabel(’x’)
ylabel(’y’)
marginal plot for bivariate data
n=length(x)
number of data points
xbar=mean(x)
mean of data in x
yp=y-mean(y)
xp=x-mean(x)
deviations
Syy=sum(y.^2)-nybar^2 % correlations Sxx=sum(x.^2)-nxbar^2 Sxy=sum(x.y)-nxbarybar
r1=Sxy/sqrt(SxxSyy)
gets you correlation and correlation coefficient (r1)
f=pdf(’norm’,x,mu,sd)
normal PDF
F=cdf(’norm’,x,mu,sd)
normal cumulative PDF
P=cdf(’norm’,2,mu,sd)-cdf(’norm’,1,mu,sd)
using cumulative distributions
P=integral(@(x)pdf(’norm’,x,mu,sd),1,2)
using numerical integration - gets you same value as P=cdf
x=icdf(’Gamma’,xi,k,b);
evaluate the inverse cumulative PDF (this PDF has two parameters k and b)
histogram([1,1,3,3,3,5,7,8,12],5,’BinLimits’,[0,20])
counts in each bin
i=0:40
1-sum(exp(-30)*30.^i./factorial(i))
1-cdf(’Poisson’,40,30)
answers: what is the probability of randomly selecting 10 unit areas that contain greater than 40 defects?
1-cdf(’Poisson’,40,30)
Matlab cumulative Poisson PDF