Numpy Statistics Flashcards

1
Q

What is the standard import procedure for numpy?

A

import numpy as np

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you generate a list of 10000 random normalized data points, centered at “x” with standard deviation of “y”.

A

list_name = np.random.normal(x,y,10000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you get the mean of a numpy list?

A

list_mean = np.mean(list_name)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you get the median of a numpy list?

A

list_median = np.median(list_name)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you generate a list of 10000 random integers, ranging from “x” to “y”?

A

list_name = np.random.randint(x,y,10000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you get the mode of a numpy list (remember the import statement)?

A

from scipy import stats

list_mode = stats.mode(list_name)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you get the standard deviation of a numpy list?

A

list_std = list_name.std()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you get the variance of a numpy list?

A

list_var = list_name.var()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you generate a list of 10000 random uniform data points, ranging from “x” to “y”?

A

list_name = np.random.uniform(x,y,10000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you generate an evenly-spaced list of numbers from x to y with a spacing of “gap”?

A

list_name = np.arange(-3, 3, 0.001)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you visualize the probability density function with a given list (include necessary import statements)?

A

from scipy.stats import norm

plt.plot(list_name, norm.pdf(list_name))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you visualize the exponential probability density function with a given list (include necessary import statements)?

A

from scipy.stats import expon

plt.plot(list_name, expon.pdf(list_name))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you visualize the binomial probability mass function with a given list (include necessary import statements)?

A

from scipy.stats import binom
import matplotlib.pyplot as plt

n, p = 10, 0.5
list_name = np.arange(0, 10, 0.001)
plt.plot(list_name, binom.pmf(list_name, n, p))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you visualize the Poisson probability mass function with a given list (include necessary import statements)?

A

from scipy.stats import poisson
import matplotlib.pyplot as plt

mu = 500
list_name = np.arange(400, 600, 0.5)
plt.plot(x, poisson.pmf(list_name, mu))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you calculate the nth percentile of a numpy list?

A

per = np.percentile(list_name,n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a statistical moment?

A

A quantitative measure of the shape of a probability density function.

17
Q

Give the first four moments that were discussed.

A

First moment: mean.
Second moment: variance.
Third moment: skew.
Fourth moment: kurtosis.

18
Q

What is the scale of the skew value (what does it mean to have negative, zero, or positive skew)?

A

A longer tail to the left represents negative skew.
A longer tail to the right represents positive skew.
A perfectly normal model has zero skew.

19
Q

What is kurtosis?

A

It represents the shape of the tail and peak.

20
Q

What is the scale of the kurtosis value (what does it mean to have low or high kurtosis)?

A

A sharp peak represents high kurtosis.

Normal models have zero kurtosis.

21
Q

How do you get the skew of a numpy list?

A

import scipy.stats as sp

sp.skew(list_name)

22
Q

How do you get the kurtosis of a numpy list?

A

sp.kurtosis(list_name)

23
Q

What is covariance?

A

The measure of how two variables vary in tandem with their means. A near-zero covariance implies a low correlation, while a large covariance implies a high correlation.

24
Q

What is correlation?

A

The measure of a relationship between two variables.

25
Q

How do you get the correlation of two numpy lists?

A

np.corrcoef(x_list, y_list)

This returns a 2x2 array with the correlation at (0,1) and (1,0).

26
Q

How do you get the covariance of two numpy lists?

A

np.cov(x_list, y_list)

This returns a 2x2 array with the covariance at (0,1) and (1,0).

27
Q

What is Bayes’ Theorem?

A

P(A|B) = P(A) * P(B|A) / P(B)

28
Q

How do you get a linear regression from two variables?

A

from scipy import stats

slope, intercept, r_value, p_value, std_err = stats.linregress(x_list,y_list)

29
Q

How do you generate two lists of data that have varying linearity?

A

import numpy as np
from pylab import *

pageSpeeds = np.random.normal(3.0, 1.0, 1000)
purchaseAmount = 100 - (pageSpeeds + np.random.normal(0, 0.1, 1000)) * 3

scatter(pageSpeeds, purchaseAmount)

30
Q

When is a polynomial regression appropriate?

A

If the data is clearly non-linear.

31
Q

How do you perform an nth degree polynomial regression?

A

import numpy as np
x = np.array(x_list)
y = np.array(y_list)

pn = np.poly1d(np.polyfit(x, y, n))

32
Q

Does a high degree polynomial regression necessarily improve things? (Y/N)

A

No.

33
Q

How is multivariate regression written?

A

y = A + B1var_1 + B2var_2 + …

34
Q

How do you grab an Excel file at a specific link (don’t forget the import statement)?

A

import pandas as pd

df = pd.read_excel(‘link_name’)

35
Q

How do you show a few lines from an Excel file “df”?

A

df.head()

36
Q

How do you create a multivariate regression summary?

A

import statsmodels.api as sm

df['Model_ord'] = pd.Categorical(df.Model).codes
X = df[['Mileage', 'Model_ord', 'Doors']]
y = df[['Price']]
X1 = sm.add_constant(X)
est = sm.OLS(y, X1).fit()

est.summary()

37
Q

What is a multi-level model?

A

A model that contains a hierarchy of interdependent events.