Midterm Flashcards

1
Q

What are CSV files?

A

CSV - Comma Separated Values

Header Row, separated by commas

Data Rows, separated by commas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which field would you expect to see in a CSV file of stock data?

  • # of employees
  • Date/Time
  • Company Name
  • Price of the stock
  • Company’s hometown
A
  • Date/time
  • Price of the stock
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does real stock data look like?

A

Header: Date, Open, High, Low, Close, Volume, Adjusted Close

Close - closing price reported at exchange

Volume - volume sold

Adjusted Close - Number data provider provides based on stock splits and dividend payments. The rate of return looking back with adjusted close should be larger.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a data frame?

A

Columns represent the stock symbols ( Separate dataframes can have different dimensions of data AdjClose, Volume, Close, etc..)

Rows represent time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What pandas code would allow you to print the first or last 5 rows of the DataFrame df?

df = pd.read_csv(“data/AAPL.csv”)

A

First 5 rows:

print df.head()

Last 5 rows:

print df.tail()

Last n rows:

print df.tail(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you review specific rows in a data frame betwen random values? For example between rows 10 to 20?

df = pd.read_csv(“data/AAPL.csv”)

A

print df[10:21]

Note that the second number is not inclusive in the range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you compute the max closing price for a stock using pandas?

df = pd.read_csv(“data/{}.csv”.format(symbol))

A

max value = df [‘Close’].max()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you compute the mean volume for a symbol?

df = pd.read_csv(“data/{}.csv”.format(symbol))

A

Mean = df[‘Volume’].mean()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How would you plot the adjusted close of the following data?

df = pd.read_csv(“data/AAPL.csv”)

print df [‘Adj Close’]

A

df [‘Adj Close’].plot()

plt.show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Select the ‘High’ column from the dataframe and then plot it.

df = pd.read_csv(“data/XXX.csv”)

A

print df [‘High’]

df [‘High’].plot()

plt.show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you plot two columns, such as ‘Close’ and ‘Adj Close’

df = pd.read_csv(“data/AAPL.csv”)

A

df [[ ‘Close’, ‘Adj Close’] ]. plot()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How many days were US stocks traded at NYSE in 2014?

A

252

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is S&P 500 and what is SPY?

A

S&P 500 - Stock Market Index based on 500 large American companies listed on the NYSE or NASDAQ. Essentially a weighted mean of the stock prices of the companies

SPY - SPDR S&P 500 - An ETF (Exchange-Traded Fund) that tracks the S&P 500 index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you create an empty data frame (df1) with a given datetime range?

start_date = ‘2010-01-22’

end_date = ‘2010-01-26’

A

dates = pd.date_range(start_date, end_date)

df1 = pd.DataFrame( index = dates)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Using an empty data frame (df1) in a specified daterange, how do you join df1 to a data frame for SPY (dfSPY)

df1 = pd.DataFrame( index = dates )

A

Ensure first that the SPY dataframe is indexed with the date column, not the numbered column. Additionally, ensure that na values are interpreted is a “not a number” and not as strings

dfSPY = pd.read_csv( “data/SPY.csv”, index_col = “Date”, parse_dates = True, na_values=[‘nan’] )

Join the two dataframes using DataFrame.join()

df1 = df1.join(dfSPY)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you drop NaN values on a data frame (df1)?

A

df1 = df1.dropna()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you drop NaN values when combining two dataframes (ie. df1 and dfSPY)?

A

df1.join ( dfSPY, how = ‘inner’ )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the default operation for the “how” parameter in the dataframe.join function?

A

The default option is left which indicates that the calling dataframe’s index will be used. Therefore, any dates from the calling dataframe will be preserved, potentially yielding NaN values if not shared by the other dataframe.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can you read in multiple stocks into one dataframe though they may contain the same column names?

A

symbols = [‘GOOG’, ‘IBM’, ‘GLD’]

for symbol in symbols:

df_temp = pd.read_csv(“data/{}.csv”.format(symbol), index_col = ‘Date’, parse_dates = True, usecols=[‘Date’, ‘Adj Close’], na_values = [‘nan’])

Rename columns

df_temp = df_temp_rename( columns = {‘ Adj Close’ : symbol})

df1 = df1.join(df_temp)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

In a dataframe (df) containing multiple symbols,

how would you drop dates in which SPY did not trade?

A

if symbol == ‘SPY’:

df = df.dropna( subset = [SPY])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you select the piece of data indicating 2010 - 02 - 13 to 2010 - 02 - 15 and only GOOG and GLD?

A

df = df.ix [‘2010-02-13’ : ‘2010-02-15’, [ ‘GOOG’, ‘GLD’] ]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the best way to normalize price data so that all prices start at 1.0?

A

df1 = df1 / df1[0]

OR

df1 = df1 / df.ix[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Slice and plot SPY and IBM over the daterange ‘2010-03-01’ to ‘2010-04-01’

A

start_index = ‘2010-03-01’

end_index = ‘2010-04-01’

columns = [‘SPY’, ‘IBM’]

plot_data(df.ix [start_index: end_index, columns], title=”title”)

df. plot()
plt. show()

def plot_data(df, title=”title”):

ax = df.plot(title = title, fontsize = 2)

ax. set_xlabel(“Date”)
ax. set_ylabel(“Price”)
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How do you normalize a dataframe df?

A

df = df / df.ix [0 :]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How do you return the number of rows in an array, a?

A

a.shape[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How do you return the number of columns in an array, a?

A

a.shape[1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How do you get the number of items in an array, a?

A

a.size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How do you get the sum of elements of an array, a?

A

a.sum()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

How do you get the sum of each column of an array, a?

A

a.sum(axis = 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

How do you get the sum of each row of an array, a?

A

a.sum(axis =1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

How do you get the location of the maximum value of an array, a?

A

a.argmax()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

In an array a, how would you get the entire row of every other column up to the 3rd column?

A

a[:, 0:3:2]

where 0 indicates start at first column

3 indicates end before 3rd column

2 indicates choose every second element

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How do you index an array, a, with another array, b?

A

a = np.random(10, size = 5)

indices = np.array( [1, 1, 2, 3]

a = [7, 6, 8, 5, 9]

a [indices] = [6, 6, 8, 5]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How would I access all elements in this array >5 ?

a = np.array( [1, 6, 5, 3, 8] )

A

a [a > 5]

36
Q

How do you compute the daily returns of a dataframe, df?

A

The daily returns are the net earnings compared to the previous day.

daily_returns = df.copy

daily_returns[1 :] = ( df [1 :] / df [:-1].values ) - 1

daily_returns.ix[0, :] = 0

37
Q

What is a bollinger band?

A

A way of quantifying how far a stock price has deviated from some norm.

38
Q

Where are the bollinger bands?

A

2 standard deviations above and below the mean of the dataset. When the data crosses below the lower band, this could indicate a buy single. When the data crossed above the upper band, this could indicate a sell signal.

39
Q

How do you calculate bollinger bands?

A

upper band = rolling mean + 2 * rolling std

lower band = rolling mean - 2 * rolling std.

40
Q

What is an ETF?

A

An ETF or Exchange-Traded Fund is a basket of equities allocated in such a way that the overall portfolio tracks the performance of a stock exchange index. ETFs can be bought and sold on the market like shares.

41
Q

How do you fill in missing data in a dataframe?

A

df. fillna( method = ‘‘ffill’)
df. fillna( method = ‘bfill’)

Fill forward first then fill backwards.

42
Q

What is kurtosis?

A

It tells us about the tails of the distribution.

It tells us how different our distribution is from the Gaussian distribution.

A positive kurtosis is more occurance in the tales than expected.

43
Q

How would you print a scatter ploy of ‘SPY’ and ‘GLD’ data?

A

df.plot( kind = ‘scatter’ , x = ‘SPY’ , y = ‘GLD’ )

44
Q

How do we fit a polynomial of degree 1 to a graph?

A

beta, alpha = np.polyfit( dailyret[‘SPY’], dailyret[‘XOM’], 1)

plt. plot( dailyret[‘SPY’] , beta * dailyret[‘SPY’] + alpha )
plt. show()

45
Q

How do you find the correlation on a dataframe?

A

df.corr ( method = pearson )

46
Q

How do you calculate the daily portfolio value

A

1) normalize df (prices / prices[0]
2) determine allocations = normed * allocs
3) determine position values = allocs * start_val
4) determine portfolio values = pos_vals.sum(axis = 1)

47
Q

What is the sharp ratio?

A

Risk adjusted return

All else being equal:

lower risk is better

higher return is better

SR also considers risk free rate of return

48
Q

What is the formula for Sharp Ratio?

A

( Rp - Rf ) / StdDev

Rp - portfolio return

rf - risk free rate of return

stddev - std dev of portfolio return

ExpectedVal [Rp - Rf] / Std [Rp - Rf]

Mean [daily_rets - daily_rf] / std [daily_rets - daily_rf]

Using the shortcut and treating daily_rf as a constant:

mean [daily_rets - daily_rf] / std [daily_rets]

49
Q

How do you compute the annual risk free rate into a daily amount?

A

Daily_Rf = 252nd sq rt ( begining value + risk free rate ) - 1

50
Q

What do you do if the SR varies?

A

Consider SR an annual measure

Sr annualized = K * SR

K = sqrt ( #samples per year )

SR = sq rt (252) * mean ( daily_rets - daily_rf ) / std ( daily_rets)

51
Q

Ranges - limits on X

Constraints - properties that must be true

A

How do you limit an optimizer to useful data?

52
Q

What is an optimizer?

A
  • Find minimum values of functions
  • Build parameterized models based on data
  • Refine allocations to stocks in portfolios
53
Q

How do you use an optimizer?

A

1) Provide a function to minimize
2) Provide an initial guess
3) Call the optimizer

54
Q

What is the python library to optimize a function?

A

scipy.optimize

min_result = spo.minimize(func, Xguess, method=”SLSQP”, options = {‘disp’: True})

55
Q

What is a convex function?

A

A real-valued function f(x) defined on an interval is called convex if the line segment between any two points on the graph of the function lies above the graph.

56
Q

How do you build a parameterized model?

A

Figure out what you are minimizing.

Minimize the error

57
Q

What are the types of funds

A

ETFs - Buy/sell like stocks, baskets of stocks, transparent

Mutual Fund - Buy/sell at end of day, quarterly disclosure, less transparent

Hedgefund - buy/sell by agreement, no disclosure, not transparent

58
Q

What is liquid?

A

Ease with which one can buy shares in a holding

ETFs are liquid

59
Q

What is large cap?

A

How is the company worth in terms of #shares x price

Price of the stock is related to what a share is selling at.

60
Q

How can you tell what type a fund is?

A

ETFs - 3/4 letters

Mutual Funds - 5 letter

Hedge Funds - name

61
Q

How are the manager of these funds compensated?

ETF

Mutual Funds

Hedge Funds

A

ETFs - Expense Ratio in terms of AUM (0.01 to 1%), tied to an index

Mutual Funds - Expense Ratio (0.5 to 3%)

Hedge Funds - Two and Twenty (2% of AUM and 20% of profits)

AUM - Assets Under Management is the total amount of money being managed by the fund.

62
Q

What types of investors use hedge funds?

A

Individuals

Institutions - retirement funds, university foundations

Funds of funds - group together funds of individuals or institutions

63
Q

What are hedge fun goals and metrics?

A

1) Beat a benchmark
2) Absolute returns

64
Q

What is an order?

A

Buy or sell info

Symbol

shares

Limit or Market ( market means accept a good market price, limit price means no worse than a certain price)

Price

65
Q

How do orders get to the exchange?

A

you -> broker -> exchange

you -> broker then joe -> broker then joe -> you

you -> broker -> dark pool <- broker2 <- lisa

66
Q

What are broker order types?

A

Stop loss

Stop gain

Trailing stop

Selling Short

67
Q

What is the value of a future dollar

A

PV = FV / (1 + IR) ^i

PV - present value

FV - future value

68
Q

What is the difference between the interest rate and discount rate?

A

Interest rate is used with a given present value, to figure out what the future value would be

Discount rate is used when we have a known or desired Future Value and want to compute the corresponding present value.

69
Q

What is the intrinsic value of a company?

A

FV / DR

Future Value / Discount Rate

70
Q

What’s the value?

Dividend = d

Discount RAte = dr

d / dr

A
71
Q

What is book value?

A

Total assets minus intangible assets and liabilities

72
Q

What is market capitalization?

A

of shares * price

73
Q

What is a portfolio?

A

A weighted set of assets.

Wi is the portion of funds in asset i

the sum of absolute value of the weights is 1.0

74
Q

What is the equation for the return on a portfolio?

A

The weight * the return summed for all assets

75
Q

What is the market porfolio?

A

An index that covers a large portion of stocks

US: SP500.

An index can be thought of as the “ocean” when malaise occurs

Index are cap weighted, where the weight of the stock is the market cap / sum of all market caps.

76
Q

What is the CAPM equation?

A

The return for a stock on day t is equal to Beta times the return on the market on day t plus alpha on that day.

Ri (t) = Bi * Rm(t) + Ai (t)

Beta component - market, SLOPE!

Alpha component - residual, y INT!

CAPM says that alpha is expected to be 0.

77
Q

What is CAPM vs Active Management?

A

Passive - buy index and hold

Active - pick stocks (over/under weight stocks)

78
Q

What is the difference between CAPM and Active investors?

A

CAPM says that alpha is random and Expected (alpha) = 0

Active managers believe they can predict alpha, at least more than a coin flip.

79
Q

What are the implications of cAPM?

A

Only way to beat market is choose Beta

Expected value of alpha = 0

Efficient Markets Hypothesis says you cant predict the market.

80
Q

What is Arbitrage Pricing Theory (APT)?

A

We ought to consider multiple betas.

Beta for different sectors

81
Q

Why do stocks split?

A

The price is too high

82
Q

What are the problems with regression based forecasting?

A

1) Noisy and uncertain forecasts
2) Challenging to estimate confidence
3) Holding time, allocation

83
Q

Pros and cons of parametric vs non-parametric learners

A

Parametric:

slow training

query fast

Non parametric:

traning fast

query slow

complex patterns with no underlying model

84
Q

What is cross validation?

A

Splitting data into many chunks to create different test/train data.

It does not work well with financial data because it is time sensitive.

85
Q

What is overfitting?

A

When in-sample error is decreasing and out-of-sample error is increasing