Pandas hw1b - Basic Queries Flashcards
All columns relating to AAPL
ans = df2010.loc[df2010.symb == ‘AAPL’]
ans.head()
All columns from the table and a column with the
difference between the open and close (open-close)
for AAPL on the 7th of January
df2010C = df2010.copy()
df2010C = df2010C.loc[ (df2010.retdate == ‘07-Jan-2010’)
&(df2010.symb == ‘AAPL’) ]
df2010C[ ‘diff’ ] = df2010C[ ‘opn’ ] - df2010C[ ‘cls’ ]
df2010C.head()
Write a query which returns the stock symbol, the date,
the open and close price for the top five differences
(open-close) in 2010 for only those stocks on the New
York Stock Exchange
df2010C = df2010.copy()
df2010C[ ‘diff’ ] = df2010C[ ‘opn’ ] - df2010C[ ‘cls ‘ ]
df2010C = ( df2010C.loc[ (df2010[ ‘exch’ ] == ‘NYSE’ ) ]
.sort_values(‘diff’, ascending = False)
) [ [ ‘symb’, ‘retdate’, ‘opn’, ‘cls’ ] ]
df2010C.head()
The days when AAPL has a volume more than 20 million and where the high is $3 or more greater than the low.
df2010C = df2010.copy()
df2010C = df2010C.loc[ (df2010C.symb == ‘AAPL’)
& (df2010C.vol > 20000000)
& ((df.2010C.high - df.2010C.low) >= 3)
] [ ‘retdate’ ]
df2010C.head()
Write a query which returns 3 columns: the return date,
SYMB and volume, but only for stocks that have a volume larger than 200 million
df2010C = df2010.copy()
df2010C = df2010C.loc[ (df2010C.vol > 200000000)
] [ [ ‘retdate’, ‘symb’, ‘vol’ ] ]
df2010C.head()
Write a query which returns all information about
Google (GOOG), Netflix (NFLX), Amazon (AMZN), and
Microsoft (MSFT) in 2010.
df2010C = df2010C.copy()
df2010C = df2010C.loc[ (df2010C.symb == ‘GOOG’)
| (df2010C.symb == ‘NFLX’)
| (df2010C.symb == ‘AMZN’)
| (df2010C.symb == ‘MSFT’)
]
df2010C.head()
Write a query which returns the date and symbol of
the largest “one-day gainer”, that is the stock which
has the highest close-open on the NYSE
df2010C = df2010.copy()
df2010C[ ‘oneday’ ] = df2010C.cls - df2010C.opn
df2010C = (df2010C.loc[ (df2010C.exch == ‘NYSE’)]
.nlargest(1, ‘oneday’)
) [ [‘retdate’, ‘symb’] ]
Write a query which returns the date and symbol of the
largest “one-day percentage gainer”, that is the stock
which has the highest (close-open)/open on the NYSE
df2010C = df2010.copy()
df2010C[ ‘onedaypct’ ] = (df2010C.cls - df2010C.opn) /
df2010C.opn
df2010C = (df2010C.loc[ (df2010C.exch == ‘NYSE’) ]
.nlargest(1, ‘ondaypct’)
) [ [‘retdate’, ‘symb’] ]
df2010C.head()
Consider stocks on the NYSE which had a volume of more than 1 million. Which stocks (symb and date) had their open price the same as their low and their closing price the same as their high?
df2010C.copy()
df2010C = df2010C.loc[ (df2010C.exch == ‘NYSE’)
& (df2010C.vol > 1000000)
& (df2010c.opn == df2010c.low)
& (df2010c.cls == df2010c.high)
] [ [ ‘symb’, ‘retdate’ ] ]
df2010C.head()
Consider stocks on the NYSE which had a volume of more than 1 million. Which stocks (symbol and date) had their closing price the same as their low and their opening price the same as their high?
df2010C.copy()
df2010C = df2010C.loc[ (df2010C.exch == ‘NYSE’)
& (df2010C.vol > 1000000)
& (df2010c.opn == df2010c.high)
& (df2010c.cls == df2010c.low)
] [ [ ‘symb’, ‘retdate’ ] ]
df2010C.head()
Consider stocks on the NYSE which had a volume of more than 1 milion. Of those days which stock had either (a) open = low and close = high, or (b) open = high and close = low, which symbol and date has the largest volume traded?
df2010C.copy()
df2010C = (df2010C.loc[ (df2010C.exch == ‘NYSE’)
& (df2010C.vol > 1000000)
& (((df2010c.opn == df2010c.high)
& (df2010c.cls == df2010c.low))
| ((df2010c.opn == df2010c.high)
& (df2010c.cls == df2010c.low)))
].nlargest(1, ‘vol’)
[ [ ‘symb’, ‘retdate’ ] ]
df2010C.head()
Which company (ticker symbol) had the highest net income over all the years that are in the FND tables?
dffndC = dffnd.copy()
dffndC = dffnd.nlargest(1, ‘netinc’) [ [ ‘tic’ ] ]
Which company (ticker symbol) had the highest net income in fiscal year 2011? (using the FND table)
dffndC = dffnd.copy()
dffndC = dffnd.loc [ (dffndC.fyear = 2011) ]
.nlargest(1, ‘netinc’) [ [ ‘tic’ ] ]
dffndC.head()
Which company (ticker symbol) had the lowest, non-zero, net income over all the years? (use the FND table)
dffndC = dffnd.copy()
dffndC = dffnd.loc [ (dffndC.fyear = 2011) ]
.nlargest(1, ‘netinc’) [ [ ‘tic’ ] ]
dffndC.head()
Which company (ticker symbol) which had a net-income per employee over $1000, had the largest number of employees (over all the years)? Keep units in mind! (use the FND table)
dffndC = dffnd.copy()
dffndC[ ‘netinc_emp’ ] = (dffndC.netinc /
dffndC.emp)*1000
dffndC = dffndC.loc [ (dffnd.netinc_emp > 1000) ]
.nlargest(1, ‘emp’) [ [ ‘tic’ ] ]
dffndC.head()