Panda's Flashcards
Panda data structure Series. what is it?
A Series is a one-dimentional array-like object, including a sequence of value (similar to NumPy array) and an associated array of index. obj=pd.Series([4,5,-3,2]) obj 0 4 1 5 2 -3 3 2 dtype: int64
output the array values
obj.values
array([ 4, 5, -3, 2], dtype=int64)
output the list of index values in panda series
obj.index
RangeIndex(start=0, stop=4, step=1)
how to assign a different index in panda series
#Specify a different index obj2=pd.Series([4,5,-3,2],index=['d','c','a','b']) obj2
get value from index in panda series
pandas has more fexibility to use index than NumPy.
obj2[‘c’]
5
show same index but use normal index position
still works.
#pandas has more fexibility to use index than NumPy.
obj2[1]
5
get 2 values using the assigned letters in panda series.
obj2[[‘a’,’d’]]
#[‘a’,’d’] can be seen as a list of indices. It returns to a subset of the original Seires, which is also a Seiries.
a -3
d 4
you can do numpy like operations on the series array.
obj2[obj2>0] d 4 c 5 b 2 dtype: int64
find data type
type(new)
find missing data in pandas
pd. isnull(obj4)
obj3. isnull()
bool to find missing data
pd.notnull(obj4)
assign value 300 to bread
obj4[‘bread’]=300
DataFrame
DataFrame¶
There are many possible data inputs to DataFrame. Such as, np array, dict of lists ot tuples, dict of Series, dict of dicts and so on…
We only intorudce how to contruct DataFrame through dict of lists
create a dataframe
create a DataFrame through a dict of equal length lists or NumPy arrays:
data={‘state’:[‘Ohio’,’Ohio’,’Ohio’,’Nevada’,’Nevada’,’Nevada’],
‘year’:[2000,2001,2002,2000,2001,2002],
‘pop’:[1.5,1.7,3.6,2.4,2.9,3.2]}
frame=pd.DataFrame(data)
frame
state year pop 0 Ohio 2000 1.5 1 Ohio 2001 1.7 2 Ohio 2002 3.6 3 Nevada 2000 2.4 4 Nevada 2001 2.9 5 Nevada 2002 3.2
create another dataframe from dictionary
election = {'state':['New Jersey','Ohio','West Virginia'], 'Winner':['Hillary','Trump','Trump'], 'Margin':[5,7,15]} election type(election) electionresult = pd.DataFrame(election) #electionresult electionresult.head()