quiz 3 data visualization Flashcards
2 types of data visualization
declaritive purpose and exploritory purpose to make better decisions
3 types of data
quantitative, categorical, and ordinal
quantitative data
descrete and continuous
categorical
nominal(order doesnt matter) and only subset of values
M or F
Hair is blonde, brunette, black etc.
who cares about order
ordinal
subset of values but order does matter
such as low,med, high
income low,med,high
education high school, college
Find the Median of: 9, 3, 44, 17, 15
3,9,15,17,44 median is 17. line up ascending pick middle
8, 3, 44, 17, 12, 6
3,6,8,12,17,44. since even amount 8+12/2=10
explain why a plot could be helpful
as part of the exploratory analysis to identify outliers. or s part of the end goal.
create plot with a range of 0-9
data=np.arange(9)
x=plt.plot(data)
create a plot with -1,3,5,7
x=plt.plot([-1,3,5,7])
create an empty figure object
x=plt.figure(). You can’t make it appear without subplots though.
ax1 = fig.add_subplot(2, 2, 1)
create a figure object with a random array of 10
data=np.random.randn(10)
create figure object 1x1 with plot of 1.5,3,5,-2,1.6
fig=plt.figure() #Create a figure object
ax=fig.add_subplot(1,1,1) #create a AxesSubplot object
ax.plot([1.5,3.5,-2,1.6])
plot a series array
ser=pd.Series(np.random.randn(10).cumsum(),index=np.arange(0,100,10) ser.plot #The Series object’s index is passed to matplotlib for plotting on the x-axis, though you can disable this by passing use_index=False.
plot a dataframe
df=pd.DataFrame(np.random.randn(10,4).cumsum(0), columns=[‘A’,’B’,’C’,’D’],index=np.arange(0,100,10))
df.plot()
this will make 4 random dataframes labels abcd