data wrangling & eda Flashcards
what is data wrangling?
transforming raw data for analysis (useful for formatting, missing/corrupt values, unit conversion)
what is EDA?
exploratory data analysis–open ended analysis to get to know the data
quantitative continuous
can be measured to arbitrary precision – price, temperature
quantitative discrete
finite possible values – # of siblings, years of education
qualitative ordinal
categories w/ ordered levels : preferences, level of education
qualitative nominal
categories with no specific orderding – political association, cal ID #
what is granularity?
how fine/coarse is each datum?
what is scope?
how (in) complete the data is
what is temporality?
how is the data situated in time?
what is faithfulness?
how well does the data capture reality?