Data And Data Preparation Flashcards
What are descriptive statistics?
The summer of important aspects of a data set 
Set unemployment rate in the Dow Jones industrial average, or example of what statistical branch
Descriptive statistics
What is the branch of statistic that draws conclusions from a sample of data called?
Inferential statistics
When is it appropriate to use cross-sectional data?
When the time of measurement doesn’t matter
When is it appropriate to use timeseries data?
When the timing matters, and you only have one thing to measure
What is the difference between structured an unstructured data?
Structure data here is like in the database. They have rows and columns, while unstructured data is just a bunch of data. Apparently like 80% of data is unstructured nowadays. This is crazy.
What are the three characteristics of big data?
Volume velocity, and variety
How can qualitative and quantitative data be described?
As is categorical and numerical
What is the difference between discrete and continuous variables?
Continuous variables can be anything while discrete variables have a limited selection
Which measurement scales are used for categorical variables
Nominal and ordinal
Which measurement scales are used for numerical variables
Interval and ratio
What does it mean that interval, scale variables don’t have a meaningful zero
That the zero does not represent an absence of what’s being measured
How can you handle missing values in a dataset
Either by the omission strategy, removing the unit entirely or by the imputation strategy, replacing the missing value with the average, or some other relevant variable
What is subsetting?
To extract a relevant portion of the data
What are some ways of preparing data?
Counting sorting and subsetting