Types of Data Flashcards
What is structured data
Data that has predefined structures, e.g. tables, spreadsheets, or relational databases
What is unstructured data?
Data with no predefined structure, comes in any size or form, cannot be easily stored in tables, e.g. blobs of text, images, audio
Quantitative vs. Categorial Data
Quantitative: numerical data, e.g. height, weight
Categorial: data that can be labeled or divided into groups, e.g. race, sex, hair color
What is Big Data?
Massive datasets, or data that contains greater variety arriving in increasing volumes and with ever-higher velocity (3 V’s). Cannot fit in the memory of one single machine
Common data formats
CSV, XML, SQL, JSON, Protocol Buffers
Data Sources
Companies/Proprietary data, APIs, Government, Academic, Web scraping/crawling