Intro to DataScience Flashcards
What is structured data?
Data that is recorded within a spreadsheet, has data, tables etc
What is unstructured data?
Data like twitter, images, newspapers, and stuff that doesn’t necessarily have a data structure
What kind of data structure is email?
Most likely structured, to, from, content, time, etc. The only unstructured part is the actual writing of content
Data science is interdisciplinary
Programming/ coding, math and stats, business logic, statistical methods, math and algorithms
Data Analytics
Analyzing big data - based on history - RAW DATA BEING USED TO TO DERIVE MEANINGFUL INFORMATION (MEAN MEDIAN AND MODE)
Data Science
Can analyze future DEALS WITH THE FUTURE – uses data from data analytics to process information and predict
Machine Learning
distance based, regression algo’s and also deep learning
Deep learning
Works with deep neural networks
AI
Data science is interdisciplinary.
Programming/ coding, math, and stats, business logic, statistical methods, math, and algorithms
Big Data, 4 V’S OF BIG DATA
With too much data to fit into a machine, high volume (amount) and velocity (how fast, aka tweets 150000 per second), and or high variety (could be text, image), data is generated super quickly and veracity(data accurately represents reality) (, of information assets that demand cost-effective, innovative forms of information processing that enable insight, decision making, and process automation
Descriptive analytics
What happened? eg. Starbucks sells 500 amount of coffee every day (deals with history)
Diagnostic
Why did it happen? eg star only sold 2000 in a day, more than (was there an event nearby? free cup of coffe? good add?) – deals with history
Predictive analytics
What will happen? we will sell this many Starbucks tomorrow, most likely
Prescriptive analytics
What do we need to do? Can you write a formula so Starbucks sells 1500 a day?