Data Science Flashcards
What is data science
The study of data to extract meaningful insights for businesses, uses statistics, scientific computing, algorithms
What can data science include
Data visualisation, machine learning, artificial intelligence, statistics, mathematics, software engineering
What is a desired skill set for a data scientist
Communication, machine learning, statistics and probability, data visualisation, computer science and HPC, data wrangling and databases, data ethics and regulation, domain expertise
What are some reasons why we may want to find patterns in data
We may want to find patterns in data to detect anomalies or outliers, cluster groups of similar things, identify relationships between things, or apply a label to an observation
Why are identified patterns only useful if they allow us to do something
We want actionable insights that help us make decisions or take actions based on the data
What is a common example of finding patterns in data
A common example of finding patterns in data is predicting whether an email is spam or not
What kind of patterns/insights do we often want to find using data analysis
We often want to find complex patterns or insights in massive amounts of data that are not easily discoverable by humans
Define Datum
A single piece of information
What is datum in the context of data analysis
A datum if often an observation or measurement of something, recording information about that thing
How is data impacted by the problem we are trying to solve
The type and amount of data we collect will depend on the problem we are trying to solve
What is the “Age of Big Data” and how has it affected the value of data
A period where massive collections of data are obtained frequently. This has led to data being an expensive commodity and potentially sold to high bidders who would benefit from insights that data might hold
What are some examples of different types of data that can be collected
Numerical data, text data and images
What is atomic data
A primitive type that cannot be broken down into a smaller unit (Integer, Boolean, Characters) e.g. Your age is an atomic piece of data
What is composite data
A composition or aggregation of data that can be broken down into smaller units (Strings, Records, Lists) e.g. your student record is a piece of composite data, collating your name, age, address, modules, …
What is Quantitative Data/Numerical Data
A type of data that represents a numerical value which quantifies something