Data Scinece As A Field Of Knowledge Flashcards
Machine learning
A subset of artificial intelligence that uses statistical techniques to enable computer systems to learn from experience without being explicitly programmed
Supervised learning
A model with a known target variable, that “trains” the data to learn how to predict the target variable.
Unsupervised model
A model with no known target variable; this type of model is free to make connections between data points without having to consider an outside target variable.
Exploratory data analysis (EDA)
The process of exploring a data set to discover insights, I identify patterns, establish relationships and trends, and test assptions.
Bias
Tendency of a sampling method to overestimate or underestimate the value if an underlying population parameter.
Target variable
Typically, a component of a business problem or objective.
Fixed data set
Does not recieve new data regularly.
Interpretability
How easily a human can understand or predict a decision or result.
Iteration
Function that repeats in a specified order, often until a specific result occurs.
iterative
refers to a process where the design of a product or application is improved by repeated review and testing
Anomalies
data points that stand out from other data points in the data set and don’t confirm the normal behavior in the data. (They deviate from the data set’s normal behavioral patterns.)
Outlier
a single data point that goes far outside the average value of a group of statistics. It is markedly different from the norm in some respect.
Data Visualization
is the representation of data through the use of common graphics, such as charts, plots, infographics, and even animations. to communicate complex data relationships and data-driven insights in a way that is easy to understand.
Features
an individual measurable property within a recorded dataset.
They are often called “variables” or “attributes.” Relevant features have a correlation or bearing (called feature importance) on a model’s use case.
Relevant Features
have a correlation or bearing (called ____ importance) on a model’s use case.