Exam1 Flashcards
What year was the term AI coined?
1955, Darthmouth
What is the definition of AI
Any system that exhibits behavior that could be interpreted as human intelligence
Weak AI is also called
Narrow AI
____ AI is good for systems that have predefined patterns to eliminate impossible options
Planning
Strong AI is also called
General AI
Definition of Weak AI
model that is confined to a narrow task
What are some examples of weak AI tasks
Language to text processing; picture sorting
Siri is an example of a weak or strong AI?
weak
Definition of strong AI
the machine displays all person-like behavior that you’d expect from an artificial human (emotions, humor, etc)
What was an early name for the nodes in neural networks?
Perceptrons (Rosenblatt at Cornell)
When did the term “Deep Learning” become popular?
1990s
Reasons that machine learning has accelerated
Availability of data
Moore’s Law
IoT
Automated SW coding (sensors and controllers)
Training and test data is
labelled
3 categories of supervised learning
- Binary classification
- Multiclass classification
- Regression analysis
If you have massive amounts of unlabeled data, ____ algorithm could be a good choice
k-means clustering
Bagging, boosting, and stacking are examples of
ensemble modeling
Definition of bagging
create several different version of the ML algorithm in parallel (like decision trees with different roof notes), and compare results, average out
Definition of Boosting
Use several different ML algorithms in sequence to boost accuracy of results (model 2 learns from model 1 etc)
Definition of Stacking
Use several different ML algorithms to boost accuracy (ex. k-NN on top of Naive Bayes)
For abstract reasoning, a _____ system reasoning may be best
symbolic
Definition of bias
gap between predicted value and actual outcome
Definition of variance
how scattered predicted values are +/- of actual outcome
What is the Turing test
Can the machine fool a human into thinking it’s a human if it’s behind a wall?
Big data is
unstructured data
One challenge of using AI for predictions is that AI uses _____ data
Historical (ex how would an AI model fall out of an unanticipated large event like Covid?)
One of the reason AI didn’t take off in the 60s and 70s was
limits of technological maturity (memory space, computational power)
When building an AI model, keep _____ in mind
the end goal in mind: who will use this model any why
raw data is
data collected in it’s original form, prior to any processing or adjustments
3 types of Data analytics
- Descriptive
- Predictive
- Prescriptive
Difference between predictive and prescriptive models
Predictive just predict the future (forecasts, etc), prescriptive change the future (control, optimization, etc)
Examples of types of data
- numeric vs non-numeric
- categorical data (ex fault or no-fault)
- structured vs unstructured
- temporal, spatial, spatio-temporal
- experimental vs operational
Experimental data differs from operational data critically in that ___
experimental data will isolate a single (or few) variables from other variables, while operational data will have a much more impact from the surrounding environment (which was not controlled)
Definition of Big Data
data that challenges the current capabilities of a single computing unit
What types of data would we encounter in energy systems
- metered data
- sub-metering
- communications
- measured data
- data storage
What does CRISP-DM stand for
Cross industry standard process for data mining
An input is also sometimes referred to as __
an instance
Definition of Data Analytics
the science of analyzing raw data to draw insight, and make conclusions from that data
Linear data cleaning workflow
- Access Data
- Detect Duty Cycles
- Remove Outliers
- Sanitize Gaps
- Check Process Limits
- Analyze data…
In univariate stats, variance =
std_dev^2
Covariance is…
the variance between 2 variables
Positive Covariance: variable A increases as variable B
increases
In weak covariance, there is…
no apparent linear statistical dependence between the 2 variables
Negative covariance, each variable “varies” ….
inversely to the other variable
Unlike covariance, correlation is…
- normalized to -1 to +1
- unitless
Correlation of A&B =
covariance of A&B / (std.devA*std.devB)
____ increase modeling risk
outliers
Outliers are:
data points that are significantly different from the rest of the data set
What is the simplest outlier detection technique for univariate samples
Z-score, where Z is the standardized equivalent of the data value = (x-x_mean)/std.dev
MCD stands for
Minimum Covariance Detection
MCD can be used to
remove outliers from multivariate samples
(minimum covariance determinant)
Definition of imputation
The process of identifying missing data, then creating a substitute
Why is it important to impute data sets
- missing data is generally not allowed in training data sets
- throwing out entire data points could throw out useful data
- statistical techniques could be biased by missing data
The covariance and correlation matrices are
symmetric
A typical Z score for outlier cutoff would be
3 ( = 3 std dev away from mean)
Imputed data inherently introduces
Bias into subsequent modeling
What are the 2 options to deal with missing data
- throw it out
- fill in the gap
What are some ways to impute?
- simple statistics (use mean, median, a constant)
- Multivariate imputation with bayesian stats
- k-nearest neighbor imputation
What are some initial questions to ask when prepping data for ML
- does the data include info that can predict the target?
- does the granularity of the training and prediction match?
- is there labeled data?
- is the data accurate? Do you know where it came from?
- is it easily accessible and readable?
- are the missing values a small percentage of the fields of interest?
Definition of an algorithm (comp sci)
a sequence of explicit instructions which perform a specific task
_____ analysis is used to simplify complexity analysis
asymptotic
_____ is a subset of AI
Machine Learning
Definition of Machine Learning
the study and usage of both algorithms and statistical models, which computer systems use, without explicit instructions, to learn how to perform specific taks
____ is a subset of Machine Learning
Deep Learning
Machine Learning applies the fields of
Comp Sci; Optimization; Statistics
Unsupervised ML models can be used for
Clustering
Labeled data is data which ___
has an associated category assigned to a specific set of features in the data set
In hard clustering, each data point…
belongs to only 1 cluster
What clustering techniques are examples of hard clustering
k-means, hierarchical
Guassian Mixture Modeling is an example of
soft clustering
What are some applications of clustering?
- exploratory data analysis
- dimensional (feature) reduction
- image segmentation
- anomaly detection
- data mining
Formula for euclidean distance between 2 pts with 2 features
d = sqrt( ( x1 - x2)^2 + (y1-y2)^2 )
In K-means clustering, a centroid is…
the arithmetic mean of the points in each dimension
Hierarchical clustering can be preferrable over k-means when dealing with
a smaller amount of data
What are some convergence criteria you could set for k-means
- % reduction drop of SSE
-Hard stop limit to avoid infinite iteration and/or a known goal
Hierarchical clustering creates a _____
dendrogram
Gaussian Mixture Modeling is a
probabilistic technique
In GMM, the center of the cluster is the
arithmetic mean
A model with overfitting is
too complex, maybe has too many predictors
You could have ovefitting when
the model is more complex than the data
overfitting is ____ common than underfitting with AI models
more
What are some applications of classification?
- fault detection
- predictive maintenance
- speech recognition
Classification error is quantified by a
loss function
What is the formula for inverse distance weighting
w_i = (1/dist_i)/(sum(1 to k)of (1/dist_i))
Euclidean distance in 2D is the same as
formula for the hypotenuse of a right triange
What are some advantages of k-NN?
- simple algorithm, with flexible options (distance calc method, # of k)
- considered a benchmark for other classification methods
What are some disadvantages of K-NN
- sensitive to outliers and erroneous labels
- memory intensive with larger k, pts, and features (giant distance matrices)
Resubstitution loss is…
the error just on the training set
What are advantages to decision trees?
- can handle non-linear responses
- excellent with categorical variables
- easy to understand for a small number of features
- once you build the model, classification of new data is computationally quick since it is just binary decisions
Disadvantages of decision trees
- struggles with a large number of features with smaller data size
- difficult to understand for a large number of features
Naive Bayes is a ____ classification technique
probabilistic
What are the 3 AI for energy transition principles
- Governing (Risk Management, Standards, Responsibility)
- Designing (Automation, Sustainability, Design)
- Enabling (Data, Incentives, Education)
How much investment does BNEF expect to need for a net-zero scenario
between 92 and 173 trillion by 2050
What are the 4 main fields where AI could be used in Energy Systems
- Renewable power gen. and demand forecasting
- Grid optimization and operation
- Management of energy demand and DER
- Materials discovery and innovation
K-means clustering has an inherent risk that the initial clusters converge….
to a local minimum, rather than global minimum SSE
Is K-means sensitive to outliers?
yes
Which has a higher time complexity, k-means or hierarchical clustering?
hierarchical