AI+ML+DS Flashcards
- Deep Learning, Goodfellow - Artificial Intelligence, Norvig - Introduction to Statistical Learning in Python, Hastie
Knowledge Based Approach to AI
Hard-coding knowledge about the world in formal languages for the computer system to make logical inference rules.
Machine Learning
Subset of artificial intelligence that enables computers to learn from data and improve their performance on a specific task without being explicitly programmed. Instead of being hand-coded with specific rules, machine learning algorithms can identify patterns and make predictions based on the data they are trained on.
Artificial Intelligence
A broad field of computer science that aims to create intelligent agents, which are systems that can reason, learn, and act autonomously.
In simpler terms, AI involves developing machines that can think and behave like humans.
Deep Learning
A subset of machine learning that uses artificial neural networks with multiple layers to learn from data. These neural networks are inspired by the structure and function of the human brain, and they can learn complex patterns and relationships in data that are difficult for traditional machine learning algorithms to capture.
Solves a central problem in representation learning by introducing representations that are expressed in terms of other, simpler representations
Representation of Data
Refers to the way data is structured and encoded so that it can be processed by machine learning algorithms. The choice of data representation can significantly impact the performance and efficiency of a machine learning model.
Feature
Each piece of information included in the representation of an observation
Representation Learning
Use of machine learning to discover not only the mapping from representation to output but also the representation itself
Autoencoder
Quintessential example of representation learning
Combination of an encoder function, which converts the input data into a different representation, and a decoder function, which converts the new representation back into the original format
Trained to preserve as much information as possible when an input is run through the encoder and then the decoder, but also trained to make the new representation have various nice properties
Encoder
Converts the input data into a different representation
Decoder
Converts the new representation (encoded) back into the original format
Factors of Variation
Concepts of abstractions that help us make sense of the rich variability in the data
Multilayer Perceptron (MLP)
A type of artificial neural network that consists of multiple layers of interconnected neurons. Each neuron takes a weighted sum of its inputs, applies an activation function, and passes the result to the next layer.
Input/Visible Layer
First layer of neural network that contains the variables we are able to observe
Hidden Layer
Layers of a neural network that are not the first (input) or last (output) layers of the network. Extracts increasingly abstract features from the data. Their values are not given in the data; instead, the model must determine which concepts are useful for explaining the relationships in the observed data.
Adaptive Linear Element (ADALINE)
A type of single-layer artificial neural network used for linear regression and classification tasks. It is similar to the perceptron but uses a least mean squares (LMS) algorithm for training, which allows it to learn more efficiently.
Rectified Linear Unit (ReLu)
A popular activation function used in artificial neural networks. It introduces non-linearity into the model, allowing it to learn complex patterns.
How does it work?
- If the input (x) is positive, the output is the input itself.
- If the input is negative, the output is zero.
Linear Algebra
Branch of mathematics that deals with the study of vectors, matrices, and linear transformations. It provides a framework for solving systems of linear equations and analyzing the properties of linear relationships between variables.
Scalar
A single number
Written in italics with lowercase variable names
Can be thought of as a matrix with a single entry
Vector
An array of ordered scalars = 1-D
(each number has a specific location in the array)
Written in lowercase names with bold typeface
Can be thought of as matrices that contain only one column
Matrix
2-D array of scalars
Written in uppercase names with bold typeface
Tensor
N-D array of scalars
Written in uppercase names with bold-tensor typeface
bold-tensor typeface is slightly different than our traditional bold typeface
Matrix Operation: Transpose
Taking the mirror image of a matrix across the main diagonal
Main Diagonal
Diagonal line on a matrix running down to the right, starting from its upper left corner.
Broadcasting
The implicit copying of a scalar to many locations when performing a matrix operation
Turing Test
A test of a machine’s ability to exhibit intelligent behavior indistinguishable from that of a human. It was proposed by Alan Turing in 1950.
Control Theory
Deals with designing devices that act optimally on the basis of feedback from the environment.
4 Approaches to AI
- Acting Humanly: The Turing Test approach
- Thinking Humanly: The cognitive modeling approach
- Thinking Rationally: The “laws of thought” approach
- Acting Rationally: the rational agent approach
Agent
Anything that can be viewed as perceiving its environment and acting upon that environment.
Perceives through sensors
Acts through actuators
Agent = architecture + program
Rational Agent
An Agent that acts so as to achieve the best outcome, or, when there is uncertainty, the best expected outcome
Rational != Perfect
Rationality maximizes expected performance, while perfection maximizes actual performance
Standard Model
General framework for representing and analyzing systems
The focus on the study and construction of agents that “do the right thing”
What is the “right thing”? -> Defined by the standard model
Control Theory: controller minimizes a cost function
Operations Research: policy maximizes a sum of rewards
Statistics: Decision rule minimizes a loss function
Economics: Where a decision maker maximizes utility or some measure of social welfare
Value Alignment Problem
The values or objectives put into the machine must be aligned with those of the human
Behaviors are not “unintelligent” or “insane”; they are a logical consequence of defining winning as the sole objective for a machine
Algorithm
A step-by-step set of instructions or rules to be followed to solve a specific problem or achieve a particular outcome. It’s like a recipe for a computer, providing a precise sequence of actions to perform
Incompleteness Theorem
States that any consistent formal system that is powerful enough to express basic arithmetic statements is necessarily incomplete. In other words, there will always be true statements within the system that cannot be proven or disproven using the axioms and rules of inference within that system.
Computability
Capable of being computed by an effective procedure
Tractability
Refers to the property of a problem that can be solved by an algorithm in a reasonable amount of time. In other words, a tractable problem is one that can be efficiently solved using a computer.
Intractable
Time required to solve instances of the problem grows exponentially with the size of the instances
NP-Completeness
A concept in theoretical computer science that refers to a class of decision problems that are considered to be among the most difficult to solve. These problems are characterized by the fact that while it is relatively easy to verify a solution, it is extremely difficult to find a solution.
Decision Theory
A field of study that deals with making rational choices in the face of uncertainty. It provides a framework for analyzing and making decisions when there are multiple possible outcomes and associated probabilities.
Game Theory
A branch of mathematics and economics that studies strategic decision-making among rational agents. It analyzes situations where the outcome for one agent depends on the choices made by other agents.
Multiagent Systems
Systems composed of multiple autonomous agents that interact with each other and their environment to achieve a common goal. These agents can be software programs, robots, or even humans.
Operations Research
A field of study that applies mathematical and analytical techniques to solve complex problems that arise in business, industry, and other organizations. It focuses on optimizing systems, processes, and decisions to improve efficiency and effectiveness.
Singularity
A hypothetical future point in time when technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. It is often associated with the development of artificial intelligence (AI) that surpasses human intelligence in all aspects.
Hebbian Learning
A principle in neuroscience that states that neurons that fire together wire together. This means that if two neurons are frequently activated simultaneously, the connection between them is strengthened. Conversely, if two neurons are rarely activated simultaneously, the connection between them weakens.
Matrix Operation: Dot Product
C=AB
If A is of shape m x n and B is of shape n x p, then C is of shape m x p
A must have the same number of columns as B has rows
A = (2x3) B = (3x2) C = (2x2)
A = [[1, 2, 3],
[4, 5, 6]]
B = [[7, 8],
[9, 10],
[11, 12]]
Dot product of A and B = C:
C = [[17+29+311, 18+210+312],
[47+59+611, 48+510+612]]
Matrix Operation: Element-wise Product
Matrix A and Matrix B must be of same shape. Then you multiply the corresponding elements in each matrix for result.
Matrix Operation: Scalar Product
Multiplying every value of a matrix by a scalar constant
Resulting matrix is the same shape and output is original matrix with each element multiplied by constant scalar
Matrix Notation: Rows and Columns
m x n
m = # of rows
n = # of columns
System of Linear Equations
A collection of two or more linear equations with the same variables. Each equation represents a straight line on a graph, and the solution to the system is the point(s) where these lines intersect.
A set of two or more equations with the same variables. The goal is to find values for the variables that satisfy all of the equations simultaneously.
Identity Matrix
I
Matrix that does not change any vector when we multiple that vector by that matrix
The structure of the identify matrix is simple: all the entries along the main diagonal are 1, while all other entries are zero
Matrix Operation: Inversion
Produces the Inverse Matrix
The inverted matrix is another matrix that, when multiplied by our original matrix, produces the identity matrix.
A * B = B * A = I
Linear Algebra: Origin
The point specified by the vector of all zeros
Linear Combination of vectors
Produced by multiplying each vector, in a list of vectors (matrix), by a corresponding scalar coefficient and then adding the results.
Matrix: Span
The set of all points obtainable by linear combination of the original vectors
Column Span/ Range
Refers to the set of all linear combinations of the columns of a matrix. It is a subspace of the vector space containing the columns.
Linear Dependence
Refers to a relationship between a set of vectors where one vector can be expressed as a linear combination of the others.
In other words, if you can find a set of non-zero scalars (coefficients) such that a linear combination of the vectors equals the zero vector, then the vectors are linearly dependent.
If you can add 2 columns together, after first multiplying them by a scalar constant, to get another column - those columns are linearly dependent.
Linear Independence
True of a set of vectors if no vector in the set is a linear combination of the other vectors
Matrix Attribute: Square
m = n and all columns are linear independent
Square Matrix Attribute: Singular
A square matrix with linear independent columns