Deep Learning Flashcards

Question 1

Q

Neural Networks

Answer

A

Interconected nodes or “neurons” organized in layers. Neurons are connected by weighted links adjusted during training. Can learn complex patterns and relationships in data.

Inspired by animal brains.

Question 2

Q

Layers of a Neural Network

Answer

A

Input Layer - receives initial data
Hidden layer(s) - process the data
Output Layer - produces the final result

Simple neural networks may only have a few layers.

Question 3

Q

Deep Learning

Answer

A

Uses neural networks with many layers to progressively extract higher-level features from raw input. Can be supervised, unsupervised, or semi-supervised. Automatically “understands” the data by breaking it down and progressively building up hierarchy of learned features, from simple patterns to complex, abstract concepts.

Range from a few to thousands of layers.

Question 4

Q

Computational Power for Deep Learning

Answer

A

Requires significant computational power, often necessitating the use of GPUs and distributed computing, where multiple computers work together to solve a common problem or perform complex tasks.

Question 5

Q

Where Deep Learning Excels

Answer

A

Automatically learning representations of data with multiple levels of abstraction. As data passes through the network, learned representations become more abstract and closer to the high-level concepts we care about. Allows the model to understand complex data and handle a wide range of data types and tasks efficiently. It also allows the network to learn features by itself without domain-specific engineering.

Example: in an image, first detect edges or colors, then corners …

Question 6

Q

Examples of Deep Learning

Answer

A

Self-driving cars
Virtual assistants (Siri, Alexa)
Recommendation systems
Medical image analysis
Game-playing AI

Question 7

Q

Transformers

Answer

A

Type of Deep Learning architecture that has revolutionized AI. Can process entire sequences in parallel, speeding up training and inference. Particularly good at sequences like language or time series data.

Introduced in seminal paper, “Attention is All Your Need” by researchers at Google in 2017.

Question 8

Q

How Transformers Work

Answer

A

Take a series of inputs, e.g., words in a sentence
Use “attention” to focus on different parts of the sequence at once and how they relate.
Process the entire sequence in parallel –> faster. E.g., reading an entire paragraph at once.
Have multiple layers, each adding more depth. First might capture simple patterns like word order, while deeper ones understand complex relationships like grammar or meaning.
By the end, the Transformer has a rich understanding of the input data.
Once the transformer understand the input, it can produce outputs like translation or answering a question.

Question 9

Q

Attention

Answer

A

Focusing on different parts of the sequence at once. Helps the model decide which words in a sentence are important and how they relate to each other, even if far apart.

Question 10

Q

Self-Attention

Answer

A

Allows the model to weigh the importance of different words in a sentence relative to each other, enabling the model to capture conceptual relationships. Each word is transformed into a vector, and the self-attention mechanism calculates attention scores to determine the relevance of each world to the others.

Question 11

Q

Diffusers

Answer

A

Gradually add noise to data, such as an image, and then reverse this process to “denoise” the data. Starts with random noise and progressively refines it into a clear, structured output, like an image or sound.

Examples: DALL-E, Stable Diffusion

Question 12

Q

Transformers vs. Diffusers

Answer

A

Transformer models are great at text-based tasks involving sequential data like translation, chatbots, and content generation, while diffuser models are generally used for image-based tasks like generating images, videos, or even sound where you’re creating new data from scratch.

Question 13

Q

Parameters

Answer

A

Internal variables that determine how an AI model processes input data and generates output. Learned from training data. Act as the “knobs and dials” that the AI model adjusts during training to minimize the difference between its predictions and actual values. Can be tailored and tuned for specific applications.

Weights, biases, and scaling factors.

Question 14

Q

Hyperparameters

Answer

A

External settings that influence the models’ learning process and architecture.

Question 15

Q

Weights

Answer

A

Parameters that determine the strength of connections between neurons. Learned by the model during training.

Question 16

Q

Biases

Answer

Study These Flashcards

A

Parameters added to the output of each neuron before applying an activation function. Shift the output of the neuron, helping the model learn more complex patterns. Learned during training.

Question 17

Q

(Vector) Embeddings

Answer

Study These Flashcards

A

Dense, low-dimensional representations that capture the semantic meaning of the data.

Question 18

Q

Vectors

Answer

Study These Flashcards

A

Primary means of representing and processing data in AI. Encode data in a structured, numerical format, including text, images, and audio. Numbers are easier to process than letters. Each element/dimension corresponds to a specific feature or attribute of the data.

Can be very high-dimenstional, which makes them computationally intensiv

Deep Learning Flashcards

(18 cards)