P3 - Processing Power & Scalability Flashcards

1
Q

Why Do Chatbots Need High Processing Power?

A

Training large language models involves billions of parameters and vast datasets.
Deployment requires real-time processing of user queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of processor is best suited for small-scale AI models?

A

CPUs (Central Processing Units) are general-purpose processors that are cost-effective for small-scale models but become inefficient for larger ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why are GPUs (Graphics Processing Units) ideal for training large AI models?

A

GPUs are designed for parallel processing, which makes them ideal for training large models and handling multiple computations simultaneously.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are TPUs (Tensor Processing Units) and why are they significant in AI?

A

TPUs are specialized hardware developed by Google specifically for machine learning tasks, offering high efficiency for both training and inference at scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What scalability challenges do AI workloads face as chatbot usage grows?

A

Increasing usage requires more computational power to handle larger datasets and user bases, and high hardware costs and energy consumption can become limiting factors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the cost vs. performance trade-offs for small-scale AI implementations?

A

For small-scale implementations, cost-effective options like CPUs or cloud-based GPU instances are preferable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do large-scale AI implementations differ in hardware investment?

A

Large-scale implementations often require investments in TPUs or dedicated data centers to ensure scalability and efficient processing of extensive workloads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some key CPU options for AI workloads, and what features enhance their performance?

A

CPUs like Intel Xeon and AMD Threadripper are common. Intel Xeon processors include an AI engine in each core, benefit from faster memory, more cores, and larger last-level cache, which can reduce LLM latency up to 5x compared to default PyTorch over 10 weeks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the von Neumann bottleneck in CPUs?

A

It refers to the limitation where memory access speed is much slower than the CPU’s computation speed, which can restrict overall throughput for data-intensive machine learning tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are GPUs, such as the NVIDIA 4090 and A100, preferred for large-scale AI training?

A

GPUs are designed for massively parallel processing, making them excellent for matrix operations and training models (like Transformers) that require simultaneous computations. They provide significant speed-ups over CPUs for these tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some challenges associated with using GPUs?

A

GPUs come with high upfront costs, consume a lot of energy, and require complex cooling (often needing water cooling). They’re also specialized hardware, so compatibility with smaller systems can be limited.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What makes TPUs uniquely suited for AI workloads, especially for Transformers?

A

TPUs (Tensor Processing Units) are specialized for matrix processing. They use a systolic array architecture—like Google’s Cloud TPU v3 with a 128×128 grid of ALUs—to perform fast, highly parallel matrix multiplications, which are core to Transformer attention and backpropagation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a primary disadvantage of TPUs compared to CPUs?

A

TPUs are not Turing complete; they’re designed exclusively for neural network workloads and cannot perform general-purpose computing tasks like word processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In what scenarios are CPUs considered more cost-effective despite their slower parallel processing?

A

CPUs are ideal for smaller-scale models or algorithms (e.g., certain time series or RNN/LSTM models) that do not require extensive parallel processing, making them a cost-effective alternative when GPU use isn’t justified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do the hardware options balance cost vs. performance for different AI workloads?

A

For small-scale implementations, cost-effective options like CPUs or cloud-based GPU instances are sufficient. For large-scale deployments—like high-quality chatbots with billions of parameters—investments in TPUs or dedicated data centers are needed to ensure scalability and efficient performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly