Introduction to Apache Spark and Scala Programming Flashcards

1
Q

What is Spark?

A

Spark is an execution engine that can do fast computations on big datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or false: Spark offers a big data storage solution.

A

False.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does Spark focus on?

A

Fast computation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Spark’s computational method?

A

Spark replaces Hadoop’s implementation of MapReduce with its own implementation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What storage does Hadoop have and what does Spark have?

A

Hadoop has Hadoop Distributed File System while Spark has none.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What MapReduce does Hadoop have and what does Spark have?

A

Hadoop has built-in and Spark has an optimized built-in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What speed does Hadoop have and what does Spark have?

A

Hadoop is considered fast but Spark is 10-100 times faster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What resource management does Hadoop have and what does Spark have?

A

Hadoop has YARN and Spark has its standalone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is fault tolerance achieved?

A

Resilient Distributed Datasets (RDDs).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Scala?

A

Scalable Language (Scala) is an object-orientated and functional programming language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why does Spark use Scala?

A

Scala is the preferred writing because it works with the JVM so interaction with Hadoop is easier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What message broker model does Kafka use?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly