Week 5 - practice quiz Flashcards
Which company has created the MapReduce framework as a concept?
1) Amazon
2) Oracle
3) Microsoft
4) Google
4) Google
Which company has implemented Hadoop an an open-source version of MapReduce?
1) Google
2) Amazon
3) Microsoft
4) Yahoo
4) Yahoo
Which of the following is true about the Hadoop file system?
1) Files are append-only
2) Files split in to 1 GB blocks
3) Meta node stores metadata
4) Each node stores distinct data blocks
1) Files are append-only
What does HDFS stand for?
1) Highly Distributed File System
2) Highly Disturbed File System
3) High Definition File System
4) Hadoop File System
4) Hadoop File System
Hadoop Disturbed File System
What is the data type used by Hadoop for a MapReduce process?
1) Column-based
2) Document-based
3) Graph-based
4) Key-value
4) Key-value
What is the output of the Map function in a MapReduce process?
1) List of graph nodes
2) List of key-value pairs.
3) List of table columns
4) List of network nodes
2) List of key-value pairs.
Where do mapper nodes save their outputs before serving to reducer nodes?
1) Local disk
2) Another node
3) Central node
4) Master node
1) Local disk
What does Hadoop do with a task that crashes in a node?
1) The task is retried on another node.
2) The node is rebooted.
3) The task is failed.
4) The node is shut down.
1) The task is retried on another node.
Apache Spark sorts its data processing operations, such as collect, filter, and sort, by building a graph called DAG. What does DAG stand for?
1) Derived Apache Graph
2) Distributed Apache Graph
3) Directed Acyclic Graph
4) Distributed Asymmetric Graph
3) Directed Acyclic Graph
Which of the following statements about the difference between Hadoop and Spark is true?
1) Hadoop supports in-memory cluster computing.
2) Hadoop is faster than Spark.
3) Both Hadoop and Spark can load data from Hadoop File System (HDFS)
4) Hadoop provides multiple built-in data processing operations such as filter and join.
3) Both Hadoop and Spark can load data from Hadoop File System (HDFS)
What is the input for the Reduce function in a MapReduce process?
1) Keys and their corresponding list of values.
2) Keys and their corresponding maps.
3) Keys and their corresponding nodes.
4) Maps and their corresponding values.
1) Keys and their corresponding list of values.
What is the output of the Reduce function in a MapReduce process?
1) List of key-value pairs
2) List of key-node pairs.
3) List of key-reducer pairs.
4) List of key-mapper pairs.
1) List of key-value pairs
Which of the following is the correct sequence of phases in a MapReduce process?
1) Input, Splitting, Shuffling, Mapping, Reducing, Output
2) Input, Splitting, Mapping, Reducing, Shuffling, Output
3) Input, Splitting, Mapping, Shuffling, Reducing, Output
4) Input, Mapping, Splitting, Shuffling, Reducing,
3) Input, Splitting, Mapping, Shuffling, Reducing, Output
What does Hadoop do with a task that repeatedly crashes in a MapReduce system?
1) The task is failed.
2) The task is retried on another system.
3) The system is rebooted.
4) The system is shut down.
1) The task is failed.
What does Hadoop do when a node crashes during a MapReduce process?
1) Ignores all of the maps created on all of the nodes.
2) Ignores all of the maps created on the node crashed.
3) Re-launches any maps the node previously ran.
4) Re-launches any maps all of the nodes previously ran.
3) Re-launches any maps the node previously ran.