Class 3 Flashcards

Question 1

Q

Block

Answer

A

Block: A small chunk of data

200 mb data, 64 mb block size so, 4 blocks

Concepts of hadoop:

min replication - 1

default replication - 3

max replication - 512

Question 2

Q

deisgn of hdfs

Answer

A

update the entire file or override, but record level updates are not supported

Question 3

Q

hdfs drawbacks

Answer

A

Low latency data access -> slow response time -> use hadoop eco system as solutions
lots of small files
eg: 1mb folder with 10000 files, we need to maintain 10000 filenames in namenode. We can overcome this problem with sequence file input format.
Multiple files, arbitrary modifications are not allowed, overcome using hbase

Question 4

Q

cluster summary

Answer

A

df -h -> to know th usage, UI is also available

(4 cards)