Module 5: Block, File & Object Based Storage Systems (File Based Storage Systems) Flashcards

1
Q

How do applications access data?

A

in the form of files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is metadata?

A

additional data that describes the raw data in the file (EX: picture file is a JPG/PNG)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a file system?

A

logical representation of how an OS manages where and how data is stored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How are files stored?

A

typically in folders - folders organized in hierarchical tree structure to be directly accessed or searched sequentially

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is included in file metadata?

A

describes how an app can access the raw data in the correct format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the relationship between hosts and file systems?

A

each server has its own file system - that file system is only accessible to that server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is file sharing?

A

allows access to different file systems across different hosts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the relationship between servers and file sharing in general purpose file sharing?

A

benefits of file sharing degrade as more general purpose servers are added to share pool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the major issues with network file sharing?

A

lack of scalability
file system incompatibilities across OSs
complex admin and data maintenance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two main OSs in file systems?

A

Windows and Linux

each based on different set of protocols - cross file sharing between OSs is a complicated process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is NAS (network-attached storage)?

A

purpose built file storage systems that take the place of general purpose servers for file sharing and storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the benefits of NAS over general purpose servers?

A

centralizes file share operations
uses specialized and optimized file IO
enables Linux UNIX and Windows users to share data more efficiently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is clustering in a NAS system?

A

enables multiple NAS controllers or nodes to function as a single entity - allow for workload distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the two components of a NAS system?

A

NAS Controller
File Storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a NAS controller?

A

compute system that contains network, memory, and CPU resources for NAS

houses specialized file OS

responsible for managing RAID, creating LUNs, installing file systems and exporting file shares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is file data storage?

A

block based storage is used to store raw NAS data and metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is scale-up NAS?

A

provides the ability to independently grow capacity and performance

if you only scale compute or only scale storage that’s scaling up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What happens when a NAS begins approaching it’s capacity limits?

A

performance of system starts degrading

19
Q

What is scale-out NAS?

A

ability to increase storage and compute simultaneously

20
Q

What are the benefits of scale out NAS?

A

pools multiple Nodes to work as single device

scales performance/capacity simultaneously

clients connected to any node can access any file system on the cluster

stripes data across nodes with mirror or parity protection

21
Q

How does scale out NAS networking work?

A

internal network provides intra cluster communication - each node connects to internal network

external network connection enables clients to access and share file data

22
Q

What are the features of the internal network for scale-out NAS?

A

offers high throughput and low latency

high speed networking like Infiniband or Gigabit Ethernet

23
Q

How do clients access the nodes in scale out NAS?

A

nodes must be connected to external Ethernet network

24
Q

What is CIFS?

A

CIFS = client internet file system - enable clients to make requests from file systems on remote computers over TCP/IP

25
Q

What is the difference between CIFS and SMB?

A

CIFS is non-proprietary version of SMB (server message block) which is made by Microsoft

26
Q

How does CIFS enable file sharing?

A

using special locks

27
Q

What are the features specific to CIFS?

A

uses file and record locking to prevent user overwriting

supports fault tolerance and automatically restore connections/files during interuptions

28
Q

What is the naming scheme for remote file systems?

A

\server\share or \servername.domain.suffix\share

29
Q

What is NFS?

A

network file system - common file protocol for UNIX systems

uses machine independent model to represent data

30
Q

What is used for inter-process communication between two computers running NFS?

A

Remote Procedure Call (RPC)

31
Q

What is HDFS?

A

Hadoop Distributed File System - supported by many of the major NAS vendors

32
Q

What is required to run HDFS?

A

requires programmatic access because the file system can’t be mounted

all HDFS communication is layered on top of TCP/IP protocol

33
Q

What type of architecture does HDFS run on?

A

primary and secondary

cluster consists of single Name Node that acts as management server

34
Q

What is an HDFS cluster made up of?

A

has in-memory maps of every file - file locations - and blocks within the files where Data Nodes reside

35
Q

What is the Name Node responsible for in HDFS?

A

manages file system namespace and controls access to the files by clients

36
Q

What are Data Nodes responsible for in HDFS?

A

serve read/write creations and perform block creation/deletion/replication

37
Q

What are the features of an HDFS file system?

A

spans multiple nodes and enables user data to be stored on files

traditional hierarchical file system

presents streaming interface to run apps through MapReduce framework

38
Q

What is FTP?

A

protocol that enables transfer over an IP network

uses TCP as the transport protocol

39
Q

What is the read/write process of a scale up NAS?

A

client packages IO into TCP/IP and forwards it to network

NAS receives request from network and converts IO request to correct physical storage which is a block level IO

operation than performed on physical storage

when NAS receives data from physical storage it packages it into correct file protocol

NAS packages into TCP/IP again and sends back over network

40
Q

How is a write operation performed in a scale out NAS?

A

client sends file to NAS

node to which client is connected to receives the file

file is striped across the nodes

41
Q

How is a read operation performed in a scale out NAS?

A

client requests file

node to which client is connected to receives the request

node retrieves and rebuilds the file and gives to client

42
Q

What is true about scale-out NAS architecture?

A

even though client is only connected to one node at a time every read or write operation from that node is striped across whole cluster

43
Q

How does the connected node rebuild a file that’s been striped across multiple nodes in a read request?

A

uses back-end Infiniband network

44
Q

What is a data lake?

A

hub for data ingestion and consumption systems

allows customers to bring analytics to their data and avoid high cost of multiple systems