Lecture 8 Flashcards

1
Q

3 V’s describing Big data

A

Volume - Increased volume of data

Velocity - Increased processing speed to process more data and more results

Variety - Diversity of data and data types

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the Storage and data models

A

Storage model – describes the layout of a data structure in a physical storage

A data modelCaptures the most logical aspects of a data structure in a database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the Two abstract models of storage

A
  1. Cell storage – Storage consists of cells of the same size which each object fits into one cell. The model organised an array of memory cells into secondary storage in sectors, read + written in a unit.
  2. Journal storage – System that keeps track of the changes that is made in a journal before placing it in a main file system.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe database, DMS, Query Lanuage + Database models

A
  • Database – A collection of records
  • Database Management System – Software that controls the access of the database
  • Query language – A programming language to develop database applications
  • Database models – Limitations of the hardware available at the time of the popular applications
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the requirements of Cloud Applications?

A
  • Rapid application development + short-time to the market
  • Low latency
  • Scalability
  • High availability
  • Consistent view of the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the file, file pointer, logical + physical organianisation of a file

A
  • File – Is an array of cells stored into a device. The application is viewed as a record
  • File pointerIdentifies a cell used as a starting point to read + write
  • The logical organisation of a file – Reflects the data model and views of the data in an application
  • The physical organisation of a file – reflects the storage model and describes how the file is stored in a storage media
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe File Systems, Distributed File Systems and the two other file systems

A
  • File system – A collection of directories that provides information about a set of files
  • Distributed file systems – Address the need to share a file with a number of clients interconnected to a LAN:
    • Network File Systems (NFS) – client server architecture
    • Parallel File Systems (PFS) – Scalable, capable of distributing files across many nodes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Unix File System (UFS)?

A

A tree structured file system for organising and stores large amounts of data making it easy to manage

  • Uses basic storage called inodes about each file and directories
  • Stores metadata files called directories: (file owner, access rights, creation time, last modification, file size etc.)
  • Separates the physical file structure from the logical
  • Flexible for allocation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Network File System (NFS)?

A
  • A client/server application that lets a computer user view + store + update file on a remote computer as though on a user’s computer
  • Interacts with the Remote Procedure Calls (RPC)
  • Ensure compatibility with existing applications
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Parallel File System (PFS)?

A
  • Allows multiple clients to read + write from the same file
  • Concurrently executes multiple input/output operations
  • Supports parallel I/O is essential for performance of many applications
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the GPFS distributed locking and its techniques

A
  • Distributed locking mechanism

A central local manager grants lock tokens to local lock manager running in I/O node used to cache management system

Techniques:

  • Byte-range tokens: used to read + write operations of :data files

Node 1: writes file, obtains token to cover the file (without permission)

UNTIL…

Node 2 attempts to take over the same file

Node 1 range of the token is restricted

Data-shipping – alternative byte-range locking allows fine-grained sharing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Google File System (GFS) and design considerations

A
  • Uses many storage systems build from a variety of components to provide storage to a large user and their needs
  • Scalability + reliability
  • Files sizes from GB > TB

Design considerations:

  • File is divided into several chunks of predefined size
  • Implement atomic file allowing multiple applications to run concurrently
  • Build high-bandwidth than low latency
  • Eliminate caching at the client site
  • Minimise the involvement in master in file operation
  • Support efficient check pointing + fast recovery mechanism
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Paxos algorithm and the three phrases?

A

Used to reach to an agreement of a value

Phases:

  1. Elect a node to be a master/coordinator. To ensure each election is unique in the range (1,r) where r is the number of replicas and proposes a (prepare) message
  2. The master selects the value and sends an accept message to all nodes. Acceptors can reply with reject or accept.
  3. Majority of the nodes are accepted + the consensus is reached and the master broadcasts a commit message
How well did you know this?
1
Not at all
2
3
4
5
Perfectly