Examen 3. Presentacion 4 Flashcards

Question 1

Q

What is cloud computing?

Answer

A

Cloud Computing (CC) is a model for enabling
ubiquitous, convenient, on-demand network
access to a shared pool of configurable
computing resources (e.g., networks, servers,
storage, applications, and services) that can be
rapidly provisioned and released with minimal
management effort or service provider
interaction

Question 2

Q

public cloud

Answer

A

when it is made available in a pay-as-you-go
manner to the general public

Question 3

Q

private cloud

Answer

A

when the cloud infrastructure is
operated solely for a business or an organization

Question 4

Q

Cloud Computing services

Answer

A

Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), Infrastructure-as-a-Service

Question 5

Q

Software-as-a-Service (SaaS)

Answer

A

Applications are accessible from several client devices.
The provider is responsible for the application for example : Email, CRM, Collaborative, ERP.

Question 6

Q

Platform-as-a-Service (PaaS)

Answer

A

The client is responsible for the end-to-end life cycle in
terms of developing, testing and deploying applications
Providers supplies all the systems (operating systems,
applications, and development environment)
For Example: Application Development, Decision Suppport, Web, Streaming

Question 7

Q

Infrastructure-as-a-Service (IaaS)

Answer

A

In this type of service the client manages the
storing and development environments for Cloud
Computing application such as the Hadoop
Distributed File System (HDFS) and the MapReduce
development framework.
For example: Caching, Legacy, Netoworking, etc.

Question 8

Q

NoSQL

Answer

A

Not Only SQL, refers to an
eclectic and increasingly familiar group of nonrelational data management systems; where
databases are not built primarily on tables, and
generally do not use SQL for data manipulation.

Question 9

Q

NoSQL Focus

Answer

A

NoSQL databases focus on
analytical processing of large scale datasets,
offering increased scalability over commodity
hardware.

Question 10

Q

Why NoSQL Databases?

Answer

A

1.- The exponential growth of the volume of data
generated by users, systems and sensors.
2.-The increasing interdependency and complexity of
data accelerated by the Internet, Web2.0, social
networks and open and standardized access to
data sources from a large number of different
systems.

Question 11

Q

The CAP-Theorem

Answer

A

postulates that only two of
the following three different aspects of scaling
out can be achieved fully at the same time

Question 12

Q

Aspects of scalling

Answer

A

Strong Consistency, High Availability, Partition-tolerance

Question 13

Q

CAP theorem and NoSQL

Answer

A

Many of the NoSQL databases above all have loosened
up the requirements on Consistency in order to achieve
better Availability and Partitioning (AP).

Question 14

Q

primary uses of NoSQL Database

Answer

A

Large-scale data processing (parallel processing over
distributed systems).
Embedded I-R (basic machine-to-machine information
look-up & retrieval).
Exploratory analytics on semi-structured data (expert
level).
Large volume data storage (unstructured, semi-structured,
small-packet structured).

Question 15

Q

Classification of NoSQL Databases

Answer

A

Key-Value stores, Document databases, WideColumn, Graph databases

Question 16

Q

Key-Value store

Answer

A

these Data Management Systems (DMS) store items as alphanumeric identifiers (keys) and associated values in
simple, standalone tables (referred to as ―hash
tables‖). The values may be simple text strings or
more complex lists and sets.

Question 17

Q

Document databases

Answer

A

designed to manage
and store documents which are encoded in a
standard data exchange format such as XML,
JSON or BSON

Question 18

Q

Key-Value store values

Answer

A

The values may be simple text strings or
more complex lists and sets.

Question 19

Q

Document Databases Value

Answer

A

he values may be simple text strings or
more complex lists and sets.

Question 20

Q

Wide-Column

Answer

A

This type of
NoSQL Database employs a distributed,
column-oriented data structure that
accommodates multiple attributes per key

Question 21

Q

Graph Databases

Answer

A

Graph databases replace
relational tables with structured relational
graphs of interconnected key-value pairings

Question 22

Q

Hadoop HBase

Answer

A

HBase is a column-oriented
database management system that runs on top
of Hadoop Distributed File System (HDFS).

Question 23

Q

The reason to store values on a per-column
basis

Answer

A

for specific queries, not all of the values are
needed

Question 24

Q

HBase Architecture

Answer

A

Consistency, Atomic Read and Write, Sharding, High Availability, Client API, Scalability, Distributed Storage, HDFS/Hadoop integration, Data Replication, Load sharing and Support for Failure, API Support, MapReduce Support, Sorted Row Keys, Real Time Processing of Data,

Question 25

Q

Consistency

Answer

A

HBAse provides consistent read
and write operations and thus can be used for
high speed requirements. This also helps to
increase the overall throughput of the system.

Question 26

Q

Atomic Read and Write

Answer

A

Atomic read and write
means that only one process can perform a
given task at a given time. For example when
one process is performing write operation no
other processes can perform the write
operation on that data.

Question 27

Q

Sharding

Answer

A

HBase offers automatic and manual
splitting of regions. This means that if a region
reaches its threshold size it automatially splits into
smaller sub regions.

Question 28

Q

High Availability

Answer

A

HBase provides Local Area
Network(LAN) and Wireless Area Network(WAN)
which supports failure recovery. There is a master
server which monitors all the regions and metadata of
the cluster

Question 29

Q

Client API

Answer

A

HBase offers access through the Java API
which helps to programmatically access HBase.

Question 30

Q

Scalability

Answer

A

This is one of the important characteristics
of non-relational databases. HBase supports scalability
both in linear and modular form

Question 31

Q

Distributed Storage

Answer

A

This feature of HBase helps usage
of distributed storage such as HDFS

Question 32

Q

HDFS/Hadoop integration

Answer

A

HBase can run on top of
various systems such as Hadoop/HDFS

Question 33

Q

Data Replication

Answer

A

The data in HBase are replicated over
a number of clusters. This helps to recover data in case of
any loss and high availability of data

Question 34

Q

Load sharing and Support for Failure

Answer

A

HDFS is
internally distributed and supports automatic recovery. As
HBase runs on top of HDFS it is also automatically
recovered.

Question 35

Q

API Support

Answer

A

HBase supports Java API which makes it
easily available programmatically using java

Question 36

Q

MapReduce Support

Answer

A

HBase supports map reduce
which helps in parallel processing of data.

Question 37

Q

Sorted Row Keys

Answer

A

HBase uses keys and stores it in
lexicographical order thus optimizing the requests.

Question 38

Q

Real Time Processing of Data

Answer

A

HBase performs real
time processing of data and supports block cache and
bloom filters

Question 39

Q

Denormalization, Duplication, and Intelligent Keys
(DDI)

Answer

A

It is about rethinking how data is stored in
Bigtable-like storage systems, and how to make use of
it in an appropriate way

Question 40

Q

table row keys

Answer

A

are byte arrays so
almost anything can serve as a row key from
strings to binary representations of longs or
even serialized data structures

Question 41

Q

Design Aspects

Answer

A

Tables are declared up front at schema definition
time, Rows are lexicographically sorted with the lowest
order appearing first in a table, Columns are grouped into column families, The column family prefix must be composed of printable characters, The qualifying tail, the column family qualifier, can be made of any arbitrary bytes, Column families must be declared up front at schema, A {row, column, version} tuple exactly specifies a cell in HBase.