Lecture 25 Flashcards

1
Q

What is the hierarchical model? What is the problem?

A

Data organised into tree-like structure, contains records and a parent-child relationships which are 1:N.
This needs to be designed, defined and then built, it is difficult to extend or alter after this and lacks relational flexibility.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the network model?

A

Data model objects are records and sets, a record is a group of related data values and a set is a description of a 1:N relationship between two record types. These sets have names, owners and members and are represented using bachman diagrams.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is big data an important recent development?

A

In the old database models few companies were generating data, but all others were consuming. Now all of us are generating and using data.
It is now more important to be able to manage and analyse data in a timely and scalable fashion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is big data? What are the characteristics?

A

Data whose scale, diversity, and complexity requires new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it.

  1. Scale is much larger.
  2. Must more variable and complex data.
  3. Data is being generated fast and needs to be processed fast.
  4. Uncertainty due to inconsistency(not always considered).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is NOSQL? What are some perks?

A

Not Only SQL, has looser schema definition, designed for distributed, large databases. Uses no joins. Horizontal scaling is possible with NOSQL, making scaling up or down very easy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some NOSQL types?

A

Key-Value: a simple model mapping between a key and a value, cannot be queried without a key, very simple and very scalable.
Wide-column: key leads to columns and values.
Document: Uses javascript object notation, or XML. Good for when data is semi-structured or structure changes through lifetime of application.
Graph: useful when objective is to quickly find connections, patterns and relationships between lots of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the MapReduce programming model?

A

A programming model and associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

  1. Read lots of data
  2. Map: extract something we care about from each record.
  3. Shuffle and sort.
  4. Reduce: aggregate, summarize, filter, or transform.
  5. Write the results.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly