4.11/12 Big data and functional programming Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is a function?

A

A mapping of values from a domain to a set of values from a co-domain. Not all of the co-domains members needs to be outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the domain?

A

The set from which the function’s input values are chosen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the co-domain?

A

The set from which the function’s output values are chosen. Not all of the co-domains members needs to be outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Big Data?

A

A catch-all term for data that won’t fit the usual containers, cannot be stored/processed on a single server, and that must be processed at very high speeds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three Vs of big data?

A
  • (very large) Volume (of data)
  • Velocity (at which data is generated)
  • Variety (of data types in the data)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does Volume mean?
Why is it a problem?
How is it solved?

A

what it means
- Data is too big to be stored/processed on a single server

why it’s a problem
- relational databases don’t scale well across multiple machines
- and the processing associated with the data must be split across multiple machines

how it’s solved
- Functional programming is a solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does Velocity mean?

A

The data is generated and/or processed at very high speed - need to respond in seconds or milliseconds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does Variety mean?

A
  • The data is in many forms such as structured, unstructured, text, multimedia.
  • The most difficult aspect of Big Data involves its lack of structure.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the most difficult aspect of Big Data? Why?

A

Its lack of structure (under Variety). This poses challenges because:

  • Analysing the data is made significantly more difficult
  • Relational databases are not appropriate because they require data to fit into a row-and-column format
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What technique is used to discern patterns in data and to extract useful information?

A

Machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the advantage of functional programming for big data?

A

Its features make it easier to write
- Correct code
- Code that can be distributed to run across more than one server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

4 features of functional programming that make it suitable for Big Data

A
  1. Immutable data structures
  2. Statelessness
  3. Higher-order functions
  4. Programs do not specify order of execution (meaning they work well on parallel processing systems)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What about immutable data structures makes them suitable for Big Data?

A
  • Immutable data structures cannot be changed during program execution
  • Same input always gives same output
  • Makes parallel processing extremely easy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What about statelessness makes it suitable for Big Data?

A
  • Statelessness means there are no side-effects of computations
  • so code is easy to write correctly, and it is easy to understand and predict how the program will behave
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What about higher order functions makes them suitable for Big Data?

A
  • Higher-order functions take a function as an argument, return a function as a result, or both.
  • Higher-order functions can be easily parallelised
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a fact in a fact-based model?

A
  • Each fact within a fact based model captures a single piece of information.
  • Each fact is immutable and timestamped
17
Q

What is a graph schema?

A
  • Graph schemas can be used to capture the structure of a dataset. They can be easily extended, without impacting existing facts (because the facts are immutable)
  • Nodes are used to represents the core entities in the data set
  • Edges are used to represent the relationships between the nodes
  • Properties are used to capture information about the nodes
18
Q

What is a first class object?

A

First class objects are objects which may:
- R - be returned in function calls
- A - be assigned as arguments
- V - be assigned to a variable
- E - appear in expressions

Functions are first-class objects in functional programming languages

19
Q

What does function application mean?

A

Applying a function to its arguments

20
Q

What does partial function application mean?

A

Parțial function application means only applying a function to some of its arguments. The result is a function.

21
Q

What is functional composition?

A
  • Combining two functions to get a new function
  • g*f means apply f first, then g
22
Q

Describe in words what map does

A

Applies a given function to each element of a list, returning a list of results

23
Q

Describe in words what filter does

A

Processes a list to produce a new list containing exactly those elements that match a given condition

24
Q

Describe in words what reduce or fold does

A

Reduces a list of values to a single value by repeatedly applying a combining function to the list values

25
Q

How is machine learning used in Big Data?

A

Machine learning is used to discern patterns in data and to extract useful information

26
Q

What is meant by “parallel processing” in Big Data?

A

When more than one processor can work on different parts of a large data set at the same time without changing any other part