Lecture 9 - Big Data and data science Flashcards
Give examples of SQL databases
mySQL, MongoDB, SQL Server, PostGreSQL, Vertica
Give examples of SQL databases
mySQL, SQL Server, PostGreSQL, Vertica
Give an example of no-SQL databases
Graph Database (like DBPedia)
What does API stand for?
Application Programmer Interface
Under which circumstances SQL / no-SQL should be used?
SQL: data is structured and unchanging
no-SQL :
- storing a large volume of data with little structure
- data changes rapidly
What is Hadoop?
open-source Java implementation of Map-Reduce
What is Map-Reduce?
Framework to distribute/parallelize processing tasks across multiple computers
What is Apache Spark?
“successor” of Hadoop
Name the advantages of Spark over Hadoop
- provides real-time, in-memory processing
- much faster than Hadoop
–> suitable for streaming real-time data
Why SaaS?
- pay as you go
- scale up/down
- low maintenance
- performance, better infrastructure
(disadvantage: data privacy)