data Flashcards
Scaling
increase the number or decreasing the number
Que decoupling
By decoupling data, it helps to remove any implementation dependencies between them. Independent releases. Streamlined and faster development. Improved testability of computing components.
node and instance
some sort of machine unit, like a webserver that is processing data in some way
vertical scaling
Take one machine and give it more stuff like ram, adding storage, etc.
immutable
data will not change
Map Reduce
A programming model that is used for bagtch analysis in a wide range of application including: Web analytics, networking, E-commerce, Finance
Map reduce - Map
input for the map phase is in key-value pairs. two phase, spliting and mapping out, key, value.
Map reduce - Reduce
Output for the reduce phase is in key-value pairs. Reduce shuffle & sort, Reducer - combines key value pairs.
What are four types of Analytics
- Descriptive Analytics
- Diagnostic Analytics
- Predictive Analytics
- Prescriptive Analytics
Descriptive Analytics
What has happened? Example: What is the average number of visitors to a website in a day?
Diagnostic Analytics
Why did it happen? Example: What is the reason that this patients heart failed at exactly 12:03 PM?
Predictive Analytics
What is likely to happen? Example: When will the stock prices for Amazon begin to go down again?
Prescriptive Analytics
What can we do to make it happen? Example: What is the best route to drive to Alderwood Mall at 5:00 PM?
what is Big Data?
Collections of datasets whose volume, velocity and variety is so large that it is difficult to store, manage, process and analyze the data using traditional databases and data processing tools.
How BIG is Big Data?
2.5 quintrillion bytes of data every day
5 Characteristics of Big Data
Volume
Velocity
Variety
Veracity
Value
Volume
how much
Velocity
How fast
Variety
structured, unstructured and semi-structured