Infrastructure & Architecture Flashcards
scale-up (vertical scaling)
upgrade existing machine (SMP,MPP)
scale-out (horizontal scaling)
adding more machines to network, unlimited scaling
Symmetric Multi-processing (SMP)
traditional PC, multiple processors share same RAM and storage
Massively Parallel Processing (MPP)
Each processor has its own dedicated RAM and storage (vendor lock-in)
Cluster Architecture
many computers connected to work as a single system.
how are nodes connected in cluster
usually gigabit ethernet, 8-64 per rack
Only pro of MPP over cluster
faster message passing between nodes. Ideal for single, vertical solutions like data warehousing
commodity hardware
standardized, market priced hardware
distributed computing
tasks split into smaller units processed simultaneously across machines
challenges of distributed computing (4) (ARFA)
Assigning SPLIT tasks,
Resource Allocation, Fault-tolerance, Aggregating (results)
solution to challenges of distributed computing
Use a framework to hide complexity of distributed computing from developers
Grid computing
- Collect computer resources from multiple locations.
- Each node perform different task
- Multi-purposed
What is HPC?
high-performance computing (GPU intensive)
What is NIST reference architecture for?
How a Big Data system should be designed
5 Components of NIST Reference Architecture
- Big Data Framework Provider
- processing
- storage
- networking - Data Provider
- Application provider
- Data Consumer
- System Orchestrator
- integrate components
- meet goals