Chapter 1: Introduction, Motivation & Overview Flashcards
Working definition for this course Distributed Systems
A distributed system is a system that is comprised of several physically disjoint compute resources interconnected by a network.
Leslie Lamport’s anecdotal remark
• “A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable
Why build a distributed system?
- Centralized system is simpler in all respects
- Scalability limitations
- Single point of failure
- Availability and redundancy
- Many resources are inherently distributed
- Many resources used in a shared fashion
Client‐server model: Examples
- Clients and servers are often separated by network, but may also be running on the same machine
- Clients initiate request, server awaits client requests
- Example servers: Web server, database server, ftp server, name server, print server, mail server, file server, compute server (software servers vs. physical server nodes)
• Example clients: Web browser, email clients, chat clients
Client‐server model: Implementation challenges
- Software server architecture
- Authentication, access control, encryption, …
- Concurrent processing of client requests
- Concurrent access to shared resources
DS Challenges
- How to keep server replicas consistent?
- How to detect failures?
- How many failures can a given design tolerate?
- What kind of failures can it tolerate?
- How to recover from failures?
Tiered architecture
- Client - Web Server - DB
- Web server, application server, database server
Multi‐tiered architecture
- Data persistence tier <–>
- Business logic tier <–>
- Presentation tier <–>
- Client <–>
N‐tiered architecture
- Client‐server architecture style
- Requests pass through tiers in a linear fashion
- Logical and physical separation of functions
- Typical function of each tier
- Presentation (user interface)
- Application processing (business logic)
- Data management (data persistence)
- Predominantly used is the 3‐tiered architecture
- Fosters flexibility, re‐usability, modularity, separation of concerns
- Tiers can be more independently modified, upgraded and replaced
- Layers vs. tiers
- Logical structuring of software vs.
- Physical structure of infrastructure
DNS: The Domain name system
- Cornerstone of the Internet (like a phone book)
- Maps domain names to IP addresses
- www.example.com to IP address of host serving this domain
- A distributed database of name servers
- E.g. ,used by clients (Web/email) to resolve names
- Developed to replace a centralized resolution scheme
- Early example of a distributed system
DNS Name resolutoin
- What is the IP address of some‐webserver.com?
Please reply to my IP address - Q: Where can I find the IP address of some‐webserver.com?
- A: I don‘t know but .com Namespace should have the answer.
- Q: What is the IP address of some‐webserver.com
- A: Primary DNS Server of some‐ webserver.com knows it.
- Q: What is the IP address of some‐webserver.com?
- A: Here is the IP address of some‐webserver.com
Web data platform
- A.k.a. key‐value store (NO SQL)
- Emerged around 2004 with Google’s BigTable,
- Facebook’s Cassandra, Yahoo!’s PNUTS etc.
- Data model based on keys associated with
- K/V stores are not new, but scale of deployment and use was unprecedented
- Backs Web properties of major Internet companies
- Meant to manage Peta bytes of data
Summary: Distribute systems examples
- Client‐server model
- Multi‐tiered enterprise architectures
- Cyber‐physical systems
- Power grid and smart power grid – Cellular networks
- ATM and banking networks
- Large‐scale distributed systems
- Distributed application
Characteristics of distributed systems
- Reliable
- Fault‐tolerant
- Highly available
- Recoverable
- Consistent
- Scalable
- Predictable performance
- Secure
- Heterogeneous
- Open
Reliability
- Probability of a system to perform its required functions under stated conditions for a specified period of time
- To run continuously without failure
- Expressed as MTBF, failure rate