2 - Data Centre Arch & Management Flashcards
Logically, where are datacentres on the internet?
In the core
What do data centres peer with
ISPs and Internet eXchange Points
Data centres have rows of …
racks
Problems with blade servers in higher density
Require more power
Requrie better cooling
Require larger bandwidth
Block Storage
Data stored in fixed size blocks.
Several blocks build a file
Each volume can be treated independently
Managed by OS
Object Storage
Data file with corresponding metadata
Each object has a Unique ID
Stored across all disks in array
Managed by application not OS
DAS
Directly Attached Storage
In which context do hard drives work best?
In terms of data NOT anything else
Sequential accesses (over random)
Due to the mechanical movement of the arm
How much per server storage is generally used? and for what?
Relatively - not looking for any numbers
Little. Only for boot and swap etc
NAS
Network Attached Storage
SAN
Storage Area Network
What does SAN provide?
Block-oriented storage that resides across a network
What networking tech does SAN traditionally use?
Fibre Channel (SAN Switch)
What are the 4 typical types of network accesses/physical networks in data centres?
Client-server
Server-server
Storage access
Management
InfiniBand
Commonly used in supercomputing for physical connection also in large datacenters for switching backbone
InfiniBand material
Copper or fibre
InfiniBand bandwidth
1Gbps, 10 and 40 (100 being introduced)
Routers
Connect networks to the internet
Expensive
Switches
Definition and port density
Interconnect data centre devices.
High port density, 48 ports are common
Middleboxes
Provide additional services between router/switch.
Low port density
Operate on Layer 4-7
In terms of cost, why are hierarchical networks used in data centres?
Equipment higher up more traffic = more expensive
High end or low end server:
memory or IO bound apps
Low end
Parallelisation
Splitting a computational task into separate packages
then assigning each package to a node for processing.
The results are then aggregated.
Why might it NOT be preferable to use single core, single processor servers in clusters?
Reducing serialisation and communication overheads becomes increasingly difficult
Load balancing becomes harder - hard to predict response times
Highly parallel programs are hard to write
How many chassis does each rack hold? How many Us per rack?
4 Chassis per rack, each can hold 16u.
How does DAS work
Directly attached to the server using USB, SATA etc.
NAS vs SAN
NAS manages its own storage and is a file/object level solution, also accessible to anyone.
SAN has block level access to pools of virtualised storage that are accessible to everyone.
NAS Downsides
High latency (ethernet)
Low Bandwidth ^
High Level file access not suitable for some apps (database management)
What is fibre channel networking
Prompt: protocol
High-speed data transfer protocol providing in-order, lossless delivery of raw block data.
What is Fibre Channel used to connect?
Connect storage to servers in storage area networks in commercial data centers
Downsides to FC networking
Need separate infrastructure and specialised network admins (expensive)
Middlebox use examples
Load balancer
NAT
Firewall
Intrusion detection system
Hierarchical Network
What they provide and how they are connected
Provide Redundancies, high bandwidth and fault tolerance
Servers connect to top of rack switches (1-10Gbps), these connect to layers of aggregation switches.
Execution time formula
execution time = 1ms+f(100ns/n +100ns(1-1/n))
n= number of nodes
f= the number of global accesses per 1ms work unit
Advantages of low end servers
Cheaper per server
Higher memory bandwidth/IO to compute ratio (better for io bounded applications)
Why are datacenters ideal for parallelisation
Large collection of affordable servers & storage
Large high speed data interconnects between servers
Challenges of Data center management
Managing and provisioning resources
Managing and detecting faults
Programming
Debugging
In data center management, what do you need an operating system for?
Resource management, utilisation and health monitoring
Deployment and maintenance
Programming framework support
Data center management operating system
Open Stack