Datacenters Flashcards
how ccan workloads be measured in the network? 3 with downsides of each?
- Log everything all the time?
- Generally expensive and infeasible
- Log with sampling?
- Samples may miss events of interest
- Replay with logging turned on?
- Completely overlooks “Heisenbugs”
what does a rack look like
a TOR switch is connected to the rack and acts as the entry point to the servers on the rack. The tor switch is connected to other fabric switches
diff ways to connect racks? 4
big swtich sux cause single point of failure/congestion
lots of big switches increases complexity
tree yay but back to single point at the top
fat tree yay
k-port switches connected to k k-port switches connected to k machines
low cost because of commodity switches
increased throughput betwen racks, any disticnt pair of hosts has full bisection bandwidth
redundant connections
what is bisection bandwidth
In computer networking, if the network is bisected into two partitions, the bisection bandwidth of a network topology is the bandwidth available between the two partitions.[1] Bisection should be done in such a way that the bandwidth between two partitions is minimum.[2] Bisection bandwidth gives the true bandwidth available in the entire system. Bisection bandwidth accounts for the bottleneck bandwidth of the entire network. Therefore bisection bandwidth represents bandwidth characteristics of the network better than any other metric.
how might u calculate throughput upperbound
throughput per flow ≤
total capacity
_____________
# of flows * mean path length
how does tcp REALLY look
Congestion Window is the number of inflight bytes awaiting ack. is always less than the rwnd (dictated by flow control) init value of 10MSS
slow start until timeout
on timeout, SSTHRESH = 1/2 CW
slow start again UNTIL CW > SSTHRESH
at which point it just becomes linear increase
what is the tcp incast problem. preconditions for it?
lots of servers using tcp all simultaneously request data , many to one requests to a switch that can overflow switch buffers - causes drastic reduction in throughput
RTT in data centre is ««_space;RTO so application may be idle for a relatively long time
Preconditions for TCP Incast
• High-bandwidth, low-latency networks
• with small switch buffers (as it should be)
• Concurrent barrier-synchronized requests
• nothing happens until all responses received
• Servers returning a relatively small amount of data
per request
Imbalance between low link latency (µs) and RTO (ms)
how can we solve the tcp incast problem
modify the tcp used to remove the RTO minimum
BUT this isn’t good enough by itself, it’ll still lead to drop in throughput because the datacentres RTTS are «««_space;TCP’s clock granularity
so we use a microsecond RTO rather than a millisecond one
then we gucci
how do mice vs elephants differ, and affect DCtcp
mice are < 1MB (micebyte :3)
• (query, control state,
advertising/bidding) etc
- delay sensitive!
- Large ‘Elephant’ flows:
- 1MB à100sMB
- (backups, updates)
- throughput-sensitive flows
most flows are mice, but most bytes com from elephants
what is ECN
explicit congestion notification
ECN allows end-to-end notification of network congestion without dropping packets
an ECN-aware router may set a mark in the IP header instead of dropping a packet in order to signal impending congestion. The receiver of the packet echoes the congestion indication to the sender, which reduces its transmission rate as if it detected a dropped packet.
Two Key Ideas
- React in proportion to the extent of congestion, not its presence.
- Mark based on instantaneous queue length.