Test 2 Prep Qs and L6 Flashcards
What causes congestion collapse to occur?
Congestion collapse occurs when dropped packets and excessive queuing delays that result
from congestion in turn further exacerbate the problem, which causes more drops and delays,
and so on.
Dropped packets cause retransmissions that add additional traffic to the congested
path, while excessive delays can cause spurious retransmissions (i.e., a timeout occurs when
the packet was merely delayed, not lost).
Note that normal traffic that contributes to
congestion is not the cause of collapse, it is the extra traffic that is caused by congestion that
leads to collapse.
What is the difference between fairness and efficiency in a congestion control scheme?
Efficiency is how much of the available bandwidth is used, i.e., efficient congestion control
leaves little or no bandwidth wasted. (Some definitions of efficiency may refer specifically to
bandwidth used to do “productive work”, thus excluding overhead traffic.)
Fairness is how bandwidth allocated between the different flows. Two common definitions of fair are that all flows get equal throughput, or that all flows get throughput proportionate to their demand
(i.e., how much they want to send).
- Assuming traditional TCP Reno with AIMD behavior (i.e., the version presented in the
lecture videos), suppose a TCP flow’s bottleneck link has 1 Gbps capacity, and that link is not
being shared with any other flows. What will the average throughput of that flow be, in
megabits per second (Mbps)?
Additive increase will increase the throughput until it equals the bandwidth, at which point a
packet loss will occur and trigger multiplicative decrease. At that point, throughput immediately
drops to ½ the bandwidth. Additive increase then resumes, raising throughput linearly until it
reaches the total bandwidth again. Thus the average throughput is the average of ½ bandwidth
and 1x bandwidth = ¾ bandwidth.
Therefore, the average throughput on a 1 Gbps link will be ¾
x 1 Gbps = 750 Mbps. (A more detailed approach may look at the area beneath the throughput
curve, but this results in the same math since the additive increase is linear.)
What circumstances lead to the incast problem? (In other words, what factors must be
present for incast to occur?)
The incast problem occurs when collective communication (i.e., many-to-one or many-to-many
patterns) occurs on high fan-in switches. This results in many small packets arrive at the switch
at the same time, thus causing some of the packets to be lost. The last necessary factor is a
low-latency network, which means the timeout delay will be much more than the
round-trip-time of the network. Consequently, large delays occur in which the system is simply
waiting for the timeouts to occur. This slows the whole application, since hearing from all the
senders in collective communication is usually necessary before the application can proceed. As
a real-world example, suppose a web app has to query a back-end database and needs to check
with 100 database nodes to do this. It needs to hear back from all 100 nodes before
proceeding, or else it risks missing some of the results. (This is the implicit “barrier” that occurs
in some data center applications that are not explicitly using barrier synchronization.) Because
they are all responding to the same query, all the nodes will reply at roughly the same time.
This means a high fan-in switch will have to handle many of these database replies at the same
time. Such traffic bursts may cause only a few of these packets to be lost, while the rest are
delivered. However, the application still cannot proceed until it receives replies from these few,
so it waits. After a significant delay, retransmissions finally occur and may be delivered,
allowing the application to proceed
fairness
Fairness is how
bandwidth allocated between the different flows.
efficiency
Efficiency is how much of the available bandwidth is used, i.e., efficient congestion control
leaves little or no bandwidth wasted. (Some definitions of efficiency may refer specifically to
bandwidth used to do “productive work”, thus excluding overhead traffic.)
collective communication
many-to-one or many-to-many patterns
Factors for Incast Problem
Collective communication occurs on high fan-in switches
Many small packets arrive at switch at same time (packet loss)
Low-latency network (timeout delay much more than RTT of network)
Consequence: large delays occur in which system simply waiting for timeouts to occur
Suppose you are working on some live video call software (think Skype or Google
Hangouts) and decide to build your application protocol on top of UDP (rather than TCP). Give
as many different points as you can (minimum two) to help justify that decision.
Latency is critical – retransmissions are pointless since they will arrive too late anyway
● Dropped frames aren’t a big deal – the next frame will advance the video state before a
retransmitted frame could arrive anyway
● Congestion control and flow control could cause unacceptable delays, as video frames
get backed up in the sender host’s local buffer (what is needed instead is for the
application itself to reduce the frame rate that it tries to send)
Why does the linear growth rate of TCP-RENO (1/RTT) perform poorly for short lived flows
in networks with large bandwidth and delay products?
The time period required for the congestion window to reach its maximum value is very large (on the order of minutes and hours) for TCP-RENO in paths with large bandwidth delayproducts. Short lived flows may never reach a congestion event, meaning the flow unnecessarily transmitted slower than necessary over its entire lifetime to avoid congestion.
Describe the operation of BIC-TCP’s binary search algorithm for setting the congestion
window. Once stable, how does BIC-TCP react to changes in available bandwidth, i.e. what
happens when there is a sudden increase or decrease in available bandwidth?
At a high level, when BIC-TCP experiences a packet loss event, the congestion window value is
set to the midpoint between last window value that did not suffer from loss (WMAX) and the
previous window size that was loss free for at least one RTT (WMIN). This is often referred to as
a binary search, as it follows intuitively that the maximum possible stable window value is
somewhere between a value that was known to be stable and the value achieved just prior to
the loss event. This algorithm “searches” for this maximum stable window value by effectively reducing the range of possible value by half per packet loss event.
Once this maximum stable window size has been achieved, if there is a sudden increase in
available bandwidth, then max probing phase of BIC-TCP will rapidly increase the window beyond the value of WMAX until another loss event occurs, which resets the value of WMAX. If a sudden decrease in available bandwidth occurs, and this loss is below the value of WMAX,
then the window size is reduced by a multiplicative value (β), enabling a safe reaction to a
lower saturation point.
BIC-TCP binary search
At a high level, when BIC-TCP experiences a packet loss event, the congestion window value is
set to the midpoint between last window value that did not suffer from loss (WMAX) and the
previous window size that was loss free for at least one RTT (WMIN)
the maximum possible stable window value is
somewhere between a value that was known to be stable and the value achieved just prior to
the loss event
How does the replacement of this congestion control algorithm with a cubic growth function in CUBIC-TCP improve on BIC-TCP? Discuss.
CUBIC retains the strengths of BIC-TCP, but makes many improvements. First, BIC-TCP is a
rather complex algorithm that approximates a cubic function. It’s growth function has both
linear and logarithmic elements, and many different phases (additive increase, binary search,
max probing). Additionally, on short RTT and low speed networks, BIC-TCP’s growth function
can be too aggressive (recall it was designed to achieve high utilization on large bandwidth,
long RTT networks), making it fairly unfriendly to other TCP flows competing for bandwidth.
CUBIC replaces the growth function in BIC-TCP with a cubic growth function, based on the
elapsed time between congestion events. This function maintains the multiplicative decrease
utilized by many TCP variants, but records the window size at a congestion event as WMAX.
Using this value of WMAX , the cubic growth function can be restarted, with the plateau
occurring at WMAX. This eliminates the need for multiple growth phases and maintaining values like SMAX/MIN. The plateau of the cubic growth function retains BIC-TCP’s stability and utilization strengths.
BIC-TCP
First, BIC-TCP is a
rather complex algorithm that approximates a cubic function. It’s growth function has both
linear and logarithmic elements, and many different phases (additive increase, binary search,
max probing). Additionally, on short RTT and low speed networks, BIC-TCP’s growth function
can be too aggressive (recall it was designed to achieve high utilization on large bandwidth,
long RTT networks), making it fairly unfriendly to other TCP flows competing for bandwidth.
CUBIC
CUBIC replaces the growth function in BIC-TCP with a cubic growth function, based on the
elapsed time between congestion events. This function maintains the multiplicative decrease
utilized by many TCP variants, but records the window size at a congestion event as WMAX.
Using this value of WMAX , the cubic growth function can be restarted, with the plateau
occurring at WMAX. This eliminates the need for multiple growth phases and maintaining values like SMAX/MIN. The plateau of the cubic growth function retains BIC-TCP’s stability and
utilization strengths.
CUBIC growth function
Concave region
The concave region of CUBIC’s growth function rapidly increases the congestion window
to the previous value where a congestion event occurred, allowing for a quick recovery
and high utilization of available bandwidth following a congestion event.
CUBIC growth function
Convex region
The convex region of CUBIC’s growth function exists to rapidly converge on a new value
of WMAX following a change in available bandwidth. When the congestion window
exceeds WMAX, and continues to increase throughout the end of the plateau, it likely
indicates some competing flows have terminated and more bandwidth is available. This
is considered a max probing phase, as the congestion window will grow exponentially in
this region until another congestion event occurs and WMAX is reset.
CUBIC growth function
Plateau region
The plateau is also known as the TCP friendly region. In this region of the growth curve,
the congestion window is nearly constant as it approaches and potentially exceeds
WMAX. This achieves stability, as WMAX represents the point where network utilization
is at its highest under steady state conditions.
How does CUBIC’s fast convergence mechanism detect a reduction in available bandwidth
(i.e. a new flow competing for bandwidth)?
When new flows start competing for bandwidth, other flows must release some bandwidth to
maintain fairness. CUBIC employs the fast convergence mechanism to accomplish this. When
two successive congestion events indicate a reduction in available bandwidth (i.e. a reduced
value of WMAX), the new value of WMAX further reduced (based on the multiplicative
decrease factor used for resetting the congestion window) to free up additional bandwidth and
reduce the number of congestion events required for all flows to converge on a fair distribution
of bandwidth.
What kinds of web traffic stand to benefit most from utilizing the TFO option? How does
TFO improve the performance of these flows?
Short lived TCP connections (small data sizes) on links with large propagation delays. The
performance of these flows are dominated by the return trip time (RTT), and as such, the 3 way
handshake used in standard TCP constitutes a large amount of overhead. By enabling the client
and server to communicate some of the payload (data) during the 3WHS, it is possible to
reduce the number of required RTTs for the flow to complete, reducing the RTT penalty
incurred by the 3WHS.
Describe how a trivial implementation of TCP Fast Open (in which the server replies to a
all HTTP GET requests with a TCP SYN-ACK packet with data attached) can be exploited to
mount a source address spoof attack. How does TFO prevent this?
An attacker can send many HTTP GET requests for large resources to a victim server, spoofing a
victim host address as the requestor. The victim server would then perform the expensive data
fetch operations and transmit large volumes of data to a victim host. The result is a Denial of
Service attack on both victims.
TFO prevents this by using an encrypted cookie that must be requested by the requestor before
initiating requests. The server uses this cookie to verify that the requester address is not a
forgery
What threat do network middleboxes pose to negotiating MPTCP connections? How
does the design of MPTCP mitigate this?
Network middleboxes may strip out unrecognized TCP options (flags) used during the 3-way
handshake used to negotiate a MPTCP connection. This means that while the sender and
receiver may both be MPTCP capable with multiple viable interfaces, a middlebox along the
route may ultimately prevent a MPTCP connection.
MPTCP is designed to resort to a single path TCP when both ends of the connection cannot
support MPTCP. In this case, when the sender’s MPTCP capable flag is stripped out by a
middlebox enroute to the receiver, the receiver thinks that the sender is not MPTCP capable
and proceeds with a single path TCP connection.
How does the design of MPTCP mitigate the threat from network middleboxes?
The sender will see that traffic returning from the receiver is not MPTCP enabled (the flag is
carried on all packets until acknowledged) and as such revert to single path TCP.
Why are receive buffer sizes required to be larger for MPTCP enabled connections?
The receive buffer allows out of order data to continue flowing in the event a packet is dropped
and must be resent. For a standard TCP connection, the required buffer size is determined by
the bandwidth delay product of the connection.
With multiple subflows across a single connection present in MPTCP, the worst case scenario is
that a packet drop occurs early and must be re-sent across the slowest link (like a 3G mobile
connection). This would require other subflows (like high bandwidth WiFi connections) to have
larger buffers than would be required if it were the only connection, because it can send data
much faster than the slower link that is retransmitting the lost packet.