Final Study Guide Flashcards
What areWhy did SDN arise?
to make computer networks more programmable.
Why are computer networks complex/difficult to manage?
Diversity of equipment
Proprietary Technologies
What is SDN’s main idea? What does that mean in practice?
Separation of tasks. Split the network into the Control Plane and the Data Plane
What are the three historical phases of SDN?
- Active networks
- Control and data plane separation
- OpenFlow API and network operating systems
Summarize the Active Networks Phase
Took place from mid 1990’s to early 2000’s
Active networks emerged, aimed at opening up network control.
Too ambitious, didn’t focus on security, required knowledge of Java
What is active networking?
Network is not just a group of bits, but a computer itself to be interacted with, providing services such as API
What are the two types of programmable modelling that are part of Active Networking?
- Capsule model – carried in‑band in data packets
- Programmable router/switch model – established by out‑of‑band mechanisms
Summarize the Control and data plane separation Phase
Lasted from 2001 to 2007
Network reliability, performance, and predictability were key
Spurred innovation for network administrators rather than end users
Summarize the OpenFlow API Phase
Took place from 2007 to 2010
Born from interest for network experimentation at a scale
Ensure practicality of real world deployment
Was adopted in the industry, unlike predecessors
What does the control plane do?
The control plane contains the logic that controls the forwarding behavior of routers such as routing protocols and network middlebox configurations
What does the data plane do?
The data plane performs the actual forwarding as
dictated by the control plane
Why separate the control plane and data plane?
1: Independent evolution and development
2: Control from high‑level software program
Why did the SDN lead to opportunities in various areas, such as data centers, routing, enterprise networks, and research networks?
Made network management easier.
More control in path selection.
Improved security.
Allows research networks to coexist with production networks
What are the two primary functions of the network layer?
Forwarding and Routing
What is forwarding?
Determining which output link that packet should be sent through.
What is routing?
Determining the path from the sender to the receiver across the network.
Forwarding is a function of what? Hardware or Software?
Data Plane, Hardware
Routing is a function of what?
Control Plane
What is the difference between a traditional and SDN approach in terms of coupling of control and data plane?
In the traditional approach, the control and data planes are closely coupled.
In the SDN approach, a remote controller computes and distributes the forwarding table, physically far from the router
Routing is a function of what? Hardware or software?
Control Plane, Software
What are the main components of SDN?
SDN‑controlled network elements
SDN controller
Network‑control applications
What do the SDN‑controlled network elements do?
The SDN‑controlled network elements, sometimes called the infrastructure layer, is responsible for the forwarding of traffic in a network based on the rules computed by the SDN control plane.
What does the SDN controller do?
The SDN controller is a logically centralized entity that acts as an interface between the network elements and the network‑control applications.
Midpoint between Northbound and Southbound
What do the Network‑control applications do?
Manage the underlying network by collecting information about the network elements with the help of SDN controller
What are the four defining features of an SDN architecture?
Flow‑based forwarding
Separation of data plane and control plane
Network control functions
A programmable network
What are the three layers of SDN Architecture?
Communication layer
Network‑wide state‑management layer
Interface to the network‑control application layer
What does the Communication layer do?
communicating between the controller and the network elements
What does the Network‑wide state‑management layer do?
stores information of network‑state
What does the Interface to the network‑control application layer do?
communicating between controller and applications
What does a ‘northbound’ interface communicate with?
Network‑control applications
What does a ‘southbound’ interface communicate with?
Controlled devices
What are the three parts of the OpenDaylight controller architecture?
Southbound interface
Northbound interface
Model Driven Service Abstraction Layer (or MD‑SAL)
A few of the main reasons that SDN arose are: a diversity of different network equipment (eg routers, switches, firewalls, etc.) using different protocols that made managing the network difficult, and second a lack of a central platform to control network equipment. True or False?
True
The main idea behind SDNs is to divide tasks into smaller functions so the code is more modular and easy to manage. True or False?
True
With SDNs the control plane and data plane have independent evolution and development. True or False?
True
In the SDN approach, the SDN controller is physically located at each router that is present in a network. True or False?
False
By separating the control plane and the data plane, controlling the router’s behavior became easier using higher order programs. For example, it is easier to update the router’s state or control the path selection. True or False?
True
In the SDN approach, ISPs or other third parties can take up the responsibility for computing and distributing the router’s forwarding tables. True or False?
True
Having the software implementations for SDNs controllers increasingly open and publicly available makes it hard to control, since any person could modify the software easily. True or False?
False
In SDN networks, the SDN controller is responsible for the forwarding of traffic. True or False?
False
The network-control applications are programs that manage the underlying network with the help of the SDN controller. True or False?
True
In SDN networks forwarding rules of traffic still have to be based on IP destination and cannot be based on other metrics, packet header info etc. True or False?
False
SDN-controlled switches operate on the:
Data Plane
In an SDN Architecture, the northbound interface keeps track of information about the state of the hosts, links, switches and other controlled elements in the network, as well as copies of the flow tables of the switches. True or False?
False
In SDN networks, the southbound interface is responsible for the communication between SDN controller and the controlled devices. True or False?
True
In SDN networks, the controller needs to be implemented over a centralized server. True or False?
False
As IP networks grew in adoption worldwide, what were the challenges that emerged?
Handling the ever growing complexity and dynamic nature of networks
Tightly coupled architecture
What does SDN stand for?
Software Defined Networking
What are the three planes of functionality for SDN?
Data plane
Control plane
Management plane
What does the Data Plane Layer do?
These are functions and processes that forward data in the form of packets or frames.
What does the Control Plane Layer do?
These refer to functions and processes that determine which path to use by using protocols to populate forwarding tables of data plane elements
What does the Management Plane Layer do?
These are services that are used to monitor and configure the control functionality, e.g. SNMP‑based tools.
What are the advantages of SDNs over traditional networks?
Shared abstractions
Consistency of same network information
Locality of functionality placement
Simpler integration
What are the three perspectives of the SDN landscape?
(a) a plane‑oriented view
(b) the SDN layers
(c) a system design perspective
What are the layers of SDN?
Infrastructure
Southbound Interfaces
Network Visualization
Network Operating Systems
Northbound Interfaces
Language-Based Virtualization
Network Programming Languages
Network Applications
What is SDN infrastructure made up of?
routers, switches and other middlebox hardware
What are SDN Southbound interfaces?
These are interfaces that act as connecting bridges between connecting and forwarding elements
What is SDN Network virtualization?
Interfacing with the physical network components via software
What are SDN Network operating systems?
Ease network management and solve networking problems by using a logically centralized controller by way of a network operating system
What is a problem with SDN Northbound interfaces?
There is no normalized standard
Each entry of a flow table has which parts?
a) a matching rule
b) actions to be executed on matching packets
c) counters that keep statistics of matching packets.
In OpenFlow, what happens when a packet arrives?
In an OpenFlow device, when a packet arrives, the lookup process starts in the first table and ends either with a match in one of the tables of the pipeline or with a miss (when no rule is found for that packet).
What are possible actions for a packet in OpenFlow?
- Forward the packet to outgoing port
- Encapsulate the packet and forward it to controller
- Drop the packet
- Send the packet to normal processing pipeline
- Send the packet to next flow table
What are the main purposes of Southbound Interfaces?
The Southbound interfaces or APIs are the separating medium between the control plane and data plane functionality.
What is the current southbound standard for SDNs?
OpenFlow
What are three information sources provided by the OpenFlow protocol?
- Event‑based messages that are sent by forwarding devices to controller when there is a link or port change
- Flow statistics are generated by forwarding devices and collected by controller
- Packet messages are sent by forwarding devices to controller when they do not know what to do with a new incoming flow
What are the core functions of an SDN controller?
topology, statistics, notifications, device management, along with shortest path forwarding and security mechanisms
What distinguishes a centralized controller in SDN?
In this architecture, we typically see a single entity that manages all forwarding devices in the network, which is a single point of failure and may have scaling issues.
What distinguishes a distributed controller in SDN?
A distributed network operating system (controller) can be scaled to meet the requirements of potentially any environment ‑ small or large networks
What are the two types of SDN distributed controllers?
It can be a centralized cluster of nodes or physically distributed set of elements
When would a distributed controller be preferred to a centralized controller?
Scales more easily, no single point of failure
What does ONOS stand for?
Open Networking Operating System
Describe ONOS at a high level
There are several ONOS instances running in a cluster. The management and sharing of the network state across these instances is achieved by maintaining a global network view.
To make forwarding and policy decisions, the applications consume information from the view and then update these decisions back to the view.
How does ONOS achieve fault tolerance?
To achieve fault tolerance, ONOS redistributes the work of a failed instance to other remaining instances.
What does P4 stand for?
P4 (Programming Protocol‑independent Packet Processors)
What is P4?
A high‑level programming language to configure switches which works in conjunction with SDN control protocols.
What are the primary goals of P4?
Reconfigurability
Protocol independence
Target independence
What are the two main operations of P4 forwarding model?
Configure
Populate
What does P4’s Configure do?
These sets of operations are used to program the parser. They specify the header fields to be processed in each match+action stage and also define the order of these stages.
What does P4’s Populate do?
The entries in the match+action tables specified during configuration may be altered using the populate operations. It allows addition and deletion of the entries in the tables
What are the applications of SDN? Provide examples of each application.
Traffic Engineering - ElasticTree
Mobility and Wireless - OpenRadio, The Odin Network
Measurement and Monitoring - OpenSketch, OpenSample and PayLess
Security and Dependability - CloudWatcher
Data Center Networking - LIME, FlowDiff
Which BGP limitations can be addressed by using SDN?
SDN can perform multiple actions on the traffic by matching over various header fields, not only by matching on the destination prefix.
What’s the purpose of SDX?
To implement the following:
Application specific peering
Traffic engineering
Traffic load balancing
Traffic redirection through middleboxes
Describe SDX Architecture
In the SDX architecture, each AS the illusion of its own virtual SDN switch that connects its border router to every other participant AS. For example, AS A has a virtual switch connecting to the virtual switches of ASes B and C. Each AS can have its own SDN applications for dropping, modifying, or forwarding their traffic
What are the applications of SDX in the domain of wide-area traffic delivery?
Application specific peering
Inbound traffic engineering
Wide‑area server load balancing
Redirection through middle boxes
An OpenFlow switch can function as a router. True or False?
True
Which plane executes a network policy?
Data Plane
Which type of network can implement load balancing?
Both Conventional and SDN
Which type of network decouples the control and data planes?
SDNs
Middleboxes can only be used in conventional networks. True or False?
False
What can be implemented as a network application in software-defined networking?
Routing
Security Enforcement
Quality of Service Enforcement
The networking operating system (NOS) is a part of the data plane. True or False?
False
The physical devices in an SDN network have embedded intelligence and control required to perform forwarding tasks. True or False?
False
When a packet arrives in an OpenFlow device and it does not match any of the rules in one of the tables, that packet is always dropped. True or False?
False
The Southbound interfaces are the separating medium between the Network-control Applications and the Control plane functionality. True or False?
False
OpenFlow enables the communication between the control plane and data plane through event-based messages, flow statistics and packet messages that are sent from forwarding devices to controller. True or False?
True
One of the disadvantages of an SDN centralized controller architecture is that it can introduce a single point of failure and also scaling issues. True or False?
True
A distributed controller can be a centralized cluster of nodes or a physically distributed set of elements. True or False?
True
A distributed controller can only be used in large networks. True or False?
False
ONOS is an example of a centralized controller platform. True or False?
False
In order to make forwarding and policy decisions in ONOS, applications get information from the view and then update these decisions back to the view. True or False?
True
In order to achieve fault tolerance, whenever there is a failure of an ONOS instance, a master is chosen randomly for each of the switches that were controller by the failed instance. True or False?
False
The purpose of the creation of the P4 language was to offer programmability on the control plane. True or False?
False
P4 acts as an interface between the switches and the controller, and its main goal is to allow the controller to define how the switches operate. True or False?
True
The P4 model allows the design of a common language to write packet processing programs that are independent of the underlying devices. True or False?
True
What are the properties of secure communication?
Confidentiality, Integrity, Authentication, Availability
How does Round Robin DNS (RRDNS) work?
Responds to a DNS request with a list of DNS A Records, which it cycles through in a RR manner.
DNS client can then pick one from this list using its own metric
If request again, a different order
What is the goal of Round Robin DNS?
To distributed large loads of incoming traffic to several different servers; used by big companies
How does DNS-based content delivery work?
CDN computes the ‘nearest edge server’ and returns its IP address to the DNS client. Basically chooses nearest one in order to deliver content quickly
How do Fast-Flux Service Networks work?
Short TTL, and after it expires, it returns a different set of records rather than the same list of records cycled through
What are the main data sources used by FIRE (Finding Rogue Networks) to identify hosts that likely belong to rogue networks?
Botnet command and control providers
Drive‑by‑download hosting providers
Phish housing providers
The design of ASwatch is based on monitoring global BGP routing activity to learn the control plane behavior of a network. Describe 2 phases of this system.
Training phase - The system learns control‑plane behavior typical of both types of ASes
Operational phase ‑ Given an unknown AS, it then calculates the features for this AS. It uses the model to then assign a reputation score to the AS.
What are the three main families of features for the Training Phase of ASwatch?
Rewiring activity
IP Space Fragmentation and Churn
BGP Routing Dynamics
What are three classes of features used to determine the likelihood of a security breach within an organization?
Mismanagement symptoms
Malicious Activities
Security Incident Reports
Which features are used for Mismanagement Symptoms?
Open Recursive Resolvers – misconfigured open DNS resolvers
DNS Source Port Randomization – many servers still do not implement this
BGP Misconfiguration – short‑lived routes can cause unnecessary updates to the global routing table
Untrusted HTTPS Certificates – can detect the validity of a certificate by TLS handshake
Open SMTP Mail Relays – servers should filter messages so that only those in the same domain can send mails/messages.
What are the three sub-types of Malicious Activities?
Capturing spam activity
Capturing phishing and malware activities
Capturing scanning activity
What are the three collections of Security Incident Reports?
VERIS Community Database
Hackmageddon
The Web Hacking Incidents Database
What is the classification by affected prefix?
In this class of hijacking attacks, we are primarily concerned with the IP prefixes that are advertised by BGP.
Exact prefix hijacking, sub-prefix, squatting
What is Exact prefix hijacking?
When two different ASes (one is genuine and the other one is counterfeit) announce a path for the same prefix. This disrupts routing in such a way that traffic is routed towards the hijacker wherever the AS‑path route is shortest, thereby disrupting traffic.
What is Sub‑prefix hijacking?
This is an extension of exact prefix hijacking, except that in this case, the hijacking AS works with a sub‑prefix of the genuine prefix of the real AS. This exploits the characteristic of BGP to favor more specific prefixes, and as a result route large/entire amount of traffic to the hijacking AS
What is Squatting?
In this type of attack, the hijacking AS announces a prefix that has not yet been announced by the owner AS.
What is Classification by AS‑Path announcement?
In this class of attacks, an illegitimate AS announces the AS‑path for a prefix for which it doesn’t have ownership rights.
Type-0, Type-N, Type-U
What is Type‑0 hijacking?
This is simply an AS announcing a prefix not owned by itself
What is Type‑N hijacking?
This is an attack where the counterfeit AS announces an illegitimate path for a prefix that it does not own to create a fake link (path) between different ASes.
What is Type‑U hijacking?
In this attack the hijacking AS does not modify the AS‑PATH but may change the prefix.
What is Classification by Data‑Plane traffic manipulation?
In this class of attacks, the intention of the attacker is to hijack the network traffic and manipulate the redirected network traffic on its way to the receiving AS.
What is a blackholing (BH) attack?
When traffic is dropped by a hijacker.
What is a man‑in‑the‑middle attack?
When traffic is eavesdropped or manipulated before it reaches the receiving AS
What is an imposture (IM) attack?
When traffic is impersonated, e.g. In this case the network traffic of the victim AS is impersonated and the response to this network traffic is sent back to the sender.
What are the causes or motivations behind BGP attacks?
Human error - mistake
Targeted Attack - stealthy
High Impact Attack - obvious
Explain the scenario of prefix hijacking.
- The attacker uses a router to announce the prefix 10.10.0.0/16 that belongs to AS1, with a new origin AS4, pretending that the prefix belongs to AS4.
- This new announcement causes a conflict of origin for the ASes that receive it (Multiple Origin AS or MOAS).
- As a result of the new announcement, AS2, AS3 and AS5 receive the false advertisement and they compare it with the previous entries in their RIB.
- AS2 will not select the route as the best route as it has the same path length with an existing entry.
- AS3 and AS5 will believe the new advertisement, and they will update their entries (10.10.0.0/16 with path 4,2,1) to (10.10.0.0/16 with path 4). Therefore AS5 and AS3 will send all traffic for prefix 10.10.0.0/16 to AS4 instead of AS1.
Explain the scenario of hijacking a path.
- AS1 advertises the prefix 10.10.0.0/16.
- AS2 and AS3 receive and propagate legitimately the path for the prefix.
- At AS4, the attacker compromises the update for the path by changing it to 4,1 and propagates it to the neighbors AS3, AS2, and AS5. Therefore it claims that it has direct link to AS1 so that others believe the new false path.
- AS5 receives the false path (4,1) “believes” the new false path and it adopts it. But the rest of the ASes don’t adopt the new path because they either have an shorter path already or an equally long path to AS1 for the same prefix. The key observation here is that the attacker does not need not to announce a new prefix, but rather it manipulates an advertisement before propagating it.
What are the key ideas behind ARTEMIS?
A configuration file: where all the prefixes owned by the network are listed here for reference
A mechanism for receiving BGP updates: this allows receiving updates from local routers and monitoring services
What are the two automated techniques used by ARTEMIS to protect against BGP hijacking?
Prefix deaggregation and Mitigation with Multiple Origin AS (MOAS)
What are two findings from ARTEMIS?
Outsource the task of BGP announcement to third parties
Filtering is less optimal than outsourcing
Explain the structure of a DDoS attack.
A Distributed Denial of Service (DDoS) attack is an attempt to compromise a server or network resources with a flood of traffic. To achieve this, the attacker first compromises and deploys flooding servers (slaves). Later, when initiating an attack, the attacker instructs these flooding servers to send a high volume of traffic to the victim. This results in the victim host either becoming unreachable or in exhaustion of
its bandwidth.
What is spoofing?
IP spoofing is the act of setting a false IP address in the source field of a packet with the purpose of impersonating a legitimate server
Describe a Reflection and Amplification attack.
A reflection/amplification attack is a combination of the two attacks that allows the attacker to generate an enormous amount of traffic and at the same time keep its identity hidden by spoofing the victim’s IP address.
What are the defenses against DDoS attacks?
Traffic Scrubbing Services
ACL Filters
BGP Flowspec
Explain provider-based blackholing.
What is blackholing?
With this mechanism, all the attack traffic to a targeted DoS destination is dropped to a null location. The premise of this approach is that the traffic is stopped closer to the source of the attack and before it reaches the targeted victim
Explain IXP/Provider Based Blackholing
The victim AS uses BGP to communicate the attacked destination prefix to its upstream AS, which then drops the attack traffic towards this prefix. Then either the provider (or the IXP) will advertise a more specific prefix and modifying the next‑hop address that will divert the attack traffic to a null interface. The blackhole messages are tagged with a specific BGP blackhole community attribute, usually publicly available, to differentiate it from the regular routing updates.
How do networks assist with blackholing?
They provide the blackholing community to be used
How do IXPs assist with blackholing?
It sends the blackholing messages to the IXP route server when a member connects to the route server. The route server then announces the message to all the connected IXP member ASes, which then drops the traffic towards the blackholed prefix
What is DNS censorship?
DNS censorship is a large scale network traffic filtering strategy opted by a network to enforce control and censorship over Internet infrastructure to suppress material which they deem as objectionable.
What are the properties of GFW (Great Firewall of China)?
Locality of GFW nodes (likely on the edge)
Centralized management
Load balancing
What are the three steps involved in DNS injection?
- DNS probe is sent to the open DNS resolvers
- The probe is checked against the blocklist of domains and keywords
- For domain level blocking, a fake DNS A record response is sent back. There are two levels of blocking domains: the first one is by directly blocking the domain, and the second one is by blocking it based on keywords present in the domain
How does DNS injection work?
Certain DNS requests are captured, and a fake DNS response is sent instead of what was actually requested
What are five DNS censorship techniques?
Packet Dropping, DNS Poisoning, Content Inspection, Blocking with Resets, Immediate Reset of Connections
What is Packet Dropping?
All network traffic going to a set of specific IP addresses is discarded
What is DNS Poisoning?
When a DNS receives a query for resolving hostname to IP address‑ if there is no answer returned or an incorrect answer is sent to redirect or mislead the user request, this scenario is called DNS Poisoning.
Like packet dropping, but for host names instead of IP addresses
What is Proxy-Based Content Inspection?
It allows for all network traffic to pass through a proxy where the traffic is examined for content, and the proxy rejects requests that serve objectionable content.
What is Blocking with Resets?
It sends a TCP reset (RST) to block individual connections that contain requests with objectionable content
What is Immediate Reset of Connections?
Censorship systems like GFW have blocking rules in addition to inspecting content, to suspend traffic coming from a source immediately, for a short period of time.
Which DNS censorship technique is susceptible to overblocking?
Packet Dropping
What are the strengths and weaknesses of the “packet dropping” DNS censorship technique?
Strengths:
Easy to implement
Low cost
Weaknesses:
Maintenance of blocklist
Overblocking
What is overblocking?
When two sites share an IP address, blocking one risks blocking both, etc
What are the strengths and weaknesses of the “DNS poisoning” DNS censorship technique?
Stengths:
No overblocking
Weaknesses:
Still hard to maintain probably
What are the strengths and weaknesses of the “content inspection” DNS censorship technique?
Strengths:
Precise censorship
Flexible
Weaknesses:
Not scalable
What are the strengths and weaknesses of the “blocking with resets” DNS censorship technique?
Doesn’t say???
What are the strengths and weaknesses of the “immediate reset of connections” DNS censorship technique?
Also doesn’t say???
Our understanding of censorship around the world is relatively limited. Why is it the case? What are the challenges?
Difficult to determine where an ISP’s rules are affecting, and which countries are affected by which ISPs
Huge amount of internet usage, need more volunteers
Hard to differentiate natural fluctuations in DNS behavior from malicious behavior
Ethical issues in obtaining this data
What are the limitations of main censorship detection systems?
Typically rely on volunteer data, makes getting continuous and diverse data difficult
What kind of disruptions does Augur focus on identifying?
IP‑based disruptions as opposed to DNS‑based manipulations
How does Iris counter the lack of diversity while studying DNS manipulation?
Iris uses open DNS resolvers located all over the globe
What are the steps involved in the global measurement process using DNS resolvers?
- Performing global DNS queries – Iris queries thousands of domains across thousands of open DNS resolvers. To establish a baseline for comparison, the creators included 3 DNS domains which were under their control to help calculate metrics used for evaluation DNS manipulation.
- Annotating DNS responses with auxiliary information – To enable the classification, Iris annotates the IP addresses with additional information such as their geo‑location, AS, port 80 HTTP responses, etc. This information is available from the Censys dataset.
- Additional PTR and TLS scanning – One IP address could host several websites via virtual hosting. So, when Censes retrieves certificates from port 443, it could differ from one retrieved via TLS’s Server Name Indication (SNI) extension. This results in discrepancies that could cause IRIS to label virtual hosting as DNS inconsistencies. To avoid this, Iris adds PTR and SNI certificates
What are the steps associated with the Iris Counter proposed process?
- Scanning the Internet’s IPv4 space for open DNS resolvers
- Identifying Infrastructure DNS Resolvers
What metrics does Iris use to identify DNS manipulation once data annotation is complete? Describe the metrics.
Consistency Metrics - Network properties and infrastructure/content are similar when accessed from differing locations
Independent Verifiability Metrics - Externally verified metrics, such as HTTPS Certificate
Under what condition do we declare the response from Iris as being manipulated?
If neither the Consistency Metric or Independent Verifiability Metric is found to valid
How to identify DNS manipulation via machine learning with Iris?
Perform Global DNS Queries
Annotate the responses with auxiliary information
Additional PTR and TLS scanning
Clean the Data Set
Evaluate with Consistency Metrics and Independent Verifiability Metrics
How is it possible to achieve connectivity disruption using the routing disruption approach?
Disrupts the critical routers which send information on which parts of the network are reachable, causing the pathways to no longer be valid
How is it possible to achieve connectivity disruption using the packet filtering approach?
Blocks packets matching a certain criteria, preventing that information from disseminating through the network
Out of the Disruption Approach and Packet Filtering, which is harder to catch?
Packet Filtering
Explain a scenario of connectivity disruption detection in the case when no filtering occurs.
- The measurement machine probes the IP ID of the reflector by sending a TCP SYN‑ACK packet. It receives a RST response packet with IP ID set to 6 (IPID (t1)).
- Now, the measurement machine performs perturbation by sending a spoofed TCP SYN to the site.
- The site sends a TCP SYN‑ACK packet to the reflector and receives a RST packet as a response. The IP ID of the reflector is now incremented to 7.
- The measurement machine again probes the IP ID of the reflector and receives a response with the IP ID value set to 8 (IPID (t4)).
Explain a scenario of connectivity disruption detection in the case of inbound blocking.
The scenario where filtering occurs on the path from the site to the reflector is termed as inbound blocking. In this case, the SYN‑ACK packet sent from the site in step 3 does not reach the reflector. Hence, there is no response generated and the IP ID of the reflector does not increase. The returned IP ID in step 4 will be 7 (IPID(t4)) as shown in the figure. Since the measurement machine observes the increment in IP ID value as 1, it detects filtering on the path from the site to the reflector.
A censorship technique can use any combination of criteria based on content, source IP and destination IP to block access to objectionable content. True or False?
True
DNS injection uses DNS replies to censor network traffic based on the source and destination IP address. True or False?
False
With a censorship technique based on packet dropping, all network traffic going to a set of specific IP addresses is discarded. True or False?
True
When using DNS Poisoning, all traffic passes through a proxy where it is examined for content, and the proxy rejects requests that serve objectionable content. True or False?
False
When using the Blocking with Resets technique, if a client sends a request containing flaggable keywords, only the connection containing requests with objectionable content is blocked. True or False?
True
With the Immediate Reset of Connections technique, whenever a request is sent containing flaggable keywords, any subsequent request will receive resets from the firewall for a certain amount of time. True or False?
True
One of the obstacles to fully understand DNS censorship is the heterogeneity of DNS manipulation across the globe. True or False?
True
It is easy to infer if there is DNS manipulation based on few indications such as inconsistent or anomalous DNS responses. True or False?
False
There is a need for methods and tools independent of human intervention and participation in order to achieve the scalability necessary to measure Internet censorship. True or False?
True
It is considered safe for volunteers to participate in censorship measurement studies and accessing DNS resolvers or DNS forwarders. True or False?
False
Which censorship detection system targets to identify IP-based disruptions as opposed to DNS-based manipulations?
Augur
[Iris] The Iris system uses home routers to identify DNS manipulation. True or False?
False
[Iris] In order to infer DNS manipulation, Iris relies solely on metrics that can be externally verified using external data sources. True or False?
False
[Augur] Assume a scenario where there is inbound blocking. The Measurement Machine sends a SYN-ACK to the reflector, what should happen?
The return IPID from the reflector to the Measurement Machine will increase by 1.
Rank these bitrate usages: Browsing FaceBook, Playing Music, Playing Video
Least: Music
Mid: FaceBook
Most: Video
What are the characteristics of streaming stored video?
Streamed - Can start without waiting for download to finish
Interactive - Pause, play, etc
Typically stored on a CDN
What are the characteristics of streaming live audio and video?
Delay sensitive, but not as much as conversational
Typically many simultaneous users
Delay is okay
What are the characteristics of conversational voice and video over IP?
Realtime, highly delay sensitive
Loss-tolerant
What does VoIP stand for?
Voice over IP
How does the encoding of analog audio work (in simple terms)?
Audio is encoded by taking many (as in, thousands) of samples per second, and then rounding each sample’s value to a discrete number within a particular range
What is quantization?
Rounding to a discrete value
What are the three major categories of VoIP encoding schemes?
Narrowband, broadband, and multimode (which can operate on either)
What are the functions that signaling protocols are responsible for?
1) User location ‑ the caller locating where the callee is.
2) Session establishment ‑ handling the callee accepting,
rejecting, or redirecting a call.
3) Session negotiation ‑ the endpoints synchronizing with each other on a set of properties for the session.
4) Call participation management ‑ handling endpoints joining or leaving an existing session.
What are three QoS VoIP metrics?
end‑to‑end delay
jitter
packet loss
What kind of delays are included in “end-to-end delay”?
the time it takes to encode the audio
the time it takes to put it in packets,
all the normal sources of network delay that network traffic encounters such as queueing delays,
“playback delay,” which comes from the receiver’s playback buffer,
decoding delay, which is the time it takes to reconstruct the signal.
How does “delay jitter” occur?
When one packet is delayed 300ms, another 100ms, another 250ms, another 50ms, etc
What are the mitigation techniques for delay jitter?
the “jitter buffer” or the “play‑out buffer”
Basically buffer packets and play them at a steady rate
Increases end to end delay or dropped packets, depending on how you optimized
What are the three major methods for dealing with packet loss in VoIP protocols?
FEC (Forward Error Correction), interleaving, and error concealment
What is FEC (Forward Error Concealment)?
FEC works by transmitting redundant data alongside the main transmission, which allows the receiver to replace lost data with the redundant data. May be more of the same, may be lower quality.
What are the downsides of FEC?
The more redundant data transmitted, the more bandwidth is consumed. Also, some of these FEC techniques require the receiving end to receive more chunks before playing out the audio, and that increases playout delay
What is interleaving?
Interleaving works by mixing chunks of audio together so that if one set of chunks is lost, the lost chunks aren’t consecutive. The idea is that many smaller audio gaps are preferable to one large audio gap.
What are the downsides of interleaving?
The tradeoff for interleaving is that the receiving side has to wait longer to receive consecutive chunks of audio, and that increases latency. Unfortunately, that means this technique is limited in usefulness for VoIP, although it can have good performance for streaming stored audio
What is error concealment?
Basically “guessing” what the lost audio packet might be. Similar to audio compression.
What developments lead to the popularity of consuming media content over the Internet?
One, the bandwidth for both the core network and last‑mile access links have increased tremendously over the years.
Two, the video compression technologies have become
more efficient. This enables to stream high‑quality video without using a lot of bandwidth.
Finally, the development of Digital Rights Management culture has encouraged content providers to put their content on the Internet.
Provide a high-level overview of adaptive video streaming.
1: Video content is created at a high quality
2: It is compressed with an algorithm
3: It is secured with DRM and hosted on a server
4: Content providers duplicate it using CDNs
5: The end users download, decode, and render
What are two ways to achieve efficient video compression?
Spatial redundancy or temporal redundancy - the former compression in the context of a single image, the latter compression in the context of different frames
What are the four steps of JPEG compression?
Step 1: Transform it into color components (Cb, Cr) and brightness component (y) matrices
Step 2: For each matrix, subdivide and apply the Discrete Cosine Transformation
Step 3: Compress the matrix of the coefficients using a
pre‑defined Quantization table
Step 4: Perform a lossless encoding based on the subdivision results
Explain video compression and temporal redundancy using I-, B-, and P-frames.
Encode the first image (i-frame), then encode the difference between that and the next frame (p-frame), or the next and previous frame (b-frame)
Why is video compression unable to use P-frames all the time?
Because there may be a cut to a new scene, and therefore the transposition between the two frames would not make sense
What is the difference between constant bitrate encoding and variable bitrate encoding (CBR vs. VBR)?
CBR - output size of video is fixed over time
VBR - It varies based on scene quality. Image quality is better, more expensive
Which protocol is preferred for video content delivery - UDP or TCP? Why?
TCP, as it has an implicit reliability promise
What was the original vision of the application-level protocol for video content delivery, and why was HTTP chosen eventually?
Original vision was have everything be server-side with a unique protocol, ie you hit pause, the server stops transmission
Http was chosen because this would have required specialized hardware, cheaper to use Http
Summarize how progressive download works.
As the client watches video, it has a playout buffer, which depletes. When it reaches a threshold, ie 10 seconds left, it requests more video, further filling the buffer. This prevents unnecessary downloads while keeping things seemless
How to handle network and user device diversity relative to videos?
Videos are usually stored in short segments at different bitrates, and an appropriate bit rate is sent depending on network capacity, and can switch if that capacity changes
How does the bitrate adaptation work in DASH?
A video in DASH is divided into chunks and each chunk is encoded into multiple bitrates. Each time the video player needs to download a video chunk, it calls the bitrate adaptation function, say f. The function f that takes in some input and outputs the bitrate of the chunk to be downloaded
What are the goals of bitrate adaptation?
Low or zero re‑buffering
High video quality
Low video quality variations
Low startup latency
What are the different signals that can serve as an input to a bitrate adaptation algorithm?
Network Throughput
Video Buffer
Explain buffer-filling rate and buffer-depletion rate calculation.
In order to have a stall‑free streaming, clearly the buffer‑filling rate should be greater than the buffer‑depletion rate
C(t)/R(t)> 1 or C(t)> R(t).
What steps does a simple rate-based adaptation algorithm perform?
Estimation
Quantization
Explain the problem of bandwidth over-estimation with rate-based adaptation.
While the buffer is full, if the quality drops, that is not necessarily reflected immediately and a higher quality (and higher delay) bitrate is requested until it catches up and buffers
When streaming stored multimedia applications, the user must first download the entire content before it can start playing. True or False?
False
With streaming stored multimedia applications, the user can pause, fast forward, skip ahead the audio/video. True or False?
True
What common application/usage is the least sensitive to network delays?
File Transfer
What common application/usage is the least tolerant to packet losses? Assume there is no packet retransmission.
File Transfer
Consider packet loss with VoIP application. Using TCP instead of UDP for VoIP applications results in __________ packet loss.
Less
Consider end-to-end delay with VoIP application. Using TCP instead UDP for VoIP applications results in __________ end-to-end delay.
More
Available bandwidth is one of the QoS metrics for VoIP applications. True or False?
False
A longer jitter buffer reduces the number of packets that are discarded because they were received too late, but that adds to the end-to-end delay. True or False?
True
A shorter jitter buffer will not add to the end-to-end delay as much, but that can lead to more dropped packets, which reduces the speech quality. True or False?
True
Network conditions such as buffer sizes, queueing delays, network congestion levels have an impact on packet jitter. True or False?
True
In VoIP applications, we have a harsher definition for packet loss, as we consider a packet to be lost if it never arrives or if it arrives after its scheduled playout. True or False?
True
With Forward Error Correction we also transmit redundant data that can be used for reconstructing the stream at the receiver’s side. This approach to error recovery can lead to more bandwidth consumption. True or False?
True
With interleaving we mix chunks of audio together so we avoid scenarios where consecutive chunks are lost. This approach can lead to increased latency. True or False?
True
Which transport-level protocol is preferred for video content delivery?
TCP
What are the characteristics of a good quality of experience from the user’s perspective?
Low or zero re-buffering
High video quality
Low video quality variations
Low start up latency
With throughput-based rate adaption, our goal is to have a buffer-filling rate that is greater than the buffer-depletion rate. True or False?
True
With rate-based adaption, when the bandwidth changes rapidly, the player takes some time to converge to the right estimate of the bandwidth, which can lead to overestimation of the future bandwidth. True or False?
True
What are the three drawbacks to using the traditional approach of having a single, publicly accessible web server?
1: Users are worldwide; what if somebody far away wants it?
2: What if the same thing is requested again and again? Can be wasteful
3: What if the single server has an outage?
What is a CDN?
Content Distribution Network - Network of geographically distributed servers with copies of content
What are the six major challenges that Internet applications face?
Peering point congestion
Inefficient routing protocols
Unreliable networks
Inefficient communication protocols
Scalability
Application limitations and slow rate of change adoption
What are the major shifts that have impacted the evolution of the Internet ecosystem?
Shift to a focus on large scale content delivery
Topological Flattening of providers thanks to IXPs and ASes in addition to ISPs
Compare the “enter deep” and “bring home” approach to CDN server placement.
Enter Deep - Create many, many CDNs in order to minimize geographic distances from recipients to content
Bring Home - Fewer but larger clusters in IXPs, easier to manage but larger delays
What is the role of DNS in the way CDN operates?
When a user makes a request, the users’ local DNS queries the CDN, which returns an appropriate IP address for a content server with the content to the LCDN.
What are the two main steps in CDN server selection?
The first step consists of mapping the client to a cluster.
In the next step, a server is selected from the cluster.
What is the simplest approach to selecting a cluster? What are the limitations of this approach?
Pick the geographically closest one. If you’re using a remote LDNS, it can pick the wrong one. There can also be routing inefficiencies.
What metrics could be considered when using measurements to select a cluster?
Network-level, ie delay or available bandwidth
Application-level, ie re-buffering ratio and avg bitrate
How are the metrics for cluster selection obtained?
Actively: ie the LDNS will ping the clusters and see how quickly they respond
Passively: Keep track of how operations from the same IP address have performed prior
Explain the distributed system that uses a 2-layered system.
A coarse‑grained global layer operates at larger time scales (timescale of a few tens of seconds (orminutes)). This layer has a global view of client qualitymeasurements. It builds a data‑driven prediction model of video quality.
A fine‑grained per‑client decision layer that operates at the millisecond timescale. It makes actual decisions upon a client request. This is based on the latest (but possibly stale) pre‑computed global model and up‑to‑date per‑client state.
What are the challenges of the distributed system using a 2-layered system?
It needs to have data for different subnet‑cluster pairs. Thus, some of the clients deliberately need to be routed to sub‑optimal clusters.
What are the strategies for server selection? What are the limitations of these strategies?
Assign one randomly - could end up picking one with a higher workload
Use least-loaded server - not all servers have all the content at all time, so you could get one which currently doesn’t have it, leading to a longer wait while it is copied over
What is consistent hashing? How does it work?
The main idea behind consistent hashing is that servers and the content objects are mapped to the same ID space. For instance, imagine we map the servers to the edge of a circle (say uniformly). Server 1, Item 4, Item 8, Server 12, Item 13. Server 12 is responsible for Items 4 and 8, but not 13. If a Server leaves, the next Server takes over its responsibilities.
Why would a centralized design with a single DNS server not work?
They are consisting of variable characters and thus it’s difficult for routers to process them.
What are the main steps that a host takes to use DNS?
- The user host runs the client side of the DNS application
- The browser extracts the hostname www.someschool.edu (Links to an external site.) and passes it to client side of the DNS application.
- DNS Client sends a query containing the hostname of DNS
- DNS Client eventually receives a reply which included IP address for the hostname
- As soon as the host receives the IP addresses, it can initiate a TCP connection to the HTTP server located at that port at that IP
What are the services offered by DNS, apart from hostname resolution?
Mail server/Host aliasing
Load distribution
What is the structure of the DNS hierarchy?
Distributed Hierarchical Database: Each node can have multiple children
Why does DNS use a hierarchical scheme?
Would be too much traffic otherwise, would have a single point of failure otherwise, allows for many transactions with many clients to happen concurrently
What are the layers of the DNS hierarchy?
Root DNS servers
Top level domain (TLD) Servers
Authoritative servers
Local DNS servers
What is the difference between iterative and recursive DNS queries?
In the iterative query process, the querying host is referred to a different DNS server in the chain, until it can fully resolve the request.
Whereas in the recursive query process, the querying host and each DNS server in the chain queries the next server and delegates the query to it.
What is DNS caching?
The idea of DNS Caching is that, in both iterative and recursive queries, after a server receives the DNS reply of mapping from any host to IP address, it stores this information in the Cache memory before sending it to the client
What is a DNS resource record?
The DNS servers store the mappings between hostnames and IP addresses as resource records (RRs). These resource records are contained inside the DNS reply messages. A DNS resource record has four fields: (name, value, Type, TTL). The TTL specifies the time (in sec) a record should remain in the cache. The name and the value depend on the type of the resource record.
What are the most common types of resource records?
A, NS, CNAME, MX
TYPE=A: the name is a domain name and value is the IP address of the hostname. (abc.com, 190.191.192.193, A)
TYPE=NS: the name is the domain name, and the value is the appropriate authoritative DNS server that can obtain the IP addresses for hosts in that domain. (abc.com, dns.abc.com, NS)
TYPE=CNAME: the name is the alias hostname, and the value is the canonical name, (abc.com, relay1.dnsserver.abc.com, CNAME)
TYPE=MX: the name is the alias hostname of a mail server, and the Value is the canonical name of the email server. (abc.com, mail.dnsserver.abc.com, MX)
Describe the DNS message format.
ID, Flags, Question, Answer, Authority, Additional
The first field is an ID that is an identifier for the query and it allows the client to match queries with responses.
The flags section have multiple fields. For example, a field allows to specify if the DNS message is a query or response. Another field specifies if a query is recursive or not.
The question section contains information about the query that is being made for example the host-name that is being queried, the type of the query (A, MX, etc).
In the answer section, and if the message is a reply, we will have the resource records for the hostname that was originally queried.
In the authority section, we have resource records for more authoritative servers.
The additional section contains other helpful records. For example, if the original query was for an MX record, then the answer section will contain the resource record for the canonical hostname of the mail server, and the additional section will contain the IP address for the canonical hostname
What is IP Anycast?
The main goal of IP anycast is to route a client to the “closest” server, as determined by BGP (Border Gateway Protocol), a routing protocol used for inter‑AS-routing.
What is HTTP Redirection?
Essentially, when a client sends a GET request to a server, say A, it can redirect the client to another server, say B, by sending an HTTP response with a code 3xx and the name of the new server.
Having a single server for providing Internet content has what disadvantages?
Single point of failure.
Bandwidth waste in high demand for the same content.
Scalability issues.
Potentially big geographic distance between Internet hosts/users and the server.
One of the advantages of using CDNs is that the routing protocols they use take important aspects into consideration, such as congestion, latency, etc., in order to best deliver the content to the Internet users. True or False?
False
There are several factors that can make a CDN network unreliable, such as misconfigured routers, power outages, malicious attacks or natural disasters. True or False?
True
As the Internet evolves, the topology of the ISPs has become flatter, and the number of IXPs increases as the time progresses due to the services they offer and the lower costs for the ISPS. True or False?
True
The major drawback of the “Enter Deep” approach is that, if one server is lost, that geographic area will experience a higher delay and lower throughput. True or False?
False
When using CDN servers for content delivery, there is more overhead than when using the traditional approach. True or False?
True
For a CDN to deliver content to an Internet user, a cluster is mapped to a client first and then a server within that cluster is selected. True or False?
True
Terei Pyrope is a cool character. True or False?
True
By using consistent hashing for server selection, in the case of a server failure, the objects that the server was responsible for can be taken care of by a random server within the same ID space. True or False?
False
When using DNS caching, what would happen if a host A makes a request for a domain that was just previously queried by another host?
The local DNS server will immediately answer the host with the IP address.
What is the type of the following resource record: (amazon.com, dns.amazon.com, ?, TTL)?
NS
IP Anycast assigns the same IP address to multiple servers in order to deliver content from CDNs by using the closest server to a client based on BGP path length. True or False?
True
HTTP redirection can only be used in order to share the load of content requests among servers. True or False?
False
What is packet classification?
Forwarding based on more than just longest prefix matching
Describe how an OpenFlow Switch works
Switch receives a packet
Switch determines highest priority matching rule
Perform action associated with highest rule
Increment internal counter
In the SDN approach, how is a forwarding table updated?
A remote controller computes and distributes forwarding tables that are used by each router
Which layers belong to the Management Plane?
Network Applications
Programming Languages
Language-Based Virtualization
Which layers belong to the Control Plane?
Northbound Interface
Network Operating System
Network Hypervisor
Which layers belong to the Data Plane?
Southbound Interface
Network Infrastructure
In simple terms, what do the Southbound interfaces do?
Act as connectors between control and data plane
In simple terms, what does the Network operating systems (NOS) do?
Provides abstractions, acts as a centralized controller for the SDN
In simple terms, what do the Northbound interfaces do?
This is still being determined/custom software
What are some examples of Network programming languages?
Pyretic, Frenetic, Merlin, Nettle, Procera, FML, etc
What do Network applications implement?
Control plane logic
In ONOS, how do instances relate to each other?
Each instance has a “master”. If an instance fails, an “election” is held for each child to find a new master.
What is the purpose of Traffic Engineering, an application of SDNs?
Optimizing the traffic flow so as to minimize power consumption
What’s an example of Traffic Engineering?
ElasticTree
What is the purpose of Mobility and Wireless, an application of SDNs?
Connecting to mobile networks, ie WLans
What are two examples of Mobility and Wireless?
OpenRadio, Odin Network
What is the purpose of Measurement and Monitoring, an application of SDNs?
Keep better metrics to respond to change in network conditions
What are three examples of Measurement and Monitoring?
OpenSketch, OpenSample and PayLess
What is the purpose of Security and Dependability, an application of SDNs?
Make the network more secure
What’s an example of Security and Dependability?
CloudWatcher
What is the purpose of Data Center Networking, an application of SDNs?
Identifying issues and troubleshooting, real‑time monitoring of networks, etc
What are two examples of Data Center Networking?
LIME, FlowDiff
At a high level, what is the purpose of an SDX?
To maintain an illusion of sorts of an independent SDN, while still benefitting from an IXP
What is “Integrity” in the context of internet security?
Message/content has not been modified
What is the goal of DNS abuse?
To keep malicious actions undetectable for longer
What does FIRE stand for?
Finding Rogue Networks
ASwatch uses information from which plane? To do what?
Control Plane, Identify Malicious Networks
What is Rewiring activity?
Frequent changes in provider, using lesser known providers, etc
What is IP Space Fragmentation and Churn?
Malicious ASes use very small BGP prefixes
What is BGP Routing Dynamics?
Monitoring whether the announcements (updates/withdrawals) follow normal patterns
What is a Man In The Middle Attack?
When something is manipulated before it reaches its destination AS
What is the difference between a Targeted Attack and a High Impact Attack?
High Impact Attacks are meant to be noticed, Targeted Attacks are meant to be discrete
What is the goal of ARTEMIS?
To safeguard a network’s own prefixes against malicious BGP hijacking attempts
What is Prefix deaggregation?
When you announce a more specific prefix than a targeted prefix, redirecting traffic to the new one you just announced.
What is Mitigation with Multiple Origin AS (MOAS)?
Have third party networks/providers do BGP announcements for a targeted network
What does “DDoS” Attack stand for?
Distributed Denial-of-Service (DDoS) attack
What are Traffic Scrubbing Services?
Incoming traffic is diverted to a “scrubber”, where “clean” and “unclean” traffic are separated. Clean traffic is sent to the destination.
What are ACL Filters?
Blocklists provided by ISPs/IXPs to prevent unwanted traffic
What is BGP Flowspec?
Sets up rules for how ASes/BGP lets traffic in, which can mitigate DDoS attacks.
Why is blackholing considered effective?
It drops the traffic nearer to the sender, saving “energy” for the targeted site
What is the downside of blackholing?
All traffic, including valid traffic, is dropped
In VoIP, what is Signaling?
Setting up calls, managing them, tearing them down
Most of the time, VoIP uses which protocol?
UDP
How does the client know about the different encoding bitrates that are available, and how does it know about the URL of each of the video segments?
It receives a manifest file over HTTP with all of the metadata
What is the challenge of, “Peering point congestion”?
Little business/financial incentive to prioritize the “middle” between end users and hosts where peers connect
What is the challenge of, “Inefficient routing protocols”?
BGP was not designed for modern infrastructures
What is the challenge of, “Unreliable networks”?
Outages, DDoS attacks, anything that prevents access
What is the challenge of, “Inefficient communication protocols”?
TCP was not designed for modern internet, distance is a bottleneck, it’s hard to update TCP protocols
What is the challenge of, “Scalability”?
Accounting for situations where usage shoots up, ie viral video
What is the challenge of, “Application limitations and slow rate of change adoption”?
It’s hard to update protocols and processes since so many services/applications can only use the old one
What overall change does the proliferation of IXPs/usage of CDNs lead to?
More local traffic
What are the three different network protocols that can be used for server selection?
DNS, HTTP Redirection, IP Anycast
What is the point of Load Distribution?
Distribute traffic across different servers
What’s the typical pattern for DNS queries?
One recursive, then however many iterative it takes
What are the three data plane traffic manipulation techniques?
Dropped (blackholing)
Man-in-the-middle
Impersonation