6. Tracking and Surveilance Flashcards
IP Address
The internet protocol address is a numerical identifier given to internet-connected devices. A major transition is currently occurring from IPv4 addresses, which have effectively been exhausted, to much larger IPv6 addresses. An IPv4 address is 32 bits (232 possible values), while IPv6 addresses are 128 bits (2128 possible values).
IP Packets
Each IP packet consists of a header and the data payload. The exact format of the packet depends on the protocol, but includes the IP address of the data’s source and the address of its destination. It also includes a checksum over the header for error checking, as well as information about how the packet should be routed and the protocol that the packet is using.
In the typical case, the information included in an IP packet allows it to be transmitted across networks using packet routing. Using the information included in the header of the IP packet, each router passes a packet on to the next router closer to its final destination. Once packets reach their final destination, the contents are reassembled into their original form, such as an image or other user-friendly file.
TCP/IP & UDP
Two of the most popular protocols that sit on top of IP are the transmission control protocol (TCP) and user datagram protocol (UDP). Whereas TCP guarantees delivery of a packet and encompasses mechanisms for verifying delivery and resending packets that did not make their way to the destination, UDP makes no such guarantees. As a result, TCP is generally used when it is important that data be delivered in its entirety, even if it takes longer. For instance, TCP would normally be used when downloading a photograph from a website. In contrast, by not making guarantees about the eventual delivery of data, UDP can operate more quickly and with less overhead. In cases where speed trumps reliability, such as in a video stream of a live sports event, UDP is generally used. If the data for a few seconds of this live video stream were to be lost in transit, it would not be useful to invoke a retransmission procedure and receive this data at a later point since the moment would have passed.
Mail User Agent
A user creates an email message using a mail user agent (MUA) at the application level of their computer. A desktop email client like Microsoft Outlook is an example of a MUA. The email message is made up of a message header and a body. The body includes the email message. The header includes a variety of addressing fields, such as the sender’s and recipients’ email addresses, the subject, and cc’d recipients.
SMTP
The email message is transmitted to the user’s outgoing mail server and then sent across the internet to its destination using the Simple Mail Transfer Protocol (SMTP).
IMAP/POP/POP3
Once the email reaches its destination mail server, it is available for access either directly or by using a mail server protocol, such as the Internet Message Access Protocol (IMAP) or the Post Office Protocol (POP). When using IMAP, the emails remain on the server for access later or for access by multiple clients (e.g., a MUA on a desktop computer or a smartphone). In contrast, in POP, the MUA removes the emails from the server after storing them locally. In POP3, the email server can be configured to leave emails in the inbox.
HTTP/HTTPS
Hypertext Transfer Protocol (Secure)
The service component of a URL specifies the protocol that will be used for the request. Most commonly, web pages use HTTP for communication between a web browser and the web server that hosts the page. Messages sent over HTTP are sent in plaintext, and thus are susceptible to monitoring and tampering by any of the intermediary nodes through which the HTTP packets are sent.
To prevent monitoring or tampering of data traveling over the internet, HTTPS (hypertext transfer protocol secure) can be used. This protocol is similar to HTTP except that data is encrypted using transport layer security (TLS).
Historically, many websites preferred to send traffic over HTTP, rather than HTTPS, for performance reasons. Unfortunately, this decision to use HTTP and therefore send web traffic in plaintext also meant that the bulk of web traffic could be monitored. However, the adoption of HTTPS greatly accelerated around 2017.6 A number of factors appear to have spurred HTTPS adoption, ranging from how web browsers began to flag the insecurity of sites served over HTTP to the nonprofit Let’s Encrypt certificate authority (CA) beginning to offer the X.509 certificates required for HTTPS deployment for free.
HTTPS DOES NOT PROVIDE ANONIMITY. Network observers still can see the source and destination of traffic, which are left unencrypted in the packet headers so that the request or response can be routed to the right destination. For instance, a user who visits example.com over HTTPS will reveal to network observers that their IP address is communicating with example.com’s IP address. While the body of the request or response, such as the precise page requested and delivered, is encrypted, the privacy provided can be imperfect. Which page is being viewed can sometimes be inferred simply based on the size and the timing of the encrypted data returned, even without observing the unencrypted data itself. Anonymizers can be used to mask the link between the source—the user—and the destination of the network traffic. Two major types of anonymizers are anonymous proxies and onion routers.
X.509 Certificate
An X.509 certificate is a digital certificate that uses the widely accepted international X.509 public key infrastructure (PKI) standard to verify that a public key belongs to the user, computer or service identity contained within the certificate.
A public key is a large numerical value used to encrypt data or check the legitimacy of a digital signature. A PKI, moreover, is the underlying framework that enables entities like users and servers to securely exchange information using digital certificates.
The X.509 certificate is a safeguard against malicious network impersonators. When a certificate is signed by a trusted authority, or is otherwise validated, the device holding the certificate can validate documents. It can also use a public key certificate to secure communications with a second party.
Port
Along with the host, a port can optionally be specified. Ports allow numerous programs and processes on one computer to communicate simultaneously with many other machines without accidentally jumbling the conversations, similar to the way mail can be correctly routed to a resident of a large apartment building that has a single street address by specifying an apartment number. Although a single computer has 65,535 ports for use by both TCP and UDP, there are default ports to which requests following particular protocols should be made. For instance, HTTP requests are sent to TCP port 80 by default, while HTTPS requests are sent to TCP port 443. Since no port was specified in the example URL above, the default port for the HTTPS protocol will be used.
Host of URL/domain
The host portion of the URL specifies who will receive the request, most often a computer server owned or contracted by the group represented by the website. The host can also be referred to as the site’s domain.
Resource of URL
Finally, the resource portion of the URL specifies exactly which page, image or other object should be returned.
Deep Packet Inspection
“Only the IP header, the first part of a packet, is required for network hardware to accurately route a packet to its destination. It is possible for network hardware to examine header information for other protocols or the full body of the network packet for a variety of purposes. When nodes look at this additional data, it is called deep packet inspection.
Deep packet inspection serves multiple purposes. For example, the ability to examine additional information within packets before they pass into a local organizational network can help determine whether or not the packets contain malicious content, such as known viruses. Alternatively, examining packets before they leave a network can help prevent data leaks, assuming the organization can scan these packets to detect sensitive information that should not leave the organization.
Deep packet inspection is also used for a variety of nonorganizational purposes. It is used by advertisers to track users’ online behavior to better target ads and by government entities to censor or track citizens’ online behaviors; both of these activities raise privacy concerns.8 In China, deep packet inspection is used as part of the “Great Firewall,” which the government uses to perform large-scale censorship on potentially sensitive topics. Some opponents of deep packet inspection note that it can be used to violate the principle of net neutrality because it allows network traffic and bandwidth shaping based on the content of a packet.
Wi-Fi eavesdropping, packet sniffing
Monitoring can also occur on Wi-Fi networks. It is possible to eavesdrop on or capture data being sent over a wireless network at the packet level. Several systems for Wi-Fi eavesdropping, including packet sniffing and analysis tools, are freely available.
Unsecured communications sent over an open, or shared, wireless network can be intercepted easily by others. This risk is often present in Wi-Fi hotspots in public spaces, such as hotels or coffee shops, where many users share a common Wi-Fi network that is either unprotected or protected with a password known to a large group of users.
Packet-sniffing systems capture packets sent over such networks. If the data is unencrypted, these packets can be examined and reassembled. These reassembled packets can then provide information about all the network user’s activities, including websites they visited, emails and files sent, and the data included in session cookies (such as website authentication information). Wireshark is one example of a packet sniffing and network analysis tool.11 It captures packet-level data on wired or wireless networks to which a user has access, allowing a user to examine and reassemble packet content. Other examples of packet sniffers include “Kismet for Unix and Eavesdrop for Mac.12
There are also more specialized Wi-Fi eavesdropping systems. One such tool enabled HTTP session hijacking, or “side-jacking,” attacks. When a user logs in to an internet site, the initial login process is usually encrypted. Sites often store a token on the user’s computer, and this token is sent along with future HTTP requests as proof that the user has logged in. However, some popular sites previously used HTTP, rather than HTTPS, to send this token, which means the token was sent unencrypted.
Defenses against WiFi eavesdropping
There are several potential defenses against Wi-Fi eavesdropping. First, Wi-Fi eavesdropping requires that the eavesdropper have access to the Wi-Fi network and be able to read the packets that are sent. Ensuring that Wi-Fi networks are encrypted using strong passwords can limit the danger of Wi-Fi eavesdropping by preventing some adversaries from reading the traffic passing across the network. However, one Wi-Fi encryption scheme that is still in limited use, Wired Equivalent Privacy (WEP), has significant vulnerabilities and can often be broken within seconds.15 The Wi-Fi Protected Access (WPA) encryption scheme is also considered insecure and should not be used. At the time of press, however, its successor WPA2 was still considered secure even though its own successor, WPA3, had already been announced. Even the more recent security protocols for Wi-Fi routers can sometimes be defeated, which means that strong Wi-Fi passwords are often not sufficient to protect this communication channel. Virtual private networks (VPNs), which allow users to create secure, encrypted tunnels to send data through more trusted channels, offer a defense against interception on unsecured networks. Additionally, regardless of the security of the network itself, “ encrypting web requests using HTTPS can prevent eavesdroppers from intercepting sensitive or personally identifiable data.
Spyware
Spyware is malicious software that is covertly installed on a user’s computer, often by tricking users through social engineering attacks. Spyware can then monitor the user’s activities through a variety of methods. It can track online activity in several ways, including capturing cookie data to determine browsing history or directly monitoring and reporting on browsing behavior. Spyware can also directly monitor what a user is doing on their computer, either by performing screen capture and transmitting an image of the user’s screen back to the attacker, or by performing keylogging. In keylogging, malware is installed that tracks all keystrokes performed by the user. This data is then sent back to the attacker, allowing them to capture sensitive information typed by the user, such as passwords.
Anonymous proxy
Anonymous proxies allow users to anonymize their network traffic by forwarding the traffic through an intermediary. Thus, the user’s traffic appears to come from the proxy server’s IP address, rather than the original user’s IP address. JonDonym is a service that anonymizes traffic by routing packets through a mix of multiple user-chosen anonymous proxies.23 However, the use of an anonymous proxy requires that the user trust the anonymous proxy, and this approach runs the risk of presenting a single point of failure.
Onion-routing system
Onion-routing systems, or mix networks, are an alternative to anonymous proxies. Similar to the layers of an onion, packets sent through an onion-routing system are encrypted in layers and then sent through a series of relays in a way that is very difficult to trace. At each stage of the circuit, a node receives a packet from the previous node, strips off a layer of encryption and sends it on to the next node. Because there are multiple nodes within the circuit, each internal node does not know anything beyond the node it received the packet from and the node to which it needs to forward the packet. This configuration allows a layer of anonymity to be inserted into network traffic. However, encryption is still required to keep the data itself anonymous once it leaves the virtual circuit. Tor (The Onion Router) is an implementation of the onion-routing protocol that uses a network of volunteer-run relay nodes to enable a variety of anonymous services.
Cookies
“web browsers typically communicate with web servers using HTTP or HTTPS to access websites. Although these protocols are stateless, which means they are not expected to remember past transactions, it is useful for websites to be able to save state about a particular user.
Single origin policy for cookies
Web domains can only read and write cookies that they themselves have set, a practice known generally as the single-origin policy.
Third-party cookies
“it is often the case that visiting a single website will result in cookies from multiple companies being placed on a user’s computer because websites that appear as a single entity to the user may actually be cobbled together transparently from many different sources. For instance, a news website might load articles from its own internet domain (for instance, www.news-website.com). These sorts of cookies from the primary page that the user is visiting are known as first-party cookies. However, images on this page might be downloaded from another company (such as www.photojournalism-aggregator.com), while each advertisement on the page might be served by a different advertising network (such as www.xyz-advertising.com). Each of these domains can also set its own cookies. Cookies set from all companies other than the primary website whose URL is displayed in a browser are known as third-party cookies.
Beacons or Web Bugs
Elements used for tracking that are not visible to the user in the rendered web page are known as beacons or web bugs. Beacons are loaded onto a page using elements of the HTML markup language, which is the most widely used language for specifying the layout of a web page. HTML allows text and multimedia from a variety of different sources to be brought together to form a web page.
The most canonical example of a beacon is a one-pixel image whose sole purpose is to generate an HTTP request. If a user visits website A and website A embeds third-party content, such as a beacon or an advertisement, the browser will visit the third-party site to get the content and will receive a cookie alongside the content. The third-party tracker receives the cookie with the user’s unique ID, as well as the referring URL, thereby concluding that this particular pseudonymous user visited this particular URL. When the user visits a completely different site, website B, that site might also reference content from the same third party. If it does, the browser again visits the third party to fetch that content, and the cookie received on the visit to website A is sent back to the third party. The third party then knows that the user has visited both website A and website B.
Although a company can only track a user’s visits to websites on which it serves content, widespread tracking is still possible since a small number of companies include their content and beacons on many popular websites across the internet.
Behavioural / targeted advertising
Tracking across popular sites supports online behavioral advertising, also known as targeted advertising, which is the practice of targeting advertisements using a profile of a user based on the websites they visits. A common method of profiling involves having a list of interest categories, such as “home and garden.” These interest categories are either selected or unselected for a particular user based on inferences the company makes. As a basis for these inferences, the company can consider which web pages the user has visited. The company can also leverage information collected from other sources, both online and offline, that is funneled through data brokers.33 They might also misuse personal data provided to them for other purposes, such as security, for targeted advertising.34 Based on this data, advertisers can choose from among tens of thousands of different characteristics on which to target an advertisement.”
Excerpt From
IAPP_T_TB_Introduction-to-Privacy-for-Technology_1.1
This material may be protected by copyright.