4. Identity and Anonymity Flashcards
Forms of identity
- identified individual
- pseudonym /we can detach online presence from the actual person. However, this kind of privacy can be illusory, as it is often possible to identify the actual person behind the pseudonym.)
- anonymity (“The weakest form of identity is anonymity. With truly anonymous data, we not only do not know the individual the data is about, we cannot even tell if two data items are about the same individual.)
The differences can easily be seen using a formal definition. Assume we have a set of data items D={d1 ,…,dn }, and an identity function I(d) that gives us information on whom the data item d is about. If we can say that, for a known individual i, I(d)=i, then I(d) is an identified individual. If we can say that I(dj )=I(dk ) (the two data items are about the same individual), but we do not know who that individual is, then I(dk ) is a pseudonym. If we cannot make either statement (identified individual or pseudonym), then the data is anonymous.
Reasons for a system to know a person’s identity
- Access control
- Attribution (the ability to prove who performed an action)
- Enhance user experience (pseudonym might be enough here)
Representing an identity
- External, easy to remember identifier (e.g. name and date of birth or address)
- user specified identifier (user ID)
- externally created identifiers (e.g. email address)
- using systems created for that purpose. The X.500 standard provides a flexible framework for storing and maintaining identifying information, as do commercial systems such as Microsoft Passport or Google Wallet. Cryptographic certificates and public-key infrastructure (see Chapter 3) also provide mechanisms to verify identity. These systems generally combine representations of identity with other identity-related information (name, address) and can provide authentication mechanisms
- Biometrics
X.500
X.500 is a series of computer networking standards used to develop the equivalent of an electronic directory that is very similar to the concept of a physical telephone directory. Its purpose is to centralize an organization’s contacts so that anyone within (and sometimes without) the organization who has Internet access can look up other people in the same organization by name or department. Several large institutions and multinational corporations have implemented X.500.
Authentication categories
Authentication is used to ensure that an individual performing an action matches the expected identity. Authentication can be accomplished by a variety of mechanisms, each with advantages and drawbacks. These mechanisms fall into four main categories:
What you know—secret knowledge held only by the individual corresponding to the identity
What you have—authentication requires an object possessed by the individual
Where you are—the location matches the expected location
What you are—biometric data from the individual
Authentication methods typically involve authentication information held by the user, complementation information held by the server/host, and an authentication function that takes a piece of authentication information and a piece of complementation information and determines whether they do or do not match.
A man in the middle attack
Simple example of a password attack performed directly through the system is the man-in-the-middle attack, in which a computer program intercepts traffic and reads the password contained in the intercept. To combat this attack, passwords are typically encrypted. Instead of presenting the password (authentication information) to a system, the system uses a one-way hash of the password and stores only the hash (complementary information). As it is extremely difficult to discover the password from the hash, this prevents the man in the middle, or an intruder who has gained access to the system, from obtaining a user’s password.
Passwords/PINs - what you know approach to authentication
One of the most common approaches to authenticating a user is through passwords or PINs. This is an example of what you know authentication: It is assumed that only the proper individual knows the password. Passwords can provide a high level of assurance that the correct individual is being identified, but when used improperly, they can easily be broken.
Attacks on password-based authentication fall into two categories: attacks on the password itself (e.g. guessing short pwds) and password attacks performed directly through the system (e.g. a man in the middle attack).
Replay attack
While the man in the middle may not know the password, he only needs to replay the hash of the password to gain access; this is called a replay attack. This kind of attack is easily combated through system design. Challenge response authentication issues a unique challenge for each authentication: The response must be correct for each challenge. With a hashed password, the challenge is an encryption key sent by the system. The user application uses the key to encrypt the hash of the password; this is compared with the system’s encryption of the stored value of the hashed password. Each authentication uses a different key, and thus a replay attack fails because the replayed password (response) is not encrypted with the current key (challenge).
What you have approach to authentication / Devices
The what you have approach to authentication typically uses computing devices. Identification badges or smart cards can be used; these require that the computing terminal have the ability to read the computing device. A convenient approach is to embed a radio frequency identification (RFID) chip in the device; this does require a reader, but the user doesn’t actually have to swipe the card. This particular technology introduces a privacy risk in that a malicious actor with a remote RFID reader can detect when the user is nearby, even though they are not actually trying to authenticate. If the actor can read the RFID card, then they may be able to “become” that individual through a replay attack; more advanced RFID approaches use a challenge-response approach to mitigate this attack.
Devices also exist that don’t require special hardware at the client’s terminal. These are typically in the form of small devices that display a changing PIN; the timing and sequence of PINs are known to the system. The user can type the PIN being displayed by the device just like a password, and the system checks to see if the given PIN matches what the device should be displaying.
Lastly, the computing device may be the computer the person uses to access the system (e.g., a home computer, laptop, smartphone). The system stores the IP address of the device or uses browser cookies to store a unique key on the machine; this allows the system to check whether the attempt to authenticate comes from a device previously used. Since the user already has the device, this requires no additional hardware.
Device-based authentication becomes problematic when devices are lost or stolen—until the loss is recognized and reported, access to the system may be compromised. As a result, these systems are typically combined with passwords or some other form of authentication so that the lost device alone cannot be used to gain access.
Where you are based authentication / Location
Location-based authentication is typically used in corporate networks. Access to corporate resources is limited to computers physically located in the company. This requires an attacker to gain physical access as well as defeat other authentication (such as passwords), making unauthorized access far more difficult. Of course, this also prevents legitimate use from outside the network, requiring the use of virtual private networks (VPNs). A VPN provides an encrypted link to the corporate network, and typically requires a high standard of authentication to make up for the loss of location-based authentication.
Note that location-based authentication can be used in other ways as well. Credit card issuers may reject transactions at unfamiliar locations unless the customer has provided advance notice of travel. While this may seem invasive from a privacy point of view, such location information will likely be made available anyway—for example, from the credit card use or the IP address used to connect to the system, so little additional information is disclosed when providing a list of authorized locations, such as a travel itinerary.
While location is useful, it should almost always be viewed as a secondary form of authentication, used to provide stronger evidence that the primary form of the authentication is.
What you are authentication / Biometrics
What you are as a form of authentication is growing increasingly popular. Notebook computers are available with fingerprint readers, and cameras and microphones are becoming standard equipment on many devices. Fingerprints, face and voice recognition, and other biometric methods for authentication are becoming increasingly available. This brings advantages, but also raises privacy issues.
First, systems using biometric data must protect that data. If a user’s password is compromised, the user can change it—but cannot be asked to change a face or fingerprint. As with passwords, careful system design is needed to ensure that an attacker cannot obtain or spoof the biometric data.
Use of biometric data raises inherent privacy concerns. While passwords can be associated with a pseudonym, a fingerprint is inherently identifying, and a pseudonymous account using a fingerprint for authentication should probably be considered individually identifiable. There may also be cultural issues; some users may be reluctant to have a photograph taken or to display their face for use in biometric authentication.
A second type of biometrics is based on behavior—for example, typing rate or patterns of mouse movement. While these give only a degree of assurance, they provide the opportunity for continuous authentication. Once a user authenticates to the system, the behavior in using the system can be used to ensure that the user hasn’t walked away and someone else has stepped in to use the account.
MFA
The idea behind multifactor authentication is to require two different mechanisms, coming from two of the above categories (what you know, who you are, where you are, what you have). A common example is the use of a device (often an individual’s cell phone) in addition to a password.
Good implementation of two-factor authentication can make many types of attacks, such as man-in-the-middle attacks, more difficult. The key is that the two factors should proceed through independent channels, such as a password combined with a one-time temporary security code sent via a text message (SMS). While this does not eliminate attacks, an attacker must now compromise two independent systems. Conversely, forms of two-factor authentication that draw both factors from the same category (such as a password and security questions) are much less effective; a targeted attack to use personal information to guess a password will likely acquire the personal information needed to answer the security questions as well.
Authentication
Authentication is the means by which a system knows that the identity matches the individual who is actually using the system. There are several approaches to authentication. Often, these can be used in combination, significantly decreasing the risk of a successful attack or attempt at impersonating the user.
Authentication can be separated from the systems requiring authenticity. This is a feature of single-sign-on systems. Authentication is performed by a service that provides a time-stamped cryptographic token to the user’s system (e.g., web browser). This token can be provided to other systems, which can decide if the source of authentication (secured using a digital certificate), recency of authentication and user identified by the token satisfy access policy without requiring separate authentication.
Authentication must balance assuring the accuracy of an individual’s identity and the usability of the system. While authentication needs to be strong enough to protect personal information, excessive use of technology to perform authentication can reduce the practical effectiveness of the system and create new privacy issues by collecting sensitive personal information needed to implement complex authentication mechanisms.
Radio Frequency Identification (RFID)
Radio Frequency Identification (RFID), is a technology that is similar in theory to barcode identification. It is a wireless non-contact use of radio frequency electromagnetic fields to transfer data, for the purpose of automatically identifying and tracking tags attached to objects.
The tags contain electronically stored information. Some tags are powered and read at short ranges by magnetic fields. Others are powered by a local power source such as a battery, or in some cases they don’t have a battery but collect energy from the interrogating EM field, and then act as a passive transponder to emit microwaves or UHF radio waves.
TAILS
Tails (The Amnesic Incognito Live System) is a live operating system that you can start on almost any computer from a USB stick. It aims to preserve your privacy and anonymity by routing all your internet traffic through the Tor network and by providing a host of other privacy-centric features.