Book Three-Chapter 1-Steganography Flashcards

Question 1

Q

Define Steganography

Answer

A

Steganography is the practice of embedding hidden messages within a carrier medium. Mathematicians, military personnel, and scientists have used it for centuries. The use of steganography dates back to ancient Egypt. Today steganography, in its digital form, is widely used on the Internet and in a variety of multimedia forms.

Modern steganography works by replacing bits of useless or unused data in regular computer files with bits of different, invisible information. When a file cannot be encrypted, the next best option for safe transfer is steganography. Steganography can also be used to supplement encryption. When used in this manner, steganography provides a double measure of protection, as the encrypted file, once deciphered, will not allow a message hidden by steganography to be seen. The receiver of the file has to use special software to decipher a message hidden by steganography.

A stegosystem is the mechanism that is used in performing steganography. The following components make up a stegosystem:

Embedded message: The original secret message to be hidden behind the cover medium

Cover medium: The medium used to hide the message

Stego-key: The secret key used to encrypt and decrypt the message

Stego-medium: The combined cover medium and embedded message

Steganography can be used for a variety of legal and illegal uses. It can be used for the following purposes:

Medical records: Steganography is used in medical records to avoid any mix-up of patients’ records. Every patient has an EPR (electronic patient record), which has examinations and other medical records stored in it.

Workplace communication: Steganography can be used as an effective method for employees who desire privacy in the workplace to bypass the normal communication channels. In this area, steganography can be an obstacle to network security.

Digital music: Steganography is also used to protect music from being copied by introducing subtle changes into a music file that act as a digital signature. BlueSpike Technology removes a few select tones in a narrow band. Verance adds signals that are out of the frequency range detectable by the human ear. Others adjust the sound by changing the frequency slightly. Digital audio files can also be modified to carry a large amount of information. Some files simply indicate that the content is under copyright. More sophisticated steganography versions can include information about the artist.

Terrorism: Certain extremist Web sites have been known to use pictures and text to secretly communicate messages to terrorist cells operating globally. Servers and computers globally provide a new twist on this covert activity.

The movie industry: Steganography can also be used as copyright protection for DVDs and VCDs. The DVD copy-protection program is designed to support a copy generation management system. Second-generation DVD players with digital video recording capabilities continue to be introduced in the black market. To protect itself against piracy, the movie industry needs to copyright DVDs.

Question 2

Q

Name the 3 main types of steganography:

Answer

A

Technical steganography

Linguistic steganography

Digital steganography

Technical Steganography:
In technical steganography, physical or chemical methods are used to hide the existence of a message. Technical steganography can include the following methods:

Invisible inks: These are colorless liquids that need heating and lighting in order to be read. For example, if onion juice and milk are used to write a message, the writing cannot be seen unless heat is applied, which makes the ink turn brown.

Microdots: This method shrinks a page-sized photograph to 1 mm in diameter. The photograph is reduced with the help of a reverse microscope.

**Linguistic steganography***
hides messages in the carrier in several ways. The two main techniques of linguistic steganography involve the use of semagrams and open codes.

Semagrams
Semagrams hide information through the use of signs or symbols. Objects or symbols can be embedded in data to send messages. Semagrams can be classified into the following types:

Visual semagrams: In this technique a drawing, painting, letter, music, or any other symbol is used to hide the information. For example, the position of items on a desk or Web site may be used to hide some kind of message.

Text semagrams: In this technique, a message is hidden by changing the appearance of the carrier text. Text can be changed by modifying the font size, using extra spaces between words, or by using different flourishes in letters or handwritten text.

Open Codes
Open codes make use of openly readable text. This text contains words or sentences that can be hidden in a reversed or vertical order. The letters should be in selected locations of the text. Open codes can be either jargon codes or covered ciphers.

Jargon codes: In this type of open code, a certain language is used that can only be understood by a particular group of people while remaining meaningless to others. A jargon message is similar to a substitution cipher in many respects, but rather than replacing individual letters the words themselves are changed.

Covered ciphers: This technique hides the message in a carrier medium that is visible to everyone. Any person who knows how the message is hidden can extract this type of message. Covered ciphers can be both null and grill ciphers.

Null ciphers: Null ciphers hide the message within a large amount of useless data. The original data may be mixed with the unused data in any order—for example, diagonally, vertically, or in reverse order—allowing only the person who knows the order to understand it.

Grill ciphers: It is possible to encrypt plaintext by writing it onto a sheet of paper through a separate pierced sheet of paper or cardboard. When an identical pierced sheet is placed on the message, the original text can be read. The grill system is difficult to crack and decipher, as only the person with the grill (sheet of paper) can decipher the hidden message.

In **digital steganography*******, the secret messages are hidden in a digital medium. The following techniques are used in digital steganography:

Injection

Least significant bit (LSB)

Transform-domain techniques

Spread-spectrum encoding

Perceptual masking

File generation

Statistical method

Distortion technique

Injection
With the injection technique, the secret information is placed inside a carrier or host file. The secret message is directly inserted into a host medium, which could be a picture, sound file, or video clip. The drawback to this technique is that the size of the host file increases, making it easy to detect. This can be overcome by deleting the original file once the file with the secret message is created. It is difficult to detect the presence of any secret message once the original file is deleted.

A type of digital steganography: Least Significant Bit (LSB)
With the least-significant-bit (LSB) technique, the rightmost bit in the binary notation is substituted with a bit from the embedded message. The rightmost bit has the least impact on the binary data. If an attacker knows that this technique is used, then the data are vulnerable.
Bit planes of a grayscale image are imprinted with the most significant bit (MSB) on top. The dark boxes represent binary value 0, and the light boxes represent binary value 1. The LSB plane of the cover image is replaced with the hidden data.

Transform-Domain Techniques
A transformed space is generated when a file is compressed at the time of transmission. This transformed space is used to hide data. The three transform techniques used when embedding a message are: discrete cosine transform (DCT), discrete Fourier transform (DFT), and discrete wavelet transform (DWT). These techniques embed the secret data in the cover at the time of the transmission process. The transformation can be applied either to an entire carrier file or to its subparts. The embedding process is performed by modifying the coefficients, which are selected based on the protection required. The hidden data in the transform domain is present in more robust areas, and it is highly resistant to signal processing.

Example: Images sent through Internet channels typically use JPEG format because it compresses itself when the file is closed. A JPEG file makes an approximation of itself to reduce the file’s size and removes the excess bits from the image. This change and approximation results in transform space that can be used to hide information.

Spread-Spectrum Encoding
Spread-spectrum encoding encodes a small-band signal into a wide-band cover. The encoder modulates a small-band signal over a carrier.

Spread-spectrum encoding can be used in the following ways:

Direct sequence: In direct-sequence encoding, the information is divided into small parts that are allocated to the frequency channel of the spectrum. The data signal is combined during transmission with a higher data-rate bit sequence that divides the data based on the predetermined spread ratio. The redundant nature of the data-rate bit sequence code is useful to the signal-resist interference, allowing the original data to be recovered.

Frequency hopping: This technique is used to divide the bandwidth’s spectrum into many possible broadcast frequencies. Frequency hopping devices require less power and are cheaper, but are less reliable when compared to direct sequence spectrum systems.

Perceptual Masking
Perceptual masking is the interference of one perceptual stimulus with another, resulting in a decrease in perceptual effectiveness. This type of steganography makes one signal hard to identify due to the presence of another signal.

File Generation
Rather than selecting a cover to hide a message, this technique generates a new cover file solely for the purpose of hiding data. A picture is created that has a hidden message in it. In the modern form of file generation, a spam-mimic program is used. Spam mimic embeds the secret message into a spam message that can be e-mailed to any destination.

Statistical Method
This method uses a one-bit steganographic scheme. It embeds one bit of information in a digital carrier, creating a statistical change. A statistical change in the cover is indicated as a 1. A 0 indicates that a bit was left unchanged . The work is based on the receiver’s ability to differentiate between modified and unmodified covers.

Distortion Technique
This technique creates a change in the cover object in order to hide the information. An encoder performs a sequence of modifications to the cover that corresponds to a secret message. The secret message is recovered by comparing the distorted cover with the original. The decoder in this technique needs access to the original cover file.
****************
Digital File Types
The various techniques used in steganography are applied differently depending on the type of file that is being used to encode the message. The four digital file types are text, image, audio, and video files.

The following steganography methods are used in text files:

Open-space

Syntactic

Semantic

Open-Space Steganography
This method uses white space on the printed page. Open-space methods can be categorized in the following three ways:

Intersentence spacing: This method encodes a binary message by inserting one or two spaces after every terminating character. This method is inefficient since it requires more space for a small message, and the white spaces can be easily spotted.

End-of-line spacing: Secret data is placed at the end of a line in the form of spaces. This allows more room to insert a message but can create problems when the program automatically removes extra spaces or the document is printed as hard copy.

Interword spacing: This method uses right justification, by which the justification spaces can be adjusted to allow binary encoding. A single space between words is 0, and two spaces is 1.

Syntactic Steganography
This method manipulates punctuation to hide messages.

Look at the following example:

Laptop, iPod, USB

Laptop iPod USB

The punctuation marks are missing in the second phrase. These punctuation marks can be used to hide the message.

Semantic Steganography
This method of data hiding involves changing the words themselves. Semantic steganography assigns two synonyms primary and secondary values. When decoded, the primary value is read as 1 and the secondary as 0.

The following steganography methods are used in text files:

Open-space

Syntactic

Semantic

Open-Space Steganography
This method uses white space on the printed page. Open-space methods can be categorized in the following three ways:

Intersentence spacing: This method encodes a binary message by inserting one or two spaces after every terminating character. This method is inefficient since it requires more space for a small message, and the white spaces can be easily spotted.

End-of-line spacing: Secret data is placed at the end of a line in the form of spaces. This allows more room to insert a message but can create problems when the program automatically removes extra spaces or the document is printed as hard copy.

Interword spacing: This method uses right justification, by which the justification spaces can be adjusted to allow binary encoding. A single space between words is 0, and two spaces is 1.

Syntactic Steganography
This method manipulates punctuation to hide messages.

Look at the following example:

Laptop, iPod, USB

Laptop iPod USB

The punctuation marks are missing in the second phrase. These punctuation marks can be used to hide the message.

Semantic Steganography
This method of data hiding involves changing the words themselves. Semantic steganography assigns two synonyms primary and secondary values. When decoded, the primary value is read as 1 and the secondary as 0.

Question 3

Q

Digital File Types
The various techniques used in steganography are applied differently depending on the type of file that is being used to encode the message. The four digital file types are text, image, audio, and video files.

Answer

A

Text Files*
The following steganography methods are used in text files:

Open-space

Syntactic

Semantic

Open-Space Steganography
This method uses white space on the printed page. Open-space methods can be categorized in the following three ways:

Intersentence spacing: This method encodes a binary message by inserting one or two spaces after every terminating character. This method is inefficient since it requires more space for a small message, and the white spaces can be easily spotted.

End-of-line spacing: Secret data is placed at the end of a line in the form of spaces. This allows more room to insert a message but can create problems when the program automatically removes extra spaces or the document is printed as hard copy.

Interword spacing: This method uses right justification, by which the justification spaces can be adjusted to allow binary encoding. A single space between words is 0, and two spaces is 1.

Syntactic Steganography
This method manipulates punctuation to hide messages.

Look at the following example:

Laptop, iPod, USB

Laptop iPod USB

The punctuation marks are missing in the second phrase. These punctuation marks can be used to hide the message.

Semantic Steganography
This method of data hiding involves changing the words themselves. Semantic steganography assigns two synonyms primary and secondary values. When decoded, the primary value is read as 1 and the secondary as 0.

****Image Files****
Image files commonly use the following formats:

Graphics Interchange Format (GIF): GIF files are compressed image files that make use of a compression algorithm developed by CompuServe. GIF files are based on a palette of 256 colors. They are mainly used for small icons and animated images since they do not have the color ranges needed for high-quality photos.

Joint Photographic Experts Group (JPEG): JPEG files are the proper format for photo images that need to be small in size. JPEG files are compressed by 90%, or to one-tenth, of the size of the data.

Tagged Image File Format (TIFF): The TIFF file format was designed to minimize the problems with mixed file formats. This file format did not evolve from a de facto standard. It was made as the standard image file format for image file exchange.

The following steganography techniques are used to hide a message in an image file:

LSB insertion

Masking and filtering

Algorithms and transformation

LSB Insertion
Using the LSB insertion method, the binary representation of the hidden data can be used to overwrite the LSB of each byte inside the image. If the image properties indicate that the image is 24-bit color, the net change is minimal and can be indiscernible to the human eye.

The following steps are involved in hiding the data:

The steganography tool makes a copy of an image palette with the help of the red, green, and blue (RGB) model.

Each pixel of the eight-bit binary number LSB is substituted with one bit of the hidden message.

A new RGB color in the copied palette is produced.

With the new RGB color, the pixel is changed to an eight-bit binary number.

Look at the following example:

01001101 00101110 10101110 10001010 10101111 10100010 00101011 10101011
Seen above are the adjacent pixels made up of eight bits. If the letter H is represented by binary digits, 01001000 needs to be hidden in this file, and the data would need to be compressed before being hidden.

After H is combined, the changed binary values would be as seen below:

01001100 00101111 10101110 10001010
Eight bits, which is four of the LSBs, have been successfully hidden. The above example is meant to be a high-level overview. This method can be applied to eight-bit color images. Grayscale images are also used for steganographic purposes. The drawback to these methods is that they can be detected by anyone who knows where to search for them.

Masking and Filtering
Masking and filtering techniques are commonly used on 24-bit and grayscale images. Grayscale images that hide information are similar to watermarks on paper and are sometimes used as digital versions. Masking images entails changing the luminescence of the masked area. The smaller the luminescent change, the less chance there is that it can be detected. Steganography images that are masked keep a higher fidelity rate than LSB through compression, cropping, and image processing. The reason that images encoded with masking have less degradation under JPEG compression is because the message is hidden in significant areas of the picture. The tool named Jpeg-Jsteg takes advantage of the compression of JPEG and keeps high message fidelity. This program uses a message and lossless cover image as input and produces an output image in JPEG format.

Algorithms and Transformation
Mathematical functions can be used to hide data that are in compression algorithms. In this technique, the data are embedded in the cover image by changing the coefficients of an image (e.g. discrete cosine transform coefficients).

If information is embedded in the spatial domain, it may be subjected to loss if the image undergoes any processing techniques like compression. To overcome this problem, the image would need to be embedded with information that can be hidden in the frequency domain, as the digital data is not continuous enough to analyze the data of the image that transformations are applied on.

****Audio Files****
Hiding information in an audio file can be done by using either LSB or frequencies that are inaudible to the human ear. Frequencies over 20,000 Hz cannot be detected by the human ear.

Information can also be hidden using musical tones with a substitution scheme. For example, tone F could represent 0, and tone C could represent 1. By using the substitution technique a simple musical piece can be composed with a secret message, or an existing piece can be used with an encoded scheme that represents a message.

Low-Bit Encoding in Audio Files
Digital steganography is based on the fact that artifacts, such as bitmaps and audio files, contain redundant information. Compression techniques such as JPEG and MP3 remove parts of the redundancy, allowing the files to be compressed. By using the DigSteg tool, the computer forensic investigator can replace some of the redundant information with other data.

Low-bit encoding replaces the LSB of information in each sampling point with a coded binary string. The low-bit method encodes large amounts of hidden data into an audio signal at the expense of producing significant noise in the upper frequency range.

Phase Coding
Phase coding involves substituting an initial audio segment with a reference phase that represents the data. This method is carried out using the following steps:

1.
The original sound sequence is shortened into segments.

2.
Each segment creates a matrix of the phase and magnitude by using the DFT algorithm.

3.
The phase difference is calculated between each adjacent segment.

4.
New phase frames are created for all other segments.

5.
A new segment is created by combining the new phase and the original magnitude.

6.
These new segments are combined together to create the encoded output.

Spread Spectrum
In most communication channels, audio data is limited to a narrow range of frequencies to protect the bandwidth of the channel. Unlike phase coding, direct-sequence spread spectrum (DSSS) introduces some random noise to the signal. The encoded data is spread across as much of the frequency spectrum as possible.

Spread spectrum is used in audio files both to embed data in the audio file and to send the audio file.

Echo Data Hiding
In this technique, an echo is introduced into the original signal. Three properties of this echo can then be varied to hide data:

Initial amplitude

Decay rate

Offset

**Video Files**
Discrete cosine transform (DCT) manipulation is used to add secret data at the time of the transformation process of the video. The techniques used in audio and image files can also be used in video files, as video consists of audio and images. A large number of secret messages can be hidden in video files because a video is a moving stream of images and sound. Due to this, an individual watching the video will not observe any distortion in the video caused by the hiding of data.

Question 4

Q

How is steganography used with audio files?

Answer

A

****Audio Files****
Hiding information in an audio file can be done by using either LSB or frequencies that are inaudible to the human ear. Frequencies over 20,000 Hz cannot be detected by the human ear.

Information can also be hidden using musical tones with a substitution scheme. For example, tone F could represent 0, and tone C could represent 1. By using the substitution technique a simple musical piece can be composed with a secret message, or an existing piece can be used with an encoded scheme that represents a message.

Low-Bit Encoding in Audio Files
Digital steganography is based on the fact that artifacts, such as bitmaps and audio files, contain redundant information. Compression techniques such as JPEG and MP3 remove parts of the redundancy, allowing the files to be compressed. By using the DigSteg tool, the computer forensic investigator can replace some of the redundant information with other data.

Low-bit encoding replaces the LSB of information in each sampling point with a coded binary string. The low-bit method encodes large amounts of hidden data into an audio signal at the expense of producing significant noise in the upper frequency range.

Phase Coding
Phase coding involves substituting an initial audio segment with a reference phase that represents the data. This method is carried out using the following steps:

1.
The original sound sequence is shortened into segments.

2.
Each segment creates a matrix of the phase and magnitude by using the DFT algorithm.

3.
The phase difference is calculated between each adjacent segment.

4.
New phase frames are created for all other segments.

5.
A new segment is created by combining the new phase and the original magnitude.

6.
These new segments are combined together to create the encoded output.

Spread Spectrum
In most communication channels, audio data is limited to a narrow range of frequencies to protect the bandwidth of the channel. Unlike phase coding, direct-sequence spread spectrum (DSSS) introduces some random noise to the signal. The encoded data is spread across as much of the frequency spectrum as possible.

Spread spectrum is used in audio files both to embed data in the audio file and to send the audio file.

Echo Data Hiding
In this technique, an echo is introduced into the original signal. Three properties of this echo can then be varied to hide data:

Initial amplitude

Decay rate

Offset

Question 5

Q

Name 2 legal uses for steganography:

Answer

A

Medical records: Steganography is used in medical records to avoid any mix-up of patients’ records. Every patient has an EPR (electronic patient record), which has examinations and other medical records stored in it.

Digital music: Steganography is also used to protect music from being copied by introducing subtle changes into a music file that act as a digital signature. BlueSpike Technology removes a few select tones in a narrow band. Verance adds signals that are out of the frequency range detectable by the human ear. Others adjust the sound by changing the frequency slightly. Digital audio files can also be modified to carry a large amount of information. Some files simply indicate that the content is under copyright. More sophisticated steganography versions can include information about the artist.

Question 6

Q

Explain the least-significant-bit method of steganography:

Answer

A

A type of digital steganography: Least Significant Bit (LSB)
With the least-significant-bit (LSB) technique, the rightmost bit in the binary notation is substituted with a bit from the embedded message. The rightmost bit has the least impact on the binary data. If an attacker knows that this technique is used, then the data are vulnerable.
Bit planes of a grayscale image are imprinted with the most significant bit (MSB) on top. The dark boxes represent binary value 0, and the light boxes represent binary value 1. The LSB plane of the cover image is replaced with the hidden data.

Question 7

Q

Explain the process of echo data hiding:

Answer

A

In a digital audio file, -Echo Data Hiding-
In this technique, an echo is introduced into the original signal. Three properties of this echo can then be varied to hide data:

Initial amplitude

Decay rate

Offset

Question 8

Q

Name 2 technical methods used to embed messages in a text file:

Answer

A

Text Files*
The following steganography methods are used in text files:

Open-space

Syntactic

Semantic

Open-Space Steganography
This method uses white space on the printed page. Open-space methods can be categorized in the following three ways:

Intersentence spacing: This method encodes a binary message by inserting one or two spaces after every terminating character. This method is inefficient since it requires more space for a small message, and the white spaces can be easily spotted.

End-of-line spacing: Secret data is placed at the end of a line in the form of spaces. This allows more room to insert a message but can create problems when the program automatically removes extra spaces or the document is printed as hard copy.

Interword spacing: This method uses right justification, by which the justification spaces can be adjusted to allow binary encoding. A single space between words is 0, and two spaces is 1.

Syntactic Steganography
This method manipulates punctuation to hide messages.

Look at the following example:

Laptop, iPod, USB

Laptop iPod USB

The punctuation marks are missing in the second phrase. These punctuation marks can be used to hide the message.

Semantic Steganography
This method of data hiding involves changing the words themselves. Semantic steganography assigns two synonyms primary and secondary values. When decoded, the primary value is read as 1 and the secondary as 0.

Question 9

Q

How is steganography different from cryptography?

Answer

A

Cryptography is the art of writing text or data in a secret code.
Steganography is defined as the art of hiding data within other data. It replaces bits of unused data from various media files with other bits that, when assembled, reveal a hidden message. The hidden data can be plaintext, ciphertext, an audio clip, or an image.

In cryptography an encrypted message that is communicated can be detected but cannot be read. In steganography, the existence of the message is hidden. Steganography is used to hide information when encryption is not a safe option. From a security point of view, steganography should be used to hide a file in an encrypted format. This is done to ensure that even if the encrypted file is decrypted, the message will still remain hidden.

Another contrast between steganography and cryptography is that the former requires caution when reusing pictures or sound files, while the latter requires caution when reusing keys.

In steganography, only one key is used to hide and extract data. In cryptography, the same key or two different keys for encryption and decryption can be used.

Question 10

Q

What is a watermark?

Answer

A

Digital watermarks are, in essence, digital stamps embedded into digital signals.

Often, the digital data found hidden in a watermark are a digital multimedia object. While digital images are most often mentioned in reference to digital watermarking, it is important to remember that watermarks can be applied to other forms of digital data such as audio and video segments.

Watermarking is used to facilitate the following processes:

Embedding copyright statements into images that provide authentication to the owner of the data

Monitoring and tracking copyright material automatically on the Web

Providing automatic audits of radio transmissions. These audits show any music or advertisement that is broadcasted

Supporting data augmentation. This enables users to add more information to the existing data present on the Web

Supporting fingerprint applications

Steganography hides the message in one-to-one communication, while watermarking hides the message in one-to-many communication.

Watermarks are split into two categories: visible and invisible.

Visible: A visible watermark is the most robust as it is not part of the foundation of the image. The watermark’s presence is clearly noticeable and often difficult to remove. A good example of a visible watermark is a television identification logo that appears on a television screen. The watermark can be either solid or semitransparent. Removing it would require a great deal of work.

Invisible: The main purpose of an invisible watermark is to identify and verify a particular piece of information in data. An invisible watermark is imperceptible but can be extracted through computational methods. An invisible watermark contains information about the watermark itself or the information present in the image that is hiding the data. The data hidden in the image can be accessed with a password, called a watermark key. There is a big difference between a watermark key and an encryption key. A watermark key is used only for watermarks, whereas an encryption key is used for information that is to be encrypted.

Watermarks and Compression
The application of watermarks in the modern world mainly concerns images, audio, and video. Watermarks are used in the case of MP3s and DVDs as a tool to ensure copyrights are enforced.

Types of Watermarks
Semifragile: Semifragile watermarks are used at the time of soft-image authentication and integrity verifications. They are robust to any common image processing of loose compression, but are fragile in case of any malicious tampering that changes the image content.

Fragile: Fragile watermarks are less robust when modified. A small change in the content will destroy the embedded information and show that an attack has occurred. Any tampering with the image will modify its integrity.

Robust: A robust watermark can be either visible or invisible. Robust watermarks are resistant to any kind of attack and will not affect the quality of the data. They are difficult to remove or damage. Robust watermarks are used in the case of copyright protection and access control. Most of these are found on television broadcasts during which the channels impose their logos in the corner of the screen to let people know what they are viewing and to signify copyright.

Question 11

Q

What is a cover medium?

Answer

A

A stegosystem is the mechanism that is used in performing steganography (Figure 1-1). The following components make up a stegosystem:

Embedded message: The original secret message to be hidden behind the cover medium

Cover medium: The medium used to hide the message

Question 12

Q

Attacks on Watermarks:

Answer

A

Attacks on Watermarking
Robustness Attack
This attack attempts to remove watermarks from an image. It can be divided into the following categories:

Signal-processing attacks: These attacks apply techniques such as compression, filtering, resizing, printing, and scanning to remove the watermark.

Analytical and algorithmic attacks: These attacks use algorithmic techniques of watermark insertion and detection to remove the watermark from the image.

Presentation attacks: Presentation attacks are carried out to change the watermarked data in such a way that a detector cannot detect it. The watermark will appear as it did before the attack. It is not necessary to eliminate the watermark to carry out the attack. The following instances are examples of presentation attacks:

An automated detector cannot detect the misalignment of a watermarked image.

A detector cannot detect the rotation and enlargement of a watermark.

Interpretation attacks: Interpretation attacks catch the weakness of watermarks, such as wrong and multiple interpretations. A watermark can be created from the existing watermark image with the same strength as the original watermark.

Legal attacks: Legal attacks mainly target digital information and copyright laws. Attackers can change the watermarked copyrights in order to create doubts about copyright in a court of law. These attacks depend upon the following conditions:

Existing and future legislation on copyright laws and digital information ownership

The credibility of the owner and the attacker

The financial strength of the owner and the attacker

The expert witnesses

The competence of the lawyers

The following techniques are commonly used to remove watermarks:

Collusion attack: A collusion attack is carried out by searching for a number of different objects having the same watermark, allowing the forensic investigator to isolate and remove the watermark by comparing the copies.

Jitter attack: A jitter attack upsets the placement of the bits that identify a watermark by applying a jitter effect to the image. By applying a jitter effect, the forensic investigator is able to gauge the integrity of the watermark.

StirMark: A StirMark attack can be applied to small distortions that are designed to simulate the printing or scanning process. If a hard-copy photograph has been scanned, it would appear obvious that subtle distortions are introduced, no matter how careful the user is. The StirMark attack can be used for JPEG scaling and rotation. This attack is effective, as some watermarks are resistant to only one type of modification.

Anti–soft bot: A benefit of watermarking in the realm of the Internet is the ability to use software robots, sometimes called soft bots or spiders, to search through Web pages for watermarked images. If the soft bot finds a watermarked image, it can use the information to determine if there is a copyright violation.

Attacks on echo hiding: Echo hiding is a signal processing technique that places information into an audio data stream in the form of closely spaced echoes. These echoes place digital tags into the sound file with minimal sound degradation. Echo hiding is also resistant to jitter attacks, making a removal attack the usual method for getting rid of the watermark. In echo hiding, most echo delays are between 0.5 and 3 milliseconds; in anything above 3 milliseconds the echo becomes noticeable.

Mosaic Attack
A mosaic attack works by splitting an image into multiple pieces and stitching them back together using JavaScript code. In this attack the marked image can be unmarked, and later all the pixels are rendered in a similar fashion to the original marked image.

This attack was prompted by automatic copyright detection systems that contain watermarking techniques and crawlers that download images from the Internet to determine whether or not they are watermarked.

Question 13

Q

Detecting Steganography

The following indicators are likely signs of steganography:

Answer

A

Software clues on the computer: The investigator should determine the filenames and Web sites the suspect used by viewing the browser’s cookies or history. An investigator should also look in registry key entries, the mailbox of the suspect, chat or instant messaging logs, and communication or comments made by the suspect. Because these data are important for investigation, they give clues to the investigator for further procedures.

Other program files: It is also important to check other program files because it is possible that a nonprogram file may be a cover file that hides other files inside it. The investigator should also check software that is not normally used for steganography such as binary (hex) editors, disk-wiping software, or other software used for changing data from one code to another.

Multimedia files: The investigator should look for large files in the system, as they can be used as carrier files for steganography. If the investigator finds a number of large duplicate files, then it is possible that they are used as carrier files.

Detection Techniques
Detecting steganographic content is difficult, especially when low payloads are used. The following techniques are used for detecting steganography:

Statistical tests: These tests reveal that an image has been modified by examining the statistical properties of the original. Some of the tests are not dependent on the data format and will measure the entropy of the redundant data, so the images with hidden data will have more entropy than the original image.

Stegdetect: Stegdetect is an automated tool that detects the hidden content in images. It detects different steganographic methods for embedding steganographic messages in images.

Stegbreak: Stegbreak breaks the encoding password with the help of dictionary guessing. It can be used in launching dictionary attacks against JSteg-Shell, JPHS, and OutGuess.

Visible noise: Attacks on hidden information can employ detection, extraction, and disabling or damaging hidden information. The images that have large payloads display distortions from the hidden data.

Appended spaces and invisible characters: Using invisible characters or appended spaces is a form of hiding data in the spaces of the text. The presence of many white spaces is an indication of steganography.

Color palettes: Some application characteristics are exclusive to steganography tools. The color palettes used in steganographic programs have unique characteristics. Modifications in the color palettes create a detectable steganographic signature.

Detecting Text, Image, Audio, and Video Steganography
Hidden information is detected in different ways depending on the type of file that is used. The following file types require specific methods to detect hidden messages.

Text Files
When a message is hidden in a text file so that the message can be detected only with the knowledge of the secret file, it was probably hidden by altering the cover source. For text files, the alterations are made to the character positions. These alterations can be detected by looking for text patterns or disturbances, the language used, and an unusual number of blank spaces.

Image Files
The hidden data in an image can be detected by determining changes in size, file format, last modified time stamp, and color palette of the file.

Statistical analysis methods can be used when scanning an image. Assuming that the least significant bit is more or less random is an incorrect assumption since applying a filter that shows the LSBs can produce a recognizable image. Therefore, it can be concluded that LSBs are not random. Rather, they consist of information about the entire image.

When a secret message is inserted into an image, LSBs are no longer random. With encrypted data that has high entropy, the LSB of the cover will not contain the information about the original and is more or less random. By using statistical analysis on the LSB, the difference between random values and real values can be identified.

Audio Files
Statistical analysis methods can be used for audio files since LSB modifications are also used on audio.

The following techniques are also useful for detecting hidden data:

Scanning information for inaudible frequencies

Determining odd distortions and patterns that show the existence of secret data

Video Files
Detection of secret data in video files includes a combination of the methods used in image and audio files.

Question 14

Q

Steganalysis/Attacks on Steganography

Answer

A

Steganalysis
Steganalysis is the reverse process of steganography. Steganography hides data, while steganalysis is used to detect hidden data. Steganalysis detects the encoded hidden message and, if possible, recovers that message. The messages are detected by verifying the differences between bit patterns and unusually large file sizes.

Steganalysis Methods/Attacks on Steganography
Steganography attacks are categorized by the following seven types:

Stego-only attack: The stego-only attack takes place when only the stego-medium is used to carry out the attack. The only way to avoid this attack is by detecting and extracting the embedded message.

Known-cover attack: The known-cover attack is used with the presence of both a stego-medium and a cover medium. The attacker can compare both media and detect the format change.

Known-message attack: The known-message attack presumes that the message and the stego-medium are present and the technique by which the message was embedded can be determined.

Known-stego attack: In this attack the steganography algorithm is known, and the original object and the stego-objects are available.

Chosen-stego attack: The chosen-stego attack takes place when the forensic investigator generates a stego-medium from the message using a special tool.

Chosen-message attack: The steganalyst obtains a stego-object from a steganography tool or algorithm of a chosen message. This attack is intended to find the patterns in the stego-object that point to the use of specific steganography tools or algorithms.

Disabling or active attacks: These attacks are categorized into the following six types:

1) Blurring: Blurring attacks can smooth transitions and reduce contrast by averaging the pixels next to the hard edges of defined lines and the areas where there are significant color transitions.
2) Noise reduction: Random noise in the stego-medium inserts random-colored pixels into the image. The uniform noise inserts pixels and colors that look similar to the original pixels. Noise reduction decreases the noise in the image by adjusting the colors and averaging the pixel values.
3) Sharpening: Sharpening is the opposite of the blurring effect. It increases the contrast between the adjacent pixels where there are significant color contrasts that are usually at the edge of objects.
4) Rotation: Rotation moves the stego-medium to give its center a point.
5) Resampling: Resampling involves a process known as interpolation. This process is used to reduce the raggedness associated with the stego-medium. It is normally used to resize the image.
6) Softening: Softening of the stego-medium applies a uniform blur to an image in order to smooth edges and reduce contrasts. It causes less distortion than blurring.

Stego-Forensics
Stego-forensics is an area of forensic science dealing with steganography techniques to investigate a source or cause of a crime. Different methods of steganalysis can be used to unearth secret communications between antisocial elements and criminals.

Question 15

Q

Tools used for steganography:

Answer

A

**2Mosaic**
2Mosaic is a small, command-line utility for Windows that will break apart any JPEG file and generate the HTML code needed to reconstruct the picture.

2Mosaic is a presentation attack against digital watermarking systems. It is of general applicability and possesses the property that allows a marked image to be unmarked and still rendered by a standard browser in exactly the same way as the marked image.

The attack was motivated by an automatic system that was fielded for copyright piracy detection. It consists of a watermarking scheme plus a Web crawler that downloads pictures from the Internet and checks whether they contain a watermark.

It consists of chopping an image up into a number of smaller subimages that are embedded in a suitable sequence in a Web page. Common Web browsers render juxtaposed subimages stuck together, so they appear identical to the original image. This attack appears to be quite general; all marking schemes require the marked image to have some minimal size (one cannot hide a meaningful mark in just one pixel). Thus, by splitting an image into sufficiently small pieces, the mark detector will be confused. The best that one can hope for is that the minimal size could be quite small, rendering the method impractical.

**BlindSide***
The BlindSide tool can hide files of any file type within a Windows bitmap image. The original and the encoded image look identical to the human eye. However, when the image is executed through BlindSide, the concealed data can be extracted and retrieved. For added security, the data can be scrambled with a password so that no one will be able to access the data. The BlindSide tool analyzes color differentials in an image so that it will only alter pixels it knows will not be noticeable to the human eye. The main limitation to BlindSide is that each image has its own capability that is dependent on color patterns within it.

The BlindSide tool can be used in many ways. The main advantage of BlindSide is that it uses a steganographic technique, supplemented with a cryptographic algorithm. This means that one can pass messages without arousing suspicion. BlindSide allows the user to encrypt messages with a password-based encryption so that even if someone did examine these images, they would need a password to obtain the secret data. Digital publishers typically use BlindSide to embed a license file and copyright notice within the images that are to be published. A similar procedure could be applied to images on a company’s Web pages.

**S-Tools**
The S-Tools steganographic tool has the ability to hide multiple files within a single object. S-Tools first compresses the individual files, which are stored with their names, and then it inserts filler on the front of the data to prevent two identical sets of files from encrypting in the same way. All files are then encrypted using the passphrase that the user generates. The encryption algorithms operate in cipher-feedback mode. The S-Tools application seeds a cryptographically strong, pseudorandom number from the passphrase and uses its output to choose the position of the next bit from the cover data to be used.

For example, if a sound file had 100 bits available for hiding and the user wanted to use 10 of those bits to hold a message, S-Tools would not choose bits zero through nine as they are easily detected by a potential enemy. Instead, it might choose bits 63, 32, 89, 2, 53, 21, 35, 44, 99, and 80.

**StegHide***
StegHide is a steganography tool that is able to hide information in images and audio files. The color and frequencies are not changed during the embedding process. Features of this tool include compression of the embedded data, encryption of the embedded information, and automatic integrity checking using a checksum. JPEG, BMP, and WAV file formats are supported for use as a cover file. No such restrictions are imposed on the format of the secret data.

StegHide also uses the graph-theoretic approach to steganography. The investigator does not need to know anything about graph theory to use the StegHide application. The following steps illustrate the working of an embedding algorithm:

1.
The secret information is compressed and encrypted.

2.
Based on a pseudorandom number, a sequence of pixel positions, which is initialized with a passphrase, is created.

3.
By using a graph-theoretic matching algorithm, the application finds pairs of positions so that exchanging their values has the effect of embedding the information.

4.
The pixels at the remaining positions are also modified to contain the embedded information. The default encryption algorithm is Rijndael, with a key size of 128 bits in the cipher block-chaining mode.

**Snow****
Snow is a steganography tool that exploits the nature of white space. It achieves this by appending white space to the end of lines in ASCII text to conceal messages. White-space steganography can be detected by applications such as Word.

Snow is susceptible to this factor. The basic assumption of Snow is that spaces and tabs are generally not visible in text viewers and therefore, a message can be effectively hidden without affecting the text’s visual representation from the casual observer. Encryption is provided using the Information Concealment Engine (ICE) encryption algorithm in one-bit cipher-feedback (CFB) mode. Because of ICE’s arbitrary key size, passwords of any length up to 1,170 characters are supported. Snow takes advantage of the fact that, since trailing spaces and tabs occasionally occur naturally, their existence will not be sufficient to immediately alert an observer who may stumble across them.

The Snow program runs in two modes: message concealment and message extraction. The data are concealed in the text file by appending sequences of up to seven spaces, interspersed with tabs. This usually allows three bits to be stored every eight columns. The start of the data is indicated by an appended tab character, which allows the insertion of e-mail and news headers without corrupting the data. Snow provides rudimentary compression, using Huffman tables optimized for English text. However, if the data are not text, or if there is a lot of data, the use of an external compression program such as compress or gzip is recommended. If a message string or message file is specified on the command line, Snow attempts to conceal the message in the file , or standard input otherwise. The resulting file is written to if specified, or standard output if not specified. If no message string is provided, Snow attempts to extract a message from the input file. The result is written to the output file or standard output.

**Camera/Shy**

Camera/Shy is a simple steganography tool that allows users to encrypt information and hide it in standard GIF images. What makes this program different from most steganography tools is its ease of use, making it a desirable component of a cracker’s arsenal.

While other steganography programs are command-line based, Camera/Shy is embedded in a Web browser. Other programs require users to know beforehand that an image contains embedded content. Camera/Shy, however, allows users to check images for embedded messages, read them, and embed their own return messages with the click of a mouse.

The Camera/Shy program allows Internet users to conceal information, viruses, or exploitative software inside graphics files on Web pages. Camera/Shy bypasses most known monitoring methods. Utilizing LSB steganographic techniques and AES 256-bit encryption, this application enables users to share censored information with their friends by hiding it in plain view as an ordinary GIF image. Moreover, it leaves no trace on the user’s system. It allows a user to make a Web site C/S-enabled (Camera/Shy-enabled) and allows a reader to decrypt images from an HTML page on the fly.

Steganos
Steganos is a steganography tool that combines cryptography and steganography to hide information. It first encrypts the information and then hides it with steganographic techniques. With the help of Steganos the user can store a file with a copyright and prove ownership of a picture if someone tries to use it.

Steganos can hide a file inside a BMP, VOC, WAV, or ASCII file.