Digital images Flashcards

Question 1

Q

Describe RGB coding and its relationship with human vision.

Answer

A

RGB stands for Red, Green, and Blue.

Some spectra indicate the wavelength-dependent sensitivity of the human eye to light and it is divided into scotopic vision and photopic vision.
Scotopic vision is using rod cells and it has only one pigment which you don’t see colors you just see brightness, because of that. It is more sensitive and it is used when it’s dark or when the light is dim.
Photopic vision is using cone cells and has 3 pigments which are red, green, and blue their spectra are overlapped very much.
Our perception of colors is given by the combination of excitation of these 3 families of light-sensitive cells, then they get elaborated in our brain.

RGB code started after the discovery in 1931 of CIE color space that relates the distribution of wavelengths in the electromagnetic visible spectrum and physiologically perceived colors in human color vision. A color is defined by a combination of 3 spectra referred to 3 main colors are green, red, and blue; once I have the spectra I can build how many colors I want, for instance, if all of them are equal to 0 I’ll have the black one whilst all of them are equal to the maximum I’ll have white. Eventually, when the 3 spectra combine, they give the same palette of colors that we can see: so RGB code is based on the fact that when I want to reproduce a color I overlap these 3 spectra (also known as x,y,z) to go closest to the color that I see in the reality.
Talking about RGB code and computer images: a colored image is based on the fact that colors are the combination of 3 values so when you save an image you are saving a sovrapposition of 3 combined images that are the red one, the green one, and the blue one, each one with the same color depth. This is because each pixel has 3 values, that are the red one (IR), the green one (IG), and the blue one (IB): I have RGB inside the single pixel and each value has the same color depth but 3 different meanings because each of them should shine light with the spectra it was recorded to reproduce it in the best way; in this case, a pixel stored a set of 3 information.

Question 2

Q

Describe main features that define an electronic image.

Answer

A

linearity: by doubling the light, the pixel intensity also doubles
sensitivity: minimum and maximum amount of photons I can detect
signal/noise ratio: how big is the error in the intensity I measure because of electronic noise (dark count)
spectral sensitivity
## integration time: how long is the recording of light.A digital image is defined by 2 key parameters that are spatial sampling and brightness (luminance) quantization. Firstly, an image is digital in space because we divide it into pixels that are picture elements or voxels so in squares: finally, we don’t have a continuity of space.
The second digitalization is about the information that is stored in each pixel: there isn’t any number with infinite digits so we have to decide the size of what we store inside each pixel; this value is expressed by a group of bits and indicates the level of brightness and color. The quantization is responsible both for the nominal depth of the image and for the intensity in the space. In detail: each pixel is associated with a number or more, and the simplest is one bit which can be 0 or 1 so one binary information. If you want more information you can use 2 bits that correspond to 4 values from the combination of 0 and 1: 00, 01, 10, 11, and so on.
To sum up, the main steps to make an image are: break the space, simply sample it, break the information into digital dividing in pixels and add the content for each one to quantize the value.

Question 3

Q

If one would need to capture an electronic image with a microscope to observe tiny features and tiny changes of luminosity, what are the crucial requirements for the electronic image?

Answer

A

I sample the space and the second is that I quantize the intensity in the space, so when you make an image, you have to break the space and the information into digital, that is quantize, the amount of space and add the content for each. SAMPLING: An image is made of M*N pixels (picture elements),
QUANTIZATION: A value, expressed by a group of bits, that indicates the level of brightness and color. DIGITAL IMAGES The quantization gives the nominal depth of the image What you put inside the pixel is a number; each pixel/voxel has a number associated, or more than one

Question 4

Q

Describe what is image depth. How is it determined in an image? Is it the same for all electronic images?

Answer

A

Depth of the image: the number of bits that are used for defining the luminosity of the picture.

Image depth is the result of the number of bits that you have to use to express the image to define the luminosity of the picture; the quality of the image depends on the number of bits that aren’t the same for all electronic images. For instance:
_If you use 1 bit the information will correspond to 21 values that are 0 or 1: the image will be black (0) and white (1) and each pixel will have only the 0 or 1 information. In this case, the depth of the 1-bit image is 2.
_If you use 3 bits the information will correspond to 23 values that are 000, 001, 010, 011, 100, 101, 110, 111: the space is broken into 8 values between white and black that result in greyscale so the greyness of the image is expressed with 8 numbers that correspond to 8 different shades of grey. In this case, the depth of the 3 bits image is 8.
_If you use 8 bits, the standard one, the information will be 28 for an amount of 256 values: the continuity of brightness is divided into 256 shades of grey between white and black. In this case, the depth of the image is 256.
The real depth of an image: the interval of grey levels used in the image.

To sum up: the most used images are the ones made by 8, 16 and 32 bits that, in a binary system, correspond to the following combinations: 28, 216, 232 so to a set of 256-65536, 4,3*109 values.

Question 5

Q

An 8-bit image taken with a microscope have many pixels whose value is saturated, what does it mean? Could that be an issue? Would be the same if they were 0?

Answer

A

The “saturation point” is the point over which your detector doesn’t recognize light anymore: for instance, you take a picture with your camera and when it is saturated it becomes all white… At saturation equal to 0 the pixel is black so you could have the same problem.
For example, I’m looking for a representation of a single color, in this case, blue, which is from 0 to 255. 0 is black and 255 is the maximum intensity of the blue light so 255 is the “saturation point”. I have to be aware that, according to the experiment that I’m doing, I can decide what is the maximum value and how many watts it corresponds to. To sum up, I can work on the intensity of the light (there isn’t an absolute blue) and I can only enlarge it.

Question 6

Q

Give a quantitative estimate of the size (in bits and in bytes) of a false-color image 1000x600px with 6 (8-bit) color channels. Would it be a realistic situation?

Answer

A

1000 × 600 ×(6×8) =28.8 × 10^6 bits = 28.8/8 = 3.6×10^6 bytes
It would not be a realistic solution because the total number of colours is 48. So, it has 2^48 possible colors. But 48-bit color has 2^48 possible colors. But 48-bit color has less fidelity; so it should be reduced to 24-bit on 8 -bit colored image
—————————————-
Initially, we have to look at the link between pixels and file size: the Grayscale, RGB and CMYK images have either 1, 3, or 4 bytes x pixels (that means 8 bits x channel images) or 2, 6, or 8 bytes x pixels (that means 16 bits x channel images).
For example, RGB has 3 channels: the size of a pixel is 3x8 which is equal to 24 bits x pixel (24÷8=3 bytes).
This is better represented in the following images:

Let’s take into examine our case of a false-color image 1000x600 pixels with 6 color channels (8 bit):
firstly, we have to calculate the pixel amount which is 600.000 (1000x600).
Secondly, from the color channel’s number, which is 6, we have to calculate the pixel size knowing that each channel is composed of 8 bits: 6x8=48 bit.
In the end, we can calculate the image size that is the result of the product between the number of pixels and the size of each of them: 600.000x48=28.800.000 bits which is equal to 3.600.000 bytes.

It wouldn’t be a realistic situation because the matrices aren’t referred to as a real space: every channel can store various information so, the result is a sovrapposition of them.

Question 7

Q

Describe the main difference between GIF/PNG formats and TIFF

Answer

A

TIFF, PNG, and GIF are the most common raster-based formats by which you can save images.
GIF (Graphics Interchange Format”) is a graphic format that let you save color images reducing file format thankful to lossy compression, LZW; it uses tricks on colors because you see only a part of them reducing bits (colors are reduced in number): it is good only for visualization and not for scientific data. You have to choose only the main important colors (or levels of grey) for the picture, mainly the most used, for a maximum of 256 indexed colors defined in a look-up table. In the image, each color is replaced by a code chosen by a sophisticated algorithm. Of these 3 formats, only GIF supports animation
PNG (Portable Network Graphics) is an evolution of GIF, so you save space playing with color, but is more extended because it uses a look-up table with all kinds of the dimension of colors (in GIF is limited to 256), eventually, the color palette is more complex.
PNG images mainly have two modes — PNG8 and PNG24. PNG8 can support up to 256 colors whereas PNG24 can handle up to 16 million so the look-up table is more complex; moreover has another way to save colors and to use a transparency channel (a rule by which you can decide how to overlap the 2 levels). You can decide how much you want to compress in the colors. PNG compression is about 5–25% better than GIF compression, GIF images are now mainly used only if the image contains animations.
TIFF can be read by any program because it’s more universal so generally used, TIFF is very flexible talking about size in pixels and depth in bits (it can have any of them). Moreover, it may contain metainformation in the memory location called “Tags”. The most common Tags are info about resolution in dpi, compression, color code, and ROI (regions of interest) because you can save more things on the top of the image known as selected regions of the image.
Finally, it can be saved both with lossy or lossless compression on the contrary for the previous 2 formats in which one can’t choose; by using a lossless compression the most typical one is LZW which is effective for images that contain a large number of repeated elements and replaced with a character without losing information.

GIF is lossless when the image is encoded correctly for the GIF format, i.e. 8-bit color palette image.
If the source image is a JPEG or a full-color image like a 24-bit PNG, creating a GIF causes a loss of details due to the necessary quantization and maybe dithering.
If you do 24-bit PNG -> GIF -> PNG you will not end up with the same image, so it is lossless in that sense. if you do 8 bit PNG -> GIF -> PNG you will end up with an identical image. GIF works for 8-bit color max or 8-bit grayscale max images lossless.
Also, if you have a video file and convert it to animated GIF there is a substantial loss of information, the videos usually look fuzzy and pixelated when converting to GIF

Question 8

Q

Describe what lossy o lossless compressions are. Why lossy compressions are used?

Answer

A

Lossy compression is a way to save things in which you lose information to reduce the size of data that is not noticeable to the human eye or ear whilst lossless compression is a way to compress a file, so to reduce the size, without losing information and reducing the quality of data, moreover lossless compression is reversible.

An example of lossless compression is LZW: the computer search in the picture strings of pixels (sequences of repeated pixels), then they are separately recorded, replaced by a code/character, and saved as a new file with the entire dictionary of string (the set of strings is defined by a look-up table). Knowing the dictionary the file can be exactly decompressed so it’s a reversible situation.

Lossy compression is irreversible such as the GIF one: one has to be able to decide which are the most important colors in a picture because GIF uses a look-up table that defines the 256 indexed colors that are replaced by codes in the image after a sophisticated algorithm.
Lossy compression reduces the quality of data because the purpose is not scientific data, but visualization.
Lossy compressions are used because are more effective in reducing the size of the file than lossless compression: that makes the difference for a user with several Gb of files, also due to the quality that is acceptable in most cases.
——————————
We do compression of images/files when we want to save memory, using less space on the computer. Compression could be loss or lossless: loss compression is a type of compression in which we lose the information of the file, an example is JPEG images, so we save a lot of space on a computer, while in lossless compression we compress the file without losing information so we save a little bit less memory than the lossy compression. Lossless compression is less effective in reducing the size of the file than lossy compression. Despite lossy compression we are losing information, we lose only details, not major information, so the quality of the image will be below. Lossy compression is used because with a minor loss of quality, we save a lot of space on the computer.

Question 9

Q

Describe the relationship between JPEG compression and FFT (Fast Fourier Transform).

Answer

A

JPEG images are Fourier Transformed because they enable me to gain a lot of space on a computer without losing too much information. In JPEG images, to do compression, we want to keep an important spectrum and lose the finest information and this is what Fourier Transform does. Initially, we take a normal picture (3 color channels), with normal depth and then we convert the image into 3 channels, 2 for color and one for brightness because human eyes are more sensitive to brightness than color. Since we have half the information about color in the pixel, now one pixel is composed of a block of 2x2 pixels, so 4 and this leads to a 50% reduction of the file size. Then we split the image into 8x8 pixels blocks considering as a summation of the specters of each pixel, and with the Fourier Transform, we eliminate blocks that contain the finest details and take away a fraction of these spectrums making the information light, so we save space. How many finest structures we are losing is decided by the quantitation factors. Then we do the inverse Fourier Transform and we obtain the image without the finest components, so we lose resolution but we gain space in the computer thanks to the Fourier Transform.

Question 10

Q

Why stacks of images are useful in confocal microscopy? How many dimensions does an image stack have?

Answer

A

Stak of the image is a condition that allows me to create layers of different images that have the same information in x and y, but different information in the third dimension; so we collect 2D images that share the same parameter, and we can do a stack of images basing on the third dimension that is different in every image. Having this different third dimension enables me to play with different parameters, such as time, λ, z, etc… In particular, stacks of images are useful in confocal microscopy because allow me to see my sample in 3D, in fact, with confocal microscopy the image is taken in layers and we can build our images by collecting the slides that I have obtained in each direction thanks to the scanning mechanism. Also, stacks of images are useful in the confocal because you can see the overlap of images in which there are different fluorophores stained different specific elements of the image, so with the stacks you can see all the fluorophores together.

Question 11

Q

Define what is an image stack. Describe at least two situations in which stacks are useful.

Answer

A

An image stack is a condition in which we have a least two 2D images associated and they share information in x and y, but they differ for some parameters, so in the third dimension we can stack images that differ in z, in λ, in time, etc… The image stuck could be useful because in confocal microscopy they allow us to see the sample in 3D, but are also useful because if we have a stuck of images in time-lapse, we can make a video; if images are stacked in λ are useful to do a spectrum analysis or if they are stack in color channels we could see different fluorophores.

Question 12

Q

A 16-bit monochromatic image is visualized with a color LUT. Is that a color image? How many color planes does the image have? Discuss the answer

Answer

A

Color LUT transforms a range input of color into another range of color, for example, we can select yellow, magenta, red, etc.. to see the image. We have a 16-bit image in which there are 65k (2^16) shades of color, but we have only one color plane because is not an RGB image, in which we have 3 color planes: one for red, one for blu, and one for green, that allow us to see all the colors. This image is monochromatic, so that means that is not colored, but we can visualize it with different shades of color.

Question 13

Q

Define what is a LUT (Look Up Table). Give an estimate for the number of lines of a lookup table used in an 16-bit image. And in an RGB image?

Answer

A

A look-up table is a way the computer used to visualize a digital image because associates the intensity of a shade of a particular color to each position in the image. 16-bit image means that we have 2^16=65,536 (because the computer is a binary machine) shades of color in the image and each shade has an intensity that is written in the look-up table. In the RGB 16-bit image, we have a combination of three 16-bit images (one image for blu, one for red, and one for green), so the number of lines in this image will be 2^16 x 3.
—————————–
A look-up table is an array that links an index number to an output value. The index number depends on the bitrate so if I have 8 bit- image, I will have index numbers from 0 to 255 on the greyscale. So it gives you by computer one possible visualization of an electronic image with a color scale in correlation with different values of intensity of recorded photons. The number of rows of LUT is the number of bits in the image so if I have a 16-bit image, I will have 65536 lines, one for each level of intensity and I will have 2 columns, one for index numbers and one for the output of the chosen color scale. In an RGB image, I will have the same number of rows but I will have 4 columns, one for the index and 3 for each color channel, red, blue, and green.

Question 14

Q

Why LUT are used in electronic imaging? If I change LUT, am I changing image information?

Answer

A

LUT is used in electronic imaging because it allows us to analyze and visualize images in a way we like based on the intensity values. If I change LUT I’m not changing image information but only the visualization of it because what I have is a range of intensity values and the computer can read them using different channels and different color scales.

Question 15

Q

Define what does it mean to change brightness and contrast of an image. Are both reversible processes? Discuss the answer.

Answer

A

Changing the brightness and contrast of an image means changing the information about intensity. With brightness you will change the value of the maximum or the minimum ( separately) of intensity and changing the contrast, you will change the difference in brightness between regions. For example, if I apply a different contrast in an 8-bit image as 140-230, I will see zero intensity for each value of recorded intensity before 140 and I will see the maximum after 230 so I flatten differences in the image. If I apply the modifications, they are irreversible because I lost a part of the recorded information. I change forever the LUT because I will have different output values for the same index values.

Question 16

Q

Describe the steps that one must do to binarize an image. Why a is used? Is it a technique free from subjective interpretations?

Answer

A

Binarize an image means that for each pixel I can have only two values of intensity, 0 or the maximum one. To do this, I need to choose a threshold, a value above which values are considered as the maximum intensity and below which they are considered as 0 or vice versa. To choose a better threshold not too low (including the background) or too high (excluding the target), it could be useful to subtract the background. Binarization is used when you have to do topological measurement (number of cells, replication rate, area, perimeter) so the target of my analysis isn’t the tiny difference in intensity but only the presence or the absence of the object. It’s not free from subjective interpretations because the choice of threshold and how much background I delete are subjective.

Question 17

Q

What process can I perform if I want to extract topological information from an image (area, perimeter, distances…)? Do I need to have a calibrated image?

Answer

A

To extract topological information from an image we can select an area of an object and we can do measurements in the selected area. First, you have to say to ImageJ what type of measurement you want to command, which is (the binarization) of an image. From binarization, you can find topological measurements in this way. Analyzing particles is an operation that you can do only on a binary image, after making the image binary we have an objective perspective, which is the threshold. I can subtract the background in ImageJ and then you can find the threshold. Through this procedure, you can find some relevant data from an image like distance, area, and perimeter.
Image calibration provides a pixel-to-pixel distance, of course, we need a calibrated image because if we can calculate the total pixel the image has and from the pixel-pixel distance in the calibrated image we can easily find the parameter and area.
—————————–
To extract topological information from an image, I have to binarize the image because I don’t care anymore about tiny differences in intensity but I have to focus only on the size and the presence of my target. I need a calibrated image if my focus is the real dimensions of objects in the image. If I need only the number of cells or the replication rate, I can use the non-calibrated image.

Question 18

Q

Describe how can FFT (Fast Fourier Transform) be used to assess the background of an image.

Answer

A

FFT is a process that decomposes an image into its corresponding sine and cosine components and represents an image (in the spatial domain) as its equivalent frequency domain.
If FFT mainly contains low-frequency components (present at the center of the image)-it means that the high-frequency elements which make the fine details of the image such as the background noise present in the image- are low
If an FFT image consists of both low frequency and high-frequency components (elements present away from the central coordinates)-there is high background noise in the image
We can select a low-frequency region and perform its inverse FFT to obtain an image without much background noise.
———————-
FFT can be used to assess the background of an image because when you do it, you have an image composed of points. Points are the representation of the image based on its spatial frequencies. You can select a zone and cut off components of the image with higher spatial frequency ( far from the center), which are your background or tiny details. So when you invert the FFT, you obtain the first image without details/background represented by deleted spatial frequencies.

Question 19

Q

Why time-lapse and slow-motion videos are used in scientific measurements? If one wants to obtain a 100x slow-motion, what is the framerate that need to be set on the camera? How much longer will be the slowmotion video to have a fluid visualization?

Answer

A

Time-lapse and slow-motion videos are used in scientific measurements to analyze events that happen too fast or too slow with the correct timescale: for example, in the future, we’ll need picoseconds as framerate to look at enzymes or other parts of cell metabolism. With this technology, we can make faster and faster cameras and we can use slow motion to study cells and molecules, or also look at what is slower than your timescale. For example, a picture every day of a plant where we can see the plant moving.
Concerning a 100x slow motion, if I want to reproduce my video at 30 fps, I have to record it at (30fpsx100fps) 3000 fps. And If I want to reproduce 3000 frames at a frame rate of 30fps I will need (3000fps/30fps) 100 seconds to reproduce it with a fluid visualization
——————————————–
Time-lapse and slow-motion videos are used in scientific measurements to study processes that are too fast or too slow to be seen in real-time. For example, if I want to observe changes in the body of a running jaguar, I need to slow down to appreciate details, so I need a slow-motion video. If I want to obtain a 100x slow motion, I need to record the phenomenon and reproduce it by dividing the real framerate by 100. So in the end the video will be 100x longer than the original one.

Question 20

Q

What happens if I subtract two images? What is a situation in which image subtraction could help?

Answer

A

If I subtract two images, for each pixel I subtract the intensity of the second image from the first. if the subtraction gives a negative value and I want to maintain the same bitrate, I will have zero at that point. Otherwise, I can choose to have a 32-bit image to cover also negative values. It is useful to focus on tiny differences that you can’t appreciate in another way, for example, changes in the intensity of fluorophores.