2.2 Data storage Flashcards
Units of capacity
The capacity of a device is how much data it can store. It is important to know the units of capacity so that we can compare the capacity of different devices.
The smallest unit of data that we can store is called a binary digit, or a bit for short.
The value of a bit can be either a 0 or a 1.
Because a bit is so small, we do not usually use it to measure data.
Instead, we group bits into larger groups such as:
Nibble (4 bits).
Byte (8 bits).
Denary
In everyday life, we use a denary number system.
We use 10 symbols to represent each digit.
Each digit’s place value is multiplied by 10 as we move from right to left.
Binary
Only 1 and 0 are used in binary.
Each digit’s place value is multiplied by two as we move from right to left.
Hexadecimal
Hexadecimal uses 16 different symbols for each place.
Hexadecimal uses the digits 0-9 then A-F:
A = 10 in denary, B = 11…
Each digit’s place value is multiplied by 16 as we move from right to left.
Binary to hexadecimal
Binary can be converted to hexadecimal by grouping into groups of 4 bits:
Start grouping from the right-hand side.
Convert each 4 bit group separately.
Hexadecimal to binary
Hexadecimal can be converted to binary by splitting each digit into 4 bits:
Separately convert each hexadecimal digit into 4 bits of binary.
Put all of the 4 bit groups together.
Advantages of hexadecimal
An 8-bit binary number can be represented by two hexadecimal digits.
This means you can reduce processing time with hexadecimal than with binary.
Hex is easier and faster to write two digits than the full binary sequence.
It is easier for a human to process hexadecimal than binary.
HTML colours
In HTML, colours are defined by how much red, green and blue (RGB) there is on a scale of 0 to 255.
0 to 255 is the range of numbers that can be represented as a single byte.
This means we use 2 hexadecimal digits to represent each RGB value.
HTML colour codes start with a hash symbol followed by 3 pairs of hexadecimal numbers like #00FF00.
We can break down the HTML colour code #00FF00 in the following way:
The red value is 00, which means we have 0 units of red.
The green value is FF, so we have 255 units of green.
The blue value is 00, so we have 0 units of blue.
The colour code #00FF00 corresponds to a bright green colour.
MAC addressing
A MAC address is a number that uniquely identifies a networked device.
A MAC address is made up of 48 bits which are shown as 6 groups of 2 hexadecimal digits:
NN-NN-NN-DD-DD-DD or NN:NN:NN:DD:DD:DD
The first 6 hex digits identify the device manufacturer.
The second 6 hex digits identify the serial number of the device.
Binary addition
If we add 0 + 0 we get 0.
If we add 1 + 0 (or 0 + 1) we get 1.
If we add 1 + 1, then we cannot use the symbol 2. So we need to carry the 1 and put 0 in the current place.
It might be the case that we have 1 + 1 and also a 1 carried over from the previous column.
If this is the case, then we carry the 1 and have 1 left over.
So we carry 1 and put 1 in the current place.
Binary shifts
A binary shift is how a computer system performs basic multiplication and division.
Binary digits are moved left or right a set number of times.
A left shift multiplies a binary number by 2 (x2).
A right shift divides a binary number by 2 (/2).
A shift can move more than one place at a time, the principle remains the same.
A left shift of 2 places would multiply the original binary number by 4 (x4).
Binary shifts can cause a loss of precision by discarding bits, which can lead to changes in the numerical value.
Left binary shift
In a left binary shift, each digit is moved one place to the left.
This has the effect of multiplying the number by two.
You must take care, when performing a left shift, that there is no overflow error (where we run out of space to store the last digit of the number).
Right binary shift
In a right binary shift, each digit is moved one place to the right.
This has the effect of dividing the number by two.
You must take care when performing a right shift that no data is shifted off the right hand side. This can cause a loss of accuracy.
Overflow error
Binary numbers are stored as a fixed length.
If a number is carried past the last place column, then this is called an overflow error.
Overflow errors can lead to inaccurate results and software crashes.
Character sets
A character set is a defined list of characters that can be understood by a computer .
Each character is given a unique binary code.
Character sets are ordered logically, the code for ‘B’ is one more than the code for ‘A’.
A character set provides a standard for computers to communicate and send/receive information.
Without a character set, one system might interpret the same binary number differently from another.
The number of characters that can be represented is determined by the number of bits used by the character set.
Two common character sets are:
American Standard Code for Information Interchange (ASCII)
Universal Character Encoding (UNICODE)
ASCII
The American Standard Code for Information Interchange (ASCII) character set is the most common character set.
Each character in ASCII is represented by a seven-bit binary code.
That means there is a maximum of 128 characters.
ASCII includes all commonly used letters and symbols in the English language.
Each letter is represented by seven bits.
This is useful because when used in an 8-bit system, the extra bit can be used as a check digit.
Limitations of ASCII
128 characters is perfectly fine for the English language. But it does not leave space for characters from other languages.
An extended ASCII set was released which used all eight bits, but it was still not enough.
This led to the release of Unicode.
Unicode
Unicode is a character set which was released because of the need to standardise character sets internationally.
Unicode aims to represent every possible character in the world.
The most common form of Unicode is UTF-8 and uses between eight and 32 bit binary codes to represent each character.
The first 256 characters in Unicode are identical to extended ASCII, which makes it backwards compatible with documents encoded using older character sets.
Unicode represents characters from all major alphabets of the world.
Unicode is also used to represent emojis.
Bitmap
A bitmap image is made up of squares called pixels.
A pixel is the smallest element of a bitmap image.
Each pixel is stored as a binary code.
Binary codes are unique to the colour in each pixel.
Colour depth
Colour depth is the number of bits stored per pixel in a bitmap image.
The colour depth is dependent on the number of colours needed in the image.
In general, the higher the colour depth the more detail in the image (higher quality).
In a black & white image the colour depth would be 1, meaning 1 bit is enough to create a unique binary code for each colour in the image (1=white, 0=black).
Common colour depths are 1-bit, 8-bit, 16-bit, and 24-bit.
Resolution
Resolution represents the number of pixels in an image.
The number of pixels is found by multiplying the width by the height of the image.
An example resolution is 1080p which is 1920x1080.
Metadata
Metadata is extra information that is added to an image file such as:
The resolution.
The colour-depth.
The encoding format.
The time and date of taking the photo.
File size
The size (in bits) of an image file is calculated as follows:
File size (in bits) = image width x image height x colour depth (in bits)
The size of the image file in bytes is equal to:
File size (in bytes) = file size (in bits) ÷ 8
Converting Binary to image
In a black and white image, each pixel is represented by 1 bit:
A 0 will represent a black area.
A 1 will represent a white area.
The image will be represented by a binary string. Use the value of each bit to colour each pixel in the right colour.
The binary string starts at the top-left of the image, and represents the first row, followed by the second row, and continues until the end of the image.