3 Fundamentals Of Data Representation Flashcards
What is the capacity of a device?
The capacity of a device is how much data it can store
Why is it important to know the units of capacity of a device?
It is important to know the units of capacity so that we can compare the capacity of different devices
What is a bit?
A bit is the smallest amount of data that we can store. It is a binary digit, often called a bit for short.
The value can only be either a 0 or 1
Why don’t we use bits to measure data?
The bit is too small for measuring data
How large is a nibble?
4 bits
How large is a Byte?
8 bits
What are the most common units for comparing storing data?
Kilobyte (KB) (1000 bytes).
Megabyte (MB) (1000 KB).
Gigabyte (GB) (1000 MB).
Terabyte (TB) (1000 GB).
What are the three things required to calculate the amount of data that can be stored within a certain capacity?
The three things required to calculate data required storage in a capacity are:
- Size of data being stored
- Available Capacity
- How to convert between units
Why is it important to know the size of data being stored?
The bigger the size of the data being stored, the bigger capacity media will be needed
Why is it important to know the available capacity of our device?
The more capacity available, the more data we can store
Why is it important to know how to convert between units?
To find out if we have enough capacity, it is important to put the capacity of the media and the size of the data into the same units
What is the equation to calculate required capacity of a device?
Required capacity = number of files × size of a single file
What is Denary?
Denary is our number system where we use 10 symbols to represent each digit
What is Binary?
Binary is a number system where only the digits zero and one are used.
Each digits place is multiplied by 2 as we move from right to left.
What is hexadecimal?
Hexadecimal uses 16 different symbols for each place. Hexadecimal uses the digits 0-9 and A-F
Each digits place value is multiplied by 16 as we move from right to left
How do you convert binary into hexadecimal?
First, you split the binary into groups of 4 bits, starting from the right
You then convert each group separately and put the digits together
How do you convert hexadecimal to denary?
To convert from hexadecimal to denary, you should:
Write out the powers of 16 above each hexadecimal digit.
For each place column, multiply the hexadecimal digit by the power of 16 above it.
Add together all of these results.
What are overflow errors?
If a number is carried past the last place column during binary arithmetic, then this is called an overflow error
What do overflow errors lead to?
Overflow errors can lead to inaccurate results and software crashes
What is a binary shift?
A binary shift is a technique for performing multiplication or division on a binary number
Extra 0 bits are added to the start or end of the binary number to fill any missing spaces
What is a right binary shift?
A right binary shift is when each digit is moved once place to the right
This has the effect of dividing the number by two
You must take care when performing a right shift that no data is shifted off the right hand side. This can cause a loss of accuracy.
What is a left binary shift?
A left binary shift, is when each digit is moved one place to the left.
This has the effect of multiplying the number by two.
You must take care, when performing a left shift, that there is no overflow error (where we run out of space to store the last digit of the number)
What is text data made up of?
Text data is made up of characters
What is a character set?
A character set is a collection of all the characters that a computer recognises, along with their binary codes
What is included in a character set?
Alphanumeric characters e.g. letters, numbers, and symbols
Special characters e.g. new line
What are the two main character sets in use?
American Standard Code for Information Interchange
Unicode
What does ASCII stand for?
American Standard Code for Information Interchange
What is the most common character set?
American Standard Code for Information Interchange
What is each character in ASCII represented by?
Each character in ASCII is represented by a seven-bit binary code
What is the maximum amount of characters allowed in ASCII?
There is a maximum amount of 128 characters that can be included in ASCII
What does ASCII include?
ASCII includes all the commonly used letters and symbols in the English language
Why is it useful that ASCII is represented in 7-bits?
It is useful that ASCII is represented in 7-bits as the extra bit remaining can be used as a check digit in an 8-bit system
What are the limitations of ASCII?
128 characters is perfectly fine for the English language but it does not leave space for characters from other languages
An extended ASCII set was released which used all eight bits, but it was still not enough
This led to the release of Unicode
What is the aim of Unicode?
The aim of Unicode is to represent every possible character in the world
What is the most common form of Unicode?
The most common form of Unicode is UTF-8 which uses between 8 and 48 bit binary codes to represent each character
How is Unicode compatible with ASCII?
The first 128 digits of Unicode are identical to extended ASCII. This makes is backwards compatible with documents encoded using older character sets
What characters does Unicode represent?
Unicode represents all characters from all major alphabets of the world
Unicode is also used to represent emojis
What are alphanumerical characters?
Letters
Numbers
Symbols
What is each bitmap image split up into?
Pixels
How is colour stored in a bitmap image?
Each pixel of a bitmap image has a colour which is stored as a binary number
What is colour-depth?
Colour-depth is the amount of bits used to store the colour of each pixel
What does a greater colour-depth mean?
The greater the number of bits used to represent each pixel, the more unique colours can be stored.
What are the common colour-depths?
Common colour depths are 1-bit, 8-bit, 16-bit, and 24-bit.
What does resolution represent?
Resolution represents the number of pixels in an image
How can you calculate the resolution of an image?
Height of image × Width of an image = Resolution of image
Give an example of resolution.
1080p which is 1920×1080
What is metadata?
Metadata is extra information that is added to an image file such as: The resolution The colour-depth The encoding format The time and date of taking the photo
How are images stored?
All images are stored as binary
How do you convert an image to binary?
Start at the top left of the image, and work across the first row:
Write a 0 if the pixel is black.
Write a 1 if the pixel is white.
Continue this process until the end of the image.
In a black and white image, what is the value of one pixel?
Each pixel is represented as one bit
0 represents a black area and 1 represents a white area
What format does sound need to be for a computer to understand it?
Sound needs to be converted from analogue waves to a digital format
What is sampling for sound?
When sound is recorded by a computer its amplitude is recorded at regular intervals.
The value of the amplitude at each sample is stored as a binary value.
The number of bits used to store each sample is known as the sample size.
The number of samples taken per second is known as the sampling rate.
How do you increase the quality of audio?
Increasing the sampling rate will increase the quality of the audio.
Increasing the sample size will also increase the quality of the audio.
Unfortunately, increasing these make the file size larger.
What is bitrate?
The bit rate is the amount of data stored per second of audio.
How do you calculate bitrate?
bitrate = sample rate × sample size.
What does compression help with?
Compression helps reduce the size of files so we can store more data
What is lossless compression?
Lossless compression is when none of the original data is lost.
An algorithm can be used to perfectly restore the original file when needed.
Lossless compression causes file size to reduce moderately.
When is lossless compression useful?
Lossless compression especially useful for executable files, where all of the data is necessary.
What is lossy compression?
An algorithm is applied to remove unnecessary detail from the original file.
Some data is permanently lost, but enough remains so that the file is still useful and there is barely a noticeable difference.
Lossy compression results in dramatic file size reduction.
What is Run Length encoding (RLE)?
RLE is a form of lossless compression that replaces repeating sequences of zeroes and ones with more efficient representations.
Each repeating string will be replaced by a code which represents the character and the amount of times it is to be repeated.
What is image compression?
Image compression is when pixels that are similar colours are grouped to create one average colour
The RLE algorithm is then run on the new image
The technique is lossy compression
What is Huffman coding?
Huffman coding is a lossless text compression algorithm which is most commonly used for long pieces of text data
How does Huffman coding work?
Huffman coding works by assigning a fewer number of bits to the most frequently used characters
How do you create a Huffman tree?
List the characters in ascending order of frequency, and write their frequency alongside them.
Pair up the lowest frequency letters at the bottom of the tree:
For each pair, join them to a higher node with a value of the combined frequency.
Repeat this process for every node
How do you convert a Huffman tree to binary?
Starting at the top, traverse the tree to a letter node:
For each left branch, append a 0 to the letter’s binary string.
For each right branch, append a 1 to the letter’s binary string.
Use each letter’s reduced binary code to represent the original text
How do you read Huffman code?
For each letter, begin at the top of the tree and use the binary string as a set of directions to reach the next letter:
For each 0, go to the left.
For each 1, go to the right.
Once you reach a letter node, you can find the next letter by restarting this process from the top of the tree.
What is Huffman coding best for?
Long Data
Test data
Infrequently accessed data
A left branch of a Huffman tree should be labelled with which number?
0