Data Representation Flashcards
Data
Raw numbers, symbols, images sounds etc.
Unprocessed
Meaningless
Information
Interpreted and processed data
With context and meaning
Knowledge
Information which human experience has been applied
Knowledge Base
A collection of knowledge
Static Data
Data that does not normally change
e.g. Birthday of someone
Dynamic Data
Data that changes automatically without user intervention
e.g. Live score, stock price
Static information source
e.g. Published book, CD
Information does not change on a regular basis
Information may go out of date
Information can be viewed offline since no update
More likely to be accurate since it is available for a long period of time, thus more chance to check
Dynamic information source
e.g. Websites Information is updated automatically More likely to be up to date Network connection is required to update Data may have been produced very quickly and may contains errors
Direct Data Source
Data collected to serve that particular purpose
e.g. Census to collect statistical data of residents
Indirect Data source
Data was collected for another purpose, usually the one who consumes the data is different from who collected them
e.g. Census is done by Government, but a company can use those data for different purpose
Direct Data Source (vs)
Data is relevant since it is collected for that purpose
Original source is known, can be trust
Take long time to gather
Data is likely up to date
Data can be collected and presented in the required format
Indirect Data Source (vs)
Some required data may not exist
Original source may not known => not reliable
Data is immediately available (someone collected already)
Data may not be up to date
Data may existed in different format, need time to process
Attributes of (quality) information
Accuracy
Relevance: How related the data to its purpose
Age: Data must stay up to date
Level of Detail: Right amount of detail for the data. Too little then may not be enough information to use, too detail maybe hard to extract the information needed
Completeness: Is all information required is provided
Coding
Representing data by assigning a code to it for classification or identification
Encoding
Storing data in a specific format
Encryption
Scrambling data to make it undreadable if intercepted
Encrypt with encryption algorithm and key
Symmetric Key encryption
Encrypt and Decrypt use the same key
Asymmetric key encryption
Sometimes refer as private key encryption
Encrypt with public key, decrypt with private key
Public key can encrypt but cannot decrypt
Public key is generated by private key
SSL/TLS
Secure protocols used in HTTPS
Establish secure, trusted and encrypted communication between browser and web server
Codec
Code and Decode
A particular method to encode multimedia data
Validation
Automatically checking data according to predefined rules
- Presence
- Format
- Lookup
- Range
- Type
- Length
- Consistency
- Check Digit
Verification
Ensure data entered matches the original source
- Double entry
- Visual Check
Often done manually
Proofreading
Checking accuracy of data manually
Audio representation
Audio file is recording the *“Loudness” of sound. The record of analog loudness in digital numbers is called Sampling
Sampling Rate
How often the “Loudness” is recorded. Usually measure in Hz (times per second)
CD quality sound is recorded in 44.1kHz or 44100 times per second
The higher the rate, the better quality but more storage
Sample Resolution (bit Depth)
How many “steps” the analog loudness is converted into digital numbers.
e.g. CD audio has 16-bit resolution, meaning that the loudness is recorded in 65536 discreet steps (2^16)
Again, the higher the sample resolution the better quality (and more storage)