Paper 2 Unit 6 Flashcards
What is data?
Data refers to raw facts, observations or measurements that have little meaning on their own
What is information?
Processed and organised data that has meaning and context. It is derived from data through interpretation, analysis and contextualisation.
What is knowledge?
Knowledge goes beyond information and represents the understanding, insights and expertise gained from information and experience.
What is human readable data?
Unstructured data like a block of text that can only be interpreted by humans
What is machine readable data?
Structures data like a set of instructions that can be processed by computer programs
What is big data?
Large, complex, and layered groups of data that can be analysed to spot patterns and trends
Define a data type.
The way data is stored (string, integer, float etc)
What is data wrangling?
The process of transforming a raw data form into a desired format suitable for purpose
Name the stages of data wrangling
Discover, structure, clean, enrich, validate, share
Why do organisations need data?
To analyse market trends to identify patterns and inform decisions, system performance analysis, user monitoring, targeted marketing, inform decision making, assess threats and opportunities
How is data generated?
Human input, AI, sensors, Internet of Things, Transactional data
Name different data formats
ASCII, CSV, fixed width text file, XML, JSON
What are the benefits and drawbacks of ASCII?
Benefits: Standard format for all computer systems, communicate using standard English alphabet
Drawbacks: Limited number of characters, replaced by Unicode which contains other alphabets and symbols so can be more widely used
What are the benefits and drawbacks of CSV?
Benefits: Common format understood by most applications
Drawbacks: Format is delimited and it is possible to use other delimiters other than the comma, tab is common making TSV widely used
What are the benefits and drawbacks of Fixed Width data formats?
Benefits: For very large data files it’s easy to calculate the location of data to retrieve it since the length is fixed
Drawbacks: Fixed sizes for fields, padding character and alignment need to be known before data can be retrieved accurately, needs to be carefully planned before setting up and saving data
What are the benefits and drawbacks of XML?
Benefits: platform dependent so can be used on any system, supports Unicode so can cope with data, displayed in a GUI using HTML
Drawbacks: Requires a series of complex tags to store the data making files large