Data Standards Flashcards
Data stewardship
Careful, responsible management of something entrusted to one’s care (in this case data) on behalf of others
How can data stewardship be enforced?
Assigning people responsible for deciding/acting on how data is stored/accessed
Data governance policy
Determines how an organisation collects/stores/uses data
Data governance policies should…
Comply with relevant laws and cover the entire life cycle of data (from collection to deletion)
Four pillars of data governance
- Stewardship
- Quality
- Management
- Use cases
Why is data governance important?
It helps to prevent data breaches, where sensitive data can be leaked and used for things like blackmail or identity theft
General Data Protection Regulation (GDPR)
- You should collect the minimum amount of data needed
- You should only collect relevant data
- Steps should be taken to protect data and report breaches
- Data should be retained for the shortest time possible
- Relevant people can request access to their data and request their data is deleted
True or false: Anonymous data is not protected under GDPR
True! Data that can’t be linked to a person isn’t protected. However, pseudonymous data is as it can be reverse engineered to identify someone.
True or false: Under the GDPR, organisations cannot share your data with third parties for any reason
False! Data can be shared with other organisations in certain circumstances (e.g. for a criminal investigation).
What makes data valuable?
o Relevance of the data
o Correctness of the data
o Potential to make money
What costs can come with data?
o Storing and retrieving data o Ensuring the data is appropriately protected o Hardware and software costs o Staff costs o Legal costs
Thematic content analysis
Categorising data based on themes
Data versioning
Any changes made to data should be recorded and the original copy retained
How can data quality be improved?
o Thematic content analysis
o Merging data sources
o Recording relevant metadata
Information life cycle
- Tier 1 - Peak value, should be processed and interpreted to maximise value
- Tier 2 - New, unprocessed data, or older data that may not be as relevant
- Tier 3 - Old data that is to be archived and is unlikely to be useful
Data should be available at the right time and in the right way. How can this be done?
o Recent/useful data should be easy to access
o Older/less useful data should be stored for possible future use
o Metadata should be used where appropriate
Organisational enablers
o Highly focused business strategy
o Aligned IT and business strategy
o Centralised IT structure
Organisational inhibitors
o Complex mixture of products and services
o Misaligned strategies
o Decentralised IT structure
Industry enablers
o Regulations, especially those that apply to multiple regions
o Predictable rate of data growth
o Using industry-wide data standards
Industry inhibitors
o Regulations that vary by region
o Lack of industry-wide data standards
Technological enablers
o Promoting strategic use of IT
o Standardisation
Technological inhibitors
o Data hoarding
o Legacy/outdated IT systems
How can a data governance policy be defined?
o A single document
o Spread across multiple groups depending on how they use data
o Incorporated into staff training
o Formed as part of standard working practices (i.e. data management software)