21-30 Flashcards
Data Warehousing
It involves centralising and analysing large amounts of data from multiple sources to help businesses make decisions.
Big Data
Extremely large and diverse collections of structured, unstructured, and semi-structured data that continue to grow exponentially over time.
Structured Data
Data that has a standardised format for efficient access by software and humans alike. It is typically tabular with rows and columns that clearly define data attributes.
Unstructured Data
Information that doesn’t fit into a standard format. Binary files, images, videos, and audio files
Semi-Structured Data
A type of data that’s a combination of structured and unstructured data and is easier to analyse than unstructured data. JSON files, emails, NoSQL databases.
Data Quality
Measure of how well a dataset meets the standards for accuracy, completeness, consistency, reliability, and timeliness
Data Governance
It refers to how we manage and protect our data.
Data Privacy
The practice of protecting personal information and ensuring that individuals have control over how their data is used, shared, and accessed.
Data Security
Keeping the data itself safe, for example, from hackers and other people trying to access the data and steal it.
ETL
ETL stands for “Extract, Transform, and Load” and describes the set of processes to extract data from one system, transform it, and load it into a target repository.