Module 4 Flashcards
It is the practice of collecting, keeping and using data securely, efficiently and cost-effectively.
Data management
Data is an ?
asset
It is crucial for an organization’s success
efficient data management
Relational databases, tables (MySQL)
Structured data
Emails, videos, social media content (NoSQL databases)
Unstructured data
Two types of data
Structured and unstructured data.
Needs specialized tools like text mining, sentiment analysis, and machine learning algorithms
unstructured data.
Ideal for transactional systems like customer records or sales
databases.
Structured data
Ways of data storage and retrieval/databases and file system
Database Management Systems (MySQL, PostgreSQL, MongoDB)
Cloud Storage (AWS S3, Google Cloud Storage)
Query Languages for Data Retrieval (SQL for DBMS, MongoDB queries for NoSQL)
Techniques for faster data retrieval
Indexing, caching and optimization
The entire data lifecycle (CSPDA)
collection, storage, processing, dissemination, archiving
This enhances database performance
by reducing the number of disk accesses
needed to process a query.
Indexing
It is a data
structure that allows quick data retrieval by
creating indexes from specific database
fields.
Indexing
They act as pointers to the data,
similar to a book’s index, making queries
faster and more efficient by providing a
quick lookup method for the requested
information.
Indexes
It is the process of temporarily
storing copies of files or data in a cache for
faster access.
Caching
These saved all the data that was
accessed for the first time by a user when
visiting a website or opening an application,
allowing quicker loading during subsequent
visits by retrieving the stored data instead
of downloading it again.
Caching
It is a fundamental process in the
realm of information management
that focuses on improving data sets
to maximize their efficiency, utility,
and accuracy.
Optimization
Refers to how well data meets the needs for its intended use.
Data Quality
o Ensures data remains complete, accurate, and reliable over its lifecycle.
o Protects data from unauthorized access or corruption
Data Integrity
Key Dimensions of data quality
Accuracy, completeness, consistency, timeliness and validity
Data must be correct and free from errors
Accuracy
All required data should be present (no missing fields).
Completeness