Module 4 Flashcards

1
Q

It is the practice of collecting, keeping and using data securely, efficiently and cost-effectively.

A

Data management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data is an ?

A

asset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

It is crucial for an organization’s success

A

efficient data management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Relational databases, tables (MySQL)

A

Structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Emails, videos, social media content (NoSQL databases)

A

Unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Two types of data

A

Structured and unstructured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Needs specialized tools like text mining, sentiment analysis, and machine learning algorithms

A

unstructured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Ideal for transactional systems like customer records or sales
databases.

A

Structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ways of data storage and retrieval/databases and file system

A

Database Management Systems (MySQL, PostgreSQL, MongoDB)

Cloud Storage (AWS S3, Google Cloud Storage)

Query Languages for Data Retrieval (SQL for DBMS, MongoDB queries for NoSQL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Techniques for faster data retrieval

A

Indexing, caching and optimization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The entire data lifecycle (CSPDA)

A

collection, storage, processing, dissemination, archiving

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

This enhances database performance
by reducing the number of disk accesses
needed to process a query.

A

Indexing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

It is a data
structure that allows quick data retrieval by
creating indexes from specific database
fields.

A

Indexing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

They act as pointers to the data,
similar to a book’s index, making queries
faster and more efficient by providing a
quick lookup method for the requested
information.

A

Indexes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

It is the process of temporarily
storing copies of files or data in a cache for
faster access.

A

Caching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

These saved all the data that was
accessed for the first time by a user when
visiting a website or opening an application,
allowing quicker loading during subsequent
visits by retrieving the stored data instead
of downloading it again.

A

Caching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

It is a fundamental process in the
realm of information management
that focuses on improving data sets
to maximize their efficiency, utility,
and accuracy.

A

Optimization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Refers to how well data meets the needs for its intended use.

A

Data Quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

o Ensures data remains complete, accurate, and reliable over its lifecycle.
o Protects data from unauthorized access or corruption

A

Data Integrity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Key Dimensions of data quality

A

Accuracy, completeness, consistency, timeliness and validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Data must be correct and free from errors

A

Accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

All required data should be present (no missing fields).

A

Completeness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Data should be uniform across systems and sources.C

A

Consistency

24
Q

Data must be up-to-date and available when needed

A

Timeliness

25
Data should conform to expected formats and values
Validity
26
Causes of poor data quality and integrity
Human Error System Integration Issues Outdated data
27
: Data entry mistakes are common
Human eror
28
Merging data from different sources can introduce inconsistencies.
System Integration Issues
29
Without regular updates, data becomes irrelevant or inaccurate.
Outdated data
30
Removing duplicates, correcting errors, and filling missing data.
Data Cleaning
31
A process in data cleaning that applies rules to check the format, range, and consistency of data at the time of entry.
Validation
32
A process in data cleaning that implements policies and assigning roles (e.g., data stewards) to oversee data quality.
Governance
33
Data Integrity Methods
Encryption Audit Trails Referential Integrity
34
: Ensures data security during storage and transmission.
Encryption
35
: Track who accessed or changed data.
Audit Trails
36
Ensuring relationships between tables in a database remain consistent.
Referential Integrity
37
Republic Act 10173
Data Privacy Act of 2012
38
Data Privacy Act of 2012
Republic Act 10173
39
“to protect the fundamental human right of privacy, of communication while ensuring free flow of information to promote innovation and growth”
Republic Act 10173 or Data Privacy Act of 2012
40
refers to extremely large datasets that are too complex and voluminous for traditional data-processing tools to handle.
Big Data
41
5 V's of Big Data
Volume Velocity Variety Veracity Value
42
The amount of data (e.g., terabytes, petabytes)
Volume
43
The speed at which data is generated and processed (e.g., real-time data from social media).
Velocity
44
: Different types of data (e.g., text, images, video, structured and unstructured data).
Variety
45
Data accuracy and trustworthiness
Veracity
46
The potential insights that can be gained from analyzing Big Data.
Value
47
The process of examining large datasets (Big Data) to uncover patterns, trends, and insights that can support decision-making.
Data Analytics
48
Types of Data Analytics:
Descriptive Analsys (What happened?) Diagnostic Analysis (Why did it happen?) Predictive Analysis (What will happen?) Prescriptive Analysis (What should be done?)
49
Big Data Tools and Technologies
For Distributed Computing: Hadoop Spark For Data Storage NoSQL Databases like MongoDB, or Cassandra For Data Visualization Tableau, PowerBI, D3.js
50
: A framework for distributed storage and processing of large datasets
Hadoop
51
A fast, in-memory processing engine for Big Data that supports real-time analytics
Spark
52
Tools like __________________ help present complex data in understandable visual formats like charts and graphs.
Tableau, Power BI, or D3.js
53
Applications of Big Data
Healthcare Finance Retail Entertainment
54
Challenges of Big Data
Data Privacy and Security Data Integration Data Quality
55
Handling sensitive data securely
Data Privacy and Security
56
Combining data from multiple sources.
Data Integration
57
: Ensuring that the data is accurate and useful for analysis.
Data Quality