Data Repositories and File Formats File Formats Flashcards

1
Q

Data Repositories

A
  • Databases.
  • Relational Databases.
  • Non-Relational Databases (NoSQL).
  • Data Warehouses.
  • Data Marts.
  • Data Lakes.
  • Big Data Stores.
  • Cloud-based Relational Databases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

File Formats

A
  • Delimited Text file format, or CSV.
  • Microsoft Excel Open, XLSX.
  • Extensible Markup Language, or XML.
  • Portable Document Format, or PDF.
  • JavaScript Object Notation, or JSON.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Databases

A

It is an organized collection of data that it is use for storing and managing data for specific purposes (transaction, queries, or reports), it is controlled by a database management system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data Warehouse

A

It is a central repository that merges information coming from disparate/different sources. After gathering, it consolidates the data through the extract, transform, and load process known as “ETL process” into one comprehensive database for analytics and business intelligence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data Marts

A

It is a sub-section of the data warehouse. There can be multiple Data Marts in one Data Warehouse, each one having specific data that users can interact/use depending on their business function, purpose, or community.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data Lakes

A

Is a storage repository that can store large amounts of structured, semi-structured, and unstructured data in their native format (Raw).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Big Data Stores

A

a storage system designed to efficiently store, retrieve, and analyze massive amounts of data in different structures that are not stored in traditional relational databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

ETL process

A

Extract, Transform, and Load.
- Helps to extract data from different data sources.
- Transform the data into a clean and usable state.
- Load the data into the data repository (In this case Data Warehouse).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data Pipeline

A

Is a term that encompasses the moving of data from one system to another, including the ETL process. Data Pipeline doesn’t transform data or it may transform it but after loading.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Delimited Text (CSV)

A

Is a file format that stores data in rows and columns, with each column separated by a delimiter characters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Delimiter Character

A

It can be a Comma “,”, Tab “”, Colon “:”, Vertical Bar “|”, and Space. In CSV. Formats the delimiter is a Comma “,” and in TSV. Formats the delimiter is a Tab “”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Extensible Markup Language (XML)

A

Is a markup language and file format for storing, transmitting, and reconstructing arbitrary(random) data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

JavaScrip Obj Notion (JSON)

A

Is a text-based format for storing and exchanging data that it can be both human-readable and machine-parsable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Relational Databases

A

It is a database that stores and provides access to data organized into a table structure (rows and columns), where the tables can be linked, or related, to other data tables that had common information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Non-Relational Databases

A

It is a database design that provides flexible schemas for the storage and retrieval of structured, semi-structured, and unstructured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Four types of NoSQL databases

A
  • Key-value store.
  • Document based.
  • Column based.
  • Graph based.
17
Q

Key-value store

A

key-value database stored data as a collection of key-value pairs. The Key represents an attribute of the data and is a unique identifier.

18
Q

Document Based

A

In this database, the data and each record it is stored within a single document.

19
Q

Column Based

A

In this database, the data is store in cells grouped as columns of data instead of rows.

20
Q

Graph Based

A

This database use graphical models to represent and store data.

21
Q

Data Management System (DBMS)

A

Is a set of programs in the database that creates and maintains and it allows you to store, modify, and extract information from the database using a function called Querying.

22
Q

Cloud-based Relational Databases

A

They are like regular Relational Databases but they have access to the limitless compute and storage capabilities offered by the cloud and manages data within an organization.