1. Database Management - Segment 1 [ Week 1 & 2- Data and Data Sources] Flashcards

1
Q

On which Data analysis is relies on ?

A

Data analysis relies on Data & Data sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In which form data is collected by Various Data Sources ?

A

Data is in raw format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the Different types of data .

A
  1. Scientific data
  2. Multimedia data
  3. Transactional data or structure data
  4. Relational data
  5. Web data
  6. Flat files data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is scientific data ?

A

Data that comes from various sensors and scientific equipments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is multimedia data ?

A

The data that comes from cameras, satellite images, videos, and CCTV footage are referred to as multimedia data. They typically contain audio and video content over a period of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is transactional data or structured data ?

A

Predefined or prestructured data taken at different time stands

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is relational data ?

A

Data comes in row and column format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is web data ?

A

Collected by Web scrapping ,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Flat files data?

A

csv or excel files , stored in a local system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is big data ?

A

Non-relational and non-structure data are generally referred to as big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the Types of Storage Based on the Connectivity ?

A
  1. Device-Attached Storage (DAS)
  2. Network-Attached Storage (NAS)
  3. Storage Area Network (SAN)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is DAS ?

A

Device attached storage
the file system and disk storage are directly connected.are available in the same physical location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is NAS ?

A

In NAS, the file system and disk storage are available remotely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is SAN ?

A

In SAN, only the disk storage is remote. The file system accesses the storage over the network.The file system is in the system itself

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the basic difference between SAN & NAS ?

A

The basic difference between SAN & NAS is ,

in NAS both file system and storage are on remote site,
But in SAN storage is on remote side and the file system is on he system itself

NAS is a single storage device while a SAN is a tightly coupled network of multiple devices .
NAS devices deliver shared storage as network mounted volumes and use protocols like NFS and SMB/CIFS,
while SAN-connected disks appear to the user as local drives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the Types of Storage Based on the Location of Nodes ?

A

It is a two types -
* 1. Warehouse storage/ On-premise storage:
* 2. Cloud storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define Storage Based on the Location of Nodes ?

A
  1. Warehouse storage/ On-premise storage -Nodes are present in the same physical location. This will ensure that accessing data is quick and network-delays not impact applications.
  2. Cloud storage - Data is stored on cloud nodes.Cloud storage is always less expensive compared to physical storage.The real-time data can be ingested and stored directly into cloud storage, scaling both in and out in response to data volume.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is Hadoop Model ?

A

Hadoop is an open-source framework for processing large datasets. Hadoop uses a unique file system called Hadoop Distributed File System (HDFS). Internally, this file system can be connected to any type of storage model- DAS,NAS SAN
HDFS provides an abstraction. As a result, the storage appears as a locally attached disk.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is HDFS ?

A

HDFS - Hadoop Distributed File System
It is a file system used in Hadoop model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the good solution to handle big data.

A

Hadoop
It provides scaling of storage as the data continues to grow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the basic requirements of big data?

A
  1. Type of storage used
  2. Handle large amount of data
  3. It should continue to scale as the data continuous to grow
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is processing of data ?

A

Processing of data means , transforming raw data into required usable format

23
Q

Why data is important ?

A

Importance of data is defined in the terms of how data is going to be used
1. **Improve people’s lives **- Smart watches , Tracking applications
2. Making decisions - organisations take decisions
3. Strategies - data is used for making srategies
4. Anticipated Strategies and decision outcomes - based on the outcomes of strategies
5. Monitor - data is used for real time monitoring
6. Access resource - reusability of data , like search engines & apps

24
Q

What is information and knowledge ?

A

Peace of data gives information, and information gives knowledge about something

25
What is wisdom ?
Wisdom is the ability to make the judgement Or decision on knowledge Acquired by information
26
What is the example of data ?
27
What are information ?
Streamlining Pattern of data gives information
28
What is knowledge
Well organised body of information
29
Describe what is data information knowledge and wisdom ?
**Data - 100 Information - 100 Miles Knowledge - 100 miles is quite so far Wisdom - difficult to walk 100 miles but with vehicle commuting would be easier**
30
What are the Data Collection methods ?
1. Oral History 2. Online marketing, social media marketing 3. Interviews 4. Questionnaires 5. Focus Group 5. Observations 6. Documents and records 7. Logs Stored on Servers
31
What is data processing ?
Transformation of raw data,this process includes 1. Filtering of data 2. Segregation of data 3. Normalisation of data 4. Cleaning of data
32
data What is bucket ?
In terms of data science bucket is an Data Warehouse which hold all the processed Data .
33
What is Data Curation ?
Data curation is the process of creating, organizing and maintaining data sets so they can be accessed and used by people looking for information.
34
How knowledge is discoverd from Processed Data
Knowledge is discoverd from Processed Data by indentifying the patterns in streamlined Data .
35
How discovery of knowledge is represented ?
Discovered Knowledge is represented in the form of reports , Tables , characterization rules .
36
What is Database ?
A database is an organized collection of interrelated data.
37
What is an Data Science ?
Data science is the study of data to extract meaningful insights for business. It is a multidisciplinary approach that combines principles and practices from the fields of mathematics, statistics, artificial intelligence, and computer engineering to analyze large amounts of data.
38
What is Information System
An information system (IS) is a formal, sociotechnical, organizational system designed to collect, process, store, and distribute information
39
What is curated Data ?
Transformed & Processed Data is called Curated Data
40
What is Data Warehouse ?
Data Warehouse is an storage or repositotry for Structured & filtered data that has been already been processed for a specific purpose .
41
What is Data Lake ?
A vast pool of raw data , the purpose for which is not yet defined, superset of Data Warehouse
42
What is Data Mart ?
A subset of Data Warehouse which contain repositories of summarised data , collectyed for analysis on a specific section or unit .We have n numbers of data marts in data warehouse
43
Compare Data Lake , Warehouse , Data Mart ?
44
What is KDD ?
Knowledge Discovery in Database KDD is the process of discovering knowledge from a collection of data .Knowledge discovery in a database is a powerful and systematic technique to derive value from raw data.
45
When KDD is formalised ?
KDD is formalised in 1989
46
What are the steps in KDD ?
1. Data Selection / Segmentaton 2. Data Pre-processing 3. Data Transformation 4. Data Mining 5. Interpretation of Discovered Data
47
Importance of Knowledge Discovery for Decision Support ?
In present time we have very large amounts of data. For effective and proper decision making , correct information is required from large data sets. For this KDD is introduced, KDD is a high level technique used to present an analyse data for decision makers. It is used to develop an optimal representation of the structure of the data.
48
What is Data Selection/ Segmentation ?
It is the first stage in knowledge Discovery. In this stage selection of data a decided based on the criteria or intention
49
What is Data Pre-Processing/ Cleaning ?
Data pre-processing and cleaning, depends on the type of data available. The noisy and inconsistent data is removed , it is necessary to retain only the require data and detain the redundant data.
50
What is Data Ingest ?
Data ingestion is the process of importing large, assorted data files from multiple sources into a single, cloud-based storage medium—a data warehouse, data mart or database—where it can be accessed and analyzed.
51
What is Data Transformation ?
Data transformation is where data gets transformed in order to be suitable for knowledge discovery. Columns are removed or new columns are added based on old columns
52
What is Data Mining ?
The process of extraction of patterns from data by using various algorithms and methods. Data analysts, data engineers, data scientists, etc., use various methods and algorithms to extract patterns
53
What is Interpretation & Evaluation ?
At the interpretation and evaluation stage, the extracted patterns are converted into knowledge. This knowledge, in turn, is used to support the decision-making by data scientists, data analysts, or data engineers
54
What is Data Collection ?
The systematic process of obtaining observations or measurements is known as data collection