1: Lecture 1 (Notes) Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Define Data Science.

A

Data Science is an emerging field that utilizes computer science, statistics, machine learning, visualization, and human-computer interactions to collect, clean, integrate, analyze, visualize, and interact with data to create data products.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Knowledge Base?

A

A Knowledge Base is a collection of entities, facts, and relationships that conforms with a certain data model, helping machines understand humans, languages, and the world.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the primary sources of Big Data?

A

Big Data sources include online activities (clicks, impressions), Internet of Things (machine-to-machine interactions like smart homes), scientific computing (genomic data), and user-generated content (social networks, reviews).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 5 V’s of Big Data?

A

Volume (sheer size), Velocity (rate of change), Variety (types of data), Veracity (data quality), and Value (usefulness for decision-making).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does Data Science contrast with Business Intelligence?

A

Business Intelligence queries the past, focusing on what has already happened. Data Science queries the past, present, and future, making predictions and suggesting actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the Data Science Pipeline Process Model.

A

Includes Discover, Wrangle, Profile, Model, Evaluate, Visualize, Report, and Iterate to improve based on feedback.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the challenges in Data Science?

A

Challenges include preparing data (dealing with noise, diversity, incompleteness), analyzing data (ensuring scalability and accuracy), representing analysis results effectively, and workflow management.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What skills are essential for a Data Scientist?

A

Skills include Data Management (collection, storage, cleaning), Large-scale Parallel Data Processing, Statistics and Machine Learning, and Interface and Data Visualization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly