Introduction to Data Science Part 1 Flashcards

1
Q

What is Data Science?

A

A process of using data to understand different things and uncover insights using scientific tools like programming and statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the key objectives of Data Science?

A

Extract knowledge from data, uncover insights, and make informed decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are common terms associated with Data Science?

A

Big Data, Machine Learning, Artificial Intelligence, Data Mining, Predictive Analytics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the types of data in Data Science?

A

Qualitative (descriptive data) and Quantitative (measurable values).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three main data formats?

A

Structured, Unstructured, and Semi-structured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are examples of Structured Data?

A

Relational databases, spreadsheets, and data tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are examples of Unstructured Data?

A

Images, videos, social media posts, and PDFs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are examples of Semi-structured Data?

A

JSON, XML, and HTML documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the major sources of data?

A

Web data, financial transactions, online trading, social networks, business records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Big Data?

A

Data that is expensive to manage and difficult to extract value from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 5 V’s of Big Data?

A

Volume, Velocity, Variety, Veracity, and Value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does ‘Volume’ refer to in Big Data?

A

The size of data being generated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does ‘Velocity’ refer to in Big Data?

A

The speed at which data is processed and analyzed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does ‘Variety’ refer to in Big Data?

A

Different types of data sources and formats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does ‘Veracity’ refer to in Big Data?

A

Data quality and reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does ‘Value’ refer to in Big Data?

A

The potential business benefits derived from analyzing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Machine Learning?

A

A field of AI that enables systems to learn and improve from experience without explicit programming.

18
Q

What are the three main types of Machine Learning?

A

Supervised Learning, Unsupervised Learning, and Reinforcement Learning.

19
Q

What is the goal of Supervised Learning?

A

To learn a mapping from inputs to outputs using labeled data.

20
Q

What is the goal of Unsupervised Learning?

A

To find patterns or structure in data without labeled responses.

21
Q

What is Reinforcement Learning?

A

A type of learning where an agent interacts with an environment to maximize cumulative reward.

22
Q

What is AI (Artificial Intelligence)?

A

The simulation of human intelligence processes by machines, including learning, reasoning, and self-correction.

23
Q

What is the difference between Data Science and Machine Learning?

A

Data Science produces insights, while Machine Learning produces predictions.

24
Q

What are the main application areas of Data Science?

A

Industrial processes, business, text data, image data, and medical data applications.

25
Q

What are some industrial applications of Data Science?

A

Fault prediction, preventive maintenance, demand forecasting, inventory management, price optimization.

26
Q

What are some business applications of Data Science?

A

Market trend analysis, churn analysis, credit risk modeling.

27
Q

What are some text data applications of Data Science?

A

Sentiment Analysis, Topic Modeling, Conversational AI.

28
Q

What are some image data applications of Data Science?

A

Computer Vision, Machine Vision.

29
Q

What are some medical applications of Data Science?

A

Disease diagnosis, patient data analysis, medical imaging analysis.

30
Q

What is the CRISP-DM process?

A

A standard for data mining with phases: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment.

31
Q

What are the six phases of CRISP-DM?

A

Business understanding, Data understanding, Data preparation, Modeling, Evaluation, Deployment.

32
Q

What is the TDSP (Team Data Science Process)?

A

A methodology developed by Microsoft for structuring data science projects.

33
Q

What are the key steps in a Data Science project?

A

Problem definition, Data Collection, Data Processing, Model Building, Model Evaluation, Deployment.

34
Q

What key questions does Data Science aim to answer?

A

What is the problem? What data is needed? Where does data come from? How should data be processed? How should models be evaluated?

35
Q

What are common ways to visualize data for insights?

A

Charts, graphs, heatmaps, scatter plots, histograms.

36
Q

What are key qualities of a good Data Scientist?

A

Inquisitive, knowledgeable, proficient in machine learning, statistics, and probability, skilled in coding, and strong in domain knowledge.

37
Q

What coding skills should a Data Scientist have?

A

Python, R, SQL, and tools for data processing like Pandas, NumPy, and Scikit-Learn.

38
Q

Why is domain knowledge important for a Data Scientist?

A

It helps in interpreting data correctly and making meaningful insights relevant to the industry.

39
Q

What are some emerging research topics in Data Science?

A

Big Data Modeling, AI Ethics, Fairness in Machine Learning, Explainable AI, Edge Computing.

40
Q

What are some challenges in Data Science?

A

Data privacy concerns, data bias, computational complexity, data storage and management.

41
Q

What is the impact of Data Science in healthcare?

A

Predicting diseases, personalizing treatments, improving patient care, optimizing hospital operations.