3 Flashcards

1
Q

Data Mining vs Data Profiling

A

Data mining is process of discovering relevant information that has not yet been identified before (convert raw data into valuable information)

Data profiling is done to evaluate a dataset for uniqueness, logic, and consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data Wrangling

A

Data wrangling is the process of transforming and structuring data from one raw form into a desired format with the intent of improving data quality and making it more consumable and useful for analytics or machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Steps in analytics project

A

Understanding the problem: Define the business or analytical question, determine key objectives, and establish success criteria.

Collecting data: Gather relevant data from internal databases, APIs, web scraping, or third-party sources to support analysis.

Cleaning data: Remove inconsistencies, handle missing values, and standardize formats to ensure data quality and reliability.

Exploring data: Analyze distributions, detect patterns, and visualize relationships to gain insights and guide further analysis.

Interpreting results: Draw meaningful conclusions, derive actionable insights, and communicate findings to stakeholders for decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how to melt a pandas dataframe called ‘report’. Just an exercise to understand the elements within melt function.

A

report.melt(
id_vars = specifies which columns should remain unchanged,
value_vars = specifies which columns should be transformed into long format,
var_name = transforms column values into columns themselves,
value_name = new numerical value across all the columns)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly