3 Flashcards
Data Mining vs Data Profiling
Data mining is process of discovering relevant information that has not yet been identified before (convert raw data into valuable information)
Data profiling is done to evaluate a dataset for uniqueness, logic, and consistency.
Data Wrangling
Data wrangling is the process of transforming and structuring data from one raw form into a desired format with the intent of improving data quality and making it more consumable and useful for analytics or machine learning
Steps in analytics project
Understanding the problem: Define the business or analytical question, determine key objectives, and establish success criteria.
Collecting data: Gather relevant data from internal databases, APIs, web scraping, or third-party sources to support analysis.
Cleaning data: Remove inconsistencies, handle missing values, and standardize formats to ensure data quality and reliability.
Exploring data: Analyze distributions, detect patterns, and visualize relationships to gain insights and guide further analysis.
Interpreting results: Draw meaningful conclusions, derive actionable insights, and communicate findings to stakeholders for decision-making.
how to melt a pandas dataframe called ‘report’. Just an exercise to understand the elements within melt function.
report.melt(
id_vars = specifies which columns should remain unchanged,
value_vars = specifies which columns should be transformed into long format,
var_name = transforms column values into columns themselves,
value_name = new numerical value across all the columns)