WEEK 2 : Data clean is a must Flashcards
Dirty data
Is data that’s incomplete, incorrect, or irrelevant to the problem you’re trying to solve.
Data warehousing specialists
develop processes and procedures to effectively store and organize data. They make sure that data is available, secure, and backed up to prevent loss.
Data engineers
transform data into a useful format for analysis and give it a reliable infrastructure. This means they develop, maintain, and test databases, data processors and related systems.
Split
is helpful when you have more than one piece of data in a cell and you want to separate them out.
Function MID
is a function that gives you a segment from the middle of a text string.
VLOOKUP
stands for vertical lookup. It’s a function that searches for a certain value in a column to return a corresponding piece of information.
Plotting
Is very useful when trying to identify any skewed data or outliers.
When you plot data, you put it in a graph chart, table, or other visual to help you quickly find what it looks like.