Data Cleaning Techniques Flashcards
Most of the time the data you want to analyze on is not in a usable format i.e., it contains blank cells, duplicate values, merged columns, etc. Before using this data for analysis we need to clean it so that it does not provide any irrelevant
results. It ensures accuracy and reliability in your analyses.
Data Cleaning
___Removes leading and trailing spaces.
___Eliminates non-printable characters.
___ Capitalizes the rst letter of each word
TRIM
CLEAN
PROPER
Text data often harbors errors like typos, extra spaces, and inconsistent capitalization.
Clean Text Data
______can help to prevent errors from being entered into your data in the rst place. You can use data validation to specify the type of data that can be entered into a cell, as well as the range of valid values.
Data validation
___is a powerful tool that can be used to clean and transform your data.____ can be used to import data from a variety of sources,
clean and transform the data, and then load the data into an Excel table.
Power Query
is useful when you have data in a single column that you
want to split into multiple columns. This is particularly handy when dealing with data imported from external sources, such as CSV les, or when data is not organized in a way that suits your analysis.
Parse Data
we need to split out data based on the space between them. That’s why chose space as a____
delimiter
____is used to remove extra spaces from a text string,
leaving only a single space between words and no leading or trailing spaces
TRIM function
_____is handy for quickly locating speci c
data and replacing it with new values. This can be useful for correcting errors, updating information, or making changes to a large dataset.
“Find and Replace”
_____powerful tool that allows you to set rules or criteria
for the data entered into a cell or range of cells. This can be particularly useful for ensuring data accuracy and consistency.
Data validation