EXCEL FUNCTIONS FOR DATA CLEANING Flashcards
Removes leading, trailing, and extra spaces from a text string.
Example: =TRIM(“ Hello, world! “) returns “Hello, world!”
TRIM
Use case: Cleaning up text data with inconsistent spacing.
TRIM
Removes non-printable characters from a text string.
Example: =CLEAN(“Hello, world! \t \r”) removes tab and carriage return
characters.
CLEAN
Removing control characters or other unwanted characters.
CLEAN
Replaces occurrences of old_text with new_text in a text string.
Example: =SUBSTITUTE(“Hello, world!”, “world”, “there”) returns “Hello,
there!”
SUBSTITUTE
Use case: Correcting typos, standardizing data formats, or removing specific words or phrases.
SUBSTITUTE
Returns a specified number of characters from the beginning of a text string.
Example: =LEFT(“Hello, world!”, 5) returns “Hello”
LEFT
Use case: Extracting specific parts of text data, such as first names or initials
LEFT
Returns a specified number of characters from the end of a
text string.
Example: =RIGHT(“Hello, world!”, 5) returns “world”
RIGHT
Use case: Extracting last names or suffixes.
RIGHT
Returns a specified number of characters from a
text string, starting at a specified position.
Example: =MID(“Hello, world!”, 7, 5) returns “world”
MID
Extracting specific parts of text data, such as city names or zip codes
MID
Returns the number of characters in a text string.
Example: =LEN(“Hello, world!”) returns 13
LEN
Use case: Determining the length of text strings for validation or formatting
purposes.
LEN
Converts all characters in a text string to uppercase.
Example: =UPPER(“hello, world!”) returns “HELLO, WORLD!”
UPPER
Converts all characters in a text string to lowercase.
Example: =LOWER(“HELLO, WORLD!”) returns “hello, world!”
LOWER
Returns the serial number of a date.
Example: =DATE(2023, 12, 25) returns the serial number for December 25,
2023.
Use case: Converting text dates to numerical values for calculations
DATE
Returns the serial number of a time.
Example: =TIME(15, 30, 0) returns the serial number for 3:30 PM.
Use case: Converting text times to numerical values for calculations.
TIME
Use case: Converting text dates to numerical values for calculations.
DATE
Use case: Converting text times to numerical values for calculations.
TIME
Returns the current date and time as a serial number.
NOW()
Use case: Timestamping data or calculating time differences.
NOW()
Returns one value if a condition is true, and another value if it is false.
Example: =IF(A1>100, “Greater than 100”, “Less than or equal to 100”)
IF
Use case: Creating conditional statements for data cleaning tasks, such as flagging errors or inconsistencies.
IF