PostgreSQL Flashcards
What steps do you want for cleaning string data?
Restrict capitalization in column names
Remove extra divider spaces in column names
make column names uniform
INITCAP()
SELECT INITCAPT(column)
Output: Hello Friend!
Fixes capitalization
REPLACE()
SELECT REPLACE(column, string_to_replace, replacement)
SELECT REPLACE(streets, ‘Main Street’, ‘St’);
Output: Main St
LPAD(input_string, length [, fill_value])
SELECT LPAD(‘column, 7, ‘X’);
OUTPUT: XXXX123
Prepending text values to a string. The fourth value has to be a string.
Like
Like is used for pattern matching using wildcards such as % or _
SELECT *
FROM employees
WHERE firstname LIKE ‘Ali%’ or firstname LIKE ‘_li’
%
Matches 0 or more characters
_
Matches exactly 1 character
Can use multiple _ to represent more than 1 character
REGEXP_REPLACE()
REGEXP_REPLACE(column, ‘regrex’, replacement, ‘g’)
‘g’ = global
DIFFERENCE()
DIFFERENCE is used to compare the SOUNDEX values of two strings and returns a score between 0 and 4:
DIFFERENCE(vehicle_color, ‘GRAY’) = 4;
SOUNDEX()
SOUNDEX is a phonetic algorithm used to match words or strings that sound similar in English. It converts a word to a code based on its pronunciation.
Example:
SOUNDEX(‘GRAY’) → G600
SOUNDEX(‘GREY’) → G600