SQL CLEANING Flashcards

1
Q

How can we delete columns of a table? code

A

ALTER TABLE HouseProject..House
DROP COLUMN OwnerAddress, TaxDistrict, PropertyAddress;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can we remove duplicate rows of a table? code

A

Using Window function row_number and delete as below:

WITH DuplicateTable AS(
SELECT *, ROW_NUMBER() OVER (PARTITION BY SalePrice,SaleDate,LegalReference,ParcelID,PropertyAddress ORDER BY UniqueID) AS duplicates
FROM HouseProject..House)
DELETE
FROM DuplicateTable
WHERE duplicates>1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can we replace values in a column with certain other values? for example turning ‘Y’ to ‘Yes’ and ‘N’ to ‘No’

A

Using Case and Update
UPDATE HouseProject..House
SET SoldAsVacant= CASE SoldAsVacant
WHEN ‘Y’ THEN ‘Yes’
WHEN ‘N’ THEN ‘No’
ELSE SoldAsVacant
END

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are 2 ways of splitting a string in a column? How can we create new columns with the split values?

A

1- Using SUBSTRING and CHARINDEX for splitting and ALTER TABLE for adding a new column

SELECT
SUBSTRING(PropertyAddress,1,CHARINDEX(‘,’,PropertyAddress)-1) AS Address,
SUBSTRING(PropertyAddress,CHARINDEX(‘,’,PropertyAddress)+1,LEN(PropertyAddress)) AS City
FROM HouseProject..House

ALTER TABLE HouseProject..House
ADD Address Nvarchar(255)

UPDATE HouseProject..House
SET Address= SUBSTRING(PropertyAddress,1,CHARINDEX(‘,’,PropertyAddress)-1)

ALTER TABLE HouseProject..House
ADD City Nvarchar(255)

UPDATE HouseProject..House
SET City= SUBSTRING(PropertyAddress,CHARINDEX(‘,’,PropertyAddress)+1,LEN(PropertyAddress))

2-Using PARSENAME and ALTER TABLE

SELECT PARSENAME(REPLACE(OwnerAddress,’,’,’.’),3) AS OwnerSplitAddress,
PARSENAME(REPLACE(OwnerAddress,’,’,’.’),2)AS OwnerSplitCity,
PARSENAME(REPLACE(OwnerAddress,’,’,’.’),1)AS OwnerSplitState
FROM HouseProject..House

–Adding OwnerSplitAddress
ALTER TABLE HouseProject..House
ADD OwnerSplitAddress Nvarchar(255);

UPDATE HouseProject..House
SET OwnerSplitAddress= PARSENAME(REPLACE(OwnerAddress,’,’,’.’),3)

–Adding OwnerSplitCity
ALTER TABLE HouseProject..House
ADD OwnerSplitCity Nvarchar(255);

UPDATE HouseProject..House
SET OwnerSplitCity= PARSENAME(REPLACE(OwnerAddress,’,’,’.’),2)

–Adding OwnerSplitState
ALTER TABLE HouseProject..House
ADD OwnerSplitState Nvarchar(255);

UPDATE HouseProject..House
SET OwnerSplitState= PARSENAME(REPLACE(OwnerAddress,’,’,’.’),1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does PARSENAME function starts parsing from the end or the beginning?

A

it starts parsing from the end of str

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do we do when the delimiters are ‘,’ and we want to use PARSENAME?

A

Replace the ‘,’ with ‘.’ below example does this:

332 CHERRY ROAD, SACRAMENTO, CA
PARSENAME(REPLACE(OwnerAddress,’,’,’.’),3)

RESULT:
332 CHERRY ROAD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can we fill the missing values in a column, using another column?

A

Using self join, coalesce and update as below:
in the dataset, parcelIDs are repeated, we can use the parcelIDs that have the value for PropertyAddress and self join, to fill the null values of other repititions of that certain ParcelID

SELECT h1.PropertyAddress,h1.ParcelID,h2.PropertyAddress,h2.ParcelID,COALESCE(h1.PropertyAddress,h2.PropertyAddress)
FROM HouseProject..House h1
JOIN HouseProject..House h2
ON h1.ParcelID=h2.ParcelID
AND h1.[UniqueID ] != h2.[UniqueID ]
AND h1.PropertyAddress IS NULL
AND h2.PropertyAddress IS NOT NULL

UPDATE h1
SET PropertyAddress=COALESCE(h1.PropertyAddress,h2.PropertyAddress)
FROM HouseProject..House h1
JOIN HouseProject..House h2
ON h1.ParcelID=h2.ParcelID
AND h1.[UniqueID ] != h2.[UniqueID ]
AND h1.PropertyAddress IS NULL
AND h2.PropertyAddress IS NOT NULL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does coalesce work?

A

It either fills the missing values of a column with a fixed value.
Or fills the missing values using another column in that row

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between ALTER and UPDATE in SQL?

A

ALTER and UPDATE are the two modifying commands of SQL. ALTER is used to modify the structure of the relations (Tables) in the database (e.g. adding columns, changing data type of a column)
UPDATE Command is used to modify the data stored in a relation of the database (e.g. changing some values on a column)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can we change data type of a column?

A

ALTER TABLE HouseProject..House
ALTER COLUMN SaleDate date;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly