Python questions Flashcards

1
Q

Write SQL query to find the top 5 products with the highest sales

A

SELECT productID, SUM(price*quantity) AS totalSales
FROM DATA
GROUP BY productId
ORDER BY(totalSales)
LIMIT 5;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Difference between normalization and denormalization

A

Normalization: Process of organizing data in a database to reduce redundancy and improve data integrity

Denormalization: Process of combining normalized tables in order to improve read performance thus reducing the number of joins required when querying.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How would you deal with missing data?

A

If there are only a few rows with missing data then I would probably use the drop function in pandas. However, if there are a substantial amount of missing data one might consider dropping an entire column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

List 6 built in python data types

A

Int, float, string, tuple, dict, bool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which python data types are mutable

A

Lists, sets and dictionaries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How would you approach dataset with outliers

A

I would analyze the data to determine whether or not these outliers are due to an error in data entry or if they are true variations in the data. For example if I was given the incomes of households in a neighborhood and there was a household whose annual income was significantly less than that of mean of the incomes then I would have to determine whether this is due to a data entry error or if this is due to the household in question being a single income household in a neighborhood where majority of the houses are dual income.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe a time when you used data visualization to communicate a complex idea. What tools did you use?

A

I used Tableau and Seaborn to highlight aspects such as which GDP per capita by emissions per capita by country to determine which countries do a good job of being both relatively developed and ecofriendly and which ones don’t. Furthermore I investigated whether or not countries with high emissions face the brunt of the consequences of climate change (Using metrics such as increase in temperature, and frequency of natural disasters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bar graph vs Histogram

A

Bar graph: Visualizes categorical data by frequency
Histogram: Visualizes continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Correlation vs Causation

A

Correlation: A numerical value between -1 & 1 which indicates the strength of relation between two variables

Causation: A concept in which one variable is directly responsible for the other occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Linear regression & Assumptions

A

Statistical method used to model relationship between dependent variable and one or more independent variables.

The assumptions:
Relationship between dep & inde is linear
Observations are independent
Residuals should have constant variance at each level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you prioritize your tasks when working on multiple projects with tight deadlines?

A

I use the priority square which splits tasks by urgency & importance then prioritize tasks which were both urgent & important

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Give an example of how you explained a technical concept to a non-technical stakeholder

A

Simplicity
Relevance
Visual aids
Engagement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why do you want to work at RBC?

A

As the biggest bank in Canada, I’d have the pleasure to work with some of the most brilliant minds in the country particularly in the field of data science which will open me up to an unprecedented level of growth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Where do you see yourself in five years in the field of data analysis?

A

With the latest version of chatgpt being released to the public it is abundantly clear that just being a data analyst is not enough. Thus, playing a pivotal role in perhaps launching an AI banking assistant for RBC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly