Data Analyst Interview Prep Flashcards

data management techniques, SQL query optimization, and strategies for handling missing values and outliers. It covers data visualization using tools like Power BI and Tableau, as well as statistical analysis concepts such as correlation vs. causation and hypothesis testing. Additionally, the deck delves into MicroStrategy-specific knowledge, addressing security filters, report migration to cloud environments, and attribute roles.

1
Q

How would you find duplicate records in a table?

A

Use a SQL query that groups by the key fields and counts the occurrences. For example:
SELECT column_name, COUNT() FROM table_name GROUP BY column_name HAVING COUNT() > 1;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Write a query to join two tables and filter results based on a specific condition.

Imagine you are tasked with combining the customer lists from two different branches of an alarm system company. How would you write a query to merge these lists and filter the results to only include customers with a premium alarm package?

A

SELECT a.column1, b.column2 FROM table_a a INNER JOIN table_b b ON a.id = b.id WHERE a.condition = ‘specific_condition’;

To merge the customer lists from two branches and filter for those with a premium alarm package, you could use a SQL query like this:

SELECT b1.customer_name, b2.branch_location
FROM branch1 b1
INNER JOIN branch2 b2 ON b1.customer_id = b2.customer_id
WHERE b1.package_type = ‘Premium’;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and how you would use these in SQL queries.

A

INNER JOIN returns only matching rows; LEFT JOIN returns all from left table and matched from right; RIGHT JOIN returns all from right and matched from left.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you optimize a slow-running query?

A

Analyze the execution plan, create indexes, avoid SELECT *, and limit dataset with WHERE conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What steps would you take to clean a dataset with missing values?

A

Identify missing values, decide on a strategy (imputation or removal), and document the process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you handle outliers in your data?

A

Use statistical methods to identify outliers and decide to remove, transform, or investigate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain the process of normalizing and standardizing data.

A

Normalization rescales values to a range, while standardization rescales data to have mean 0 and std dev 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you use pivot tables for data analysis?

A

Pivot tables summarize data dynamically, allowing users to rearrange fields for quick comparisons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you create and use calculated fields to derive new insights from existing data?

A

Create calculated fields in Microstrategy to derive metrics using existing fields.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe how you would create interactive dashboards in Power BI/Tableau.

A

Connect the data source, create visualizations, and use filters and slicers for interaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain the difference between correlation and causation.

A

Correlation indicates a relationship; causation implies one variable influences another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How would you test a hypothesis using data?

A

Formulate null and alternative hypotheses, select a statistical test, and analyze the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the Central Limit Theorem, and why is it important?

A

The Central Limit Theorem states sample means approach a normal distribution as sample size increases, crucial for inferential statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How would you approach analyzing sales data to identify trends and patterns?

A

Clean and prepare data, use visualizations to spot trends, and apply time series analysis for forecasting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Imagine a scenario where a company’s profits are declining. What data would you look at, and how would you analyze it to find the root cause?

A

Analyze sales, customer feedback, and market trends; use regression analysis to find significant factors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the difference between a Security filter and a View filter?

A

A Security filter restricts data access; a View filter restricts data displayed without affecting security.

17
Q

How can we migrate MicroStrategy Reports and Dashboards from on-premise to cloud (AWS, GCP)?

A

Use migration tools to export and import reports into the cloud environment, ensuring compatibility.

18
Q

Define Attribute role with a use case.

A

Attribute roles define how attributes relate to facts (e.g., a customer may have roles in sales and marketing).

19
Q

How do you change the data type in MicroStrategy?

A

In MicroStrategy Desktop, right-click on the attribute or metric and change the data type in properties.

20
Q

What are the types of Transformation, and where can we use them? (Provide a case scenario.)

A

Types include Aggregation, Derivation, and Filtering. Use case: Transforming sales data to show the top 10 products.

21
Q

How is one fact table shown in SQL Query in MicroStrategy?

A

A fact table appears as a joined table in SQL queries, aggregating data over dimensions.