Ace_the_Data_Science_Interview Flashcards by Austin Whaley

Two teams play a series of games (best of 7 - whoever wins 4 games first) in which each team has a 50% chance of winning any given round (no draws allowed). What is the probability that the series goes to 7 games?

5.1

How well did you know this?

Not at all

Perfectly

Say you roll a die three times. What is the probability of getting two sixes in a row?

5.2

How well did you know this?

Not at all

Perfectly

You roll three dice, one after another. What is the probability that you obtain three numbers in a strictly increasing order?

5.3

How well did you know this?

Not at all

Perfectly

Assume you have a deck of 100 cards with values ranging from 1 to 100, and that you draw two cards at random without replacement. What is the probability that the number of one card is precisely double that of the other?

5.4

How well did you know this?

Not at all

Perfectly

Imagine you are in a 3D space. From (0,0,0) to (3,3,3), how many paths are there if you can move only up, right, and forward?

5.5

How well did you know this?

Not at all

Perfectly

One in a thousand people have a particular disease, and the test for the disease is 98% correct in testing for the disease. On the other hand, the test has a 1% error rate if the person being tested does not have the disease. If someone tests positive, what are the odds they have the disease?

5.6

How well did you know this?

Not at all

Perfectly

Assume two coins, one fair and the other unfair (both sides having tails). You pick one at random, flip it five times, and observe that it comes up as tails all five times. What is the probability that you are flipping the unfair coin?

5.7

How well did you know this?

Not at all

Perfectly

Players A and B are playing a game where they take turns flipping a biased coin, with p probability of landing on heads (and winning). Player A starts the game, and then the players pass the coin back and forth until one person flips heads and win. What is the probability that A wins?

5.8

How well did you know this?

Not at all

Perfectly

Three friends in Seattle each told you it is rainy, and each person has a 1/3 probability of lying. What is the probability that Seattle is rainy, assuming that the likelihood of rain on any given day is 0.25?

5.9

How well did you know this?

Not at all

Perfectly

You draw a circle and choose two chords at random. What is the probability that those chords will intersect?

5.10

How well did you know this?

Not at all

Perfectly

You and your friend are playing a game. The two of you will continue to toss a coin until the sequence HH or TH shows up. If HH shows up first, you win. If TH shows up first, your friend wins. What is the probability of you winning?

5.11

How well did you know this?

Not at all

Perfectly

Say you are playing a game where you roll a 6-sided die up to two times and can choose to stop following the first roll if you wish. You will receive a dollar amount equal to the final amount rolled. How much are you willing to pay to play this game?

5.12

How well did you know this?

Not at all

Perfectly

Facebook has a content team that labels pieces of content on the platform as either spam or not spam. 90% are good raters and will mark 20% of the content as spam and 80% as not spam. The remaining 10% of raters are bad raters and will mark 0% of the content as spam and 100% as not-spam. Assume the pieces of content are labeled independently of one another, for every rater. Given that a rater has labeled four pieces of content as good, what is the probability that this rater is a good rater?

5.13

How well did you know this?

Not at all

Perfectly

A couple has two children. You discover that one of their children is a boy. What is the probability that the second child is also a boy?

5.14

How well did you know this?

Not at all

Perfectly

A desk has 8 drawers. There is a probability of 1/2 that someone placed a letter in one of the desk’s 8 drawers and a probability of 1/2 that this person did not place a letter in any of the 8 drawers. You open the first 7 drawers and do not find a letter. What is the probability that the 8th drawer has a letter in it?

5.15

How well did you know this?

Not at all

Perfectly

Two players are playing a tennis match and are at deuce (they will play back and forth until one person has scored two more points than the other). The first player has a 60% chance of winning every point, and the second player has a 40% chance of winning every point. What is the probability that the first player wins the match?

5.16

How well did you know this?

Not at all

Perfectly

Say you have a deck of 50 cards made up of 5 different colors, with 10 cards of each color, numbered 1 through 10. What is the probability that two cards you pick at random do not have the same color and are also not the same number?

5.17

How well did you know this?

Not at all

Perfectly

Suppose you have ten fair dice. If you randomly throw these dice simultaneously, what is the probability that the sum of all the top faces is divisible by 6?

5.18

How well did you know this?

Not at all

Perfectly

Player A and B play the following game: a number k from 1-6 is chosen, and player A and player B will toss a die until the first person throws a die showing side k, after which that person is awarded $100 and the game is over. How much is player A willing to pay to play first in this game?

5.19

How well did you know this?

Not at all

Perfectly

You are given an unfair coin having an unknown bias towards heads or tails. How can you generate fair odds using this coin?

5.20

How well did you know this?

Not at all

Perfectly

Suppose you are given a white cube that is broken into 3 x 3 x 3 = 27 pieces. However, before the cube was broken, all 6 of its faces were painted green. You randomly pick a small cube and see that 5 faces are white. What is the probability that the bottom face is also white?

5.21

How well did you know this?

Not at all

Perfectly

Assume you take a stock of length 1 and you break it uniformly at random into three parts. What is the probability that the three pieces can be used to form a triangle?

5.22

How well did you know this?

Not at all

Perfectly

What is the probability that, in a random sequence of H’s and T’s, HHT shows up before HTT?

5.23

How well did you know this?

Not at all

Perfectly

A fair coin is tossed twice, and you are asked to decide whether it is more likely that two heads showed up given that either (a) at lease one toss was heads or (b) the second toss was a head. Does your answer change if you are told that the coin is unfair?

5.24

How well did you know this?

Not at all

Perfectly

Three ants are sitting at the corners of an equilateral triangle. Each ant randomly picks a direction and begins moving along an edge of the triangle. What is the probability that none of the ants meet? What would your answer be if there are, instead, k ants sitting on all k corners of an equilateral polygon?

5.25

A biased coin, with probability p of landing on heads, is tossed n times. White a recurrence relation for the probability that the total number of heads after n tosses is even.

5.26

Alice and Bob are playing a game together. They play a series of rounds until one of them wins two more rounds than the other. Alice wins a round with probability p. What is the probability that Bob wins the overall series?

5.27

Say you have three draws of a uniformly distributed random variable between (0,2). What is the probability that the median of the three is greater than 1.5?

5.28

Say you have 150 friends, and 3 of them have phone numbers that have the last 4 digits with some permutation of the digits 0, 1, 4, 9. What's the probability of this occurring?

5.29

A fair die is rolled n times. What is the probability that the largest number rolled is r, for each r in 1,...,6?

5.30

Say you have a jar initially containing a single amoeba in it. Once every minute, the amoeba has a 1 in 4 chance of doing one of four things: (1) dying out (2) doing nothing (3) splitting into two amoebas, or (4) splitting into three amoebas What is the probability that the jar will eventually contain no living amoeba?

5.31

A fair coin is tossed n times. Given that there are k heads in the n tosses, what is the probability that the first toss was heads?

5.32

You have N i.i.d. draws of numbers following a normal distribution with parameters mu and sigma. What is the probability that k of those draws are larger than some value Y?

5.33

You pick three random points on a unit circle and form a triangle from them. What is the probability that the triangle includes the center of the unit circle?

5.34

You have r red balls and w white balls in a bag. You continue to draw balls from the bag until the bag only contains balls of one color. What is the probability that you run out of white balls first?

5.35

Explain the Central Limit Theorem. Why is it useful?

6.1 Given ANY starting distribution, we can take repeated random samples from it and calculate their means (sample means). The resulting distribution of sample means will be a normal distribution with a mean that matches that of the starting distribution. This is important because our statistical testing only works on normally distributed data so this effectively allows us to transform our data from a random distribution into a normal distribution we can work with Rule of thumb is that we need AT LEAST 30 sample means to get a resulting normal distribution of sample means.

How would you explain a confidence interval to a non-technical audience?

6.2 Confidence intervals are a range of values with a lower and upper bound such that if you were to sample the parameter of interest a large number of times, the 95% confidence interval would contain the true value of this parameter 95% of the time. If we are asked to get the average height of everyone in the United States (population), we are unable to measure everyone in the US. So, instead, we randomly sample N number of people (sample) and use their average (the sample average) as a substitute for the population average. Note: This sample average changes from sample to sample by some amount The sample average rarely is EXACTLY what the true population average is, so we say that we are a certain amount confident that the true population average exists within a range around or sample average. This range is determined by how confident we want to be as well as how large our sample is.

What are some common pitfalls encountered in A/B testing?

6.3

Explain both covariance and correlation formulaically and compare and contrast them.

6.4

Say you flip a coin 10 times and observe only one heads. What would be your null hypothesis and p-value for testing whether the coin is fair or not?

6.5

Describe hypothesis testing and p-values in layman's terms

6.6

Describe what Type I and Type II errors are, and the trade-offs between them.

6.7

Explain the statistical background behind power.

6.8

What is a Z-Test? When would you use it versus a t-test?

6.9

Say you are testing hundreds of hypotheses, each with t-test. What considerations would you take into account when doing this?

6.10

How would you derive a confidence interval for the probability of flipping heads from a series of coin tosses?

6.11

What is the expected number of coin flips needed to get two consecutive heads?

6.12

What is the expected number of rolls needed to see all six sides of a fair die?

6.13

Say you're rolling a fair six-sided die. What is the expected number of rolls until you roll two consecutive 5s?

6.14

A coin was flipped 1,000 times, and 550 times it showed heads. Do you think the coin is biased? Why or why not?

6.15

You are drawing from a normally distributed random variable X~N(0,1) once a day. What is the approximate expected number of days until you get a value greater than 2?

6.16

Say you have two random variables X and Y, each with a standard deviation. What is the variance of aX + bY for constants a and b?

6.17

Say we have X~Uniform(0,1) and Y~Uniform(0,1) and two are independent. What is the expected value of the minimum of X and Y?

6.18

Say you have an unfair coin which lands on heads 60% of the time. How many coin flips are needed to detect that are coin is unfair?

6.19

Say you have n numbers 1...n, and you uniformly sample from this distribution with replacement n times. What is the expected numbers of distinct values you would draw?

6.20

There are 100 noodles in a bowl. At each step, you randomly select two noodle ends from the bowl and tie them together. What is the expectation on the number of loops formed?

6.21

What is the expected value of the max of two dice rolls?

6.22

Derive the mean and variance of the uniform distribution U(a,b).

6.23

How many cards would you expect to draw from a standard deck before seeing the first ace?

6.24

Say you draw n samples from a uniform distribution U(a,b). What are the MLE estimates of a and b?

6.25

Assume you are drawing from an infinite set of i.i.d random variables that are uniformly distributed from (0,1). You keep drawing as long as the sequence you are getting is monotonically increasing. What is the expected length of the sequence you draw?

6.26

There are two games involving dice that you can play. In the first game, you roll two dice at once and receive a dollar amount equivalent to the product of the rolls. In the second game, you roll one die and get the dollar amount equivalent to the square of the value. Which has the higher expected value and why?

6.27

What does it mean for an estimator to be unbiased? What about consistent? Give examples of an unbiased but not consistent estimator, and biased but consistent estimator.

6.28

What are MLE and MAP? What is the difference between the two?

6.29

Say you are given a random Bernoulli trial generator. How would you generate values from a standard normal distribution?

6.30

Derive the expectation for a geometric random variable.

6.31

Say you have a random variable X~D, where D is an arbitrary distribution. What is the distribution F(X) where F is the CDF of X?

6.32

Describe what a moment generating function (MGF) is. Derive the MGF for a normally distributed random variable X.

6.33

Say you have N independent and identically distributed draws of an exponential random variable. What is the best estimator for the parameter lambda?

6.34

Assume that log X~N(0,1). What is the expectation of X?

6.35

Say you have two distinct subsets of a dataset for which you know their means and standard deviations. How do you calculate the blended mean and standard deviation of the total dataset? Can you extend it to K subsets?

6.36

Say we have two random variables X and Y. What does it mean for X and Y to be independent? What about uncorrelated? Give an example where X and Y are uncorrelated but no independent.

6.37

Say we have X~Uniform(-1,1) and Y=X^2. What is the covariance of X and Y?

6.38

How do you uniformly sample points at random from a circle with radius R?

6.39

Say you continually sample from some i.i.d. uniformly distributed (0,1) random variables until the sum of the variables exceeds 1. How many samples do you expect to make?

6.40

Say you are building a binary classifier for an unbalanced dataset. How do you handle this situation?

7.1

What are some differences you would expect in a model that minimizes squared error versus a model that minimizes absolute error? In which cases would each error metric be appropriate?

7.2

When performing K-means clustering, how do you choose k?

7.3

How can you make your models more robust to outliers?

7.4

Say that you are running a multiple linear regression and that you have reason to believe that several of the predictors are correlated. How will the results of the regression be affected? How would you deal with this problem?

7.5

Describe the motivation behind random forests. What are two ways in which they improve upon individual decision trees?

7.6

Given a large dataset of payment transactions, say we want to predict the likelihood of a given transaction being fraudulent. However, there are many rows with missing values for various columns. How would you deal with this?

7.7

Say you are running a simple logistic regression to solve a problem but find the results to be unsatisfactory. What are some ways you might improve your model, or what other models might you look into using instead?

7.8

Say you are running a linear regression for a dataset but you accidentally duplicated every data point. What happens to your beta coefficient?

7.9

Compare and contrast gradient boosting and random forests

7.10

Say DoorDash is launching in Singapore and for this new market, we want to predict the ETA for a delivery after being placed. From an earlier beta test in Singapore, there were 10,000 deliveries made. Do we have enough training data to create an accurate ETA model?

7.11

Say we are running a binary classification loan model and rejected applicants must be supplied with a reason why they are rejected. Without digging into the weights of features, how would you supply these reasons?

7.12

Say you are given a very large corpus of words - how would you identify synonyms?

7.13

What is the bias-variance trade-off? How is it expressed using an equation?

7.14

Define the cross-validation process. What is the motivation behind using it?

7.15

How would you build a lead scoring algorithm to predict whether a prospective company is likely to convert into being an enterprise customer?

7.16

How would you approach creating a music recommendation algorithm?

7.17

Define what it means for a function to be convex. What is an example of a machine learning algorithm that is not convex and describe why that is so.

7.18

Explain what information gain and entropy are in the context of a decision tree and walk through a numerical example

7.19

What is L1 and L2 regularization? What are the differences between the two?

7.20

Describe gradient descent and the motivation behind stochastic gradient descent.

7.21

Assume we have a classifier that produces a score between 0 and 1 for the probability of a particular loan application being fraudulent. Say that for each application's score, we take the square root of that score. How would the ROC curve change? If it doesn't change, what kinds of functions would change the curve?

7.22

Say X is an univariate Gaussian random variable. What is the entropy of X?

7.23

How would you build a model to calculate a customer's propensity to buy a particular item? What are some pros and cons of your approach?

7.24

Compare and contrast Gaussian Naive Bayes (GNB) and logistic regression. When would you use one over the other?

7.25

What loss function is used in k-means clustering given k clusters and n sample points? Compute the update formula using (1) batch gradient descent and (2) stochastic gradient descent for a cluster mean for cluster k using a learning rate epsilon.

7.26

Describe the kernel trick in SVMs and give a simple example. How do you decide what kernel to choose?

7.27

Say we have N observations for some variable which we model as being drawn from a Gaussian distribution. What are your best guesses for the parameters of the distribution?

7.28

Say we are using a Gaussian mixture model (GMM) for anomaly detection of fraudulent transactions to classify incoming transactions into K classes. Describe the model setup formulaically and how to evaluate the posterior probabilities and log likelihood. How can we determine if a new transaction should be deemed fraudulent?

7.29

Walk me through how you'd build a model to predict whether a particular user will churn?

7.30

Suppose you are running a linear regression and model the error terms as being normally distributed. Show that in this setup, maximizing the likelihood of the data is equivalent to minimizing the sum of the squared residuals.

7.31

Describe the idea behind Principle Component Analysis (PCA) and describe its formulation and derivation in matrix form. Next, go through the procedural description and solve the constrained maximization.

7.32

Describe the model formulation behind logistic regression. How do you maximize the log-likelihood of a given model (using the two-class case)?

7.33

How would you approach creating a music recommendation algorithm for Discover Weekly (a 30-song weekly playlist personalized to an individual user)?

7.34

Derive the variance-covariance matrix of the least squares parameter estimates in matrix form.

7.35

Ace_the_Data_Science_Interview Flashcards

(110 cards)