Ace_the_Data_Science_Interview Flashcards
Two teams play a series of games (best of 7 - whoever wins 4 games first) in which each team has a 50% chance of winning any given round (no draws allowed). What is the probability that the series goes to 7 games?
5.1
Say you roll a die three times. What is the probability of getting two sixes in a row?
5.2
You roll three dice, one after another. What is the probability that you obtain three numbers in a strictly increasing order?
5.3
Assume you have a deck of 100 cards with values ranging from 1 to 100, and that you draw two cards at random without replacement. What is the probability that the number of one card is precisely double that of the other?
5.4
Imagine you are in a 3D space. From (0,0,0) to (3,3,3), how many paths are there if you can move only up, right, and forward?
5.5
One in a thousand people have a particular disease, and the test for the disease is 98% correct in testing for the disease. On the other hand, the test has a 1% error rate if the person being tested does not have the disease. If someone tests positive, what are the odds they have the disease?
5.6
Assume two coins, one fair and the other unfair (both sides having tails). You pick one at random, flip it five times, and observe that it comes up as tails all five times. What is the probability that you are flipping the unfair coin?
5.7
Players A and B are playing a game where they take turns flipping a biased coin, with p probability of landing on heads (and winning). Player A starts the game, and then the players pass the coin back and forth until one person flips heads and win. What is the probability that A wins?
5.8
Three friends in Seattle each told you it is rainy, and each person has a 1/3 probability of lying. What is the probability that Seattle is rainy, assuming that the likelihood of rain on any given day is 0.25?
5.9
You draw a circle and choose two chords at random. What is the probability that those chords will intersect?
5.10
You and your friend are playing a game. The two of you will continue to toss a coin until the sequence HH or TH shows up. If HH shows up first, you win. If TH shows up first, your friend wins. What is the probability of you winning?
5.11
Say you are playing a game where you roll a 6-sided die up to two times and can choose to stop following the first roll if you wish. You will receive a dollar amount equal to the final amount rolled. How much are you willing to pay to play this game?
5.12
Facebook has a content team that labels pieces of content on the platform as either spam or not spam. 90% are good raters and will mark 20% of the content as spam and 80% as not spam. The remaining 10% of raters are bad raters and will mark 0% of the content as spam and 100% as not-spam. Assume the pieces of content are labeled independently of one another, for every rater. Given that a rater has labeled four pieces of content as good, what is the probability that this rater is a good rater?
5.13
A couple has two children. You discover that one of their children is a boy. What is the probability that the second child is also a boy?
5.14
A desk has 8 drawers. There is a probability of 1/2 that someone placed a letter in one of the desk’s 8 drawers and a probability of 1/2 that this person did not place a letter in any of the 8 drawers. You open the first 7 drawers and do not find a letter. What is the probability that the 8th drawer has a letter in it?
5.15
Two players are playing a tennis match and are at deuce (they will play back and forth until one person has scored two more points than the other). The first player has a 60% chance of winning every point, and the second player has a 40% chance of winning every point. What is the probability that the first player wins the match?
5.16
Say you have a deck of 50 cards made up of 5 different colors, with 10 cards of each color, numbered 1 through 10. What is the probability that two cards you pick at random do not have the same color and are also not the same number?
5.17
Suppose you have ten fair dice. If you randomly throw these dice simultaneously, what is the probability that the sum of all the top faces is divisible by 6?
5.18
Player A and B play the following game: a number k from 1-6 is chosen, and player A and player B will toss a die until the first person throws a die showing side k, after which that person is awarded $100 and the game is over. How much is player A willing to pay to play first in this game?
5.19
You are given an unfair coin having an unknown bias towards heads or tails. How can you generate fair odds using this coin?
5.20
Suppose you are given a white cube that is broken into 3 x 3 x 3 = 27 pieces. However, before the cube was broken, all 6 of its faces were painted green. You randomly pick a small cube and see that 5 faces are white. What is the probability that the bottom face is also white?
5.21
Assume you take a stock of length 1 and you break it uniformly at random into three parts. What is the probability that the three pieces can be used to form a triangle?
5.22
What is the probability that, in a random sequence of H’s and T’s, HHT shows up before HTT?
5.23
A fair coin is tossed twice, and you are asked to decide whether it is more likely that two heads showed up given that either (a) at lease one toss was heads or (b) the second toss was a head. Does your answer change if you are told that the coin is unfair?
5.24
Three ants are sitting at the corners of an equilateral triangle. Each ant randomly picks a direction and begins moving along an edge of the triangle. What is the probability that none of the ants meet? What would your answer be if there are, instead, k ants sitting on all k corners of an equilateral polygon?
5.25
A biased coin, with probability p of landing on heads, is tossed n times. White a recurrence relation for the probability that the total number of heads after n tosses is even.
5.26
Alice and Bob are playing a game together. They play a series of rounds until one of them wins two more rounds than the other. Alice wins a round with probability p. What is the probability that Bob wins the overall series?
5.27
Say you have three draws of a uniformly distributed random variable between (0,2). What is the probability that the median of the three is greater than 1.5?
5.28
Say you have 150 friends, and 3 of them have phone numbers that have the last 4 digits with some permutation of the digits 0, 1, 4, 9. What’s the probability of this occurring?
5.29
A fair die is rolled n times. What is the probability that the largest number rolled is r, for each r in 1,…,6?
5.30
Say you have a jar initially containing a single amoeba in it. Once every minute, the amoeba has a 1 in 4 chance of doing one of four things:
(1) dying out
(2) doing nothing
(3) splitting into two amoebas, or
(4) splitting into three amoebas
What is the probability that the jar will eventually contain no living amoeba?
5.31
A fair coin is tossed n times. Given that there are k heads in the n tosses, what is the probability that the first toss was heads?
5.32
You have N i.i.d. draws of numbers following a normal distribution with parameters mu and sigma. What is the probability that k of those draws are larger than some value Y?
5.33
You pick three random points on a unit circle and form a triangle from them. What is the probability that the triangle includes the center of the unit circle?
5.34
You have r red balls and w white balls in a bag. You continue to draw balls from the bag until the bag only contains balls of one color. What is the probability that you run out of white balls first?
5.35
Explain the Central Limit Theorem. Why is it useful?
6.1
Given ANY starting distribution, we can take repeated random samples from it and calculate their means (sample means). The resulting distribution of sample means will be a normal distribution with a mean that matches that of the starting distribution.
This is important because our statistical testing only works on normally distributed data so this effectively allows us to transform our data from a random distribution into a normal distribution we can work with
Rule of thumb is that we need AT LEAST 30 sample means to get a resulting normal distribution of sample means.
How would you explain a confidence interval to a non-technical audience?
6.2
Confidence intervals are a range of values with a lower and upper bound such that if you were to sample the parameter of interest a large number of times, the 95% confidence interval would contain the true value of this parameter 95% of the time.
If we are asked to get the average height of everyone in the United States (population), we are unable to measure everyone in the US. So, instead, we randomly sample N number of people (sample) and use their average (the sample average) as a substitute for the population average. Note: This sample average changes from sample to sample by some amount
The sample average rarely is EXACTLY what the true population average is, so we say that we are a certain amount confident that the true population average exists within a range around or sample average. This range is determined by how confident we want to be as well as how large our sample is.
What are some common pitfalls encountered in A/B testing?
6.3
Explain both covariance and correlation formulaically and compare and contrast them.
6.4
Say you flip a coin 10 times and observe only one heads. What would be your null hypothesis and p-value for testing whether the coin is fair or not?
6.5
Describe hypothesis testing and p-values in layman’s terms
6.6
Describe what Type I and Type II errors are, and the trade-offs between them.
6.7
Explain the statistical background behind power.
6.8
What is a Z-Test? When would you use it versus a t-test?
6.9