Non coding fundamentals Flashcards
Frequentist probability
An interpretation of probability as describing how often a particular outcome would occur in an experiment if that experiment were repeated over and over.
Bayesian probability
An interpretation of probability as describing how likely an observer expects a particular outcome to be in the future, based on previous experience and expert knowledge.
Prior probability
Also called the prior, the probability based on previous experiences, according to the Bayesian approach
Randomness
An apparent lack of pattern predictability in events
Random sampling
The process of sampling a subset of subjects at random, such that the sample is reflective of the greater population
Selection bias
Systematic differences between the sample and the population
Normal distribution (Gaussian)
A probability distribution in which most values cluster in the center of the range, with the rest tapering off symmetrically to the left and right.
Bernoulli distribution
A probability distribution that counts the number of successes when an event with two or more distinct possible outcomes is repeated many times
Gamma distribution
A probability distribution that represents the time until an event, when the event starts out unlikely, becomes more likely and then becomes less likely again
Conditional distribution
A distribution that indicates the probability that a randomly selected item in a subpopulation has a given characteristic
Poisson distribution
A probability distribution that represents the number of times that a given event will occur during a given time interval.
central limit theorem
The proposition that the sampling distribution of the sample means of any variable will be normal if the sample size is large enough
Null hypothesis
Hypothesis that proposed that no statistically significant difference exists between two specified populationse
2.11.1
Alternative hypothesis
Hypothesis that proposes that a statistically significant difference does exist between specified populations
2.11.1
p-value
The probability of obserrving a sample statistic at leastt as extreme as the one that you have, assuming that the null hypothesis is true.
2.11.1
Confidence interval
A statistical range with a specified probability that a given parameter lies within the range
2.11.1
t-statistic
Ratio of the departure from parameter value to hypothesized value due to standard error
independent-samples
a testt that compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different
Kutrtosis
Measure of the sharpness of a distribution’s peak
Skewness
Measure of the degree of asymmetry of a distribution
Lean startup methodology
An approach that aims to introduce research-driven techniques into the core of a company’s business model
Ceteris Paribus
Latin for “all other things held equal”; the assumption that every variable except the treatment variable is held equal between groups
Control group
The group of test subjects left untreated to some change, then compared with treated subjects in order to validate the results of the change
Control variable
a variable that the researcher controls (holds) constant during an experiment, so as to avoid biased results
Random sampling
The process of sampling a subset of subjects at random, such that the sample is reflective of the greater population
Treatment group
The group to which some treatment is given
Simpson’s paradox
Also called the lurking variable problem, the phenomenon when an average over several groups shows one trend, but an average for each individual group shows the opposite trend or no trend.
Indeprendent variable
The variable being changed or controlled to test the effect on the dependent variable
Dependent variable
The variable being tested and measured in a scientific test
Parametric test
A test that uses some known set of parameter estimates (such as mean and standard deviation) to represent the information in the data; this type of test works only on variables with mathematically understood distributions
Nonparametric test
A test that relies on estimates that represent certain pieces of information within a variable, but not the whole variable; this type of test is appropriate for variables that don’t conform to a distribution type
Shapiro-Wilk test
A nonparametric statistical test used to infer whether a variable disruption is significantly different from a normal distribution
Wilcoxon signed-rank
The nonparametric equivalent of the paired t-test, for samples that aren’t normally distributed; this test uses ranked values to test whether the difference in pairs follows a symmetric distribution around zero.
Paired t-test
A test used to compare two dependent (or paired) groups
Tukey’s honest significant differences
Also called Tukey’s HSD test, a parametric statistical test that compares all possible pairs of means and uses a variability estimate based on variability from all the groups combined
Kruskal-Wallis test
A nonparametric statistical test that whether samples originate from the same distribution
One-way analysis of variance
One-way ANOVA, a parametric statistical test used to compare whether two samples means are significantly different
Type I error
Also called a false positive, rejection of a true null hypothesis
Type II error
Also called a false negative, failure to reject a null hypothesis