Terms_and_Definitions Flashcards

1
Q

A/B Test

A

A method of comparing two versions of a webpage, feature, or app against each other to determine which performs better.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Null Hypothesis (H₀)

A

Assumes there is no significant difference between the control and test groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Alternative Hypothesis (H₁)

A

Assumes there is a significant difference between the control and test groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

P-Value

A

The probability of observing results at least as extreme as those measured, assuming the null hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Significance Level (α)

A

The threshold for rejecting the null hypothesis (commonly set at 0.05).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Confidence Interval (CI)

A

A range of values that is likely to contain the true effect size or metric with a given level of confidence (e.g., 95%).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Control Group

A

The group that does not receive the treatment or variant being tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Test Group

A

The group that receives the treatment or variant being tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Randomization

A

Assigning participants to groups in a way that each participant has an equal chance of being in any group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Power Analysis

A

A calculation to determine the minimum sample size required to detect a given effect size with sufficient power.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Effect Size

A

The magnitude of the difference between groups (e.g., a 5% increase in conversion rate).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Type I Error

A

Incorrectly rejecting the null hypothesis (false positive).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Type II Error

A

Failing to reject the null hypothesis when it is false (false negative).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Bonferroni Correction

A

A method to adjust significance levels when multiple comparisons are being made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Simpson’s Paradox

A

A trend appears in different groups of data but disappears or reverses when the groups are combined.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Descriptive Statistics

A

Summarizing and describing the features of a dataset (e.g., mean, median, mode).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Inferential Statistics

A

Using a sample to make generalizations about a population (e.g., hypothesis testing, confidence intervals).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Mean

A

The average value of a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Median

A

The middle value in a dataset when ordered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Mode

A

The most frequently occurring value in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Variance

A

A measure of how much values in a dataset vary from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Standard Deviation

A

The square root of the variance, representing data dispersion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Z-Test

A

A hypothesis test for comparing means when the population variance is known.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

T-Test

A

A hypothesis test for comparing means when the population variance is unknown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

ANOVA (Analysis of Variance)

A

A test to compare the means of three or more groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Chi-Square Test

A

A test for relationships between categorical variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Linear Regression

A

A method to model the relationship between a dependent variable and one or more independent variables.

28
Q

Logistic Regression

A

A regression model used when the dependent variable is categorical.

29
Q

Bayesian Statistics

A

An approach to statistics that incorporates prior beliefs or evidence.

30
Q

Frequentist Statistics

A

A traditional approach to statistics based on frequency or proportion.

31
Q

SELECT

A

A SQL command used to retrieve data from a database.

32
Q

FROM

A

Specifies the table to retrieve data from.

33
Q

WHERE

A

Filters rows based on conditions.

34
Q

GROUP BY

A

Groups rows sharing a property for aggregation.

35
Q

HAVING

A

Filters grouped rows based on aggregated values.

36
Q

JOIN

A

Combines rows from two or more tables based on a related column.

37
Q

INNER JOIN

A

Returns rows with matching values in both tables.

38
Q

LEFT JOIN

A

Returns all rows from the left table and matching rows from the right table.

39
Q

RIGHT JOIN

A

Returns all rows from the right table and matching rows from the left table.

40
Q

OUTER JOIN

A

Returns all rows from both tables, with nulls where no match exists.

41
Q

ORDER BY

A

Sorts the result set by specified columns.

42
Q

LIMIT

A

Restricts the number of rows returned in a query.

43
Q

Subquery

A

A query nested within another query.

44
Q

CTE (Common Table Expression)

A

A temporary result set used within a SQL query.

45
Q

Pandas

A

A library for data manipulation and analysis.

46
Q

NumPy

A

A library for numerical computations.

47
Q

Matplotlib

A

A library for creating static visualizations.

48
Q

Seaborn

A

A library for statistical data visualization.

49
Q

Scipy.stats

A

A library for statistical functions and tests.

50
Q

Statsmodels

A

A Python module for statistical modeling and hypothesis testing.

51
Q

A/B Test Simulation

A

A process to mimic test results using random sampling or bootstrapping.

52
Q

Data Visualization

A

Representing data graphically to communicate insights.

53
Q

Dashboard

A

A visual interface that displays key performance metrics and data.

54
Q

Power BI

A

A business analytics tool for creating dashboards and visualizations.

55
Q

Tableau

A

A software tool for data visualization and business intelligence.

56
Q

Funnel Analysis

A

A method to track user journey and identify drop-off points.

57
Q

Cohort Analysis

A

Analyzing behavior by grouping users based on shared characteristics.

58
Q

Customer Journey

A

The path a customer takes from initial interaction to conversion.

59
Q

Clickstream Data

A

Data collected about user interactions on a website or app.

60
Q

Hadoop

A

A framework for distributed storage and processing of large datasets.

61
Q

Telemetry

A

The collection of data about the usage of a digital product.

62
Q

Data Pipeline

A

A series of steps to process and analyze data from source to destination.

63
Q

Hypothesis Validation

A

The process of testing assumptions with data.

64
Q

Exploratory Data Analysis (EDA)

A

Initial analysis to summarize data characteristics.

65
Q

ETL (Extract, Transform, Load)

A

A process for collecting, transforming, and storing data.