Week 4 Flashcards

1
Q

Data

A

recorded values of qualitative or quantitative observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Population

A

the collection of all subjects of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sample

A

a subset of the population of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Parameters

A

a characteristic of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Statistic

A

a characteristic of a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Levels of Measurement

A

qualitative [nominal (categories that cannot be put in any order) & ordinal (categories that can be ordered)] & quantitative [interval (-infinity to infinity) & ratio (0 to infinity)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Measure of Central Tendency

A

Mean (average of data points), Median (middle of data points) and Mode (most recurring data point)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Measure of Position

A

Mean, Median, Mode, Min, Max.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Measure of Dispersion

A

Range, frequency, variance, standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Measures of Relationship

A

Covariance, Correlation, Regression, Trend, Forecast.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Measures of Asymmetry

A

Skewness and Kurtosis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Statistics

A

the science of collecting, summarizing, and drawing valid conclusions from data which involves: selecting models to validate hypotheses and test assumptions, determining the relationships between variables, assessing data trends and trajectories, identifying patterns and groupings, detecting mistakes and outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Uniform Distribution

A

distribution (continuous or discrete) whose data points lie within a range and all have equal probability of appearing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Binomial Distribution

A

discrete probability distribution with parameters n and p of the number of successes in a sequence of n independent experiments and each with its Boolean-valued outcome: success (with probability p) or failure (with probability q = 1-p).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Poisson distribution

A

discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Normal distribution

A

continuous probability distribution whose importance stems from the fact that random variables without known distribution will mimic the distribution if a large enough sample of those random variables are collected (CLT).

17
Q

Central Limit Theorem

A

no matter the underlying distribution of the dataset, the sampling distributions of the means would approximate a normal distribution. The mean of the sampling distribution would be equal to the mean of the original distribution and the variance would be n times smaller .

18
Q

Hypothesis Testing

A

the testing of a hypothesis (an idea that can be tested and a supposition or proposed explanation made on the basis of limited evidence as a starting point for further explanation.

19
Q

ANOVA (Analysis of Variance)

A

a collection of statistical models and their associated estimation procedures used to analyze the difference among means. Based on the law of total variance, ANOVA provides a statistical test of whether two or more population means are equal.

20
Q

Chi-Squared Analysis

A

a statistical hypothesis test that is valid to perform when the test statistic is chi-squared distributed under the null hypothesis. Used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table.

21
Q

Standardization

A

the normalization of the normal distribution (N(0,1)) .

22
Q

Z score

A

the standard score calculated by subtracting the population mean from an individual raw score and dividing the difference by the population standard deviation.

23
Q

Arithmetic mean, Median, Mode

A

average of data points, center of data points and data point that appears most frequently.

24
Q

Range, Average Deviation, Variance

A

difference between the maximum and minimum data point, number that indicates how data points deviate from the mean, taking the standard deviation and squaring it.

25
Q

Standard deviation

A

number that indicates how much data points deviate from the mean.

26
Q

Covariance

A

a measure of the joint variability of two variables

27
Q

Correlation

A

a measure of the joint variability of two variables. Standardized measure of covariance.

28
Q

Skewness

A

a measure of a symmetry that indicates whether the observations in a dataset are concentrated on one side.

29
Q

Probability Sampling

A

each element from the population dataset has a chance of being deleted as a sample. Ex. Simple, Stratified, Cluster, and Systematic random sampling.

30
Q

Non Probability Sampling

A

the practice of sampling without the assurance that elements have the equal amount of chance of being selected. Ex. Convenience, Voluntary and Snowball sampling, Quota, and Purposive.

31
Q

Bias

A

the risk that a subset of a population will not accurately represent the overall population.