statistics Flashcards

1
Q

population

A

the entire set of individuals or objects of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

sample

A

a portion, a selected part, of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

individuals

A

the minimum unit that can be studied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

statistic

A

an approximation to the parameter, that can be calculated from our data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

simple random (probability) sampling

A

random numbers form 1 - N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

cluster (probability) sampling

A

simple random sampling of clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

stratified (probability) sampling

A
  • order population in strata
  • simple random sampling in the strata
  • these might have different sample sizes (or not)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

convenience (non-probability) sampling

A

when units are selected for inclusion in the sample because they are the easiest for the researcher to access

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

snowball (non-probability) sampling

A

a recruitment technique in which research participants are asked to assist researchers in identifying other potential subjects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

quota (non-probability) sampling

A

it relies on the non-random selection of a predetermined number or proportion of units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

frequency

A

the number of observations for each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

absolute frequencies

A

counting the observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

relative frequencies

A

percentage (or fraction) of observations in each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

measures of centrality

A

trying to summarize the data by identifying the central position of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

mean (or average)

A

the sum of the data divided by the number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

median

A

midpoint of the values ordered in size

17
Q

mode

A

most frequent observation

18
Q

dispersion

A

informs about the variability in the data

19
Q

variance

A

a measurement of how far each number in a data set is from the mean, and thus from every other number in the set.

20
Q

standard deviation

A

a statistic that measures the dispersion of a dataset relative to its mean and it is calculated as the square root of the variance

21
Q

boxplot

A

univariate descriptives

22
Q

multivariate descriptive statistics

A

shows the relation between two or more variables, which can be of different types

23
Q

statistical inference

A

data analysis to study the underlying probability distribution

24
Q

hypothesis testing

A

an act in statistics whereby an analyst tests an assumption regarding a population parameter

25
Q

null hypothesis

A

a type of statistical hypothesis that proposes that no statistical significance exists in a set of given observations.

26
Q

false positive

A

an investigator rejects a null hypothesis that is actually true in the population. It is usually more problematic

27
Q

false negative

A

the investigator fails to reject a null hypothesis that is actually false in the population.

28
Q

probability (alpha)

A

statistical significance

29
Q

p-value

A

probability, under the null hypothesis, of sampling a test statistic at least as extreme as that which was observed

30
Q

low p-value

A

reject H0 and accept H1

31
Q

high p-value

A

cannot reject H0 and cannot accept H1

32
Q

homogeneity of contingency tables

A

when the distribution of observations in the rows (or in the columns) could be explained by random sampling of the observations in the columns (or rows)

33
Q

Shapiro-Wilk

A

normality test but only for n<50

34
Q

Kolmogorov-Smirnov

A

one sample vs a distribution, or two samples

35
Q

95% interval of confidence

A

has a 95% likelihood of containing the parameter value